<img src="https://d5nxst8fruw4z.cloudfront.net/atrk.gif?account=iA1Pi1a8Dy00ym" style="display:none" height="1" width="1" alt="" />
Skip Navigation

Summary Statistics, Summarizing Univariate Distributions

Describing single variable data with additional measures of the center and common percentiles

Atoms Practice
This indicates how strong in your memory this concept is
Practice Now
Turn In
Summary Statistics, Summarizing Univariate Distributions

More Measures of Center 

The mean, median and mode are only a few possible measures of center. While they are the most commonly used measures of center, it is important to be familiar with some other measures of center that are sometimes used as well.


The midrange (sometimes called the midextreme) is found by taking the mean of the maximum and minimum values of the data set.

Determining the Midrange 

Consider the following quiz grades: 75, 80, 90, 94, and 96. The midrange would be:


Since it is based on only the two most extreme values, the midrange is not commonly used as a measure of central tendency.

Trimmed Mean

Recall that the mean is not resistant to the effects of outliers. Many students ask their teacher to “drop the lowest grade.” The argument is that everyone has a bad day, and one extreme grade that is not typical of the rest of their work should not have such a strong influence on their mean grade. The problem is that this can work both ways; it could also be true that a student who is performing poorly most of the time could have a really good day (or even get lucky) and get one extremely high grade. We wouldn’t blame this student for not asking the teacher to drop the highest grade! Attempting to more accurately describe a data set by removing the extreme values is referred to as trimming the data. To be fair, though, a valid trimmed statistic must remove both the extreme maximum and minimum values. So, while some students might disapprove, to calculate a trimmed mean you remove the maximum and minimum values and divide by the number of values that remain.

Determining the Trimmed Mean 

Consider the following quiz grades: 75, 80, 90, 94, 96.

A trimmed mean would remove the largest and smallest values, 75 and 96, and divide by 3.


n% Trimmed Mean

Instead of removing just the minimum and maximums in a larger data set, a statistician may choose to remove a certain percentage of the extreme values. This is called an n% trimmed mean. To perform this calculation, remove the specified percent of the number of values from the data from each end. For example, in a data set that contains 100 numbers, to calculate a 10% trimmed mean, remove 10% of the data from each end. In this simplified example, the ten smallest and the ten largest values would be discarded, and the sum of the remaining numbers would be divided by 80.

Calculating a 5% Trimmed Mean 

In real data, it is not always so straightforward. To illustrate this, let’s return to our data from the number of children in a household and calculate a 5% trimmed mean. Here is the data set:

1, 3, 4, 3, 1, 2, 2, 2, 1, 2, 2, 3, 4, 5, 1, 2, 3, 2, 1, 2, 3, 6

Placing the data in order yields the following:

1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 5, 6

Five percent of 22 values is 1.1, so we could remove one from each end (2 total), which is approximately 4.5% trimmed, or we could remove 2 numbers from each end (4 total), which is approximately 9% trimmed. Some statisticians would calculate both of these and then use proportions to find an approximation for 5%. Others might argue that 4.5% is closer, so we should use that value. For our purposes, and to stay consistent with the way we handle similar situations in later chapters, we will always opt to remove more numbers than necessary. The logic behind this is simple. You are claiming to remove 5% of the numbers. If you cannot remove exactly 5%, then you either have to remove more or fewer. We would prefer to err on the side of caution and remove at least the percentage reported. This is not a hard and fast rule and is a good illustration of how many concepts in statistics are open to individual interpretation. Some statisticians even say that the only correct answer to every question asked in statistics is, “It depends!”

Weighted Mean

The weighted mean is a method of calculating the mean where instead of each data point contributing equally to the mean, some data points contribute more than others. This could be because they appear more often or because a decision was made to increase their importance (give them more weight). The most common type of weight to use is the frequency, which is the number of times each number is observed in the data. When we calculated the mean for the children living at home, we could have used a weighted mean calculation. The calculation would look like this:


The symbolic representation of this is as follows:



xi is the ith data point.

fi is the number of times that data point occurs.

n is the number of data points.

We may be interested in other sections of the data besides the center or middle. We could be interested in some lower percentage of the data or some higher portion of the data. The following topics will explain how to look at certain portions or percentages of a data set.

Percentiles and Quartiles

A percentile is a statistic that identifies the percentage of the data that is less than the given value. The most commonly used percentile is the median. Because it is in the numeric middle of the data, half of the data is below the median. Therefore, we could also call the median the 50th percentile. A 40th percentile would be a value in which 40% of the numbers are less than that observation.

To check a child’s physical development, pediatricians use height and weight charts that help them to know how the child compares to children of the same age. A child whose height is in the 70th percentile is taller than 70% of children of the same age.

Two very commonly used percentiles are the 25th and 75th percentiles. The median, 25th, and 75th percentiles divide the data into four parts. Because of this, the 25th percentile is notated as Q1 and is called the lower quartile, and the 75th percentile is notated as Q3 and is called the upper quartile. The median is a middle quartile and is sometimes referred to as Q2.

Finding the Median, Lower Quartile, and Upper Quartile 

Let's return to the previous data set, which is as follows:

1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 5, 6

Find the median, lower quartile and upper quartile.

Recall that the median (50th percentile) is 2. The quartiles can be thought of as the medians of the upper and lower halves of the data.

In this case, there are an odd number of values in each half. If there were an even number of values, then we would follow the procedure for medians and average the middle two values of each half.

Finding the Median, 1st Quartile, and 3rd Quartile 

Find the median, 1st quartile and 3rd quartile for the data set below.

The median in this set is 90. Because it is the middle number, it is not technically part of either the lower or upper halves of the data, so we do not include it when calculating the quartiles. However, not all statisticians agree that this is the proper way to calculate the quartiles in this case. As we mentioned in the last section, some things in statistics are not quite as universally agreed upon as in other branches of mathematics. The exact method for calculating quartiles is another one of these topics. To read more about some alternate methods for calculating quartiles in certain situations, click on the subsequent link.

Technology Notes:

Calculating Medians and Quartiles on the TI-83/84 Graphing Calculator

The median and quartiles can also be calculated using a graphing calculator. You may have noticed earlier that median is available in the MATH submenu of the LIST menu (see below).

While there is a way to access each quartile individually, we will usually want them both, so we will access them through the one-variable statistics in the STAT menu.

You should still have the data in L1 and the frequencies, or weights, in L2, so press [STAT], and then arrow over to CALC (the left screen below) and press [ENTER] or press [1] for '1-Var Stats', which returns you to the home screen (see the middle screen below). Press [2ND][L1][,][2ND][L2][ENTER] for the data and frequency lists (see third screen). When you press [ENTER], look at the bottom left hand corner of the screen (fourth screen below). You will notice there is an arrow pointing downward to indicate that there is more information. Scroll down to reveal the quartiles and the median (final screen below).

Remember that Q1 corresponds to the 25th percentile, and Q3 corresponds to the 75th percentile.


Use the following data set for these examples:

2, 3, 6, 8, 11, 14, 15, 17, 18, 19, 20, 20, 24, 26, 27, 28, 28, 28, 32, 34, 38 39, 43

Example 1

Find the minimum value

The minimum value is 2.

Example 2

Find the maximum value

The maximum value is 43.

Example 3

Find the median

Since there are 23 data points and the stem and leaf puts the data points in order, the 12th data point will be the median. This is 20.

Example 4

Find the upper quartile

The upper quartile is the median of the upper half of the data. Since there are 11 data points in the upper half, the upper quartile will be the 6th data point. The upper quartile will be 28.

Example 5

Find the lower quartile

The lower quartile will be the 6th data point in the first half of the data. The lower quartile is 14.


For 1-4, use the following data set

2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, 8, 8, 8, 9

find the following:

  1. minimum and maximum
  2. midrange
  3. median
  4. upper and lower quartiles

For 5-11, the chart below shows the data from the Galapagos tortoise preservation program with just the number of individual tortoises that were bred in captivity and reintroduced into their native habitat.

Island or Volcano Number of Individuals Repatriated
Wolf 40
Darwin 0
Alcedo 0
Sierra Negra 286
Cerro Azul 357
Santa Cruz 210
Española 1293
San Cristóbal 55
Santiago 498
Pinzón 552
Pinta 0

Figure: Approximate Distribution of Giant Galapagos Tortoises in 2004 (“Estado Actual De Las Poblaciones de Tortugas Terrestres Gigantes en las Islas Galápagos,” Marquez, Wiedenfeld, Snell, Fritts, MacFarland, Tapia, y Nanjoa, Scologia Aplicada, Vol. 3, Num. 1,2, pp. 98-11).

For this data, calculate each of the following:

  1. mode
  2. median
  3. mean
  4. a 10% trimmed mean
  5. midrange
  6. upper and lower quartiles
  7. the percentile for the number of Santiago tortoises reintroduced
  1. Why is the answer to (8) significantly higher than the answer to (7)?
  2. How would you describe the difference between the midrange and the median?
  3. How can we represent data visually using the various measures of center?


Review (Answers) 

To view the Review answers, open this PDF file and look for section 1.4. 

Notes/Highlights Having trouble? Report an issue.

Color Highlighted Text Notes
Please to create your own Highlights / Notes
Show More


Lower quartile

The lower quartile, also known as Q_1, is the median of the lower half of the data.


The largest number in a data set.


The midrange  is the mean of the maximum and minimum values.


The minimum is the smallest value in a data set.


A percentile is a data value for which the specified percentage of the data is below that value.

trimmed mean

In an n% trimmed mean, you remove a certain percentage of the data (half from each end) before calculating the mean.

Upper Quartile

The upper quartile, also known as Q_3, is the median of the upper half of the data.

weighted mean

A weighted mean involves multiplying individual data values by their frequencies or percentages before adding them and then dividing by the total of the frequencies (weights).

Image Attributions

Explore More

Sign in to explore more, including practice questions and solutions for Summary Statistics, Summarizing Univariate Distributions.
Please wait...
Please wait...