13.9: Measures of Central Tendency and Dispersion
What if you polled 20 adults and asked them how much money they save for retirement each year? You record the results. How could you numerically describe the average amount your survey participants are saving annually? After completing this Concept, you'll be able to calculate and compare measures of central tendency to describe a data set like this one.
Watch This
CK-12 Foundation: Measures of Central Tendency
Watch This
The following video is an introduction to the mean, median, and mode.
Khan Academy: Statistics: The Average
The narrator models finding the mean, median, and mode of a set of numbers. While this is similar to some of the content below, you may find it to be a helpful comparison of what the three measures of central tendency show.
Guidance
The word “average” is often used to describe the general characteristics of a group of unequal objects. Mathematically, an average is a single number which can be used to summarize a collection of numerical values. In mathematics, there are several types of “averages” with the most common being the mean, the median and the mode.
Mean
The arithmetic mean of a group of numbers is found by dividing the sum of the numbers by the number of values in the group. In other words, we add all the numbers together and divide by the number of numbers.
Example A
Find the mean of the numbers 11, 16, 9, 15, 5, 18.
Solution
There are six separate numbers, so the \begin{align*}\text{mean} = \frac{11 + 16 + 9+ 15+ 5+18}{6}=\frac{74}{6}=12\frac{1}{3}\end{align*}.
The arithmetic mean is what most people automatically think of when the word average is used with numbers. It’s generally a good way to take an average, but it can be misleading when a small number of the values lie very far away from the rest. A classic example would be when calculating average income. If one person (such as former Microsoft Corporation chairman Bill Gates) earns a great deal more than everyone else who is surveyed, then that one value can sway the mean significantly away from what the majority of people earn.
Example B
The annual incomes for 8 professions are shown below. Form the data, calculate the mean annual income of the 8 professions.
Profession | Annual Income |
---|---|
Farming, Fishing, and Forestry | $19,630 |
Sales and Related | $28,920 |
Architecture and Engineering | $56,330 |
Healthcare Practitioners | $49,930 |
Legal | $69,030 |
Teaching & Education | $39,130 |
Construction | $35,460 |
Professional Baseball Player* | $2,476,590 |
(Source: Bureau of Labor Statistics, except (*)-The Baseball Players' Association (playbpa.com)).
Solution
There are 8 values listed, so the mean is
\begin{align*}\frac{19630+28920+56330+49930+69030+39130+35460+2476590}{8}= \$346,877.50\end{align*}
As you can see, the mean annual income is substantially larger than the income of 7 out of the 8 professions. The effect of the single outlier (the baseball player) has a dramatic effect on the mean, so the mean is not a good method for representing the ‘average’ salary in this case.
Median
The median is another type of average. It is defined as the value in the middle of a group of numbers. To find the median, we must first list all the numbers in order from least to greatest.
Example C
Find the median of the numbers 11, 21, 6, 17, 9.
Solution
We first list the numbers in ascending order: 6, 9, 11, 17, 21.
The median is the value in the middle of the set (in bold).
The median is 11. There are two values higher than 11 and two values lower than 11.
If there is an even number of values, then the median is the arithmetic mean of the two numbers in the middle (in other words, the number halfway between them).
The median is a useful measure of average when the data set is highly skewed by a small number of points that are extremely large or extremely small. Such outliers will have a large effect on the mean, but will leave the median relatively unchanged.
Mode
The mode can be a useful measure of data when that data falls into a small number of categories. It is simply a measure of the most common number, or sometimes the most popular choice. The mode is an especially useful concept for data sets that contains non-numerical information, such as surveys of eye color or favorite ice-cream flavor.
Of course, a data set can contain more than one mode; when it does, it is called multimodal. In fact, every value in a data set could be a mode, if every value appears an equal number of times. However, this situation is quite rare. You might encounter data sets with two or even three modes, but more than that would be unlikely unless you are working with very small sample sets.
Example D
Jim is helping to raise money at his church bake sale by doing face painting for children. He collects the ages of his customers, and displays the data in the graph below. Find the mean, median and mode for the ages represented.
Solution
By reading the graph we can see that there was one 2-year-old, three 3-year-olds, four 4-year-olds, etc. In total, there were \begin{align*}1 + 3 + 4 + 5 + 6 + 7 + 3 + 1 = 30\end{align*} customers.
The mean age is found by adding up all the ages multiplied by the number of times each age appears, and then dividing by 30:
\begin{align*}\frac{2(1)+3(3)+4(4)+5(5)+6(6)+7(7)+8(3)+9(1)}{30} = \frac{170}{30} = 5 \frac{2}{3} \end{align*}
Since there are 30 children, the median is half way between the \begin{align*}15^{th}\end{align*} and \begin{align*}16^{th}\end{align*} oldest (that way there will be 15 younger and 15 older than the median age). Both the \begin{align*}15^{th}\end{align*} and \begin{align*}16^{th}\end{align*} oldest fall in the 6-year-old range, therefore the median is 6.
The mode is given by the age group with the highest frequency. Reading directly from the graph, we see that the mode is 7; there are more 7-year-olds than any other age.
Vocabulary
- The arithmetic mean of a group of numbers is found by dividing the sum of the numbers by the number of values in the group. In other words, we add all the numbers together and divide by the number of numbers.
- The median is another type of average. It is defined as the value in the middle of a group of numbers. To find the median, we must first list all the numbers in order from least to greatest. A useful formula for finding the middle value is as follows:
if there are \begin{align*}n\end{align*} values in the data set, the median is the \begin{align*}\frac{n+1}{2}\end{align*}th value.
- The mode is the most frequent number(s). If no number repeats, there is is no mode. There can be more than one mode.
Guided Practice
Find the mean, median and mode of the numbers 2, 17, 1, -3, 12, 8, 12, 16.
Solution:
\begin{align*}\text{Mean}=\frac{2+17+1+(-3)+12 + 8 + 12 +16}{9}= 7.\overline{22}\end{align*}
We first list the numbers in ascending order: -3, 1, 2, 8, 12, 12, 16, 17.
The median is the value in the middle of the set, so the median lies between 8 and 12. Halfway between 8 and 12 is 10, so 10 is the median.
The mode is the most frequent number or numbers. The only number that repeats is 12, so 12 is the mode.
Practice
- Find the median and mode of the salaries given in Example A.
- Find the median and mode of the salaries given in Example B.
- Find the mean, median and mode of the data set: \begin{align*}14, 9, 3, 14, 2, 7, 13, 6. \end{align*}
- Find the mean, median and mode of the data set: \begin{align*}5, 3, 5, 0, 1, 5, 3, 4, 4, 4\end{align*}.
- Find the mean, median and mode of the data set: \begin{align*}8, 5, 10, 4, 4, 10, 6, 4, 7, 8, 2, 8, 10, 9, 2, 1, 6, 10, 5, 3\end{align*}.
- Find the mean, median and mode of the following numbers. Which of these will give the best average? 15, 19, 15, 16, 11, 11, 18, 21, 165, 9, 11, 20, 16, 8, 17, 10, 12, 11, 16, 14
- Ten house sales in Encinitas, California are shown in the table below. Find the mean, median and standard deviation for the sale prices. Explain, using the data, why the median house price is most often used as a measure of the house prices in an area.
Address | Sale Price | Date Of Sale |
---|---|---|
643 3RD ST | $1,137,000 | 6/5/2007 |
911 CORNISH DR | $879,000 | 6/5/2007 |
911 ARDEN DR | $950,000 | 6/13/2007 |
715 S VULCAN AVE | $875,000 | 4/30/2007 |
510 4TH ST | $1,499,000 | 4/26/2007 |
415 ARDEN DR | $875,000 | 5/11/2007 |
226 5TH ST | $4,000,000 | 5/3/2007 |
710 3RD ST | $975,000 | 3/13/2007 |
68 LA VETA AVE | $796,793 | 2/8/2007 |
207 WEST D ST | $2,100,000 | 3/15/2007 |
For 8-10, determine which average measure of center (mean, median or mode) would be most appropriate for the following.
- The life expectancy of store-bought goldfish.
- The age in years of audience for a kids TV program.
- The weight of potato sacks that a store labels as “5 pound bag.”
Notes/Highlights Having trouble? Report an issue.
Color | Highlighted Text | Notes | |
---|---|---|---|
Please Sign In to create your own Highlights / Notes | |||
Show More |
Term | Definition |
---|---|
arithmetic mean | The arithmetic mean is also called the average. |
descriptive statistics | In descriptive statistics, the goal is to describe the data that found in a sample or given in a problem. |
inferential statistics | With inferential statistics, your goal is use the data in a sample to draw conclusions about a larger population. |
measure of central tendency | In statistics, a measure of central tendency of a data set is a central or typical value of the data set. |
Median | The median of a data set is the middle value of an organized data set. |
Mode | The mode of a data set is the value or values with greatest frequency in the data set. |
multimodal | When a set of data has more than 2 values that occur with the same greatest frequency, the set is called multimodal . |
Outlier | In statistics, an outlier is a data value that is far from other data values. |
Population Mean | The population mean is the mean of all of the members of an entire population. |
resistant | A statistic that is not affected by outliers is called resistant. |
Sample Mean | A sample mean is the mean only of the members of a sample or subset of a population. |
Image Attributions
Here you'll learn how to find the measures of central tendency (the mean, the median, and the mode) for a set of data.