<img src="https://d5nxst8fruw4z.cloudfront.net/atrk.gif?account=iA1Pi1a8Dy00ym" style="display:none" height="1" width="1" alt="" />

# Applications of Variance and Standard Deviation

## Analyzing data using technology.

Estimated8 minsto complete
%
Progress
Practice Applications of Variance and Standard Deviation

MEMORY METER
This indicates how strong in your memory this concept is
Progress
Estimated8 minsto complete
%
Variance Practice

Suppose you were given a histogram and asked to find the variance of the data it illustrates? Would you know how?

After this lesson, you will understand how to compare visualized data with variance.

Credit: Anders Sandberg
Source: https://www.flickr.com/photos/arenamontanus/2243540719

### Comparing Visualized Data with Variance

Knowing how to calculate the variance of a set when it is given to you as a list of values is great, but statistical data is often shared and disseminated in visual form rather than as raw data. Because of this, it is important to practice evaluating the variance of graphed data as well as tabular or raw data so you can actually apply your understanding of variance to real-world statistics.

In general, you will need to:

1. Identify the values of the dependent variable, as these are the values you will be finding the variance of.
2. Sum the values and calculate the arithmetic mean.
3. Subtract the mean from each value to find the deviation and square the deviation
4. Sum the squared deviations and divide the total by the count of values in the data set, the result is the variance.

#### Finding the Mean and Variance

Find the μ\begin{align*}\mu\end{align*} and σ2\begin{align*}\sigma^2\end{align*} of the number of students in each classroom at Toni’s school:

 Classroom Number of Students A 6 B 5 C 9 D 13 E 12 F 16 G 14

Follow the steps from above to find mean and variance of the students:

1. The frequency of students in each classroom is the dependent variable.

2. There are 7 values, listed in ascending order they are: 5, 6, 9, 12, 13, 14, and 16.

3. The sum of the values is: 5+6+9+12+13+14+16=75\begin{align*}5+6+9+12+13+14+16=75\end{align*}, the mean is 757=10.714\begin{align*}\frac{75}{7}=10.714\end{align*}.

4. The deviances and squared deviances are:

 Value−Mean=Deviance\begin{align*}\text{Value} - \text{Mean} = \text{Deviance}\end{align*} Deviance2\begin{align*}\text{Deviance}^2\end{align*} 5−10.714=−5.714\begin{align*}5-10.714=-5.714\end{align*} 32.65 6−10.714=−4.714\begin{align*}6-10.714=-4.714\end{align*} 22.22 9−10.714=−1.714\begin{align*}9-10.714=-1.714\end{align*} 2.94 12−10.714=1.286\begin{align*}12-10.714=1.286\end{align*} 1.654 13−10.714=2.286\begin{align*}13-10.714=2.286\end{align*} 5.226 14−10.714=3.286\begin{align*}14-10.714=3.286\end{align*} 10.798 16−10.714=5.286\begin{align*}16-10.714=5.286\end{align*} 27.942

5. The sum of the squared deviances is 103.43. The variance is 103.437=14.776\begin{align*}\frac{103.43}{7}=14.776\end{align*}

#### Finding the Mean and Variance of Graphed Data

Find the μ\begin{align*}\mu\end{align*} and σ2\begin{align*}\sigma^2\end{align*} of the graphed data.

1. Most often, the dependent variable is represented by the vertical axis, and this histogram is no exception. The number of 4.0’s each year is the dependent variable, while the year is the independent variable.

2. In ascending order, the dependent variable values are:

39,45,47,51,51,54,54,56\begin{align*}39, 45, 47, 51, 51, 54, 54, 56\end{align*}

3. The sum of the values is: 39+45+47+51+51+54+54+56=397\begin{align*}39+45+47+51+51+54+54+56=397\end{align*}.

The mean (μ) is: 3978=49.625\begin{align*}\frac{397}{8}=49.625\end{align*} which suggests that a year with 50 or more 4.0 GPA’s would be considered an above average year.

4. The deviation and squared deviation of each value is:

 Deviance Deviance2 39−49.625=−10.625\begin{align*}39-49.625=-10.625\end{align*} (−10.625)2=112.89\begin{align*}(-10.625)^2=112.89\end{align*} 45−49.625=−4.625\begin{align*}45-49.625=-4.625\end{align*} (−4.625)2=21.39\begin{align*}(-4.625)^2=21.39\end{align*} 47−49.625=−2.625\begin{align*}47-49.625=-2.625\end{align*} (−2.625)2=6.89\begin{align*}(-2.625)^2=6.89\end{align*} 51−49.625=1.375\begin{align*}51-49.625=1.375\end{align*} (1.375)2=1.89\begin{align*}(1.375)^2=1.89\end{align*} 51−49.625=1.375\begin{align*}51-49.625=1.375\end{align*} (−10.625)2=112.89\begin{align*}(-10.625)^2=112.89\end{align*} 54−49.625=4.375\begin{align*}54-49.625=4.375\end{align*} (4.375)2=19.14\begin{align*}(4.375)^2=19.14\end{align*} 54−49.625=4.375\begin{align*}54-49.625=4.375\end{align*} (4.375)2=19.14\begin{align*}(4.375)^2=19.14\end{align*} 56−49.625=6.375\begin{align*}56-49.625=6.375\end{align*} (6.375)2=40.64\begin{align*}(6.375)^2=40.64\end{align*}

5. The sum of the squared deviances is 334.87, making the variance 334.878=41.86\begin{align*}\frac{334.87}{8}=41.86\end{align*}.

#### Interpreting Frequency Polygons

Based on the data in the frequency polygon, which year had the greatest variance in number of shoe brands at various prices, and which had the least variance?

Each of the three data sets contains 6 values, and the mean of each set is:

• 2008: Sum\begin{align*}19+25+21+25+9+6=105\end{align*} Mean: \begin{align*}\frac{105}{6}=17.5\end{align*}
• 2007: Sum\begin{align*}16+19+17+19+7+3=81\end{align*} Mean: \begin{align*}\frac{81}{6}=13.5\end{align*}
• 2006: Sum\begin{align*}14+17+16+15+6+3=71\end{align*} Mean: \begin{align*}\frac{71}{6}=11.83\end{align*}

The sum of the squared deviances for each year is:

• 2008: \begin{align*}(19-17.5)^2+(25-17.5)^2+(21-17.5)^2+(25-17.5)^2+(9-17.5)^2+(6-17.5)^2=331.5\end{align*}
• 2007: \begin{align*}(16-13.5)^2+(19-13.5)^2+(17-13.5)^2+(19-13.5)^2+(7-13.5)^2+(3-13.5)^2=231.5\end{align*}
• 2006: \begin{align*}(14-11.83)^2+(17-11.83)^2+(16-11.83)^2+(15-11.83)^2+(6-11.83)^2+(3-11.83)^2=170.833\end{align*}

The variance of each set is:

• 2008: \begin{align*}\frac{331.5}{6}=55.25\end{align*}
• 2007: \begin{align*} \frac{231.5}{6}=38.583\end{align*}
• 2006: \begin{align*}\frac{170.833}{6}=28.472 \end{align*}

\begin{align*}\therefore\end{align*} 2008 has the greatest variance and 2006 has the least variance

#### Earlier Problem Revisited

Could you find the variance of a data set presented as a histogram?

After your practice above, this should no longer be a problem!

### Examples

The number of cars of various colors in a parking lot with 5 levels is summarized by the table below, use the data to answer questions 1-4.

 Red Yellow Blue White Level 1 11 4 9 14 Level 2 9 3 8 11 Level 3 13 5 10 12 Level 4 14 4 7 9 Level 5 12 6 13 7

#### Example 1

What is the variance of red cars among the 5 levels?

The population of red cars across the 5 levels is: 11, 9, 13, 14, and 12.

• Add the values and divide by five to get the mean of 11.8.
• Square each of the values and sum the squares: \begin{align*}11^2+9^2+13^2+14^2+12^2=711\end{align*}
• Divide the sum of the squares by the number of values in the set (since this is the whole population of red cars), getting \begin{align*}\frac{711}{5}=142.2\end{align*}, and subtract the mean squared \begin{align*}(11.8^2=139.24)\end{align*}
• The variance of the population of red cars is \begin{align*}142.2-139.24=2.96\end{align*}

#### Example 2

What is the color variance of blue cars across the 5 levels?

The levels above level 3 include only levels 4 and 5. The total number of red, yellow, blue, and white cars is 26, 10, 20, and 16, respectively.

• The mean number of cars of each color is \begin{align*}\frac{26+10+20+16}{4}=18\end{align*}
• Square the values and find the sum: \begin{align*}26^2+10^2+20^2+16^2=1432\end{align*}
• Divide the sum of the squares by the number of values: \begin{align*}\frac{1432}{4}=358\end{align*}. Subtract the squared mean \begin{align*}(18^2=324)\end{align*} to get the variance: \begin{align*}358-324=34\end{align*}

#### Example 3

What is the variance of blue cars across the 5 levels?

The blue car counts are: 9, 8, 10, 7, and 13

• The mean number of blue cars is \begin{align*}\frac{47}{5}=9.4\end{align*}
• The sum of the squared values is \begin{align*}9^2+8^2+10^2+7^2 +13^2=463\end{align*}, divided by the number of levels (5), gives us 92.6
• Subtract the squared mean \begin{align*}(9.4^2=88.36)\end{align*} to get the variance
• The variance is \begin{align*}92.6-88.36=4.24\end{align*}

#### Example 4

If we take a sample of levels by rolling a die and end up with levels 1, 3, and 5, what is the variance of white cars in the sampe?

The number of white cars on levels 1, 3, and 5 is 14, 12, and 7.

• The mean number of white cars in this sample is \begin{align*}\frac{33}{3} = 11\end{align*}
• Since this is a sample, we need to use the individual deviations: subtract the mean from each value, and square the result of each subtraction, then find the sum: \begin{align*}(14-11)^2+(12-11)^2+(7-11)^2=26\end{align*}
• Divide the sum of the deviations by the number of values minus 1 (remember, this is a sample!): \begin{align*}\frac{26}{2}=13\end{align*}
• The sample variance is 13.

### Review

Find the variance:

1. 365, 400.7, 303, 479, 514.2, 500, 489

2. 7200, 7020, 7165.9, 7000, 7796, 7012, 7016.1

3. 17, 10.3, 30.7, 70, 66, 76, 40, 53

4. 3607, 3600, 3600, 3631, 3600.6

5. 700, 700, 712, 756, 741, 716, 782

6. 3370, 3300.5, 3366, 3306.6, 3310, 3336, 3301.3

Calculate the sample variance:

7. 34.4, 34, 34.7, 34.6, 34, 34.1, 31, 31.3

8. 989.22, 990.6, 992, 996.9, 981.1, 986, 975

9. 10, 16, 10.33, 10.63, 18, 17, 16.36, 10.46

10. 3240, 3260, 3250, 3280, 3280, 3300, 3310, 3270

### Notes/Highlights Having trouble? Report an issue.

Color Highlighted Text Notes

### Vocabulary Language: English

TermDefinition
disseminated data Disseminated data is data that has been given out to others.
normal distribution curve A normal distribution curve is a symmetrical curve that shows the highest frequency in the center with an identical curve on either side of the center.
tabular data Tabular data is data presented in the form of a table, or, depending on the use, it may refer to data points separated by tabs.