<img src="https://d5nxst8fruw4z.cloudfront.net/atrk.gif?account=iA1Pi1a8Dy00ym" style="display:none" height="1" width="1" alt="" />
Dismiss
Skip Navigation

Applications of Variance and Standard Deviation

Analyzing data using technology.

Atoms Practice
Estimated8 minsto complete
%
Progress
Practice Applications of Variance and Standard Deviation
 
 
 
MEMORY METER
This indicates how strong in your memory this concept is
Practice
Progress
Estimated8 minsto complete
%
Practice Now
Turn In
Variance Practice

Suppose you were given a histogram and asked to find the variance of the data it illustrates? Would you know how?

After this lesson, you will understand how to compare visualized data with variance.

Credit: Anders Sandberg
Source: https://www.flickr.com/photos/arenamontanus/2243540719
License: CC BY-NC 3.0

Comparing Visualized Data with Variance 

Knowing how to calculate the variance of a set when it is given to you as a list of values is great, but statistical data is often shared and disseminated in visual form rather than as raw data. Because of this, it is important to practice evaluating the variance of graphed data as well as tabular or raw data so you can actually apply your understanding of variance to real-world statistics.

In general, you will need to:

  1. Identify the values of the dependent variable, as these are the values you will be finding the variance of.
  2. Sum the values and calculate the arithmetic mean.
  3. Subtract the mean from each value to find the deviation and square the deviation
  4. Sum the squared deviations and divide the total by the count of values in the data set, the result is the variance.

 

Finding the Mean and Variance 

Find the \begin{align*}\mu\end{align*}μ and \begin{align*}\sigma^2\end{align*}σ2 of the number of students in each classroom at Toni’s school:

Classroom Number of Students
A 6
B 5
C 9
D 13
E 12
F 16
G 14

 Follow the steps from above to find mean and variance of the students:

1. The frequency of students in each classroom is the dependent variable.

2. There are 7 values, listed in ascending order they are: 5, 6, 9, 12, 13, 14, and 16.

3. The sum of the values is: \begin{align*}5+6+9+12+13+14+16=75\end{align*}5+6+9+12+13+14+16=75, the mean is \begin{align*}\frac{75}{7}=10.714\end{align*}757=10.714.

4. The deviances and squared deviances are:

\begin{align*}\text{Value} - \text{Mean} = \text{Deviance}\end{align*}ValueMean=Deviance \begin{align*}\text{Deviance}^2\end{align*}Deviance2
\begin{align*}5-10.714=-5.714\end{align*}510.714=5.714 32.65
\begin{align*}6-10.714=-4.714\end{align*}610.714=4.714 22.22
\begin{align*}9-10.714=-1.714\end{align*}910.714=1.714 2.94
\begin{align*}12-10.714=1.286\end{align*}1210.714=1.286 1.654
\begin{align*}13-10.714=2.286\end{align*}1310.714=2.286 5.226
\begin{align*}14-10.714=3.286\end{align*}1410.714=3.286 10.798
\begin{align*}16-10.714=5.286\end{align*}1610.714=5.286 27.942

5. The sum of the squared deviances is 103.43. The variance is \begin{align*}\frac{103.43}{7}=14.776\end{align*}103.437=14.776

Finding the Mean and Variance of Graphed Data 

Find the \begin{align*}\mu\end{align*}μ and \begin{align*}\sigma^2\end{align*}σ2 of the graphed data.

Follow the steps outlined above:

1. Most often, the dependent variable is represented by the vertical axis, and this histogram is no exception. The number of 4.0’s each year is the dependent variable, while the year is the independent variable.

2. In ascending order, the dependent variable values are:

\begin{align*}39, 45, 47, 51, 51, 54, 54, 56\end{align*}39,45,47,51,51,54,54,56

3. The sum of the values is: \begin{align*}39+45+47+51+51+54+54+56=397\end{align*}39+45+47+51+51+54+54+56=397.

The mean (μ) is: \begin{align*}\frac{397}{8}=49.625\end{align*}3978=49.625 which suggests that a year with 50 or more 4.0 GPA’s would be considered an above average year.

4. The deviation and squared deviation of each value is:

Deviance Deviance2
\begin{align*}39-49.625=-10.625\end{align*}3949.625=10.625 \begin{align*}(-10.625)^2=112.89\end{align*}(10.625)2=112.89
\begin{align*}45-49.625=-4.625\end{align*}4549.625=4.625 \begin{align*}(-4.625)^2=21.39\end{align*}(4.625)2=21.39
\begin{align*}47-49.625=-2.625\end{align*}4749.625=2.625 \begin{align*}(-2.625)^2=6.89\end{align*}(2.625)2=6.89
\begin{align*}51-49.625=1.375\end{align*}5149.625=1.375 \begin{align*}(1.375)^2=1.89\end{align*}(1.375)2=1.89
\begin{align*}51-49.625=1.375\end{align*}5149.625=1.375 \begin{align*}(-10.625)^2=112.89\end{align*}(10.625)2=112.89
\begin{align*}54-49.625=4.375\end{align*}5449.625=4.375 \begin{align*}(4.375)^2=19.14\end{align*}(4.375)2=19.14
\begin{align*}54-49.625=4.375\end{align*}5449.625=4.375 \begin{align*}(4.375)^2=19.14\end{align*}(4.375)2=19.14
\begin{align*}56-49.625=6.375\end{align*}5649.625=6.375 \begin{align*}(6.375)^2=40.64\end{align*}(6.375)2=40.64

5. The sum of the squared deviances is 334.87, making the variance \begin{align*}\frac{334.87}{8}=41.86\end{align*}334.878=41.86.

\begin{align*}\therefore \sigma^2=41.86 \end{align*}

Interpreting Frequency Polygons 

Based on the data in the frequency polygon, which year had the greatest variance in number of shoe brands at various prices, and which had the least variance?

Each of the three data sets contains 6 values, and the mean of each set is:

  • 2008: Sum\begin{align*}19+25+21+25+9+6=105\end{align*} Mean: \begin{align*}\frac{105}{6}=17.5\end{align*}
  • 2007: Sum\begin{align*}16+19+17+19+7+3=81\end{align*} Mean: \begin{align*}\frac{81}{6}=13.5\end{align*}
  • 2006: Sum\begin{align*}14+17+16+15+6+3=71\end{align*} Mean: \begin{align*}\frac{71}{6}=11.83\end{align*}

The sum of the squared deviances for each year is:

  • 2008: \begin{align*}(19-17.5)^2+(25-17.5)^2+(21-17.5)^2+(25-17.5)^2+(9-17.5)^2+(6-17.5)^2=331.5\end{align*}
  • 2007: \begin{align*}(16-13.5)^2+(19-13.5)^2+(17-13.5)^2+(19-13.5)^2+(7-13.5)^2+(3-13.5)^2=231.5\end{align*}
  • 2006: \begin{align*}(14-11.83)^2+(17-11.83)^2+(16-11.83)^2+(15-11.83)^2+(6-11.83)^2+(3-11.83)^2=170.833\end{align*}

The variance of each set is:

  • 2008: \begin{align*}\frac{331.5}{6}=55.25\end{align*}
  • 2007: \begin{align*} \frac{231.5}{6}=38.583\end{align*}
  • 2006: \begin{align*}\frac{170.833}{6}=28.472 \end{align*}

\begin{align*}\therefore\end{align*} 2008 has the greatest variance and 2006 has the least variance

Earlier Problem Revisited

Could you find the variance of a data set presented as a histogram?

After your practice above, this should no longer be a problem!

Examples 

The number of cars of various colors in a parking lot with 5 levels is summarized by the table below, use the data to answer questions 1-4.

Red Yellow Blue White
Level 1 11 4 9 14
Level 2 9 3 8 11
Level 3 13 5 10 12
Level 4 14 4 7 9
Level 5 12 6 13 7

 

Example 1

What is the variance of red cars among the 5 levels?

The population of red cars across the 5 levels is: 11, 9, 13, 14, and 12.

  • Add the values and divide by five to get the mean of 11.8.
  • Square each of the values and sum the squares: \begin{align*}11^2+9^2+13^2+14^2+12^2=711\end{align*}
  • Divide the sum of the squares by the number of values in the set (since this is the whole population of red cars), getting \begin{align*}\frac{711}{5}=142.2\end{align*}, and subtract the mean squared \begin{align*}(11.8^2=139.24)\end{align*}
  • The variance of the population of red cars is \begin{align*}142.2-139.24=2.96\end{align*}

Example 2

What is the color variance of blue cars across the 5 levels?

The levels above level 3 include only levels 4 and 5. The total number of red, yellow, blue, and white cars is 26, 10, 20, and 16, respectively.

  • The mean number of cars of each color is \begin{align*}\frac{26+10+20+16}{4}=18\end{align*}
  • Square the values and find the sum: \begin{align*}26^2+10^2+20^2+16^2=1432\end{align*}
  • Divide the sum of the squares by the number of values: \begin{align*}\frac{1432}{4}=358\end{align*}. Subtract the squared mean \begin{align*}(18^2=324)\end{align*} to get the variance: \begin{align*}358-324=34\end{align*}

Example 3

What is the variance of blue cars across the 5 levels?

The blue car counts are: 9, 8, 10, 7, and 13

  • The mean number of blue cars is \begin{align*}\frac{47}{5}=9.4\end{align*}
  • The sum of the squared values is \begin{align*}9^2+8^2+10^2+7^2 +13^2=463\end{align*}, divided by the number of levels (5), gives us 92.6
  • Subtract the squared mean \begin{align*}(9.4^2=88.36)\end{align*} to get the variance
  • The variance is \begin{align*}92.6-88.36=4.24\end{align*}

Example 4

If we take a sample of levels by rolling a die and end up with levels 1, 3, and 5, what is the variance of white cars in the sampe?

The number of white cars on levels 1, 3, and 5 is 14, 12, and 7.

  • The mean number of white cars in this sample is \begin{align*}\frac{33}{3} = 11\end{align*}
  • Since this is a sample, we need to use the individual deviations: subtract the mean from each value, and square the result of each subtraction, then find the sum: \begin{align*}(14-11)^2+(12-11)^2+(7-11)^2=26\end{align*}
  • Divide the sum of the deviations by the number of values minus 1 (remember, this is a sample!): \begin{align*}\frac{26}{2}=13\end{align*}
  • The sample variance is 13.

Review 

Find the variance:

1. 365, 400.7, 303, 479, 514.2, 500, 489

2. 7200, 7020, 7165.9, 7000, 7796, 7012, 7016.1

3. 17, 10.3, 30.7, 70, 66, 76, 40, 53

4. 3607, 3600, 3600, 3631, 3600.6

5. 700, 700, 712, 756, 741, 716, 782

6. 3370, 3300.5, 3366, 3306.6, 3310, 3336, 3301.3

Calculate the sample variance:

7. 34.4, 34, 34.7, 34.6, 34, 34.1, 31, 31.3

8. 989.22, 990.6, 992, 996.9, 981.1, 986, 975

9. 10, 16, 10.33, 10.63, 18, 17, 16.36, 10.46

10. 3240, 3260, 3250, 3280, 3280, 3300, 3310, 3270

Review (Answers)

To view the Review answers, open this PDF file and look for section 5.7. 

Notes/Highlights Having trouble? Report an issue.

Color Highlighted Text Notes
Please to create your own Highlights / Notes
Show More

Vocabulary

disseminated data

Disseminated data is data that has been given out to others.

normal distribution curve

A normal distribution curve is a symmetrical curve that shows the highest frequency in the center with an identical curve on either side of the center.

tabular data

Tabular data is data presented in the form of a table, or, depending on the use, it may refer to data points separated by tabs.

Image Attributions

  1. [1]^ Credit: Anders Sandberg; Source: https://www.flickr.com/photos/arenamontanus/2243540719; License: CC BY-NC 3.0

Explore More

Sign in to explore more, including practice questions and solutions for Applications of Variance and Standard Deviation.
Please wait...
Please wait...