# Variance of a Data Set

## The mean of the squares of the deviation of data values

Estimated20 minsto complete
%
Progress
Practice Variance of a Data Set

MEMORY METER
This indicates how strong in your memory this concept is
Progress
Estimated20 minsto complete
%
Random Variable Variance

Recently, we discussed the process of finding the mean of a discrete random variable. The process resembled that of finding the arithmetic mean of a set of basic numbers, yet had some significant differences as well. Suppose you needed to know the variance or standard deviation of a random variable. Would these values be calculated differently for random variables than for standard numerical data sets, or not?

### Random Variable Variance

As we discussed some time ago, sometimes it is not enough to know the average, or mean, value of a data set when trying to get a feel for the trend(s) of the set. It is the same with a random variable, sometimes you need to know about the spread of a variable to get a better idea of the overall behavior.

One of the other additional pieces of information we learned to calculate in order to evaluate sets before was the variance, which is the square of the standard deviation. Both of these measures help to create an understanding of the tendency of values to cluster around the mean. By evaluating the variance and standard deviation of a random variable, we can get a better idea of the spread of the values than with the mean alone.

Just as with the mean, or expected value, we have a formula to apply in order to calculate the variance:σ2X=(xiμx)2pi\begin{align*}{\sigma^2}_X =\sum(x_i-\mu _x )^2 p_i\end{align*}

Then, to find the standard deviation, just take the square root of the variance:

\begin{align*}\sigma_X=\sqrt{{\sigma^2}_X}\end{align*}

#### Calculating the Variance and Standard Deviation

1. In another lesson, we calculated the expected value of the number of the number of kids that Sally baby sits on any given day from the data in the table below. Using the table and the mean, calculate the variance and standard deviation of the number of kids she baby sits.

 \begin{align*}x\end{align*} 1 2 3 4 5 \begin{align*}P(X=x)\end{align*} 0.35 0.4 0.15 0.05 0.05

\begin{align*}\mu _X=2\end{align*}

Use the data given in the question to fill in the formula and find the variance:

\begin{align*}\text{Formula:} \ {\sigma^2}_X =\sum(x_i-\mu _x )^2 p_i\end{align*}

\begin{align*}&{\sigma^2}_X =(1-2)^2 \times .35+(2-2)^2 \times .4+(2-3)^2 \times .15+(2-4)^2 \times .05+(2-5)^2 \times .05 \\ & \qquad \qquad \qquad (1 \times .35)+(0 \times .4)+(1 \times .15)+(4 \times .05)+(9 \times .05)\\ &.35+0+.15+.2+.45 \\ &{\sigma^2}_X =1.15 \end{align*}

Since the variance is 1.15, the standard deviation is \begin{align*}\sqrt{1.15} =1.07\end{align*}

\begin{align*}\sigma_X =1.07\end{align*}

2. Random variable \begin{align*}X\end{align*} has mean 18.84, and the probability distribution show below. Calculate the variance and standard deviation.

 \begin{align*}x\end{align*} 3 11 19 27 \begin{align*}P(X=x)\end{align*} 0.07 0.08 0.65 0.2

Using the variance formula: \begin{align*}{\sigma^2}_X =\sum(x_i-\mu _x )^2 p_i\end{align*}

\begin{align*}&{\sigma^2}_X =(3-18.84)^2 \times .07+(11-18.84)^2 \times .08+(19-18.84)^2 \times .65+(27-18.84)^2 \times .2 \\ &{\sigma^2}_X =(-15.84)^2 \times .07+(-7.84)^2 \times .08+(.16)^2 \times .65+(8.16)^2 \times .2\\ &{\sigma^2}_X =251 \times .07+61.5 \times .08+.03 \times .65+66.59 \times .2\\ &{\sigma^2}_X =17.6+4.9+.02+13.3\\ &{\sigma^2}_X =35.8 \end{align*}

Since the variance is 36.3, the standard deviation is \begin{align*}\sqrt{35.8} =5.98\end{align*}

\begin{align*}\sigma_X =5.98\end{align*}

3. The random variable \begin{align*}Z\end{align*} has a probability distribution shown below, find \begin{align*}\mu _Z\end{align*}, \begin{align*}{\sigma^2} _Z\end{align*}, and \begin{align*}\sigma _Z\end{align*}.

 \begin{align*}x\end{align*} 0.65 0.84 1.03 1.22 1.41 \begin{align*}P(X=x)\end{align*} 0.16 0.29 0.14 0.28 0.13

Start by finding the mean of \begin{align*}Z\end{align*}:

\begin{align*}\mu_Z=(.65 \times .16)+(.84 \times .29)+(1.03 \times .14)+(1.22 \times .28)+(1.41 \times .13) = 1.02\end{align*}

Now that we have the mean, we can use it to find the variance:

\begin{align*}{\sigma^2}_Z &=(.65-1.02)^2 \times .16+(.84-1.02)^2 \times .29+(1.03-1.02)^2 \times .14+(1.22-1.02)^2 \times .28+(1.41-1.02)^2 \times .13\\ {\sigma^2}_Z &=(-.37)^2 \times .16+(-.18)^2 \times .29+(.01)^2 \times .14+(.2)^2 \times .28+(.39)^2 \times .13\\ {\sigma^2}_Z &=.14 \times .16+.03 \times .29+0 \times .14+.04 \times .28+.16 \times .13\\ {\sigma^2}_Z &=.02+.01+.01+.02\\ {\sigma^2}_Z &=.06 \end{align*}

Finally, the standard deviation is just the square root of the variance:

\begin{align*}\sigma_Z=\sqrt{.06}=.25\end{align*}

#### Earlier Problem Revisited

Suppose you needed to know the variance or standard deviation of a random variable. Would these values be calculated differently for random variables than for standard numerical data sets or not?

The variance and standard deviation are the same concept when dealing with random variable as with numerical data sets. However, the process of calculating the values is slightly different. Instead of dividing the squared difference of each number and the mean by the count of values: \begin{align*}\frac{(x-\mu)^2}{n}\end{align*} you multiply the square of the difference of each number and the mean by the probability of that value: \begin{align*}(x-\mu)^2 \times P(x)\end{align*}.

In either case, the standard deviation is the square root of the variance.

### Examples

#### Example 1

Calculate the variance and standard deviation of random variable \begin{align*}Y\end{align*}, given: \begin{align*}\mu_Y=43.2\end{align*} and:

 \begin{align*}x\end{align*} 15 30 45 60 75 \begin{align*}P(Y=x)\end{align*} 0.2 0.25 0.15 0.27 0.13

#### All of the values we need for this one are given, it is really just a “plug-n-chug” using the variance formula: \begin{align*}{\sigma^2}_X =\sum(x_i-\mu_x )^2 p_i\end{align*}

\begin{align*}{\sigma^2}_Y &=(15-43.2)^2 \times .2+(30-43.2)^2 \times .25+(45-43.2)^2 \times .15+(60-43.2)^2 \times .27+(75-43.2)^2 \times .13\\ {\sigma^2}_Y &=(-28.2)^2 \times .2+(-13.2)^2 \times .25+(-1.8)^2 \times .15+(16.8)^2 \times .27+(31.8)^2 \times .13\\ {\sigma^2}_Y &=159+43.6+.5+76.2+131.5\\ {\sigma^2}_Y&=410.8\\ \sigma_Y&=\sqrt{410.8}=20.27 \end{align*}

#### Example 2

Find \begin{align*}\mu_X\end{align*}\begin{align*}\sigma _X\end{align*}, and \begin{align*}{\sigma^2}_X\end{align*} given:

 \begin{align*}x\end{align*} 4 8 12 16 20 \begin{align*}P(Y=x)\end{align*} 0.5 0.25 0.15 0.05 0.05

First, calculate the mean

\begin{align*}& \mu_X =(4 \times .5)+(8 \times .25)+(12 \times .15)+(16 \times .05)+(20 \times .05) = 7.6 \end{align*}

Then, use the mean to calculate the variance:

\begin{align*}& {\sigma^2}_X =(4-7.6)^2 \times .5+(8-7.6)^2 \times .25+(12-7.6)^2 \times .15+(16-7.6)^2 \times .05+(20-7.6)^2 \times .05\\ & {\sigma^2}_X =20.64\\ & \sigma_X =\sqrt{20.64}=4.5 \end{align*}

#### Example 3

Marie has a part-time job walking dogs to earn money on weekends. The following probability distribution represents the probability of having a particular number of clients on any given day. If she earns 2.75 per client, how much could she expect to earn each day, on average, and what is the standard deviation of her expected earnings?  # clients 20 25 30 35 40 probability 0.15 0.35 0.3 0.15 0.05 Start by finding the mean: \begin{align*}\mu_X = 20 \times .15+25 \times .35+30 \times .3+35 \times .15+40 \times .05 = 28\end{align*} Use the mean to find the variance: \begin{align*}{\sigma^2}_X =(20-28)^2 \times .15+(25-28)^2 \times .35+(30-28)^2 \times .30+(35-28)^2 \times .15+(40-28)^2 \times .05=28.5\end{align*} Use the variance to find the standard deviation: \begin{align*}\sigma_X=\sqrt{28} =5.3\end{align*} Now we can find her average income by multiplying the mean, 28 by Marie’s rate,2.75, to get her average daily income of $77. Finally, we can multiply the calculated standard deviation, 5.3, by the rate,$2.75, to get the standard deviation of her income: \begin{align*}5.3 \times 2.75=14.58\end{align*}

What all this means is that Marie can expect to average $77 per day, on average, give or take about$14.50.

### Review

For questions 1 – 9, find the variance and standard deviation of the random variable, given the mean and probability distribution.

1. \begin{align*}\mu_x=4.435\end{align*}

 \begin{align*}x\end{align*} 4.1 4.4 4.7 4.9 5.1 \begin{align*}P(X=x)\end{align*} 0.3 0.45 0.1 0.05 0.1

2. \begin{align*}\mu_x=7.6\end{align*}

 \begin{align*}x\end{align*} 4 8 12 16 20 \begin{align*}P(X=x)\end{align*} 0.5 0.25 0.15 0.05 0.05

3. \begin{align*}\mu_x=43.2\end{align*}

 \begin{align*}x\end{align*} 15 30 45 60 75 \begin{align*}P(X=x)\end{align*} 0.2 0.25 0.15 0.27 0.13

4. \begin{align*}\mu_X=93\end{align*}

 \begin{align*}x\end{align*} 30 60 90 120 150 170 \begin{align*}P(X=x)\end{align*} 0.18 0.16 0.24 0.22 0.2 0

5. \begin{align*}\mu_X=12.92\end{align*}

 \begin{align*}x\end{align*} 5 9 13 17 \begin{align*}P(X=x)\end{align*} 0.07 0.08 0.65 0.2

6. \begin{align*}\mu_X=21.80\end{align*}

 \begin{align*}x\end{align*} 13 17 21 25 29 33 37 \begin{align*}P(X=x)\end{align*} 0.15 0.17 0.23 0.3 0.1 0.03 0.02

7. \begin{align*}\mu_X= 57.98\end{align*}

 \begin{align*}x\end{align*} 26 39 52 65 78 \begin{align*}P(X=x)\end{align*} 6% 14% 30% 28% 22%

8. \begin{align*}\mu_X=64.99\end{align*}

 \begin{align*}x\end{align*} 22 43 64 85 106 \begin{align*}P(X=x)\end{align*} 10.5% 22.5% 31.5% 22.8% 12.7%

9. \begin{align*}\mu_X=7.46\end{align*}

 \begin{align*}x\end{align*} 3.65 5.84 7.03 9.22 11.41 \begin{align*}P(X=x)\end{align*} 0.16 0.25 0.18 0.24 0.17

10. Dorian works for a construction company, where he earns $11.50 per hour. The number of hours he works each week varies between 25 and 40. Based on prior experience, Dorian has compiled the probability distribution below describing the probability that he will work a given number of hours. Can Dorian afford to buy a new truck that has a payment of$525/month, if he wants to be sure not to put more than 25% of his average monthly income into car payments? What is the standard deviation of his monthly income?

 # hours 25 28 31 34 37 40 probability 0.15 0.14 0.26 0.18 0.14 0.13

### Notes/Highlights Having trouble? Report an issue.

Color Highlighted Text Notes

### Vocabulary Language: English

TermDefinition
absolute deviation The absolute deviation is the sum total of how different each number is from the mean.
deviation Deviation is a measure of the difference between a given value and the mean.
Mean The mean of a data set is the average of the data set. The mean is found by calculating the sum of the values in the data set and then dividing by the number of values in the data set.
mean absolute deviation The mean absolute deviation is an alternate measure of how spread out the data is. It involves finding the mean of the distance between each data value and the mean. While this method might seem more intuitive, in statistics it has been found to be too limited and is not commonly used.
Population In statistics, the population is the entire group of interest from which the sample is drawn.
Sample A sample is a specified part of a population, intended to represent the population as a whole.
Skew To skew a given set means to cause the trend of data to favor one end or the other
standard deviation The square root of the variance is the standard deviation. Standard deviation is one way to measure the spread of a set of data.
variance A measure of the spread of the data set equal to the mean of the squared variations of each data value from the mean of the data set.