<img src="https://d5nxst8fruw4z.cloudfront.net/atrk.gif?account=iA1Pi1a8Dy00ym" style="display:none" height="1" width="1" alt="" />
You are viewing an older version of this Concept. Go to the latest version.

# Variance of a Data Set

## The mean of the squares of the deviation of data values

0%
Progress
Practice Variance of a Data Set
Progress
0%
Random Variable Variance

#### Objective

Here you will learn to calculate the variance and standard deviation of discrete random variables.

#### Concept

Recently, we discussed the process of finding the mean of a discrete random variable. The process resembled that of finding the arithmetic mean of a set of basic numbers, yet had some significant differences as well. Suppose you needed to know the variance or standard deviation of a random variable. Would these values be calculated differently for random variables than for standard numerical data sets, or not?

#### Watch This

http://youtu.be/7h0TfaYVCv0 statslectures – Variance and Standard Deviation of Discrete Random Variables

#### Guidance

As we discussed some time ago, sometimes it is not enough to know the average, or mean, value of a data set when trying to get a feel for the trend(s) of the set. It is the same with a random variable, sometimes you need to know about the spread of a variable to get a better idea of the overall behavior.

One of the other additional pieces of information we learned to calculate in order to evaluate sets before was the variance , which is the square of the standard deviation . Both of these measures help to create an understanding of the tendency of values to cluster around the mean. By evaluating the variance and standard deviation of a random variable, we can get a better idea of the spread of the values than with the mean alone.

Just as with the mean, or expected value , we have a formula to apply in order to calculate the variance:

Then, to find the standard deviation , just take the square root of the variance:

$\sigma_X=\sqrt{{\sigma^2}_X}$

Example A

In another lesson, we calculated the expected value of the number of the number of kids that Sally baby sits on any given day from the data in the table below. Using the table and the mean, calculate the variance and standard deviation of the number of kids she baby sits.

 $x$ 1 2 3 4 5 $P(X=x)$ 0.35 0.4 0.15 0.05 0.05

$\mu _X=2$

Solution :

Use the data given in the question to fill in the formula and find the variance:

$\text{Fromula:} \ {\sigma^2}_X =\sum(x_i-\mu _x )^2 p_i$

$&{\sigma^2}_X =(1-2)^2 \times .35+(2-2)^2 \times .4+(2-3)^2 \times .15+(2-4)^2 \times .05+(2-5)^2 \times .05 \\& \qquad \qquad \qquad (1 \times .35)+(0 \times .4)+(1 \times .15)+(4 \times .05)+(9 \times .05)\\&.35+0+.15+.2+.45 \\&{\sigma^2}_X =1.15$

Since the variance is 1.15, the standard deviation is $\sqrt{1.15} =1.07$

$\sigma_X =1.07$

Example B

Random variable  $X$ has mean 18.84, and the probability distribution show below. Calculate the variance and standard deviation.

 $x$ 3 11 19 27 $P(X=x)$ 0.07 0.08 0.65 0.2

Solution:

Using the variance formula: ${\sigma^2}_X =\sum(x_i-\mu _x )^2 p_i$

$&{\sigma^2}_X =(3-18.84)^2 \times .07+(11-18.84)^2 \times .08+(19-18.84)^2 \times .65+(27-18.84)^2 \times .2 \\&{\sigma^2}_X =(-15.84)^2 \times .07+(-7.84)^2 \times .08+(.16)^2 \times .65+(8.16)^2 \times .2\\&{\sigma^2}_X =251 \times .07+61.5 \times .08+.03 \times .65+66.59 \times .2\\&{\sigma^2}_X =17.6+4.9+.02+13.3\\&{\sigma^2}_X =35.8$

Since the variance is 36.3, the standard deviation is $\sqrt{35.8} =5.98$

$\sigma_X =5.98$

Example C

The random variable $Z$ has a probability distribution shown below, find $\mu _Z$ , ${\sigma^2} _Z$ , and $\sigma _Z$ .

 $x$ 0.65 0.84 1.03 1.22 1.41 $P(X=x)$ 0.16 0.29 0.14 0.28 0.13

Solution: Start by finding the mean of $Z$ :

$\mu_Z=(.65 \times .16)+(.84 \times .29)+(1.03 \times .14)+(1.22 \times .28)+(1.41 \times .13) = 1.02$

Now that we have the mean, we can use it to find the variance:

${\sigma^2}_Z &=(.65-1.02)^2 \times .16+(.84-1.02)^2 \times .29+(1.03-1.02)^2 \times .14+(1.22-1.02)^2 \times .28+(1.41-1.02)^2 \times .13\\{\sigma^2}_Z &=(-.37)^2 \times .16+(-.18)^2 \times .29+(.01)^2 \times .14+(.2)^2 \times .28+(.39)^2 \times .13\\{\sigma^2}_Z &=.14 \times .16+.03 \times .29+0 \times .14+.04 \times .28+.16 \times .13\\{\sigma^2}_Z &=.02+.01+.01+.02\\{\sigma^2}_Z &=.06$

Finally, the standard deviation is just the square root of the variance:

$\sigma_Z=\sqrt{.06}=.25$

##### Concept Problem Revisited

Suppose you needed to know the variance or standard deviation of a random variable. Would these values be calculated differently for random variables than for standard numerical data sets or not?

The variance and standard deviation are the same concept when dealing with random variable as with numerical data sets. However, the process of calculating the values is slightly different. Instead of dividing the squared difference of each number and the mean by the count of values: $\frac{(x-\mu)^2}{n}$ you multiply the square of the difference of each number and the mean by the probability of that value: $(x-\mu)^2 \times P(x)$ .

In either case, the standard deviation is the square root of the variance.

#### Vocabulary

The variance of a random variable is a measure of how closely the values of the random variable tend to cluster around the mean, or expected value of the variable.

The mean or expected value of a random variable is the value that is expected to be the average of the outputs of the variable, over many, many trials.

#### Guided Practice

1. Calculate the variance and standard deviation of random variable  $Y$ , given: $\mu_Y=43.2$ and:

 $x$ 15 30 45 60 75 $P(Y=x)$ 0.2 0.25 0.15 0.27 0.13

Find $\mu_X$$\sigma _X$ , and  ${\sigma^2}_X$ given:

 $x$ 4 8 12 16 20 $P(Y=x)$ 0.5 0.25 0.15 0.05 0.05

3. Marie has a part-time job walking dogs to earn money on weekends. The following probability distribution represents the probability of having a particular number of clients on any given day. If she earns $2.75 per client, how much could she expect to earn each day, on average, and what is the standard deviation of her expected earnings?  # clients 20 25 30 35 40 probability 0.15 0.35 0.3 0.15 0.05 Solutions: 1. All of the values we need for this one are given, it is really just a “plug-n-chug” using the variance formula: ${\sigma^2}_X =\sum(x_i-\mu_x )^2 p_i$ ${\sigma^2}_Y &=(15-43.2)^2 \times .2+(30-43.2)^2 \times .25+(45-43.2)^2 \times .15+(60-43.2)^2 \times .27+(75-43.2)^2 \times .13\\{\sigma^2}_Y &=(-28.2)^2 \times .2+(-13.2)^2 \times .25+(-1.8)^2 \times .15+(16.8)^2 \times .27+(31.8)^2 \times .13\\{\sigma^2}_Y &=159+43.6+.5+76.2+131.5\\{\sigma^2}_Y&=410.8\\\sigma_Y&=\sqrt{410.8}=20.27$ 2. First, calculate the mean $& \mu_X =(4 \times .5)+(8 \times .25)+(12 \times .15)+(16 \times .05)+(20 \times .05) = 7.6$ Then, use the mean to calculate the variance: $& {\sigma^2}_X =(4-7.6)^2 \times .5+(8-7.6)^2 \times .25+(12-7.6)^2 \times .15+(16-7.6)^2 \times .05+(20-7.6)^2 \times .05\\& {\sigma^2}_X =20.64\\& \sigma_X =\sqrt{20.64}=4.5$ 3. Start by finding the mean: $\mu_X = 20 \times .15+25 \times .35+30 \times .3+35 \times .15+40 \times .05 = 28$ Use the mean to find the variance: ${\sigma^2}_X =(20-28)^2 \times .15+(25-28)^2 \times .35+(30-28)^2 \times .30+(35-28)^2 \times .15+(40-28)^2 \times .05=28.5$ Use the variance to find the standard deviation: $\sigma_X=\sqrt{28} =5.3$ Now we can find her average income by multiplying the mean, 28 by Marie’s rate,$2.75, to get her average daily income of $77 . Finally, we can multiply the calculated standard deviation, 5.3, by the rate,$2.75, to get the standard deviation of her income: $5.3 \times 2.75=14.58$

What all this means is that Marie can expect to average $77 per day, on average, give or take about$14.50.

#### Practice

For questions 1 – 9, find the variance and standard deviation of the random variable, given the mean and probability distribution.

1. $\mu_x=4.435$

 $x$ 4.1 4.4 4.7 4.9 5.1 $P(X=x)$ 0.3 0.45 0.1 0.05 0.1

2. $\mu_x=7.6$

 $x$ 4 8 12 16 20 $P(X=x)$ 0.5 0.25 0.15 0.05 0.05

3. $\mu_x=43.2$

 $x$ 15 30 45 60 75 $P(X=x)$ 0.2 0.25 0.15 0.27 0.13

4. $\mu_X=93$

 $x$ 30 60 90 120 150 170 $P(X=x)$ 0.18 0.16 0.24 0.22 0.2 0

5. $\mu_X=12.92$

 $x$ 5 9 13 17 $P(X=x)$ 0.07 0.08 0.65 0.2

6. $\mu_X=21.80$

 $x$ 13 17 21 25 29 33 37 $P(X=x)$ 0.15 0.17 0.23 0.3 0.1 0.03 0.02

7. $\mu_X= 57.98$

 $x$ 26 39 52 65 78 $P(X=x)$ 6% 14% 30% 28% 22%

8. $\mu_X=64.99$

 $x$ 22 43 64 85 106 $P(X=x)$ 10.5% 22.5% 31.5% 22.8% 12.7%

9. $\mu_X=7.46$

 $x$ 3.65 5.84 7.03 9.22 11.41 $P(X=x)$ 0.16 0.25 0.18 0.24 0.17

10. Dorian works for a construction company, where he earns $11.50 per hour. The number of hours he works each week varies between 25 and 40. Based on prior experience, Dorian has compiled the probability distribution below describing the probability that he will work a given number of hours. Can Dorian afford to buy a new truck that has a payment of$525/month, if he wants to be sure not to put more than 25% of his average monthly income into car payments? What is the standard deviation of his monthly income?

 # hours 25 28 31 34 37 40 probability 0.15 0.14 0.26 0.18 0.14 0.13

### Vocabulary Language: English

absolute deviation

absolute deviation

The absolute deviation is the sum total of how different each number is from the mean.
deviation

deviation

Deviation is a measure of the difference between a given value and the mean.
Mean

Mean

The mean of a data set is the average of the data set. The mean is found by calculating the sum of the values in the data set and then dividing by the number of values in the data set.
mean absolute deviation

mean absolute deviation

The mean absolute deviation is an alternate measure of how spread out the data is. It involves finding the mean of the distance between each data value and the mean. While this method might seem more intuitive, in statistics it has been found to be too limited and is not commonly used.
Population

Population

In statistics, the population is the entire group of interest from which the sample is drawn.
Sample

Sample

A sample is a specified part of a population, intended to represent the population as a whole.
Skew

Skew

To skew a given set means to cause the trend of data to favor one end or the other
standard deviation

standard deviation

The square root of the variance is the standard deviation. Standard deviation is one way to measure the spread of a set of data.
variance

variance

A measure of the spread of the data set equal to the mean of the squared variations of each data value from the mean of the data set.