4.4: Sums and Differences of Independent Random Variables
Learning Objectives
- Construct probability distributions of independent random variables.
- Calculate the mean and standard deviation for sums and differences of independent random variables.
Introduction
A probability distribution is the set of values that a random variable can take on. At this time, there are three ways that you can create probability distributions from data. Sometimes previously collected data, relative to the random variable that you are studying, can help to create a probability distribution. In addition to this method, a simulation is also a good way to create an approximate probability distribution. A probability distribution can also be constructed from the basic principles, assumptions, and rules of theoretical probability. The examples in this lesson will lead you to a better understanding of these rules of theoretical probability.
Sums and Differences of Independent Random Variables
Example: Create a table that shows all the possible outcomes when two dice are rolled simultaneously. (Hint: There are 36 possible outcomes.)
\begin{align*}2^{nd}\end{align*} | Die | ||||||
---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | ||
1 | 1, 1 | 1, 2 | 1, 3 | 1, 4 | 1, 5 | 1, 6 | |
2 | 2, 1 | 2, 2 | 2, 3 | 2, 4 | 2, 5 | 2, 6 | |
3 | 3, 1 | 3, 2 | 3, 3 | 3, 4 | 3, 5 | 3, 6 | \begin{align*}1^{st}\end{align*} Die |
4 | 4, 1 | 4, 2 | 4, 3 | 4, 4 | 4, 5 | 4, 6 | |
5 | 5, 1 | 5, 2 | 5, 3 | 5, 4 | 5, 5 | 5, 6 | |
6 | 6, 1 | 6, 2 | 6, 3 | 6, 4 | 6, 5 | 6, 6 |
This table of possible outcomes when two dice are rolled simultaneously that is shown above can now be used to construct various probability distributions. The first table below displays the probabilities for all the possible sums of the two dice, and the second table shows the probabilities for each of the possible results for the larger of the two numbers produced by the dice.
Sum of Two Dice, \begin{align*}x\end{align*} | Probability, \begin{align*}p(x)\end{align*} |
---|---|
2 | \begin{align*}\frac{1}{36}\end{align*} |
3 | \begin{align*}\frac{2}{36}\end{align*} |
4 | \begin{align*}\frac{3}{36}\end{align*} |
5 | \begin{align*}\frac{4}{36}\end{align*} |
6 | \begin{align*}\frac{5}{36}\end{align*} |
7 | \begin{align*}\frac{6}{36}\end{align*} |
8 | \begin{align*}\frac{5}{36}\end{align*} |
9 | \begin{align*}\frac{4}{36}\end{align*} |
10 | \begin{align*}\frac{3}{36}\end{align*} |
11 | \begin{align*}\frac{2}{36}\end{align*} |
12 | \begin{align*}\frac{1}{36}\end{align*} |
Total | 1 |
Larger Number, \begin{align*}x\end{align*} | Probability, \begin{align*}p(x)\end{align*} |
---|---|
1 | \begin{align*}\frac{1}{36}\end{align*} |
2 | \begin{align*}\frac{3}{36}\end{align*} |
3 | \begin{align*}\frac{5}{36}\end{align*} |
4 | \begin{align*}\frac{7}{36}\end{align*} |
5 | \begin{align*}\frac{9}{36}\end{align*} |
6 | \begin{align*}\frac{11}{36}\end{align*} |
Total | 1 |
When you roll the two dice, what is the probability that the sum is 4? By looking at the first table above, you can see that the probability is \begin{align*}\frac{3}{36}\end{align*}.
What is the probability that the larger number is 4? By looking at the second table above, you can see that the probability is \begin{align*}\frac{7}{36}\end{align*}.
Example: The Regional Hospital has recently opened a new pulmonary unit and has released the following data on the proportion of silicosis cases caused by working in the coal mines. Suppose two silicosis patients are randomly selected from a large population with the disease.
Silicosis Cases | Proportion |
---|---|
Worked in the mine | 0.80 |
Did not work in the mine | 0.20 |
There are four possible outcomes for the two patients. With ‘yes’ representing “worked in the mines” and ‘no’ representing “did not work in the mines”, the possibilities are as follows:
First Patient | Second Patient | |
---|---|---|
1 | No | No |
2 | Yes | No |
3 | No | Yes |
4 | Yes | Yes |
As stated previously, the patients for this survey have been randomly selected from a large population, and therefore, the outcomes are independent. The probability for each outcome can be calculated by multiplying the appropriate proportions as shown:
\begin{align*}P(\text{no for} \ 1^{\text{st}}) \bullet P(\text{no for} \ 2^{\text{nd}}) &= (0.2)(0.2)=0.04\\ P(\text{yes for} \ 1^{\text{st}}) \bullet P(\text{no for} \ 2^{\text{nd}}) &= (0.8)(0.2)=0.16\\ P(\text{no for} \ 1^{\text{st}}) \bullet P(\text{yes for} \ 2^{\text{nd}}) &= (0.2)(0.8)=0.16\\ P(\text{yes for} \ 1^{\text{st}}) \bullet P(\text{yes for} \ 2^{\text{nd}}) &= (0.8)(0.8)=0.64\end{align*}
If \begin{align*}X\end{align*} represents the number silicosis patients who worked in the mines in this random sample, then the first of these outcomes results in \begin{align*}x = 0\end{align*}, the second and third each result in \begin{align*}x = 1\end{align*}, and the fourth results in \begin{align*}x = 2\end{align*}. Because the second and third outcomes are disjoint, their probabilities can be added. The probability distribution for \begin{align*}X\end{align*} is given in the table below:
\begin{align*}x\end{align*} | Probability, \begin{align*}p(x)\end{align*} |
---|---|
0 | 0.04 |
1 | \begin{align*}0.16 + 0.16 = 0.32\end{align*} |
2 | 0.64 |
Example: The Quebec Junior Major Hockey League has five teams from the Maritime Provinces. These teams are the Cape Breton Screaming Eagles, the Halifax Mooseheads, the PEI Rockets, the Moncton Wildcats, and the Saint John Sea Dogs. Each team has its own hometown arena, and each arena has a seating capacity that is listed below:
Team | Seating Capacity (Thousands) |
---|---|
Screaming Eagles | 5 |
Mooseheads | 10 |
Rockets | 4 |
Wildcats | 7 |
Sea Dogs | 6 |
A schedule is being drawn up for the teams to play pre-season exhibition games. One game will be played in each home arena, so the possible capacity attendance can be calculated. In addition, the probability of the total possible attendance being at least 12,000 people can also be calculated.
The number of possible combinations of two teams from these five is \begin{align*}_5 C_2 = 10\end{align*}. The following table shows the possible attendance for each of the pre-season exhibition games.
Teams | Combined Attendance Capacity for Both Games (Thousands) |
---|---|
Eagles/Mooseheads | \begin{align*}5 + 10 = 15\end{align*} |
Eagles/Rockets | \begin{align*}5 + 4 = 9\end{align*} |
Eagles/Wildcats | \begin{align*}5 + 7 = 12\end{align*} |
Eagles/Sea Dogs | \begin{align*}5 + 6 = 11\end{align*} |
Mooseheads/Rockets | \begin{align*}10 + 4 = 14\end{align*} |
Mooseheads/Wildcats | \begin{align*}10 + 7 = 17\end{align*} |
Mooseheads/Sea Dogs | \begin{align*}10 + 6 = 16\end{align*} |
Rockets/Wildcats | \begin{align*}4 + 7 = 11\end{align*} |
Rockets/Sea Dog | \begin{align*}4 + 6 = 10\end{align*} |
Sea Dogs/Wildcats | \begin{align*}6 + 7 = 13\end{align*} |
Now the probability distribution for the capacity attendance can be calculated, as is shown in the table below.
Capacity Attendance, \begin{align*}x\end{align*} | Probability, \begin{align*}p(x)\end{align*} |
---|---|
9 | 0.1 |
10 | 0.1 |
11 | 0.2 |
12 | 0.1 |
13 | 0.1 |
14 | 0.1 |
15 | 0.1 |
16 | 0.1 |
17 | 0.1 |
From the table, it can be determined that the probability that the capacity attendance will be at least 12,000 is \begin{align*}0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 = 0.6\end{align*}.
Expected Values and Standard Deviation
Example: Suppose an individual plays a gambling game where it is possible to lose $2.00, break even, win $6.00, or win $20.00 each time he plays. The probability distribution for each outcome is provided by the following table:
Winnings, \begin{align*}x\end{align*} | Probability, \begin{align*}p(x)\end{align*} |
---|---|
\begin{align*}-\end{align*}$2 | 0.30 |
$0 | 0.40 |
$6 | 0.20 |
$20 | 0.10 |
The table can be used to calculate the expected value and the variance of this distribution:
\begin{align*}\mu_{} &= \sum x_{}p_{}(x)\\ \mu_{} &= (-2 \cdot 0.30)+(0 \cdot 0.40)+(6 \cdot 0.20)+(20 \cdot 0.10)\\ \mu_{} &= 2.6\end{align*}
Thus, the player can expect to win $2.60 playing this game.
The variance of this distribution can be calculated as shown:
\begin{align*}\sigma{_{}}^2 &= \sum (x_{}-\mu_{})^2 p(x)\\ \sigma{_{}}^2 &= (-2-2.6)^2 (0.30)+(0-2.6)^2 (0.40)+(6-2.6)^2 (0.20)+(20-2.6)^2 (0.10)\\ \sigma{_{}}^2 & \approx 41.64\\ \sigma_{} & \approx \sqrt{41.64} \approx \$ 6.45\end{align*}
Example: The following probability distribution was constructed from the results of a survey at the local university. The random variable is the number of fast food meals purchased by a student during the preceding year (12 months). For this distribution, calculate the expected value and the standard deviation.
Number of Meals Purchased Within 12 Months, \begin{align*}x\end{align*} | Probability, \begin{align*}p(x)\end{align*} |
---|---|
0 | 0.04 |
\begin{align*}[1 - 6)\end{align*} | 0.30 |
\begin{align*}[6 - 11)\end{align*} | 0.29 |
\begin{align*}[11 - 21)\end{align*} | 0.17 |
\begin{align*}[21 - 51)\end{align*} | 0.15 |
\begin{align*}[51 - 60)\end{align*} | 0.05 |
Total | 1.00 |
You must begin by estimating a mean for each interval, and this can be done by finding the center of each interval. For the first interval of \begin{align*}[1 - 6)\end{align*}, 6 is not included in the interval, so a value of 3 would be the center. This same procedure can be used to estimate the mean of all the intervals. Therefore, the expected value can be calculated as follows:
\begin{align*}\mu_{} &= \sum x_{}p_{}(x)\\ \mu_{} &= (0)(0.04)+(3)(0.30)+(8)(0.29)+(15.5)(0.17)+(35.5)(0.15)+(55)(0.05)\\ \mu_{} &= 13.93\end{align*}
Likewise, the standard deviation can be calculated:
\begin{align*}\sigma^2 &= \sum (x_{}-\mu_{})^2 p_{}(x)\\ &= (0-13.93)^2 (0.04)+(3-13.93)^2 (0.30)\\ & \quad +(8-13.93)^2(0.29)+(15.5-13.93)^2(0.17)\\ & \quad +(35.5-13.93)^2(0.15)+(55-13.93)^2(0.05)\\ & \approx 208.3451\end{align*}
\begin{align*}\sigma_{} \approx 14.43\end{align*}
Thus, the expected number of fast food meals purchased by a student at the local university is 13.93, and the standard deviation is 14.43. Note that the mean should not be rounded, since it does not have to be one of the values in the distribution. You should also notice that the standard deviation is very close to the expected value. This means that the distribution will be skewed to the right and have a long tail toward the larger numbers.
Technology Note: Calculating mean and variance for probability distribution on TI-83/84 Calculator
Notice that the mean, which is denoted by \begin{align*}\overline{x}\end{align*} in this case, is 13.93, and the standard deviation, which is denoted by \begin{align*}\sigma_x\end{align*}, is approximately 14.43.
Linear Transformations of \begin{align*}X\end{align*} on Mean of \begin{align*}X\end{align*} and Standard Deviation of \begin{align*}X\end{align*}
If you add the same value to all the numbers of a data set, the shape and standard deviation of the data set remain the same, but the value is added to the mean. This is referred to as re-centering the data set. Likewise, if you rescale the data, or multiply all the data values by the same nonzero number, the basic shape will not change, but the mean and the standard deviation will each be a multiple of this number. (Note that the standard deviation must actually be multiplied by the absolute value of the number.) If you multiply the numbers of a data set by a constant \begin{align*}d\end{align*} and then add a constant \begin{align*}c\end{align*}, the mean and the standard deviation of the transformed values are expressed as follows:
\begin{align*}\mu_{c+dX} &= c+d \mu_{X}\\ \sigma_{c+dX} &= |d| \sigma_{X}\end{align*}
These are called linear transformations, and the implications of this can be better understood if you return to the casino example.
Example: The casino has decided to triple the prizes for the game being played. What are the expected winnings for a person who plays one game? What is the standard deviation? Recall that the expected value was $2.60, and the standard deviation was $6.45.
Solution:
The simplest way to calculate the expected value of the tripled prize is (3)($2.60), or $7.80, with a standard deviation of (3)($6.45), or $19.35. Here, \begin{align*}c = 0\end{align*} and \begin{align*}d = 3\end{align*}. Another method of calculating the expected value and standard deviation would be to create a new table for the tripled prize:
Winnings, \begin{align*}x\end{align*} | Probability, \begin{align*}p\end{align*} |
---|---|
\begin{align*}-\end{align*}$6 | 0.30 |
$0 | 0.40 |
$18 | 0.20 |
$60 | 0.10 |
The calculations can be done using the formulas or by using a graphing calculator. Notice that the results are the same either way.
This same problem can be changed again in order to introduce the Addition Rule and the Subtraction Rule for random variables. Suppose the casino wants to encourage customers to play more, so it begins demanding that customers play the game in sets of three. What are the expected value (total winnings) and standard deviation now?
Let \begin{align*}X, Y\end{align*} and \begin{align*}Z\end{align*} represent the total winnings on each game played. If this is the case, then \begin{align*}\mu_{X+Y+Z}\end{align*} is the expected value of the total winnings when three games are played. The expected value of the total winnings for playing one game was $2.60, so for three games the expected value is:
\begin{align*}\mu_{X+Y+Z} &= \mu_X+\mu_Y +\mu_Z\\ \mu_{X+Y+Z} &= \$ 2.60 + \$ 2.60 + \$ 2.60\\ \mu_{X+Y+Z} &= \$ 7.80\end{align*}
Thus, the expected value is the same as that for the tripled prize.
Since the winnings on the three games played are independent, the standard deviation of \begin{align*}X, Y\end{align*} and \begin{align*}Z\end{align*} can be calculated as shown below:
\begin{align*}\sigma{^2}_{X+Y+Z} &= \sigma{^2}_X + \sigma{^2}_Y + \sigma{^2}_Z\\ \sigma{^2}_{X+Y+Z} &= 6.45^2 + 6.45^2 + 6.45^2\\ \sigma{^2}_{X+Y+Z} &\approx 124.8075\\ \sigma_{X+Y+Z} &\approx \sqrt{124.8075}\\ \sigma_{X+Y+Z} &\approx 11.17\end{align*}
This means that the person playing the three games can expect to win $7.80 with a standard deviation of $11.17. Note that when the prize was tripled, there was a greater standard deviation ($19.36) than when the person played three games ($11.17).
The Addition and Subtraction Rules for random variables are as follows:
If \begin{align*}X\end{align*} and \begin{align*}Y\end{align*} are random variables, then:
\begin{align*}\mu_{X+Y} &= \mu_X + \mu_Y\\ \mu_{X-Y} &= \mu_X - \mu_Y\end{align*}
If \begin{align*}X\end{align*} and \begin{align*}Y\end{align*} are independent, then:
\begin{align*}\sigma{^2}_{X+Y} &= \sigma{^2}_X+\sigma{^2}_Y\\ \sigma{^2}_{X-Y} &= \sigma{^2}_X+\sigma{^2}_Y\end{align*}
Variances are added for both the sum and difference of two independent random variables, because the variation in each variable contributes to the overall variation in both cases. (Subtracting is the same as adding the opposite.) Suppose you have two dice, one die, \begin{align*}X\end{align*}, with the usual positive numbers 1 through 6, and another, \begin{align*}Y\end{align*}, with the negative numbers \begin{align*}-1\end{align*} through \begin{align*}-6\end{align*}. Next, suppose you perform two experiments. In the first, you roll the first die, \begin{align*}X\end{align*}, and then the second die, \begin{align*}Y\end{align*}, and you compute the difference of the two rolls. In the second experiment, you roll the first die and then the second die, and you calculate the sum of the two rolls.
\begin{align*}\mu_X &= \sum x_{}p_{}(x) && \mu_Y = \sum y_{}p_{}(y)\\ \mu_X &= 3.5 && \mu_Y=-3.5\\ \sigma{^2}_X & \approx \sum (x_{}-\mu_X)^2 p_{}(x) && \sigma{^2}_Y \approx \sum (y_{}-\mu_Y)^2 p_{}(y)\\ \sigma{^2}_X & \approx 2.917 && \sigma{^2}_Y \approx 2.917\\ \mu_{X-Y}&=\mu_X - \mu_Y && \mu_{X+Y} = \mu_X+\mu_Y\\ \mu_{X-Y} &= 3.5 - (-3.5)=7 && \mu_{X+Y} = 3.5 + (-3.5)=0\\ \sigma{^2}_{X-Y}&=\sigma{^2}_X+\sigma{^2}_Y && \sigma{^2}_{X+Y} = \sigma{^2}_X+\sigma{^2}_Y\\ \sigma{^2}_{X-Y} &\approx 2.917 + 2.917 = 5.834 && \sigma{^2}_{X+Y} \approx 2.917 + 2.917 = 5.834\end{align*}
Notice how the expected values and the variances for the two dice combine in these two experiments.
Example: Beth earns $25.00 an hour for tutoring but spends $20.00 an hour for piano lessons. She saves the difference between her earnings for tutoring and the cost of the piano lessons. The numbers of hours she spends on each activity in one week vary independently according to the probability distributions shown below. Determine her expected weekly savings and the standard deviation of these savings.
Hours of Piano Lessons, \begin{align*}x\end{align*} | Probability, \begin{align*}p(x)\end{align*} |
---|---|
0 | 0.3 |
1 | 0.3 |
2 | 0.4 |
Hours of Tutoring, \begin{align*}y\end{align*} | Probability, \begin{align*}p(y)\end{align*} |
---|---|
1 | 0.2 |
2 | 0.3 |
3 | 0.2 |
4 | 0.3 |
\begin{align*}X\end{align*} represents the number of hours per week taking piano lessons, and \begin{align*}Y\end{align*} represents the number of hours tutoring per week. The mean and standard deviation for each can be calculated as follows:
\begin{align*}E(x) &= \mu_X = \sum x_{}p_{}(x) && \sigma{^2}_X = \sum (x_{}-\mu_X)^2 p{}(x)\\ \mu_X &= (0)(0.3)+(1)(0.3)+(2)(0.4) && \sigma{^2}_X = (0-1.1)^2 (0.3)+(1-1.1)^2(0.3)+(2-1.1)^2(0.4)\\ \mu_X &= 1.1 && \sigma{^2}_X = 0.69\\ &&& \sigma_X = 0.831\end{align*}
\begin{align*}E(y) &= \mu_Y = \sum y_{}p_{}(y) && \sigma{^2}_Y = \sum (y_{}-\mu_Y)^2p_{}(y)\\ \mu_Y &= (1)(0.2)+(2)(0.3)+(3)(0.2)+(4)(0.3) && \sigma{^2}_Y = (1-2.6)^2 (0.2)+(2-2.6)^2(0.3)+(3-2.6)^2(0.2)\\ &&& +(4-2.6)^2(0.3)\\ \mu_Y &= 2.6 && \sigma{^2}_Y = 1.24\\ &&& \sigma_Y = 1.11\end{align*}
The expected number of hours Beth spends on piano lessons is 1.1 with a standard deviation of 0.831 hours. Likewise, the expected number of hours Beth spends tutoring is 2.6 with a standard deviation of 1.11 hours.
Beth spends $20 for each hour of piano lessons, so her mean weekly cost for piano lessons can be calculated with the Linear Transformation Rule as shown:
\begin{align*}\mu_{20 X}=(20)(\mu_X)=(20)(1.1)=\$ 22\end{align*} by the Linear Transformation Rule.
Beth earns $25 for each hour of tutoring, so her mean weekly earnings from tutoring are as follows:
\begin{align*}\mu_{25 Y}=(25)(\mu_Y)=(25)(2.6)=\$ 65\end{align*} by the Linear Transformation Rule.
Thus, Beth's expected weekly savings are:
\begin{align*}\mu_{25 Y}-\mu_{20 X}=\$ 65 - \$ 22 = \$ 43\end{align*} by the Subtraction Rule.
The standard deviation of the cost of her piano lessons is:
\begin{align*}\sigma_{20 X}=(20)(0.831)=\$ 16.62\end{align*} by the Linear Transformation Rule.
The standard deviation of her earnings from tutoring is:
\begin{align*}\sigma_{25 Y}=(25)(1.11)=\$ 27.75\end{align*} by the Linear Transformation Rule.
Finally, the variance and standard deviation of her weekly savings is:
\begin{align*}\sigma{^2}_{25Y-20X} &= \sigma{^2}_{25 Y}+\sigma{^2}_{20 X}=(27.75)^2+(16.62)^2=1046.2896\\ \sigma_{25Y-20X} &\approx \$ 32.35\end{align*}
Lesson Summary
A chance process can be displayed as a probability distribution that describes all the possible outcomes, \begin{align*}x\end{align*}. You can also determine the probability of any set of possible outcomes. A probability distribution table for a random variable, \begin{align*}X\end{align*}, consists of a table with all the possible outcomes, along with the probability associated with each of the outcomes. The expected value and the variance of a probability distribution can be calculated using the following formulas:
\begin{align*}E(x) &= \mu_X=\sum x_{}p_{}(x)\\ \sigma{^2}_X &= \sum (x_{}-\mu _X)^2 p_{}(x)\end{align*}
For the random variables \begin{align*}X\end{align*} and \begin{align*}Y\end{align*} and constants \begin{align*}c\end{align*} and \begin{align*}d\end{align*}, the mean and the standard deviation of a linear transformation are given by the following:
\begin{align*}\mu_{c+dX} &= c+d \mu_X\\ \sigma_{c+dX} &= |d| \sigma_X\end{align*}
If the random variables \begin{align*}X\end{align*} and \begin{align*}Y\end{align*} are added or subtracted, the mean is calculated as shown below:
\begin{align*}\mu_{X+Y} &= \mu_X+\mu_Y\\ \mu_{X-Y} &= \mu_X-\mu_Y\end{align*}
If \begin{align*}X\end{align*} and \begin{align*}Y\end{align*} are independent, then the following formulas can be used to compute the variance:
\begin{align*}\sigma{^2}_{X+Y} &= \sigma{^2}_X+\sigma{^2}_Y\\ \sigma{^2}_{X-Y} &= \sigma{^2}_X + \sigma{^2}_Y\end{align*}
Points to Consider
- Are these concepts applicable to real-life situations?
- Will knowing these concepts allow you estimate information about a population?
Multimedia Links
For examples of finding means and standard deviations of sums and differences of random variables (5.0), see mrjaffesclass, Linear Combinations of Random Variables (6:41).
Review Questions
- It is estimated that 70% of the students attending a school in a rural area take the bus to school. Suppose you randomly select three students from the population. Construct the probability distribution of the random variable, \begin{align*}X\end{align*}, defined as the number of students who take the bus to school. (Hint: Begin by listing all of the possible outcomes.)
- The Safe Grad Committee at a high school is selling raffle tickets on a Christmas Basket filled with gifts and gift cards. The prize is valued at $1200, and the committee has decided to sell only 500 tickets. What is the expected value of a ticket? If the students decide to sell tickets on three monetary prizes – one valued at $1500 dollars and two valued at $500 each, what is the expected value of the ticket now?
- A recent law has been passed banning the use of hand-held cell phones while driving, and a survey has revealed that 76% of drivers now refrain from using their cell phones while driving. Three drivers were randomly selected, and a probability distribution table was constructed to record the outcomes. Let \begin{align*}N\end{align*} represent those drivers who never use their cell phones while driving and \begin{align*}S\end{align*} represent those who seldom use their cell phones while driving. Calculate the expected value and the variance using technology.