<meta http-equiv="refresh" content="1; url=/nojavascript/"> Independence versus Dependence ( Read ) | Statistics | CK-12 Foundation
You are viewing an older version of this Concept. Go to the latest version.

# Independence versus Dependence

%
Progress
Practice Independence versus Dependence
Progress
%
Everyday Examples of Independence and Probability

10% of the emails that Michelle receives are spam emails. Her spam filter catches spam 95% of the time. Her spam filter misidentifies non-spam as spam 2% of the time. What percent of the emails in the spam folder are not spam emails?

#### Guidance

In everyday situations, conditional probability is a probability where additional information is known . Finding the probability that a random smoker gets lung cancer is a conditional probability compared to the probability that a random person gets lung cancer. The additional information of the person being a smoker changes the probability being calculated. If the additional information does not ultimately change the probability, then the two events are independent.

There are many everyday situations having to do with probabilities. It is important for you to be able to differentiate between a regular probability and a conditional probability. Always read problems carefully in order to be sure that you are interpreting the information correctly.

Example A

A test for a certain disease is said to be 99% accurate. What does this mean? What does this have to do with conditional probability?

Solution: You should consider four groups of people:

1. People with the disease who test positive for the disease (true positive).
2. People with the disease who test negative for the disease (false negative).
3. People without the disease who test positive for the disease (false positive).
4. People without the disease who test negative for the disease (true negative).

If a test is 99% accurate, it implies that:

1. If a person has the disease, 99% of the time they will receive a positive test result. $P(positive|disease)=99\%$

2. If a person does not have the disease, 99% of the time they will receive a negative test result. $P(negative|no \ disease)=99\%$

The 99% is a conditional probability in each case. Note that these are two completely different probability calculations, and they do not automatically have to be the same. It is in fact more realistic if these two probabilities are different.

Example B

10% of the emails that Michelle receives are spam emails. Her spam filter catches spam 95% of the time. Her spam filter misidentifies non-spam as spam 2% of the time. Let  $A$ be the event that an email is spam. Let  $B$ be the event that the spam filter identifies the email as spam.

1. What does  $P(A)$ mean in English?
2. What does  $P(B|A)$ mean in English?
3. What does  $P(B^\prime|A^\prime)$ mean in English?

Solution:

1. $P(A)$ is the probability that a random email is spam.
2. $P(B|A)$ is the probability that a spam email gets identified as spam.
3. $P(B^\prime|A^\prime)$ is the probability that a non-spam email does not get identified as spam.

Example C

10% of the emails that Michelle receives are spam emails. Her spam filter catches spam 95% of the time. Her spam filter misidentifies non-spam as spam 2% of the time. Let  $A$ be the event that an email is spam. Let  $B$ be the event that the spam filter identifies the email as spam.

1. Find $P(A)$ .
2. Find  $P(B|A)$ .
3. Find $P(B^\prime|A^\prime)$ .

Solution:

1. $P(A)=10\%$
2. $P(B|A)=95\%$
3. $P(B^\prime|A^\prime)=98\%$ . Note that in the problem, 2% is  $P(B|A^\prime). \ P(B|A^\prime)$ and $P(B^\prime|A^\prime)$  must add to 100% because $B$  and $B^\prime$  are complements.

Concept Problem Revisited

10% of the emails that Michelle receives are spam emails. Her spam filter catches spam 95% of the time. Her spam filter misidentifies non-spam as spam 2% of the time. What percent of the emails in the spam folder are not spam emails?

This question is asking for the probability that an email that has been identified as spam is a regular email, $P(A^\prime|B)$ . You were not given this probability directly. One way to approach this problem is to make a two-way frequency table for a some number of emails. Suppose you have 1000 emails.

 Spam Not Spam Total Identified as Spam Not Identified as Spam Total 1000

You know that 10% of those (100 emails) will be spam. This means 90% of those (900 emails) will not be spam. Fill these numbers into the table.

 Spam Not Spam Total Identified as Spam Not Identified as Spam Total 100 900 1000

You also know that 95% of the spam emails (95 emails) will be identified as spam. This means the other 5 spam emails will not be identified as spam. Fill these numbers into the table.

 Spam Not Spam Total Identified as Spam 95 Not Identified as Spam 5 Total 100 900 1000

You also know that 98% of the non-spam emails (882 emails) will not be identified as spam. This means that the other 18 emails will be identified as spam. Fill these numbers into the table.

 Spam Not Spam Total Identified as Spam 95 18 113 Not Identified as Spam 5 882 887 Total 100 900 1000

Now go back to the question. The question is asking for the probability that an email that has been identified as spam is a regular email. 113 emails that were identified as spam. 18 of them are not spam emails.  $P(A^\prime|B)=\frac{18}{113} \approx 16\%$ . Even though the spam filter is pretty accurate, 16% of the emails in the spam folder will be regular emails.

#### Vocabulary

A false negative is when a person with a disease tests negative for the disease.

A false positive is when a person without a disease tests positive for the disease.

A true negative is when a person without a disease tests negative for the disease.

A true positive is when a person with a disease tests negative for the disease.

The probability of an event is the chance of the event occurring.

Two events are independent if one event occurring does not change the probability of the second event occurring. $P(A \cap B)=P(A) P(B)$ if and only  $A$ and  $B$ are independent events. Also,  $P(A|B)=P(A)$ and  $P(B|A)=P(B)$ if and only if  $A$ and  $B$ are independent events.

Two events are dependent if one event occurring causes the probability of the second event to go up or down.

The conditional probability of event  $A$ given event  $B$ is the probability of event  $A$ occurring given event  $B$ occurred. The notation is $P(A|B)$ , which is read as “the probability of  $A$ given $B$ ”.

A two-way frequency table organizes data when two categories are associated with each person/object being classified.

#### Guided Practice

Karl takes the bus to school. Each day, there is a 10% chance that his bus will be late, a 20% chance that he will be late, and a 2% chance that both he and the bus will be late. Let  $C$ be the event that Karl is late. Let  $D$ be the event that the bus is late.

1. State the 10%, 20%, and 2% probabilities in probability notation in terms of events  $C$ and $D$ .

2. Are events  $C$ and  $D$ independent? Explain.

3. Find the probability that Karl is not late but the bus is late.

1. $10\%=P(D). \ 20\%=P(C). \ 2\%=P(C \cap D)$

2. Events  $C$ and  $D$ are independent if $P(C \cap D)=P(C) P(D)$

$P(C) P(D)=(0.2)(0.1)=0.02=2\%=P(C \cap D)$

Therefore, the events are independent.

3. This is $P(C^\prime \cap D)$ . Because the two events are independent, $P(C^\prime \cap D)=P(C^\prime) P(D)$ . Since there is a 20% chance that Karl will be late, there is an 80% chance that Karl will not be late. This means $P(C^\prime)=80\%$ . Therefore, $P(C^\prime \cap D)=P(C^\prime) P(D)=(0.8)(0.1)=8\%$ .

#### Practice

0.1% of the population is said to have a new disease. A test is developed to test for the disease. 97% of people without the disease will receive a negative test result. 99.5% of people with the disease will receive a positive test result. Let  $D$ be the event that a random person has the disease. Let  $E$ be the event that a random person gets a positive test result.

1. State the 0.1%, 97%, and 99.5% probabilities in probability notation in terms of events  $D$ and $E$ .

Fill in the two-way frequency table for this scenario for a group of 1,000,000 people. Follow the steps to help.

 Disease No Disease Total Positive Test Negative Test Total 1,000,000

2. How many of the 1,000,000 people have the disease? How many don't have the disease?

3. How many of the people without the disease will receive a negative test result (true negative)? How many of the people without the disease will receive a positive test result (false positive)?

4. How many of the people with the disease will receive a positive test result (true positive)? How many of the people with the disease will receive a negative test result (false negative)?

5. What does  $P(D|E)$ mean in English?

6. Find $P(D|E)$ . Is this a surprising result?

7. What does  $P(D|E)^\prime$ mean in English?

8. Find $P(D|E^\prime)$ .

9. Are the two events  $D$ and  $E$ independent? Justify your answer.

After finishing his homework, Matt often plays video games and/or has a snack. There is a 60% chance that Matt plays video games, an 80% chance that Matt has a snack, and a 55% chance that Matt plays video games and has a snack. Let  $G$ be the event that Matt plays video games and  $S$ be the event that Matt has a snack.

10. State the 60%, 80%, and 55% probabilities in probability notation in terms of events $G$ and $S$ .

11. Are events  $G$ and  $S$ independent? Explain.

12. Consider 100 days after Matt has finished his homework. Use the probabilities in the problem to fill in the two-way frequency table.

 Snack No Snack Total Video Games No Video Games Total 100

13. Given that Matt played video games, find the probability that he had a snack.

14. What is $P(G|S)$  in English?

15. Find $P(G|S)$ .