<meta http-equiv="refresh" content="1; url=/nojavascript/"> Sampling Distributions and Estimations | CK-12 Foundation
You are reading an older version of this FlexBook® textbook: CK-12 Probability and Statistics - Advanced (Teachers Edition) Go to the latest version.

# 2.7: Sampling Distributions and Estimations

Created by: CK-12
0  0  0

## Sampling Distribution

Activity: Making a Sampling Distribution

In this activity students will create several sampling distributions. The concept of a sampling distribution typically takes students awhile to grasp. Stress that the distribution includes all possible samples of a specific size.

Procedure:

1. Dived the class into groups of five or six students. Each group represents a population to be studied. The parameter of interest could be the average number of siblings of each student in the group.

Now the student will make sampling distributions for different sample sizes.

2. First make a sampling distribution with sample size one. This will be a dot-plot with six values. One for the number of siblings of each student in the group.

3. Now use a sample size of two. Look at all possible combinations of two in the group. Find the average number of siblings of each combination of two and graph it on a separate dot plot.

4. Continue by making a new dot plot with samples of size three, then four, and so on until there is just one sample that contains all the members of the group. Use the combination formula to make sure you have found all the possible samples of each size.

Analysis:

1. Calculate the mean of each sampling distribution.
2. Calculate the standard deviation of each sampling distribution, otherwise known as the sampling error, using the formula $s = \sqrt{\frac{PQ}{n}}$.
3. How do the shape, center, and spread of the sampling distributions change as the sample size increases?
4. If just one sample where to be taken randomly from each distribution, which one would most likely have mean closest to the true population mean.

## The z-score and the Central Limit Theorem

Discuss and Explore: The Relationship between Sample Size, Sampling Error, and Probability

This process will give students the opportunity to visually and quantitatively see the effect of sample size on sampling distributions and see the effect of sample size on the reliability of estimates of population parameters taken from samples.

Discussion:

Ask the students to consider the formula $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$. What happens to $\sigma_{\bar{x}}$ as $n$ gets large?

Explore:

1. Use the calculator to graph a normal distribution with mean $100$ and standard deviation $35$.
2. Then graph sampling distributions for this normal distribution with sample size $3, 6,$ and $12$. The sampling distribution will also be a normal distribution with the same mean as the original distribution and standard deviation given by the above formula.
3. Ask the students to compare the graphs.
4. Calculate the probability of randomly selecting a member of the population with a value less than seventy.
5. Calculate the probability of randomly selecting a sample of three with a sample mean less than seventy.
6. Calculate the probability of randomly selecting a sample of six with a sample mean less than seventy.
7. Calculate the probability of randomly selecting a sample of twelve with a sample mean less than seventy.

Analysis:

Ask the student to explain the affect sample size has on the shape, center, and spread of the sampling distribution and its affect on the probability that the sample mean is close to the population mean.

## Binomial Distributions and Binomial Experiments

Extension: The Normal Approximation to the Binomial Distribution

The binomial distribution is discrete; the probability for each possible value must be calculated separately. A complex event that contains many outcomes would be tedious to calculate manually using the binomial formula. It could be done with technology as shown in the text, or with the normal approximation to the binomial distribution. This will be a brief introduction to the latter method.

Conditions and Formulas:

• The binomial distribution is only close enough to a normal distribution if both $np$ and $nq$ are greater than $10$.
• If both of these conditions are met, a $z-$score can be calculate with $\mu = np$, and $\sigma = \sqrt{np(1 - p)}$.
• A correction for continuity must be made on the value of $x$ so that the entire discrete bar can be included or excluded as the situation requires. This will usually involve adding or subtracting $0.5$ from the $x-$value.

Example:

In early August 2009, approximately $60 \%$ of Americans were in favor of a public option for health care. In a random sample of $500$ Americans, what is the probability that less than $200$ are in favor of a public option?

Step One – Check

$500 * 0.6 > 10$ and $500 * 0.4 > 10$ It is appropriate to use the normal approximation because both these statements are true.

Step Two

$\mu = 500(.6) = 300, \sigma = \sqrt{500(.6)(.4)} \approx 11, x = 200 - 0.5 = 199.5$

Step Three

$z = \frac{200 - 300}{11} = -1.82$

Step Four

$P(x < 30) \approx \ \text{normcdf}(-999, -1.82) \approx .0344$

## Confidence Intervals

Project: Confidence Intervals in the News

Students become much more interested in a topic and motivated to learn about it when they see applications for the topic outside of the classroom. Statistics is prevalent in our daily lives, and many examples can be found in the news.

Objective: Collect and analyze examples of confidence intervals used in the reporting of news stories.

Guidelines:

• Look for examples of confidence intervals in newspapers, news magazines, television broadcasts, and in news stories covered online. Cite your sources.
• Find examples from a variety of areas. Include science, politics, weather, updates on the war, or other topics of interest to you.
• Identify the confidence level, the margin of error, and interpret the meaning of the confidence interval for the given situation.

Note:

This assignment should extend over a large period of time. Ideally, students will spend the entire length of the assignment on the lookout for confidence intervals. It will also take some time to get confidence intervals in a variety of subject areas.

This could be a written report, a presentation made to the class, or could take the form of a poster that will decorate the walls of the classroom. If time is available, the presentations are preferable since they are easy to grade and allow all the students to benefit from the work of their peers.

## Sums and Differences of Independent Random Variables

Practice and Extend: Sums of Independent Random Variables from Normal Distributions and from Binomial Distributions

Students have already learned how to use the normal distribution and the binomial distribution to calculate probabilities. These exercises review these old skills, and give the students the opportunity to see more examples of sums of independent random variables.

Exercises:

1. A college gives an entrance exam with both a math and writing section. The math scores are normally distributed with at mean of $500$ and a standard deviation of $35$. The writing scores are also normally distributed with a mean of $485$ and a standard deviation of $50$. If the scores are independent and a student is randomly selected, what is the probability that the sum of her math and reading score is higher than $1020$?

Answer: The sum of two independent random variables, both from normal distributions, also has a normal distribution with $\mu_{X + Y} = \mu_{X} + \mu_{Y} = 500 + 485 = 985$ and standard deviation $\sigma_{X + Y} = \sqrt{\sigma_{X}^2 + \sigma_{Y}^2} = \sqrt{35^2 + 50^2} \approx 61$. Therefore, $P(X + Y > 1020) = \;\mathrm{normcdf}(1020, 100,000,000,000, 985, 61) \approx 0.2831$

2. Oscar plays little league baseball, and has a batting average of $0.235$. This means he gets a hit $23.5 \%$ of the times he is at bat. Oscar has two games this weekend. If he is at bat $5\;\mathrm{times}$ in Saturday’s game, and $4\;\mathrm{times}$ in Sunday’s game, what is the probability that he will get more than three hits this weekend? Assume the number of hits he gets in each game is independent.

Answer: The sum of two independent random variables, both with binomial distributions, also has a binomial distribution.

Let $X$ be the number of hits in Saturday’s game. $X \sim B(5, .235)$

$Y$ be the number of hits in Sundays game. $Y \sim B(4, .235)$

Then $X + Y \sim B(9, .235)$. Therefore $P(X + Y > 3) = 1 - P(X + Y \le 2) = 1 - (P(0) + P(1) + P(2)) \approx 0.3573$

## Student’s t Distribution

Practice: t Distribution, Standard Normal Distribution, or Neither

For each of the following situations determine if a confidence interval could be calculated using the standard normal distribution, the t distribution, or if the requirements of neither are met. Explain your reasoning.

1. A sample of $50$ tomatoes is taken from a field. The sample had a mean weight of $120\;\mathrm{grams}$ with a standard deviation of $20\;\mathrm{grams}$.

Answer: The standard normal distribution should be used because the sample size is large. With a large sample the shape of the population distribution and the fact that the population standard deviation is not known is irrelevant.

2. A college instructor analyzes all the midterm scores for biology $101$. The scores are normally distributed with a mean of $72$ and a standard deviation of $13$. One class of twelve scored exceptionally high with an average of $92$. He will treat this small class as a sample.

Answer: The standard normal distribution should be used. Even though the sample size is small, the population is normally distributed and the population standard deviation is known.

3. The time a dog spends in a shelter before being adopted is approximately normally distributed. The Mountain View shelter found homes for $5$ dogs last month. The mean time these dogs spent is the shelter was $4$ months, with a standard deviation of one month.

Answer: The $t$ distribution should be used. The sample is small, the population standard deviation is not known, and the population is approximately normally distributed.

4. Five students compare their recent test scores. Their scores have an average of $81$, and a standard deviation of $6\;\mathrm{percentage}$ points.

Answer: Neither of the distributions can be used. The shape of the population distribution is not known, nor is the population standard deviation.

Feb 23, 2012

Aug 19, 2014