<meta http-equiv="refresh" content="1; url=/nojavascript/"> The z-Score and the Central Limit Theorem | CK-12 Foundation
You are reading an older version of this FlexBook® textbook: CK-12 Probability and Statistics - Advanced (Second Edition) Go to the latest version.

# 7.2: The z-Score and the Central Limit Theorem

Difficulty Level: At Grade Created by: CK-12

## Learning Objectives

• Understand the Central Limit Theorem and calculate a sampling distribution using the mean and standard deviation of a normally distributed random variable.
• Understand the relationship between the Central Limit Theorem and the normal approximation of a sampling distribution.

## Introduction

In the previous lesson, you learned that sampling is an important tool for determining the characteristics of a population. Although the parameters of the population (mean, standard deviation, etc.) were unknown, random sampling was used to yield reliable estimates of these values. The estimates were plotted on graphs to provide a visual representation of the distribution of the sample means for various sample sizes. It is now time to define some properties of a sampling distribution of sample means and to examine what we can conclude about the entire population based on these properties.

### Central Limit Theorem

The Central Limit Theorem is a very important theorem in statistics. It basically confirms what might be an intuitive truth to you: that as you increase the sample size for a random variable, the distribution of the sample means better approximates a normal distribution.

Before going any further, you should become familiar with (or reacquaint yourself with) the symbols that are commonly used when dealing with properties of the sampling distribution of sample means. These symbols are shown in the table below:

Population Parameter Sample Statistic Sampling Distribution
Mean $\mu$ $\bar{x}$ $\mu_{\bar{x}}$
Standard Deviation $\sigma$ $s$ $S_{\bar{x}}$ or $\sigma_{\bar{x}}$
Size $N$ $n$

As the sample size, $n$, increases, the resulting sampling distribution would approach a normal distribution with the same mean as the population and with $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$. The notation $\sigma_{\bar{x}}$ reminds you that this is the standard deviation of the distribution of sample means and not the standard deviation of a single observation.

The Central Limit Theorem states the following:

If samples of size $n$ are drawn at random from any population with a finite mean and standard deviation, then the sampling distribution of the sample means, $\bar{x}$, approximates a normal distribution as $n$ increases.

The mean of this sampling distribution approximates the population mean, and the standard deviation of this sampling distribution approximates the standard deviation of the population divided by the square root of the sample size: $\mu_{\bar{x}}=\mu$ and $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$.

These properties of the sampling distribution of sample means can be applied to determining probabilities. If the sample size is sufficiently large $(>30)$, the sampling distribution of sample means can be assumed to be approximately normal, even if the population is not normally distributed.

Example: Suppose you wanted to answer the question, “What is the probability that a random sample of 20 families in Canada will have an average of 1.5 pets or fewer?” where the mean of the population is 0.8 and the standard deviation of the population is 1.2.

For the sampling distribution, $\mu_{\bar{x}}=\mu=0.8$ and $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}} = \frac{1.2}{\sqrt{20}} = 0.268$.

Using technology, a sketch of this problem is as follows:

The shaded area shows the probability that the sample mean is less than 1.5.

The $z$-score for the value 1.5 is $z = \frac{\bar{x}-\mu_{\bar{x}}}{\sigma_{\bar{x}}} = \frac{1.5-0.8}{0.27} \approx 2.6$.

As shown above, the area under the standard normal curve to the left of 1.5 (a $z$-score of 2.6) is approximately 0.9937. This value can also be determined by using a graphing calculator as follows:

Therefore, the probability that the sample mean will be below 1.5 is 0.9937. In other words, with a random sample of 20 families, it is almost definite that the average number of pets per family will be less than 1.5.

The properties associated with the Central Limit Theorem are displayed in the diagram below:

The vertical axis now reads probability density, rather than frequency, since frequency can only be used when you are dealing with a finite number of sample means. Sampling distributions, on the other hand, are theoretical depictions of an infinite number of sample means, and probability density is the relative density of the selections from within this set.

Example: A random sample of size 40 is selected from a known population with a mean of 23.5 and a standard deviation of 4.3. Samples of the same size are repeatedly collected, allowing a sampling distribution of sample means to be drawn.

a) What is the expected shape of the resulting distribution?

b) Where is the sampling distribution of sample means centered?

c) What is the approximate standard deviation of the sample means?

The question indicates that multiple samples of size 40 are being collected from a known population, multiple sample means are being calculated, and then the sampling distribution of the sample means is being studied. Therefore, an understanding of the Central Limit Theorem is necessary to answer the question.

a) The sampling distribution of the sample means will be approximately bell-shaped.

b) The sampling distribution of the sample means will be centered about the population mean of 23.5.

c) The approximate standard deviation of the sample means is 0.68, which can be calculated as shown below:

$\sigma_{\bar{x}}&=\frac{\sigma}{\sqrt{n}}\\\sigma_{\bar{x}} & = \frac{4.3}{\sqrt{40}}\\\sigma_{\bar{x}} & = 0.68$

Example: Multiple samples with a sample size of 40 are taken from a known population, where $\mu=25$ and $\sigma=4$. The following chart displays the sample means:

$25 && 25 && 26 && 26 && 26 && 24 && 25 && 25 && 24 && 25\\26 && 25 && 26 && 25 && 24 && 25 && 25 && 25 && 25 && 25\\24 && 24 && 24 && 24 && 26 && 26 && 26 && 25 && 25 && 25\\25 && 25 && 24 && 24 && 25 && 25 && 25 && 24 && 25 && 25\\25 && 24 && 25 && 25 && 24 && 26 && 24 && 26 && 24 && 26\\$

a) What is the population mean?

b) Using technology, determine the mean of the sample means.

c) What is the population standard deviation?

d) Using technology, determine the standard deviation of the sample means.

e) As the sample size increases, what value will the mean of the sample means approach?

f) As the sample size increases, what value will the the standard deviation of the sample means approach?

a) The population mean of 25 was given in the question: $\mu=25$.

b) The mean of the sample means is 24.94 and is determined by using '1 Vars Stat' on the TI-83/84 calculator: $\mu_{\bar{x}}=24.94$.

c) The population standard deviation of 4 was given in the question: $\sigma = 4$.

d) The standard deviation of the sample means is 0.71 and is determined by using '1 Vars Stat' on the TI-83/84 calculator: $S_{\bar{x}}=0.71$. Note that the Central Limit Theorem states that the standard deviation should be approximately $\frac{4}{\sqrt{40}}=0.63$.

e) The mean of the sample means will approach 25 and is determined by a property of the Central Limit Theorem: $\mu_{\bar{x}}=25$.

f) The standard deviation of the sample means will approach $\frac{4}{\sqrt{n}}$ and is determined by a property of the Central Limit Theorem: $\sigma_{\bar{x}}=\frac{4}{\sqrt{n}}$.

On the Web

http://tinyurl.com/2f969wj Explore how the sample size and the number of samples affect the mean and standard deviation of the distribution of sample means.

## Lesson Summary

The Central Limit Theorem confirms the intuitive notion that as the sample size increases for a random variable, the distribution of the sample means will begin to approximate a normal distribution, with the mean equal to the mean of the underlying population and the standard deviation equal to the standard deviation of the population divided by the square root of the sample size, $n$.

## Point to Consider

• How does sample size affect the variation in sample results?

For an explanation of the Central Limit Theorem (16.0), see Lutemann, The Central Limit Theorem, Part 1 of 2 (2:29).

For the second part of the explanation of the Central Limit Theorem (16.0), see Lutemann, The Central Limit Theorem, Part 2 of 2 (4:39).

For an example of using the Central Limit Theorem (9.0), see jsnider3675, Application of the Central Limit Theorem, Part 1 (5:44).

For the continuation of an example using the Central Limit Theorem (9.0), see jsnider3675, Application of the Central Limit Theorem, Part 2 (6:38).

## Review Questions

1. The lifetimes of a certain type of calculator battery are normally distributed. The mean lifetime is 400 days, with a standard deviation of 50 days. For a sample of 6000 new batteries, determine how many batteries will last:
1. between 360 and 460 days.
2. more than 320 days.
3. less than 280 days.

Feb 23, 2012

Dec 15, 2014