<meta http-equiv="refresh" content="1; url=/nojavascript/"> The z-Score and the Central Limit Theorem | CK-12 Foundation

# 7.2: The z-Score and the Central Limit Theorem

Created by: CK-12

## Learning Objectives

• Calculate the $z-$score of a mean distribution of a random variable in problem situations.
• Understand the Central Limit Theorem and calculate a sampling distribution using the mean and standard deviation of a normally distributed random variable.
• Understand the relationship between the Central Limit Theorem and normal approximation of the binomial distribution.

## Introduction

In the previous lesson you learned that sampling is an important tool for determining the characteristics of a population. Although the parameters of the population (mean, standard deviation, etc.) were unknown, random sampling was used to yield reliable estimates of these values. The estimates were plotted on graphs to provide a visual representation of the distribution of the sample mean for various sample sizes. It is now time to define some properties of the sampling distribution of the sample mean and to examine what we can conclude about the entire population based on it.

All normal distributions have the same basic shape and therefore rescaling and recentering can be implemented to change any normal distributions to one with a mean of zero and a standard deviation of one. This configuration is referred to as standard normal distribution. In this distribution, the variable along the horizontal axis is called the $z-$score. This score is another measure of the performance of an individual score in a population. The $z-$score measures how many standard deviations a score is away from the mean. The $z-$score of a term $x$ in a population distribution whose mean is $\mu$ and whose standard deviation $\sigma$ is given by:

$z = \frac {x - \mu}{\sigma}$

Since $\sigma$ is always positive, $z$ will be positive when $X$ is greater than $\mu$ and negative when $X$ is less than $\mu$. A $z-$score of zero means that the term has the same value as the mean. For the normal standard distribution, where $\mu = 0$, if we let $x = \sigma$, then $z = 1$. If we let $x = 2 \sigma$ , $z = 2$. Thus, a value of $z$ tells the number of standard deviations the given value of $x$ is above or below the mean.

Example: On a nationwide math test the mean was $65$ and the standard deviation was $10$. If Robert scored $81$, what was his $z-$score?

Solution:

$z & = \frac {x - \mu}{\sigma}\\z & = \frac{81 - 65}{10}\\z & = \frac{16}{10}\\z & = 1.6$

Example: On a college entrance exam, the mean was $70$ and the standard deviation was $8$. If Helen’s $z-$score was $-1.5$, what was her exam mark?

Solution:

$z & = \frac{x - \mu} {\sigma}\\\therefore z \cdot \sigma & = x - \mu\\X & = \mu + z \cdot \sigma\\X & = (70) + (-1.5)(8)\\X & = 58$

Now you will see how $z-$scores are used to determine the probability of an event.

Suppose you were to toss $8 \;\mathrm{coins}\ 2560 \;\mathrm{times}$. The following figure shows the histogram and the approximating normal curve for the experiment. The random variable represents the number of tails obtained.

The blue section of the graph represents the probability that exactly $3$ of the coins turned up tails. One way to determine this is by the following

$P (3 \;\mathrm{tails}) & = \frac{_8C_3} {2^8}\\P (3 \;\mathrm{tails}) & = \frac{56} {256}\\P (3 \;\mathrm{tails}) & \cong 0.2186$

Geometrically this probability represents the area of the blue shaded bar divided by the total area of the bars. The area of the shaded bar is approximately equal to the area under the normal curve from $2.5$ to $3.5$.

Since areas under normal curves correspond to the probability of an event occurring, a special normal distribution table is used to calculate the probabilities. This table can be found in any statistics book, but is seldom used today. Below is an example of a table of $z-$scores and a brief explanation of how it works.

As shown in the illustration below, the values inside the given table represent the areas under the standard normal curve for values between $0$ and the relative $z-$score. For example, to determine the area under the curve between $0$ and $2.36$, look in the intersecting cell for the row labeled $2.30$ and the column labeled $0.06$. The area under the curve is $0.4909$. To determine the area between $0$ and a negative value, look in the intersecting cell of the row and column which sums to the absolute value of the number in question. For example, the area under the curve between $-1.3$ and $0$ is equal to the area under the curve between $1.3$ and $0$, so look at the cell on the $1.3$ row and the $0.00$ column (the area is $0.4032$).

$0.00$ $0.01$ $0.02$ $0.03$ $0.04$ $0.05$ $0.06$ $0.07$ $0.08$ $0.09$
$0.0$ $0.0000$ $0.0040$ $0.0080$ $0.0120$ $0.0160$ $0.0199$ $0.0239$ $0.0279$ $0.0319$ $0.0359$
$0.1$ $0.0398$ $0.0438$ $0.0478$ $0.0517$ $0.0557$ $0.0596$ $0.0636$ $0.0675$ $0.0714$ $0.0753$
$0.2$ $0.0793$ $0.0832$ $0.0871$ $0.0910$ $0.0948$ $0.0987$ $0.1026$ $0.1064$ $0.1103$ $0.1141$
$0.3$ $0.1179$ $0.1217$ $0.1255$ $0.1293$ $0.1331$ $0.1368$ $0.1406$ $0.1443$ $0.1480$ $0.1517$
$0.4$ $0.1554$ $0.1591$ $0.1628$ $0.1664$ $0.1700$ $0.1736$ $0.1772$ $0.1808$ $0.1844$ $0.1879$
$0.5$ $0.1915$ $0.1950$ $0.1985$ $0.2019$ $0.2054$ $0.2088$ $0.2123$ $0.2157$ $0.2190$ $0.2224$
$0.6$ $0.2257$ $0.2291$ $0.2324$ $0.2357$ $0.2389$ $0.2422$ $0.2454$ $0.2486$ $0.2517$ $0.2549$
$0.7$ $0.2580$ $0.2611$ $0.2642$ $0.2673$ $0.2704$ $0.2734$ $0.2764$ $0.2794$ $0.2823$ $0.2852$
$0.8$ $0.2881$ $0.2910$ $0.2939$ $0.2967$ $0.2995$ $0.3023$ $0.3051$ $0.3078$ $0.3106$ $0.3133$
$0.9$ $0.3159$ $0.3186$ $0.3212$ $0.3238$ $0.3264$ $0.3289$ $0.3315$ $0.3340$ $0.3365$ $0.3389$
$1.0$ $0.3413$ $0.3438$ $0.3461$ $0.3485$ $0.3508$ $0.3531$ $0.3554$ $0.3577$ $0.3599$ $0.3621$
$1.1$ $0.3643$ $0.3665$ $0.3686$ $0.3708$ $0.3729$ $0.3749$ $0.3770$ $0.3790$ $0.3810$ $0.3830$
$1.2$ $0.3849$ $0.3869$ $0.3888$ $0.3907$ $0.3925$ $0.3944$ $0.3962$ $0.3980$ $0.3997$ $0.4015$
$1.3$ $0.4032$ $0.4049$ $0.4066$ $0.4082$ $0.4099$ $0.4115$ $0.4131$ $0.4147$ $0.4162$ $0.4177$
$1.4$ $0.4192$ $0.4207$ $0.4222$ $0.4236$ $0.4251$ $0.4265$ $0.4279$ $0.4292$ $0.4306$ $0.4319$
$1.5$ $0.4332$ $0.4345$ $0.4357$ $0.4370$ $0.4382$ $0.4394$ $0.4406$ $0.4418$ $0.4429$ $0.4441$
$1.6$ $0.4452$ $0.4463$ $0.4474$ $0.4484$ $0.4495$ $0.4505$ $0.4515$ $0.4525$ $0.4535$ $0.4545$
$1.7$ $0.4554$ $0.4564$ $0.4573$ $0.4582$ $0.4591$ $0.4599$ $0.4608$ $0.4616$ $0.4625$ $0.4633$
$1.8$ $0.4641$ $0.4649$ $0.4656$ $0.4664$ $0.4671$ $0.4678$ $0.4686$ $0.4693$ $0.4699$ $0.4706$
$1.9$ $0.4713$ $0.4719$ $0.4726$ $0.4732$ $0.4738$ $0.4744$ $0.4750$ $0.4756$ $0.4761$ $0.4767$
$2.0$ $0.4772$ $0.4778$ $0.4783$ $0.4788$ $0.4793$ $0.4798$ $0.4803$ $0.4808$ $0.4812$ $0.4817$
$2.1$ $0.4821$ $0.4826$ $0.4830$ $0.4834$ $0.4838$ $0.4842$ $0.4846$ $0.4850$ $0.4854$ $0.4857$
$2.2$ $0.4861$ $0.4864$ $0.4868$ $0.4871$ $0.4875$ $0.4878$ $0.4881$ $0.4884$ $0.4887$ $0.4890$
$2.3$ $0.4893$ $0.4896$ $0.4898$ $0.4901$ $0.4904$ $0.4906$ $0.4909$ $0.4911$ $0.4913$ $0.4916$
$2.4$ $0.4918$ $0.4920$ $0.4922$ $0.4925$ $0.4927$ $0.4929$ $0.4931$ $0.4932$ $0.4934$ $0.4936$
$2.5$ $0.4938$ $0.4940$ $0.4941$ $0.4943$ $0.4945$ $0.4946$ $0.4948$ $0.4949$ $0.4951$ $0.4952$
$2.6$ $0.4953$ $0.4955$ $0.4956$ $0.4957$ $0.4959$ $0.4960$ $0.4961$ $0.4962$ $0.4963$ $0.4964$
$2.7$ $0.4965$ $0.4966$ $0.4967$ $0.4968$ $0.4969$ $0.4970$ $0.4971$ $0.4972$ $0.4973$ $0.4974$
$2.8$ $0.4974$ $0.4975$ $0.4976$ $0.4977$ $0.4977$ $0.4978$ $0.4979$ $0.4979$ $0.4980$ $0.4981$
$2.9$ $0.4981$ $0.4982$ $0.4982$ $0.4983$ $0.4984$ $0.4984$ $0.4985$ $0.4985$ $0.4986$ $0.4986$
$3.0$ $0.4987$ $0.4987$ $0.4987$ $0.4988$ $0.4988$ $0.4989$ $0.4989$ $0.4989$ $0.4990$ $0.4990$

The graphing calculator will give greater accuracy in finding the proportion of values that lie between two specified values in a standard normal distribution.

To use the TI-83 calculator for this operation is quite simple. Follow these steps.

$2^{nd}$ Vars – This will access the distribution function

Scroll down to $2$: normalcdf( enter $\longrightarrow$

This screen appears $\longrightarrow$

Type in the numbers ($0,2.36$ enter $\longrightarrow$

The calculator has given an answer that is more accurate than that given in the chart. However, if the answer is rounded to the nearest ten-thousandth, then both answers would be the same. Using the calculator is a more efficient method of obtaining the $z-$score since you all have them on hand.

Example: For a normal distribution curve based on values of $\sigma = 5$ and $\mu = 20$, find the area between $x = 24$ and $x = 32$.

Solution:

$& z = \frac{x - \mu} {\sigma} & & \text{and} & & z = \frac{x - \mu} {\sigma}\\& z = \frac{24 - 20} {5} & & \text{and} & & z = \frac{32 - 20} {5}\\& z = 0.8 & & \text{and} & & z = 2.4$

Using the TI-83

The area for $z = 0.8$ is $0.2881$ and for $z = 2.4$ is $0.4918$. Therefore the area between $x = 24$ and $x = 32$ is:

$0.4918 - 0.2881 = 0.2037$

This means that the relative frequency of the values between $x = 24$ and $x = 32$ is $20.37\%$.

## Central Limit Theorem

The Central Limit Theorem is a very important theorem in statistics. It basically confirms what might be an intuitive truth to you: that as you increase the number of trials of a random variable, the distribution of the sample trials better approximates a normal distribution.

Before going any further, you should become familiar with (or reacquaint yourself with) the symbols that are commonly used when dealing with properties of the sampling distribution of the sample mean. These symbols are shown in the table below:

Population Parameter Sample Statistic Sampling Distribution
Mean $\mu$ $\bar {x}$

$\mu_{\bar {x}}$

Standard Deviation $\sigma$ $s$

$S_{\bar {x}}$ or $\sigma_{\bar{x}}$

Size $N$ $n$

In the previous lesson, you discovered that the standard error is the standard deviation of the sampling distribution and this value was calculated by using the formula $s = \sqrt{\frac{P \cdot Q} {n}}$. By making a few substitutions, this formula can be rewritten using the symbols from the chart above. The formula $s = \sqrt{\frac{P \cdot Q} {n}}$ can be expressed as the quotient of two radical expressions $s = \frac{\sqrt{P \cdot Q}} {\sqrt{n}}$. The square root of the product of the parameters $P$ and $Q$ is actually the standard deviation of the population $(\sigma)$. When this value is divided by square root of the sample size, the result is the standard error $(s)$, also known as the standard deviation of the sampling distribution $(S_{\bar {x}})$. Therefore $s = \sqrt{\frac{P \cdot Q} {n}}$ can be written as $S_{\bar {x}} = \frac{\sigma} {\sqrt{n}}$ This frequency distribution only approximates the true sampling distribution of the sample mean because a finite number of sample means were used. If, hypothetically, an infinite number of sample means were used, the resulting distribution would be the desired sampling distribution and the following would be true:

$\sigma_{\bar {x}} = \frac{\sigma} {\sqrt{n}}$

The notation $\sigma_{\bar {x}}$ reminds you that this is the standard deviation of the sample mean $(\bar {x})$ and not the standard deviation $(\sigma)$ of a single observation.

The Central Limit Theorem states the following:

• If samples of size $n$ are drawn at random from any population with a finite mean and standard deviation, then the sampling distribution of the sample mean $(\bar {x})$ approximates a normal distribution as $n$ increases.
• The mean of this sampling distribution approximates the population mean as $n$ becomes large:

$\mu \approx \mu_{\bar {x}}$

• The standard deviation of the sample mean is approximately equivalent to the following

$\sigma_{\bar {x}} = \frac{\sigma} {\sqrt{n}}$

These properties of the sampling distribution of the mean can be applied to determining probabilities. The sampling distribution of the sample mean can be assumed to be approximately normal, even if the population is not normally distributed. Now that it has been clarified that the sampling distribution of the mean is approximately normal, let’s see how these properties work. Suppose you wanted to answer the question, “What is the probability that a random sample of $20$ families in Canada will have an average of $1.5$ pets or fewer?” where the mean of the population is $0.8$ and the standard deviation of the population is $1.2$.

For the sampling distribution $\mu_{\bar {x}} = \mu = 0.8$ and $\sigma_{\bar {x}} = \frac{\sigma} {\sqrt{n}} = \frac{1.2} {\sqrt{20}} \approx 0.27$

Using technology, a sketch of this problem is

The shaded area shows the probability that the sample mean is less than $1.5$.

The $z â€“$ score for the value $1.5$ is $z = \frac{\bar {x} - \mu_{\bar {x}}} {\sigma_{\bar{x}}} \approx \frac{1.5 - 0.8} {0.27} \approx 2.6$

As shown above, the area under the standard normal curve to the left of $1.5$ (a $z-$score of $2.6$) is approximately $0.9937$. This value can also be determined by using the graphing calculator

The probability that the sample mean will be below $1.5$ is

Feb 23, 2012

Jul 03, 2014