<meta http-equiv="refresh" content="1; url=/nojavascript/"> The Standard Normal Probability Distribution | CK-12 Foundation

# 5.1: The Standard Normal Probability Distribution

Difficulty Level: At Grade Created by: CK-12

## Learning Objectives

• Identify the characteristics of a normal distribution.
• Identify and use the Empirical Rule ($68-95-99.7$ rule) for normal distributions.
• Calculate a $z-$score and relate it to probability.
• Determine if a data set corresponds to a normal distribution.

## Introduction

Most high schools have a set amount of time in between classes in which students must get to their next class. If you were to stand at the door of your statistics class and watch the students coming in, think about how the students would enter. Usually, one or two students enter early, then more students come in, then a large group of students enter, and then the number of students entering decreases again, with one or two students barely making it on time, or perhaps even coming in late! Try the same by watching students enter your school cafeteria at lunchtime. Spend some time in a fast food restaurant or café before, during, and after the lunch hour and you will most likely observe similar behavior.

Have you ever popped popcorn in a microwave? Think about what happens in terms of the rate at which the kernels pop. Better yet, actually do it and listen to what happens! For the first few minutes nothing happens, then after a while a few kernels start popping. This rate increases to the point at which you hear most of the kernels popping and then it gradually decreases again until just a kernel or two pops. Try measuring the height, or shoe size, or the width of the hands of the students in your class. In most situations, you will probably find that there are a couple of very students with very low measurements and a couple with very high measurements with the majority of students centered around a particular value.

Sometimes the door handles in office buildings show a wear pattern caused by thousands, maybe millions of times being pulled or pushed to open the door. Often you will see that there is a middle region that shows by far the most amount of wear at the place where people opening the door are the most likely to grab the handle, surrounded by areas on either side showing less wear. On average, people are more likely to have grabbed the handle in the same spot and less likely to use the extremes on either side.

All of these examples show a typical pattern that seems to be a part of many real life phenomena. In statistics, because this pattern is so pervasive, it seems to fit to call it “normal”, or more formally the normal distribution. The normal distribution is an extremely important concept because it occurs so often in the data we collect from the natural world, as well as many of the more theoretical ideas that are the foundation of statistics. This chapter explores the details of the normal distribution.

## The Characteristics of a Normal Distribution

### Shape

If you think of graphing data from each of the examples in the introduction, the distributions from each of these situations would be mound-shaped and mostly symmetric. A normal distribution is a perfectly symmetric, mound-shaped distribution. It is commonly referred to the as a normal, or bell curve.

Because so many real data sets closely approximate a normal distribution, we can use the idealized normal curve to learn a great deal about such data. In practical data collection, the distribution will never be exactly symmetric, so just like situations involving probability, a true normal distribution results from an infinite collection of data, or from the probabilities of a continuous random variable.

### Center

Due to this exact symmetry the center of the normal distribution, or a data set that approximates a normal distribution, is located at the highest point of the distribution, and all the statistical measures of center we have already studied, mean, median, and mode are equal.

It is also important to realize that this center peak divides the data into two equal parts.

Let’s go back to our popcorn example. The bag advertises a certain time, beyond which you risk burning the popcorn. From experience, the manufacturers know when most of the popcorn will stop popping, but there is still a chance that a rare kernel will pop after longer, or shorter periods of time. The directions usually tell you to stop when the time between popping is a few seconds, but aren’t you tempted to keep going so you don’t end up with a bag full of un-popped kernels? Because this is real, and not theoretical, there will be a time when it will stop popping and start burning, but there is always a chance, no matter how small, that one more kernel will pop if you keep the microwave going. In the idealized normal distribution of a continuous random variable, the distribution continues infinitely in both directions.

Because of this infinite spread, range would not be a possible statistical measure of spread. The most common way to measure the spread of a normal distribution then is using the standard deviation, or typical distance away from the mean. Because of the symmetry of a normal distribution, the standard deviation indicates how far away from the maximum peak the data will be. Here are two normal distributions with the same center(mean):

The first distribution pictured above has a smaller standard deviation and so the bulk of the data is concentrated more heavily around the mean. There is less data at the extremes compared to the second distribution pictured above, which has a larger standard deviation and therefore the data is spread farther from the mean value with more of the data appearing in the tails.

## Investigating the Normal Distribution on a TI-83/4 Graphing Calculator

We can graph a normal curve for a probability distribution on the TI-83/4. Press [y=]. To create a normal distribution, we will draw an idealized curve using something called a density function. We will learn more about density functions in the next lesson. The command is called a probability density function and it is found by pressing [2nd] [DISTR] [1]. Enter an $X$ to represent the random variable, followed by the mean and the standard deviation. For this example, choose a mean of $5$ and a standard deviation of $1$.

Choose [2nd] [QUIT] to go to the home screen. We can draw a vertical line at the mean to show it is in the center of the distribution by pressing [2nd] [DRAW] and choosing VERTICAL. Enter the mean (5) and press [ENTER]

Remember that even though the graph appears to touch the $x-$axis it is actually just very close to it.

This will graph $3$ different normal distributions with various standard deviations to make it easy to see the change in spread.

## The Empirical Rule

Because of the similar shape of all normal distributions we can measure the percentage of data that is a certain distance from the mean no matter what the standard deviation of the set is. The following graph shows a normal distribution with $\mu=0$ and $\sigma=1$. This curve is called a standard normal distribution. In this case, the values of $x$ represent the number of standard deviations away from the mean.

Notice that vertical lines are drawn at points that are exactly one standard deviation to the left and right of the mean. We have consistently described standard deviation as a measure of the “typical” distance away from the mean. How much of the data is actually within one standard deviation of the mean? To answer this question, think about the space, or area under the curve. The entire data set, or $100\%$ of it, is contained by the whole curve. What percentage would you estimate is between the two lines? It is a reasonable estimate to say it is about $2/3$ of the total area.

In a more advanced statistics course, you could use calculus to actually calculate this area. To help estimate the answer, we can use a graphing calculator. Graph a standard normal distribution over an appropriate window.

Now press [2nd] [DISTR] and choose DRAW ShadeNorm. Insert $-1$, $1$ after the ShadeNorm command and it will shade the area within one standard deviation of the mean.

The calculator also gives a very accurate estimate of the area. We can see from this that approximately $68\;\mathrm{percent}$ of the area is within one standard deviation of the mean. If we venture two standard deviations away from the mean, how much of the data should we expect to capture? Make the changes to the ShadeNorm command to find out.

Notice from the shading, that almost all of the distribution is shaded and the percentage of data is close to $95\%$. If you were to venture $3$ standard deviations from the mean, $99.7\%$, or virtually all of the data is captured, which tells us that very little of the data in a normal distribution is more than $3$ standard deviations from the mean.

Notice that the shading of the calculator actually makes it look like the entire distribution is shaded because of the limitations of the screen resolution, but as we have already discovered, there is still some area under the curve further out than that. These three approximate percentages, $68, 95$ and $99.7$ are extremely important and useful for beginning statistics students and is called the empirical rule.

The empirical rule states that the percentages of data in a normal distribution within $1, 2$, and $3$ standard deviations of the mean, are approximately $68, 95$, and $99.7$, respectively.

## Z-Scores

A $z-$score is a measure of the number of standard deviations a particular data point is away from the mean. For example, let’s say the mean score on a test for your statistics class were an $82$ with a standard deviation of $7$ points. If your score was an $89$, it is exactly one standard deviation to the right of the mean, therefore your $z-$score would be $1$. If, on the other hand you scored a $75$, your score is exactly one standard deviation below the mean, and your $z-$score would be $-1$. To show that it is below the mean, we will assign it a $z-$score of negative one. All values that are below the mean will have negative $z-$scores. A $z-$score of negative two would represent a value that is exactly $2$ standard deviations below the mean, or $82 - 14 = 68$ in this example.

To calculate a $z-$score in which the numbers are not so obvious, you take the deviation and divide it by the standard deviation.

$z=\frac{\text{Deviation}}{\text{Standard Deviation}}$

You may recall that deviation is the observed value of the variable, subtracted by the mean value, so in symbolic terms, the $z-$score would be:

$z=\frac {x-\bar x}{sd}$

Ex. What is the $z-$score for an $A$ on this test? (assume that an $A$ is a $93$).

$z&=\frac {x-\bar x}{sd}\\z&=\frac {93-82}{7}\\z&=\frac {11}{7}\approx 1.57$

It is not necessary to have a normal distribution to calculate a $z-$score, but the $z-$score has much more significance when it relates to a normal distribution. For example, if we know that the test scores from the last example are distributed normally, then a $z-$score can tell us something about how our test score relates to the rest of the class. From the empirical rule we know that about $68\;\mathrm{percent}$ of the students would have scored between a $z-$score of $-1$ and $1$, or between a $75$ and an $89$. If $68\%$ of the data is between those two values, then that leaves a remaining $32\%$ in the tail areas. Because of symmetry, that leaves $16\%$ in each individual tail.

If we combine the two percentages, approximately $84\%$ of the data is below an $89$ score. We typically refer to this as a percentile. A student with this score could conclude that he or she performed better than $84\%$ of the class, and that he or she was in the $84^{th}$ percentile.

This same conclusion can be put in terms of a probability distribution as well. We could say that if a student from this class were chosen at random the probability that we would choose a student with a score of $89$ or less is $.84$, or there is an $84\%$ chance of picking such a student.

## Assessing Normality

The best way to determine if a data set approximates a normal distribution is to look at a visual representation. Histograms and box plots can be useful indicators of normality, but are not always definitive. It is often easier to tell if a data set is not normal from these plots.

If a data set is skewed right it means that the right tail is significantly larger than the left. Likewise, skewed left means the left tail has more weight than the right. A bimodal distribution has two modes, or peaks, as if two normal distributions were added together. Multimodal distributions with two or more modes often reflect two different types. For instance, a histogram of the heights of American $30$-year-old adults, you will see a bimodal distribution -- one mode for males, one mode for females.

Now that we know how to calculate $z-$scores, there is a plot we can use to determine if a distribution is normal. If we calculate the $z-$scores for a data set and plot them against the actual values, this is called a normal probability plot, or a normal quantile plot. If the data set is normal, then this plot will be perfectly linear. The closer to being linear the normal probability plot is, the more closely the data set approximates a normal distribution.

Look below at a histogram and the normal probability plot for the same data.

The histogram is fairly symmetric and mound-shaped and appears to display the characteristics of a normal distribution. When the $z-$scores are plotted against the data values, the normal probability plot appears strongly linear, indicating that the data set closely approximates a normal distribution.

Example:

The following data set tracked high school seniors' involvement in traffic accidents. The participants were asked the following question: “During the last $12$ months, how many accidents have you had while you were driving (whether or not you were responsible)?”

Year Percentage of high school seniors who said they were involved in no traffic accidents
1991 $75.7$
1992 $76.9$
1993 $76.1$
1994 $75.7$
1995 $75.3$
1996 $74.1$
1997 $74.4$
1998 $74.4$
1999 $75.1$
2000 $75.1$
2001 $75.5$
2002 $75.5$
2003 $75.8$

Figure: Percentage of high school seniors who said they were involved in no traffic accidents. Source: Sourcebook of Criminal Justice Statistics: http://www.albany.edu/sourcebook/pdf/t352.pdf

Here is a histogram and a box plot of this data.

The histogram appears to show a roughly mound-shaped and symmetric distribution. The box plot does not appear to be significantly skewed, but the various sections of the plot also do not appear to be overly symmetric either. In the following chart the $z-$scores for this data set have been calculated. The mean percentage is approximately $75.35$

Year Percentage $z-$score
1991 $75.7$ $.45$
1992 $76.9$ $2.03$
1993 $76.1$ $.98$
1994 $75.7$ $.45$
1995 $75.3$ $-.07$
1996 $74.1$ $-1.65$
1997 $74.4$ $-1.25$
1998 $74.4$ $-1.25$
1999 $75.1$ $-.33$
2000 $75.1$ $-.33$
2001 $75.5$ $.19$
2002 $75.5$ $.19$
2003 $75.8$ $.59$

Figure: Table of $z-$scores for senior no-accident data.

Here is a plot of the percentages and the $z-$scores, or the normal probability plot.

While not perfectly linear, this plot shows does have a strong linear pattern and we would therefore conclude that the distribution is reasonably normal.

One additional clue about normality might be gained from investigating the empirical rule. Remember than in an idealized normal curve, approximately $68\%$ of the data should be within one standard deviation of the mean. If we count, there are $9\;\mathrm{years}$ for which the $z-$scores are between $-1$ and $1$. As a percentage of the total data, $9/13$ is about $69\%$, or very close to the ideal value. This data set is so small that it is difficult to verify the other percentages, but they are still not unreasonable. About $92\%$ of the data (all but one of the points) ends up within $2$ standard deviations of the mean, and all of the data (Which is in line with the theoretical $99.7\%$) is located between $z-$scores of $-3$ and $3$.

## Lesson Summary

A normal distribution is a perfectly symmetric, mound-shaped distribution that appears in many practical and real data sets and is an especially important foundation for making conclusions about data called inference. A standard normal distribution is a normal distribution in which the mean is $0$ and the standard deviation is $1$.

A $z-$score is a measure of the number of standard deviations a particular data value is away from the mean. The formula for calculating a $z-$score is:

$z=\frac {x-\bar x}{sd}$

$Z-$scores are useful for comparing two distributions with different centers and/or spreads. When you convert an entire distribution to $z-$scores, you are actually changing it to a standardized distribution. A distribution has $z-$scores regardless of whether or not it is normal in shape. If the distribution is normal, however, the $z-$scores are useful in explaining how much of the data is contained within a certain distance of the mean. The empirical rule is the name given to the observation that approximately $68\%$ of the data is within $1$ standard deviation of the mean, about $95\%$ is within $2$ standard deviations of the mean, and $99.7\%$ of the data is within $3$ standard deviations of the mean. Some refer to this as the $68-95-99.7$.

There is no straight-forward test for normality. You should learn to recognize the normality of a distribution by examining the shape and symmetry of its visual display. However, a normal probability or normal quantile plot is a useful tool to help check the normality of a distribution. This graph is a plot of the $z-$scores of a data set against the actual values. If the distribution is normal, this plot will be linear.

## Points To Consider

1. How can we use normal distributions to make meaningful conclusions about samples and experiments?
2. How do we calculate probabilities and areas under the normal curve that are not covered by the empirical rule?
3. What are the other types of distributions that can occur in different probability situations?

## Review Questions

1. Which of the following data sets is most likely to be normally distributed? For the other choices, explain why you believe they would not follow a normal distribution.
1. The hand span (measured from the tip of the thumb to the tip of the extended $5^{th}$ finger) of a random sample of high school seniors.
2. The annual salaries of all employees of a large shipping company.
3. The annual salaries of a random sample of $50$ CEOs of major companies, $25$ women and $25$ men.
4. The dates of $100$ pennies taken from a cash drawer in a convenience store.
2. The grades on a statistics mid-term for a high school are normally distributed with $\mu = 81$ and $\sigma = 6.3$. Calculate the $z-$scores for each of the following exam grades. Draw and label a sketch for each example.
1. $65$
2. $83$
3. $93$
4. $100$
3. Assume that the mean weight of $1$ year-old girls in the US is normally distributed with a mean of about $9.5 \;\mathrm{kilograms}$ with a standard deviation of approximately $1.1 \;\mathrm{kilograms}$. Without using a calculator, estimate the percentage of $1$ year-old girls in the US that meet the following conditions. Draw a sketch and shade the proper region for each problem.
1. Less than $8.4 \;\mathrm{kg}$
2. Between $7.3 \;\mathrm{kg}$ and $11.7 \;\mathrm{kg}$
3. More than $12.8 \;\mathrm{kg}$
4. For a standard normal distribution, place the following in order from smallest to largest.
1. The percentage of data below $1$
2. The percentage of data below $-1$
3. The mean
4. The standard deviation
5. The percentage of data above $2$
5. The 2007 AP Statistics examination scores were not normally distributed, with $\mu = 2.80$ and $\sigma = 1.34^1$. What is the approximate $z-$score that corresponds to an exam score of $5$ (The scores range from $1-5$).
1. $0.786$
2. $1.46$
3. $1.64$
4. $2.20$
5. A $z-$score can not be calculated because the distribution is not normal.

$^1$Data available on the College Board Website:

6. The heights of $5^{th}$ grade boys in the United States is approximately normally distributed with a mean height of $143.5 \;\mathrm{cm}$ and a standard deviation of about $7.1 \;\mathrm{cm}$. What is the probability that a randomly chosen $5^{th}$ grade boy would be taller than $157.7 \;\mathrm{cm}$?
7. A statistics class bought some sprinkle (or jimmies) doughnuts for a treat and noticed that the number of sprinkles seemed to vary from doughnut to doughnut. So, they counted the sprinkles on each doughnut. Here are the results: $241, 282, 258, 224, 133, 335, 322, 323, 354, 194, 332, 274, 233, 147, 213, 262, 227, 366$

(a) Create a histogram, dot plot, or box plot for this data. Comment on the shape, center and spread of the distribution.

(b) Find the mean and standard deviation of the distribution of sprinkles. Complete the following chart by standardizing all the values:

$\mu = \underline{\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;} \qquad \qquad \sigma = \underline{\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;}$

Number of Sprinkles Deviation $Z-$scores
$241$
$282$
$258$
$223$
$133$
$335$
$322$
$323$
$354$
$194$
$332$
$274$
$233$
$147$
$213$
$262$
$227$
$366$

Figure: A table to be filled in for the sprinkles question.

(c) Create a normal probability plot from your results.

(d) Based on this plot, comment on the normality of the distribution of sprinkle counts on these doughnuts.

Open-ended Investigation: Munchkin Lab.

Teacher Notes: For this activity, obtain two large boxes of Dunkin Donuts’ munchkins. Each box should contain only one type of munchkin. I have found students prefer the glazed and the chocolate, but the activity can be modified according to your preference. If you do not have Dunkin Donuts near you, the bakery section of your supermarket should have boxed donut holes or something similar you can use. You will also need an electronic balance capable of measuring to the nearest $10^{th}$ of a gram. Your science teachers will be able to help you out with this if you do not have one. I have used this activity before introducing the concepts in this chapter. If you remove the words “$z-$score”, the normal probability plot and the last two questions, students will be able to investigate and develop an intuitive understanding for standardized scores and the empirical rule, before defining them. Experience has shown that this data very closely approximates a normal distribution and students will be able to calculate the $z-$scores and verify that their results come very close to the theoretical values of the empirical rule.

1. You would expect this situation to vary normally with most students’ hand spans centering around a particular value and a few students having much larger or much smaller hand spans.
2. Most employees could be hourly laborers and drivers and their salaries might be normally distributed, but the few management and corporate salaries would most likely be much higher, giving a skewed right distribution.
3. Many studies have been published detailing the shrinking, but still prevalent income gap between male and female workers. This distribution would most likely be bi-modal, with each gender distribution by itself possibly being normal.
4. You might expect most of the pennies to be this year or last year, fewer still in the previous few years, and the occasional penny that is even older. The distribution would most likely be skewed left.
1. $z \approx -2.54$
2. $z \approx 0.32$
3. $z \approx 1.90$
4. $z \approx 3.02$
1. Because the data is normally distributed, students should use the $68-95-99.7$ rule to answer these questions.
1. about $16\%$ (less than one standard deviation below the mean)
2. about $95\%$ (within $2$ standard deviations)
3. about $0.15\%$ (more than $3$ standard deviations above the mean)
2. The standard normal curve has a mean of zero and a standard deviation of one, so all the values correspond to $z-$scores. The corresponding values are approximately:
1. $0.84$
2. $0.16$
3. $0$
4. $1$
5. $0.025$

Therefore the correct order is: c, e, b, a, d

3. c
4. $0.025. 157.7$ is exactly $2$ standard deviations above the mean height. According to the empirical rule, the probability of a randomly chosen value being within $2$ standard deviations is about $0.95$, which leaves $0.05$ in the tails. We are interested in the upper tail only as we are looking for the probability of being above this value.
5. (a) Here are the possible plots showing a symmetric, mound shaped distribution. (b) $\mu = 262.222 \qquad \qquad s = 67.837$
Number of Sprinkles Deviations $Z-$scores
$241$ $-21.2222$ $-0.313$
$282$ $19.7778$ $0.292$
$258$ $-4.22222$ $-0.062$
$223$ $-38.2222$ $-0.563$
$133$ $-129.222$ $-1.905$
$335$ $72.7778$ $1.073$
$322$ $59.7778$ $0.881$
$323$ $60.7778$ $0.896$
$354$ $91.7778$ $1.353$
$194$ $-68.2222$ $-1.006$
$332$ $69.7778$ $1.029$
$274$ $11.7778$ $0.174$
$233$ $-29.2222$ $-0.431$
$147$ $-115.222$ $-1.699$
$213$ $-49.2222$ $-0.726$
$262$ $-0.222222$ $-0.003$
$227$ $-35.2222$ $-0.519$
$366$ $103.778$ $1.530$

(c)

(d) The normal probability plot shows a fairly linear pattern which is an indication that there are no obvious departures from normality in this distribution.

## References

$^1$http://www.albany.edu/sourcebook/pdf/t352.pdf

Feb 23, 2012

Jul 03, 2014