<meta http-equiv="refresh" content="1; url=/nojavascript/"> The Density Curve of the Normal Distribution | CK-12 Foundation
Dismiss
Skip Navigation
You are reading an older version of this FlexBook® textbook: CK-12 Probability and Statistics - Advanced Go to the latest version.

5.2: The Density Curve of the Normal Distribution

Created by: CK-12
 0  0  0 Share To Groups

Learning Objectives

  • Identify the properties of a normal density curve, and the relationship between concavity and standard deviation.
  • Convert between z-scores and areas under a normal probability curve.
  • Calculate probabilities that correspond to left, right, and middle areas from a left-tail z-score table.
  • Calculate probabilities that correspond to left, right, and middle areas using a graphing calculator.

Introduction

In this section we will continue our investigation of normal distributions to include density curves and learn various methods for calculating probabilities from the normal density curve.

Density Curves

A density curve is an idealized representation of a distribution in which the area under the curve is defined to be 1. Density curves need not be normal, but the normal density curve will be the most useful to us.

Inflection Points on a Normal Density Curve

We already know from the empirical rule, that approximately 2/3 of the data in a normal distribution lies within 1 standard deviation of the mean. In a density curve, this means that about 68\% of the total area under the curve is within z-scores of \pm 1. Look at the following three density curves:

Notice that the curves are spread increasingly wider. Lines have been drawn to show the points one standard deviation on either side of the mean. Look at where this happens on each density curve. Here is a normal distribution with an even larger standard deviation.

Could you predict the standard deviation of this distribution from estimating the point on the density curve?

You may notice that the density curve changes shape at this point in each of our examples. In Calculus, we learn to call this shape changing location an inflection point. It is the point where the curve changes concavity. Starting from the mean and heading outward to the left and right, the curve is concave down (it looks like a mountain, or "n" shape). After passing this point, the curve is concave up (it looks like a valley or "u" shape). We will leave it to the Calculus students to prove it, but in a normal density curve, this inflection point is always exactly one standard deviation away from the mean.

In this example, the standard deviation was 3\;\mathrm{units}. We can use these concepts to estimate the standard deviation of a normally distributed data set.

Can you estimate the standard deviation of the distribution represented by the following histogram?

This distribution is fairly normal, so we could draw a density curve to approximate it as follows.

Now estimate the inflection points:

It appears that the mean is about 0.5 and the inflection points are 0.45 and 0.55 respectively. This would lead to an estimate of about 0.05 for the standard deviation.

The actual statistics for this distribution are:

s \approx 0.04988\\\bar x \approx 0.04997 We can verify this using expectations from the empirical rule. In the following graph, we have highlighted the bins that are contained within one standard deviation of the mean.

If you estimate the relative frequencies from each bin, they total remarkably close to 68\%!

Calculating Density Curve Areas

While it is convenient to estimate areas using the empirical rule, we need more precise methods to calculate the areas for other values. In Calculus you study methods for calculating the area under a curve, but in statistics, we are not so concerned about the specific method used to calculate these areas. We will use formulas or technology to do the calculations for us.

Z-Tables

Before software and graphing calculator technology was readily available, it was common to use tables to approximate the amount of area under a normal density curve between any two given z-scores. We have included two commonly used tables at the end of this lesson. Here are a few things you should know about reading these tables:

The values in these tables are all in terms of z-scores, or standardized, meaning that they correspond to a standard normal curve in which the mean is 0 and the standard deviation is 1. It is important to understand that the table shows the areas below the given z-score in the table. It is possible and often necessary to calculate the area above, or between z-scores as well. You could generate new tables to show these values, but it is just as easy to calculate them from the one table.

The values in these tables can represent areas under the density curve. For example, .500 means half of the area (because the area of the total density curve is 1). However, they are most frequently expressed as probabilities, e.g. .500 means the probability of a randomly chosen value from this distribution being in that region is .5, or a 50\% chance.

Z-scores must be rounded to the nearest hundredth to use the table.

Most z-score tables do not go much beyond 3 standard deviations away from the mean in either direction because as you know, the probability of experiencing results that extreme in a normal distribution is very low.

Table 5.5 shows those below the mean and Table 5.6 shows values of z-scores that are to the right of the mean. To help you understand how to read the table, look at the top left entry of Table 5.6. It reads .500.

Think of the table as a stem and leaf plot with the stem of the z-scores running down the left side of the table and the leaves across the top. The leaves represent 100\mathrm{ths} of a z-score. So, this value represents a z-score of 0.00. This should make sense because we are talking about the actual mean.

Let’s look at another common value. In Table 5.6 find the z-score of 1 and read the associated probability value.

As we have already discovered, approximately 84\% of the data is below this value (68\% in the middle, and 16\% in the tail). This corresponds to the probability in the table of .8413.

Now find the probability for a z-score of -1.58. It is often a good idea to estimate this value before using the table when you are first getting started. This z-score is between -2 and -1. We know from the empirical rule that the probability for z = -1 is approximately .16 and similarly, for -2 it is around .025, so we should expect to get a value somewhere between these two estimates.

Locate the stem and the leaf for -1.58 on Table 5.5 and follow them across and down to the corresponding probability. The answer appears to be approximately 0.0571, or approximately 5.7\% of the data in a standard normal curve is below a z-score of -1.58.

It is extremely important, especially when you first start with these calculations, that you get in the habit of relating it to the normal distribution by drawing a sketch of the situation. In this case, simply draw a sketch of a standard normal curve with the appropriate region shaded and labeled.

Let’s try an example in which we want to find the probability of choosing a value that is greater than z = -0.528. Before even using the table, draw a sketch and estimate the probability. This z-score is just below the mean, so the answer should be more than 0.5. The z-score of –0.5 would be half way between 0 and -1, but because there is more area concentrated around the mean, we could guess that there should be more than half of the 34\% of the area in this section. If we were to guess about 20-25\%, we would estimate an answer of between 0.70 and 0.75.

First read the table to find the correct probability for the data below this z-score. We must first round this z-score to -0.53. This will slightly under-estimate the probability, but it is the best we can do using the table. The table returns a value of 0.2981 as the area below this z-score. Because the area under the density curve is equal to 1, we can subtract this value from 1 to find the correct probability of about .7019.

What about values between two z-scores? While it is an interesting and worthwhile exercise to do this using a table, it is so much simpler using software or a graphing calculator that we will leave this for one of the homework exercises.

Using Graphing Calculators: The Normal CDF Command.

Your graphing calculator has already been programmed to calculate probabilities for a normal density curve using what is called a cumulative density function or cdf. This is found in the distributions menu above the VARS key.

Press [2nd] [VARS], [2] to select the normalcdf (command. normalcdf( lower bound, upper bound, mean, standard deviation)

The command has been programmed so that if you do not specify a mean and standard deviation, it will default to the standard normal curve with \mu = 0 and \sigma = 1.

For example, entering normalcdf (-1,1) will specify the area within one standard deviation of the mean, which we already know to be approximately 68\%.

Try to verify the other values from the empirical rule.

Summary:

Normalpdf (x,0,1) gives values of the probability density function. It gives the value of the probability (vertical distance to the graph) at any value of x. This is the function we graphed in Lesson 5.1

Normalcdf (a,b,0,1) gives values of the cumulative density function. It gives the probability of an event occurring between x=a and x=b (area under the probability density function curve and between two vertical lines).

Let’s look at the two examples we did in the last section using the table.

Example:

Find the probability for x < -1.58.

Solution:

The calculator command must have both an upper and lower bound. Technically though, the density curve does not have a lower bound as it continues infinitely in both directions. We do know however, that a very small percentage of the data is below 3 standard deviations to the left of the mean. Use -3 as the lower bound and see what answer you get.

The answer is accurate to the nearest 1\%, but remember that there really still is some data, no matter how little, that we are leaving out if we stop at –3. In fact, if you look at Table 1, you will see that about 0.0013 has been left out. Try going out to -4 and -5.

Notice that if we use -5, the answer is as accurate as the one in the table. Since we cannot really capture “all” the data, entering a sufficiently small value should be enough for any reasonable degree of accuracy. A quick and easy way to handle this is to enter -99999 (or “a bunch of nines”). It really doesn’t matter exactly how many nines you enter. The difference between five and six nines will be beyond the accuracy that even your calculator can display.

Example:

Find the probability for x \ge -0.528.

Solution:

Right away we are at an advantage using the calculator because we do not have to round off the z-score. Enter a normalcdf command from -0.528 to “bunches of nines”. This upper bound represents a ridiculously large upper bound that would insure a probability of missing data being so small that it is virtually undetectable.

Remember that our answer from the table was slightly too small, so when we subtracted it from 1, it became too large. The calculator answer of about .70125 is a more accurate approximation than the table value.

Standardizing

In most practical problems involving normal distributions, the curve will not be standardized (\mu = 0 and \sigma = 1). When using a z-table, you will have to first standardize the distribution by calculating the z-score(s).

Example:

A candy company sells small bags of candy and attempts to keep the number of pieces in each bag the same, though small differences due to random variation in the packaging process lead to different amounts in individual packages. A quality control expert from the company has determined that the mean number of pieces in each bag is normally distributed with a mean of 57.3 and a standard deviation of 1.2. Endy opened a bag of candy and felt he was cheated. His bag contained only 55 candies. Does Endy have reason to complain?

Solution:

Calculate the z-score for 55.

Z&=\frac {x-\mu}{\sigma}\\Z&=\frac {55-57.3}{1.2}\\Z &\approx -1.911666\ldots

Using Table 5.5, the probability of experiencing a value this low is approximately 0.0274. In other words, there is about a 3\% chance that you would get a bag of candy with 55 or fewer pieces, so Endy should feel cheated.

Using the graphing calculator, the results would look as follows (the ANS function has been used to avoid rounding off the z-score):

However, the advantage of using the calculator is that it is unnecessary to standardize. We can simply enter the mean and standard deviation from the original population distribution of candy, avoiding the z-score calculation completely.

Lesson Summary

A density curve is an idealized representation of a distribution in which the area under the curve is defined as 1, or in terms of percentages, 100\% of the data. A normal density curve is simply a density curve for a normal distribution. Normal density curves have two inflection points, which are the points on the curve where it changes concavity. Remarkably, these points correspond to the points in the normal distribution that are exactly 1 standard deviation away from the mean. Applying the empirical rule tells us that the area under the normal density curve between these two points is approximately 0.68. This is most commonly thought of in terms of probability, e.g. the probability of choosing a value at random from this distribution and having it be within 1 standard deviation of the mean is 0.68. Calculating other areas under the curve can be done using a z-table or using the normalcdf command on the TI-83/84. The z-table provides the area less than a particular z-score for the standard normal density curve. The calculator command allows you to specify two values, either standardized or not, and will calculate the area between those values.

Points To consider

  1. How do we calculate the areas/probabilities for distributions that are not normal?
  2. How do we calculate the z-scores, mean, standard deviation, or actual value given the probability or area?

Tables

There are two tables here, Table 1 for z-scores less than 0 and one and Table 2 for z-scores greater than 0. The table entry for z is the probability of lying below z. Essentially, these tables list the area of the shaded region in the figure below for each value of z.

For example, to look up P(z < -2.68) = 0.0037 in the first table, find -2.6 in the left hand column, then read across that row until you reach the value in the hundredths place (8) to read off the value.

Using this same technique and the second table, you should find that P(z < 1.42)=0.92.

Table of Standard Normal Probabilities for z < 0
z 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
 -3 0.0013 0.0013 0.0013 0.0012 0.0012 0.0011 0.0011 0.0011 0.0010 0.0010
 -2.9 0.0019 0.0018 0.0018 0.0017 0.0016 0.0016 0.0015 0.0015 0.0014 0.0014
 -2.8 0.0026 0.0025 0.0024 0.0023 0.0023 0.0022 0.0021 0.0021 0.0020 0.0019
 -2.7 0.0035 0.0034 0.0033 0.0032 0.0031 0.0030 0.0029 0.0028 0.0027 0.0026
 -2.6 0.0047 0.0045 0.0044 0.0043 0.0041 0.0040 0.0039 0.0038 0.0037 0.0036
 -2.5 0.0062 0.0060 0.0059 0.0057 0.0055 0.0054 0.0052 0.0051 0.0049 0.0048
 -2.4 0.0082 0.0080 0.0078 0.0075 0.0073 0.0071 0.0069 0.0068 0.0066 0.0064
 -2.3 0.0107 0.0104 0.0102 0.0099 0.0096 0.0094 0.0091 0.0089 0.0087 0.0084
 -2.2 0.0139 0.0136 0.0132 0.0129 0.0125 0.0122 0.0119 0.0116 0.0113 0.0110
 -2.1 0.0179 0.0174 0.0170 0.0166 0.0162 0.0158 0.0154 0.0150 0.0146 0.0143
 -2 0.0228 0.0222 0.0217 0.0212 0.0207 0.0202 0.0197 0.0192 0.0188 0.0183
 -1.9 0.0287 0.0281 0.0274 0.0268 0.0262 0.0256 0.0250 0.0244 0.0239 0.0233
 -1.8 0.0359 0.0351 0.0344 0.0336 0.0329 0.0322 0.0314 0.0307 0.0301 0.0294
 -1.7 0.0446 0.0436 0.0427 0.0418 0.0409 0.0401 0.0392 0.0384 0.0375 0.0367
 -1.6 0.0548 0.0537 0.0526 0.0516 0.0505 0.0495 0.0485 0.0475 0.0465 0.0455
 -1.5 0.0668 0.0655 0.0643 0.0630 0.0618 0.0606 0.0594 0.0582 0.0571 0.0559
 -1.4 0.0808 0.0793 0.0778 0.0764 0.0749 0.0735 0.0721 0.0708 0.0694 0.0681
 -1.3 0.0968 0.0951 0.0934 0.0918 0.0901 0.0885 0.0869 0.0853 0.0838 0.0823
 -1.2 0.1151 0.1131 0.1112 0.1093 0.1075 0.1056 0.1038 0.1020 0.1003 0.0985
 -1.1 0.1357 0.1335 0.1314 0.1292 0.1271 0.1251 0.1230 0.1210 0.1190 0.1170
 -1 0.1587 0.1562 0.1539 0.1515 0.1492 0.1469 0.1446 0.1423 0.1401 0.1379
 -0.9 0.1841 0.1814 0.1788 0.1762 0.1736 0.1711 0.1685 0.1660 0.1635 0.1611
 -0.8 0.2119 0.2090 0.2061 0.2033 0.2005 0.1977 0.1949 0.1922 0.1894 0.1867
 -0.7 0.2420 0.2389 0.2358 0.2327 0.2296 0.2266 0.2236 0.2206 0.2177 0.2148
 -0.6 </

Image Attributions

Files can only be attached to the latest version of None

Reviews

Please wait...
Please wait...
Image Detail
Sizes: Medium | Original
 
CK.MAT.ENG.SE.1.Prob-&-Stats-Adv.5.2
ShareThis Copy and Paste

Original text