- Apply the characteristics of the normal distribution to solving problems.
The normal distribution is the foundation for statistical inference and will be an essential part of many of those topics in later chapters. In the meantime, this section will cover some of the types of questions that can be answered using the properties of a normal distribution. The first examples deal with more theoretical questions that will help you master basic understandings and computational skills, while the later problems will provide examples with real data, or at least a real context.
Unknown Value Problems
If you truly understand the relationship between the area under a density curve and the mean, standard deviation, and score, you should be able to solve problems in which you are provided all but one of these values and are asked to calculate the remaining value. While perhaps not directly practical, it is the thorough understanding of these calculations that will lead to a high degree of comfort when a more relevant context is provided.
In the last lesson we found the probability, or area under the density curve. What if you are asked to find a value that gives a particular probability?
Given a normally distributed random variable with and , what is the value of where the probability of experiencing a value less than that is ?
As suggested before, it is important and helpful to sketch the distribution.
If we had to estimate an actual value first, we know from the empirical rule that about of the data is below one standard deviation to the right of the mean.
We expect the answer to be slightly below this value.
When we were given a value of the variable and were asked to find the percentage or probability, we used the table or a normalcdf command. But how do we find a value given the percentage? Again, the table has its limitations in this case and graphing calculators or computer software are much more convenient and accurate. The command on the TI-83/84 calculator is invNorm. You may have seen it already in the distribution menu.
The syntax for this command is:
InvNorm (percentage or probability to the left, mean, standard deviation)
Enter the values in the correct order:
Unknown Mean or Standard Deviation
For a normally distributed random variable, , and , Estimate
First draw a sketch:
Remember that about of the data is within standard deviations of the mean. This would leave of the data in the lower tail, so our value must be less than from the mean.
Because we do not know the mean, we have to use the standard normal curve and calculate a score using the invNorm command. The result confirms the prediction that the value should be less than standard deviations from the mean.
In one of the few instances in beginning statistics that we use algebra, plug in the known quantities into the score formula:
For a normally distributed random variable, , and , find .
Again, let’s first look at a sketch of the distribution.
Since about of the data is below standard deviations, it seems reasonable to estimate that the value is less than two standard deviations away and might be around or .
Again, use invNorm to calculate the score. Remember that we are not entering a mean or standard deviation, so the result is from and .
Use the score formula and solve for :
Drawing a Distribution on the Calculator
As you saw in Lesson 1 of this chapter, the TI-83/84 will also draw the distribution for you. But before doing that, we need to set an appropriate window (see screen below) and delete or turn off any functions or plots. Let’s use the last example and draw the shaded region of the normal curve with and below . Remember from the empirical rule that we probably want to show about standard deviations away from in either direction. If we use as an estimate for , then we should open our window above and below . The settings can be a bit tricky, but with a little practice you will get used to determining the maximum percentage of area near the mean.
The reason that we went below the axis is to leave room for the text as you will see.
Now press [2nd] [DISTR]> and arrow over to the Draw option.
Choose the ShadeNorm command. You enter the values just as if you were doing a normalcdf calculation:
ShadeNorm(lower bound, upper bound, mean, standard deviation)
Press [ENTER] to see the result.
Normalpdf on the Calculator
You may have noticed that the first option in the distribution menu is Normalpdf, which stands for a normal probability density function. It is the option you used in lesson 5.1 to draw the graph of the normal distribution. Many students wonder what this function is for and occasionally even use it by mistake to calculate what they think are cumulative probabilities. This function is actually the mathematical formula for drawing the normal distribution. You can find this formula in the resources at the end of the lesson if you are interested. The numbers this formula returns are not really useful to us statistically. The primary useful purpose for this function is to draw the normal curve.
As you did in Lesson 5.1, plot Y1=Normalpdf with the window shown below. Be sure to turn off any plots and clear out any functions. Enter and close the parentheses. Because we did not specify a mean and standard deviation, we will draw the standard normal curve. Enter the window settings necessary to fit most of the curve on the screen as shown below (think about the empirical rule to help with this).
Normal Distributions with Real Data
The foundation of collecting surveys, samples, and experiments is most often based on the normal distribution as you will learn in later chapters. Here are two examples.
The Information Centre of the National Health Service in Britain collects and publishes a great deal of information and statistics on health issues affecting the population. One such comprehensive data set tracks information about the health of children. According to their statistics, in 2006 the mean height of year-old boys was with a standard deviation estimate of approximately (these are not the exact figures for the population and in later chapters we will learn how they are calculated and how accurate they may be, but for now we will assume that they are a reasonable estimate of the true parameters).
Part 1 If old Cecil is , approximately what percentage of all year-old boys in Britain is he taller than?
We first must assume that the height of year-old boys in Britain is normally distributed. This seems a reasonable assumption to make. As always, the first step should be to draw a sketch and estimate a reasonable answer prior to calculating the percentage. In this case, let’s use the calculator to sketch the distribution and the shading. First decide on an appropriate window that includes about standard deviations on either side of the mean. In this case, standard deviations is about , so add and subtract that value to/from the mean to find the horizontal extremes. Then enter the appropriate ShadeNorm command.
From this data, we would estimate Cecil is taller than of year-old boys. We could also phrase this answer as follows: the probability of a randomly selected British year-old boy being shorter than Cecil is . Often with data like this we use percentiles. We would say Cecil is in the percentile for height among year-old boys in Britain.
Part 2 How tall would Cecil need to be to be in the top of all year-old boys in Britain?
Here is a sketch:
In this case we are given the percentage, so we need to use the invNorm command.
Cecil would need to be about tall to be in the top of year-old boys in Britain.
Suppose that the distribution of mass of female marine iguanas Puerto Villamil in the Galapagos Islands is approximately normal with a mean mass of and a standard deviation of . There are very few young marine iguanas in the populated areas of the islands because feral cats tend to kill them. How rare is it that we would find a female marine iguana with a mass less than in this area?
Using the graphing calculator we need to approximate the probability of being less than .
With a probability of approximately , we could say it is rather unlikely (only about of the time) that we would find an iguana this small.
In order to find the percentage of data in between two values (or the probability of a randomly chosen value being between those values) in a normal distribution, we can use the normalcdf command on the TI-83/84 calculator. When you know the percentage or probability, use the invNorm command to find a score or value of the variable. In order to use these tools in real situations, we need to know that the distribution of the variable in question is approximately normal. When solving problems using normal probabilities, it helps to draw a sketch of the distribution and shade the appropriate region.
Points to Consider
- How do the probabilities of a standard normal curve apply to making decisions about unknown parameters for a population given a sample?
- Which of the following intervals contains the middle of the data in a standard normal distribution?
- For each of the following problems, is a continuous random variable with a normal distribution and the given mean and standard deviation. is the probability of a value of the distribution being less than . Find the missing value and sketch and shade the distribution.
- What is the score for the lower quartile in a standard normal distribution?
- The manufacturing process at a metal parts factory produces some slight variation in the diameter of metal ball bearings. The quality control experts claim that the bearings produced have a mean diameter of . If the diameter is more than to wide or too narrow, they will not work properly. In order to maintain its reliable reputation, the company wishes to insure that no more than of of the bearings that are made are ineffective. What should the standard deviation of the manufactured bearings be in order to meet this goal?
- Suppose that the wrapper of a certain candy bar lists its weight as . Naturally, the weights of individual bars vary somewhat. Suppose that the weights of these candy bars vary according to a normal distribution with and .
- What proportion of candy bars weigh less than the advertised weight?
- What proportion of candy bars weight between and ?
- What weight candy bar would be heavier than all but of the candy bars out there?
- If the manufacturer wants to adjust the production process so no more than candy bar in less than the advertised weight, what should the mean of the actual weights be? (Assuming the standard deviation remains the same)
- If the manufacturer wants to adjust the production process so that the mean remains at and no more than candy bar in less than the advertised weight, how small does the standard deviation of the weights need to be??
Other sites of interest
Standard Normal Curve
Normal Probability Plot (or Normal Quantile Plot)
Cumulative Density Function
Probability Density Function