When students ask their teachers to curve exams, what they often mean is they want everyone to simply get a higher grade. Curving a grade can also mean fitting to a bell curve where lots of people get Cs, some people get Ds and Bs and very few people get As and Fs. Even though this second interpretation is not what most students mean, the normal curve is one of the most widely used and applied probability distributions. What are other examples that follow a normal distribution?
Watch This
http://www.youtube.com/watch?v=hgtMWR3TFnY Khan Academy: Introduction to the Normal Distribution
Guidance
The Standard Normal Distribution is graphed from the following function and is represented by the Greek letter phi, \phi .
@$\phi(x)=\frac{1}{\sqrt{2 \pi}}e^{- \frac{1}{2}x^2}@$
This distribution represents a population with a mean of 0 and a standard deviation of 1. The numbers along the @$x@$ -axis represent standard deviations. For data that is normally distributed, the empirical rule states that:
- Approximately 68% of the data will be within 1 standard deviation of the mean.
- Approximately 95% of the data will be within 2 standard deviations of the mean.
- Approximately 99.7% of the data will be within 3 standard deviations of the mean.
Some other important points about the normal distribution:
- The total area between the normal curve and the @$x@$ axis is 1 and this area represents all possible probabilities.
- If data is distributed normally, you can use the normal distribution to determine the percentage of the data between any two values by calculating the area under the curve between those two values. When you take calculus, you will learn how to calculate this area analytically, but for now you can use the normalcdf function on your calculator.
- Many histograms approximate a normal curve, but a true normal curve is infinitely smooth.
Example A
The amount of rain each year in Connecticut follows a normal distribution. What is the probability of getting one standard deviation below the normal amount of rain?
Solution: You are looking for the area of the shaded portion of the normal distribution shown below. By the empirical rule, you know that approximately 34% of the data is in between -1 and 0. Also, 50% of the data is above 0. Therefore, approximately 84% of the data is unshaded. Therefore, @$100\%-84\%=16\%@$ of the data is shaded. The approximate probability is 16%.
To get the exact probability, use the normal cdf function on your calculator to calculate the exact area under the curve. Go to [DISTR] (which is @$[2^{nd}]@$ [VARS]) and choose normalcdf. This is the normal cumulative distribution function and calculates the area under the curve between two @$x@$ -values. The syntax (how you will type it in) for normal cdf is:
normalcdf(lower, upper, mean, standard deviation)
The lower bound for this shaded region is technically @$- \infty@$ , but the TI-84 cannot handle that so use -1E99. -1E99 is @$-1 \times 10^{99}@$ , an extremely small number, and will give identical results that are correct to many decimal places. The upper bound is -1. For a standard normal distribution with a mean of zero and a standard deviation of 1 you don’t need to type anything else in, but since you will be working with normal distributions with means and standard deviations that are different, it will make sense to get used to using the whole syntax.
normalcdf(-1E99, -1, 0, 1)
The exact answer is closer to 15.87%.
Example B
On your first college exam, you score an 82. After the exam the professor tells the class that the mean was a 62 and the standard deviation was 10. What percentage of the class did better than you?
Solution: An 82 is 20 away from the mean so is 2 standard deviations from the mean. Therefore, this question is asking for the percentage of students that are above +2 standard deviations above the mean.
In future statistics courses you will learn how to create the equation for this distribution and then transform it to standard normal. For now, you can use the fact that your score was exactly 2 standard deviations above the mean. Or, you can calculate the probability using the actual numbers.
- normalcdf(2, 1E99, 0, 1) = 0.022750 or 2.750%
- normalcdf(82, 1E99, 62, 10) = 0.022750 or 2.750%
2.75% of the class did better than you on the exam. Even though you seemed to score a B-, the professor would probably note that you were near the top of the class and adjust grades accordingly.
Example C
The quality control technician of a widget making factory observes that widgets that are three standard deviations too large or three standard deviations too small from the precise widget size are unusable. What is the probability of producing a usable widget?
Solution: This question is essentially asking for the area between -3 standard deviations and positive 3 standard deviations. The empirical rule says this should be 99.7%. Use the normalcdf function to find the exact value.
normalcdf(-3, 3, 0, 1) = 0.997300 or 99.73%
The quality control technician would decide if this is a high enough success rate for producing a usable widget.
Concept Problem Revisited
Height, weight and other measures of people, animals or plants are normally distributed.
Vocabulary
A standard normal distribution is a normal distribution with mean of 0 and a standard deviation of 1.
The empirical rule states that for data that is normally distributed, approximately 68% of the data will fall within one standard deviation of the mean, approximately 95% of the data will fall within two standard deviations of the mean, and approximately 99.7% of the data will fall within three standard deviations of the mean. It is a good way to quickly approximate probabilities.
Normalcdf is the normal cumulative distribution function and calculates the area between any two values for data that is normally distributed as long as you know the mean and standard deviation for the data. Your calculator has this function built in, and it produces an exact answer as opposed to the empirical rule.
Guided Practice
1. What is the probability that a person in Texas is exactly 6 feet tall?
2. Two percent of high school football players are invited to play at a competitive college level. How many standard deviations above the average player would someone need to be to have this opportunity?
3. On average, a pumpkin at your local farm weighs 10 pounds with a standard deviation of 6 pounds. You go and find a pumpkin weighing 26 pounds. Of all the pumpkins at the farm, what percent weigh less than this enormous pumpkin?
Answers:
1. Since height is a continuous variable, meaning any number within a reasonable domain interval is possible, the probability of choosing any single number is zero. Many people may be close to 6 feet tall, but in reality they are 5.9 or 6.0001 feet tall. There must be someone in Texas who is the closest to being exactly 6 feet tall, but even that person when measured accurately enough will still be slightly off from 6 feet. This is why instead of calculating the probability for a single outcome, you calculate the probability between a certain interval, like between 5.9 feet and 6.1 feet. For continuous variables, the probability of any specific outcome, like 6 feet, will always be 0.
2. This situation is the inverse of the previous questions. Instead of being given the standard deviation and asked to find the probability, you are given the probability and asked to find the standard deviation.
There is a second programmed feature in the distribution menu that performs this calculation. You are looking for how many standard deviations above the mean will include 98% of the data.
invNorm(0.98) = 2.0537
A person would have to be greater than about 2 standard deviations above the mean to be in the top 2 percent.
3. Normalcdf(-1E99, 26, 10, 6) = 0.9961 or 99.61%
The vast majority of the pumpkins weigh less than the 26 pound pumpkin you found.
Practice
Consider the standard normal distribution for the following questions.
1. What is the mean?
2. What is the standard deviation?
3. What is the percentage of the data below 1?
4. What is the percentage of the data below -1?
5. What is the percentage of the data above 2?
6. What is the percentage of the data between -2 and 2?
7. What is the percentage of the data between -0.5 and 1.7?
8. What is the probability of a value of 2?
Assume that the mean weight of 1 year old girls in the USA is normally distributed, with a mean of about 9.5 kilograms and a standard deviation of approximately 1.1 kilograms.
9. What percent of 1 year old girls weigh between 8 and 12 kilograms?
10. What percent of girls weigh above 12 kilograms?
11. Girls in the bottom 5% by weight need their weight monitored every 2 months. How many standard deviations below the mean would a girl need to be to have their weight monitored?
Suppose that adult women’s heights are normally distributed with a mean of 65 inches and a standard deviation of 2 inches.
12. What percent of adult women have heights between 60 inches and 65 inches?
13. Use the empirical rule to describe the range of heights for women within one standard deviation of the mean.
14. What is the probability that a randomly selected adult woman is more than 64 inches tall?
15. What percent of adult women are either less than 60 inches or greater than 72 inches tall?