Have you ever heard your teacher say that he or she was going to "grade on a curve"? If 100 people take an exam, the data of their scores can be plotted. How would you expect this plot to look? Suppose that the highest score is 75% and the lowest score is 15%. Your score was the highest score! Your teacher is going to "grade on the curve" and you'll receive an A for this exam. Your friend is upset because his score was 28%. Based on this curve will your friend pass the exam?
First watch this video to learn about normal distributions.
Previously you've spent some time learning about probability distributions. A distribution, itself, is simply a description of the possible values of a random variable and the possible occurrences of these values. Remember that probability distributions show you all the possible values of your variable (X), and the probability associated with each of these values (P(X)). You were also introduced to the concept of binomial distributions, or distributions of experiments where there are a fixed number of successes in X (random variable) trials, and each trial is independent of the other. In addition, you were introduced to binomial distributions in order to compare them with multinomial distributions. Remember that multinomial distributions involve experiments where the number of possible outcomes is greater than 2, and the probability is calculated for each outcome for each trial.
In this first concept on probability distributions, you are going to begin by learning about normal distributions. A normal distribution curve can be easily recognized by its shape. The first 2 diagrams above show examples of normal distributions. What shape do they look like? Do they look like a bell to you? Compare the first 2 diagrams above to the third diagram. A normal distribution is called a bell curve because its shape is comparable to a bell. It has this shape because the majority of the data is concentrated at the middle and slowly decreases symmetrically on either side. This gives it a shape similar to a bell.
Actually, the normal distribution curve was first called a Gaussian curve after a very famous mathematician, Carl Friedrich Gauss. He lived between 1777 and 1855 in Germany. Gauss studied many aspects of mathematics. One of these was probability distributions, and in particular, the bell curve. It is interesting to note that Gauss also spoke about global warming and postulated the eventual finding of Ceres, the planet residing between Mars and Jupiter. A neat fact about Gauss is that he was also known to have beautiful handwriting. If you want to read more about Carl Friedrich Gauss, look at http://en.wikipedia.org/wiki/Carl_Friedrich_Gauss.
You previously learned about discrete random variables. Remember that discrete random variables are ones that have a finite number of values within a certain range. In other words, a discrete random variable cannot take on all values within an interval. For example, say you were counting from 1 to 10. You would count 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10. These are discrete values. 3.5 would not count as a discrete value within the limits of 1 to 10. For a normal distribution, however, you are working with continuous variables. Continuous variables, unlike discrete variables, can take on any value within the limits of the variable. So, for example, if you were counting from 1 to 10, 3.5 would count as a value for the continuous variable. Lengths, temperatures, ages, and heights are all examples of continuous variables. Compare these to discrete variables, such as the number of people attending your class, the number of correct answers on a test, or the number of tails on a coin flip. You can see how a continuous variable would take on an infinite number of values, whereas a discrete variable would take on a finite number of values. As you may know, you can actually see this when you graph discrete and continuous data.
Look at the 2 graphs below. The first graph is a graph of the height of a child as he or she ages. The second graph is the cost of a gallon of gasoline as the years progress. Which graph represents discrete data? Which graph represents continuous data?
If you look at the first graph, the data points are joined, because as the child ages from birth to age 1, for example, his height also increases. As he continues to age, he continues to grow. The data is said to be continuous and, therefore, you can connect the points on the graph. For the second graph, the price of a gallon of gas at the end of each year is recorded. In 1930, a gallon of gas cost 10c. You would not have gone in and paid 10.2c or 9.75c. The data is, therefore, discrete, and the data points cannot be connected.
Let’s look at a few problems to show how histograms approximate normal distribution curves.
Jillian takes a survey of the heights of all of the students in her high school. There are 50 students in her school. She prepares a histogram of her results. Is the data normally distributed?
If you take a normal distribution curve and place it over Jillian’s histogram, you can see that her data does not represent a normal distribution.
If the histogram were actually shaped like a normal distribution, it would have a shape like the curve below:
Thomas did a survey similar to Jillian’s in his school. His high school had 100 students. Is his data normally distributed?
If you take a normal distribution curve and place it over Thomas’s histogram, you can see that his data also does not represent a normal distribution.
Joanne posted a problem to her friends on FaceBook. She told her friends that her grade 12 math project was to measure the lifetimes of the batteries used in different toys. She surveyed people in her neighborhood and asked them, on average, how many hours their typical battery lasts. Her results are shown below:
Is her data normally distributed? Where is the center of the distribution?
If you take a normal distribution curve and place it over Joanne’s histogram, you can see that her data appears to come from a normal distribution.
This means that the data fits a normal distribution with a mean around 105. Using the TI-84 calculator, you can actually find the mean of this data to be 105.7.
What Joanne’s data does tell us is that the mean score (105.7) is at the center of the distribution, and the data from all of the other scores (times) are spread from that mean. You will be learning much more about standard normal distributions in a later Concept. But for now, remember the 2 key points about a standard normal distribution. The first key point is that the data represented is continuous. The second key point is that the data is centered at the mean and is symmetrically distributed on either side of that mean.
The following data was collected on a recent 25-point math quiz. Does the data represent a normal distribution? Can you determine anything from the data? 201418141617152211322141119142317182010259191812
A recent blockbuster movie was rated PG, with an additional violence warning. The manager of a movie theater did a survey of moviegoers to see what ages were attending the movie in an attempt to see if people were adhering to the warnings. Is his data normally distributed? Do moviegoers at the theater regularly adhere to warnings? 17151951791471013202421151327191823121614121414
The heights of coniferous trees were measured in a local park in a regular inspection. Is the data normally distributed? Are there areas of the park that seem to be in danger? The measurements are all in feet. 22.818.225.023.14.39.77.08.818.57.823.28.823.021.73.421.225.723.221.720.023.519.4184.108.40.206
Determine if the points representing each of the following data sets can be connected when graphed.
The number of students enrolled in a college each semester
The weight of a baby seal each day as it grows
The number of coins a coin collector owns each week
The speed of a rocket each second as it accelerates
The amount of water in a swimming pool each minute as it is drained
The number of employees a company has each month as it expands
Variable that takes on any value within the limits of the variable
discrete values are data where a finite number of values exist between any two values.
A distribution is a description of the possible values of a random variable and the possible occurrences of these values.
normal distribution curve
A normal distribution curve is a symmetrical curve that shows the highest frequency in the center with an identical curve on either side of the center.
A continuous variable is a variable that takes on any value within the limits of the variable.
The empirical rule states that for data that is normally distributed, approximately 68% of the data will fall within one standard deviation of the mean, approximately 95% of the data will fall within two standard deviations of the mean, and approximately 99.7% of the data will fall within three standard deviations of the mean.
normal probability plot
A normal probability plot is a graph is a plot of the z -scores of the data as quantiles against the actual data values. If a distribution is normal, this plot will be linear.
normal quartile plot
normal quartile plot is another name for a normal probability plot.
standard normal distribution
The standard normal distribution, is a normal distribution with mean of 0 and a standard deviation of 1.
The z -score of a value is the number of standard deviations between the value and the mean of the set.
Determine if a data set approximates a normal distribution.
Here you'll learn how to distinguish between graphs of discrete and continuous data. You'll also become familiar with the properties of a normal distribution and determine if a specific data set approximates a normal distribution.