Previously you've spent some time learning about probability distributions. A distribution, itself, is simply a description of the possible values of a random variable and the possible occurrences of these values. Remember that probability distributions show you all the possible values of your variable (), and the probability associated with each of these values . You were also introduced to the concept of binomial distributions, or distributions of experiments where there are a fixed number of successes in (random variable) trials, and each trial is independent of the other. In addition, you were introduced to binomial distributions in order to compare them with multinomial distributions. Remember that multinomial distributions involve experiments where the number of possible outcomes is greater than 2, and the probability is calculated for each outcome for each trial.
In this first concept on probability distributions, you are going to begin by learning about normal distributions. A normal distribution curve can be easily recognized by its shape. The first 2 diagrams above show examples of normal distributions. What shape do they look like? Do they look like a bell to you? Compare the first 2 diagrams above to the third diagram. A normal distribution is called a bell curve because its shape is comparable to a bell. It has this shape because the majority of the data is concentrated at the middle and slowly decreases symmetrically on either side. This gives it a shape similar to a bell.
Actually, the normal distribution curve was first called a Gaussian curve after a very famous mathematician, Carl Friedrich Gauss. He lived between 1777 and 1855 in Germany. Gauss studied many aspects of mathematics. One of these was probability distributions, and in particular, the bell curve. It is interesting to note that Gauss also spoke about global warming and postulated the eventual finding of Ceres, the planet residing between Mars and Jupiter. A neat fact about Gauss is that he was also known to have beautiful handwriting.
You previously learned about discrete random variables. Remember that discrete random variables are ones that have a finite number of values within a certain range. In other words, a discrete random variable cannot take on all values within an interval. For example, say you were counting from 1 to 10. You would count 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10. These are discrete values. 3.5 would not count as a discrete value within the limits of 1 to 10. For a normal distribution, however, you are working with continuous variables. Continuous variables, unlike discrete variables, can take on any value within the limits of the variable. So, for example, if you were counting from 1 to 10, 3.5 would count as a value for the continuous variable. Lengths, temperatures, ages, and heights are all examples of continuous variables. Compare these to discrete variables, such as the number of people attending your class, the number of correct answers on a test, or the number of tails on a coin flip. You can see how a continuous variable would take on an infinite number of values, whereas a discrete variable would take on a finite number of values. As you may know, you can actually see this when you graph discrete and continuous data.
Look at the 2 graphs below. The first graph is a graph of the height of a child as he or she ages. The second graph is the cost of a gallon of gasoline as the years progress. Which graph represents discrete data? Which graph represents continuous data?
If you look at the first graph, the data points are joined, because as the child ages from birth to age 1, for example, his height also increases. As he continues to age, he continues to grow. The data is said to be continuous and, therefore, you can connect the points on the graph. For the second graph, the price of a gallon of gas at the end of each year is recorded. In 1930, a gallon of gas cost . You would not have gone in and paid or . The data is, therefore, discrete, and the data points cannot be connected.
Let’s look at a few problems to show how histograms approximate normal distribution curves.
Understanding Data Distributions
Jillian takes a survey of the heights of all of the students in her high school. There are 50 students in her school. She prepares a histogram of her results. Is the data normally distributed?
If you take a normal distribution curve and place it over Jillian’s histogram, you can see that her data does not represent a normal distribution.
If the histogram were actually shaped like a normal distribution, it would have a shape like the curve below:
Thomas did a survey similar to Jillian’s in his school. His high school had 100 students. Is his data normally distributed?
If you take a normal distribution curve and place it over Thomas’s histogram, you can see that his data also does not represent a normal distribution.
Joanne posted a problem to her friends on FaceBook. She told her friends that her grade 12 math project was to measure the lifetimes of the batteries used in different toys. She surveyed people in her neighborhood and asked them, on average, how many hours their typical battery lasts. Her results are shown below:
Is her data normally distributed? Where is the center of the distribution?
If you take a normal distribution curve and place it over Joanne’s histogram, you can see that her data appears to come from a normal distribution.
This means that the data fits a normal distribution with a mean around 105. Using the TI-84 calculator, you can actually find the mean of this data to be 105.7.
What Joanne’s data does tell us is that the mean score (105.7) is at the center of the distribution, and the data from all of the other scores (times) are spread from that mean. You will be learning much more about standard normal distributions in a later Concept. But for now, remember the 2 key points about a standard normal distribution. The first key point is that the data represented is continuous. The second key point is that the data is centered at the mean and is symmetrically distributed on either side of that mean.
- The following data was collected on a recent 25-point math quiz. Does the data represent a normal distribution? Can you determine anything from the data?
- A recent blockbuster movie was rated PG, with an additional violence warning. The manager of a movie theater did a survey of moviegoers to see what ages were attending the movie in an attempt to see if people were adhering to the warnings. Is his data normally distributed? Do moviegoers at the theater regularly adhere to warnings?
- The heights of coniferous trees were measured in a local park in a regular inspection. Is the data normally distributed? Are there areas of the park that seem to be in danger? The measurements are all in feet.
Determine if the points representing each of the following data sets can be connected when graphed.
- The number of students enrolled in a college each semester
- The weight of a baby seal each day as it grows
- The number of coins a coin collector owns each week
- The speed of a rocket each second as it accelerates
- The amount of water in a swimming pool each minute as it is drained
- The number of employees a company has each month as it expands
- The thickness of a glacier each year as it melts
To view the Review answers, open this PDF file and look for section 4.1.