- Be able to calculate and understand z-scores
- Understand the concept of a percentile and be able to calculate it for a particular result
- Be able to calculate percentages of data above, below, or in between any specific values in a normal distribution
- Be able to use z-scores to compare results for two different but related situations
In section 7.1, we analyzed normal distributions and specific situations in which analysis was done for data which followed the 68-95-99.7 rule exactly. The truth of the matter is that most situations require us to answer questions that do not reference exact whole numbers of standard deviations above or below the mean. What if we asked a student what their actual score would be if they were in the top 10% of ACT test takers? We need a tool to help us deal with these types of situations.
Our first tool will be the z-score formula. The z-score is a measure of how many standard deviations above or below the mean a particular value is. If a z-score is negative, the result is below the mean and if it is positive, the result is above the mean. For example, if the ACT mathematics exam scores are normally distributed with a mean of 18 and a standard deviation of 6, then an ACT score of 30 would be equivalent to a z-score of 2 because 30 would be 2 standard deviations above the mean.
The formula below gives a quick way to calculate z-scores. In the formula, 'x' is the observation, μ is the mean of the distribution, and σ is the standard deviation for the distribution.
Suppose the mean length of the hair of 10th grade girls is 10 inches with a standard deviation of 4 inches. What would be the z-score for hair length for a 10th grade girl whose hair is 16 inches long and what does it mean in terms of the normal curve?
It is often a good idea to draw a sketch for these sorts of situations so we can visualize what is happening.
Because 16 is located between 1 and 2 standard deviations above the mean, we expect a z-score between 1 and 2. Use the formula z=x−μσ to calculate the z-score. Our observation, x, is 16 inches while the mean is μ = 10 inches and the standard deviation is σ = 4 inches. z=16−104 or z = 1.5. This tells us that a hair length of 16 inches will be 1.5 standard deviations above the mean.
Suppose that the z-score for a particular 10th grade girl's hair length is z = -1.25. What is the length of the girl's hair?
We will use the z-score formula to find our answer.
The length of the hair for this girl would be 5 inches.
Suppose a student can either submit only their SAT score or their ACT score to a particular college. Suppose their SAT score was 620 and that the SAT has a mean of 500 and a standard deviation of 100. Suppose also that the same student scored a 25 of their ACT exam and that the ACT exam has a mean of 18 and a standard deviation of 6. Which score should the student submit?
Looking at the diagram below, it is not exactly clear which score is better. They appear to be quite similar and we will need to do some calculations to make a distinction.
Calculate the z-score for each exam. For the SAT, z=620−500100=1.2. For the ACT, z=25−186≈1.17. Since the z-score is higher on the SAT, the student should submit the SAT exam score.
In order to understand how to apply z-scores beyond what we have already done, we must first understand percentiles. A percentile is a marker on a normal curve such that the marker is greater than or equal to that percentage of results. For example, suppose you are at the 30th percentile for how fast you type. This means that you can type faster than 30% of all people. The percentile can also be thought of as the percent of area to the left of its marker. The graphic below shows where the 30th percentile is located. The shaded area to the left of the marker represents 30% of the normal curve.
It is very common for colleges and universities to use percentiles for entrance criteria. For example, a rather elite university might require that you score at the 90th percentile or higher on your ACT exam to be considered for admissions. Doctors often use percentiles to track the growth of babies. For example, can you picture what a baby would look like that is at the 70th percentile for weight and the 25th percentile for length?
Now we must ask what percentiles have to do with z-scores. Find the Normal Distribution Table in Appendix A, Part 2 of your book. Let's examine the z-score of -1.25 from Example 2. Find the z-value of -1.2 and then go over until you are under the 0.05 column. A partial table is given in Figure 7.3 below and the value in the cell we are looking for is bold and underlined. The value of 0.1056 can be interpreted as a percentile. This means that the girl in Example 2 has hair that is longer than 10.56% of all girls. In other words, she is at about the 10th or 11th percentile for hair length for 10th grade girls.
At what percentile for hair length is a 10th grade girl if her hair is 17 inches long?
Start by determining her z-score which would be z=17−104=74=1.75. We now go to the Normal Distribution Table in Appendix A, Part 2 of the book. We go across the row with z=1.7 until we are under 0.05. This gives a value of 0.9599. This tells us the girl is at about the 96th percentile for hair length. In other words, this girl's hair is longer than 96% of all 10th grade girls.
'Between' and 'Above' Problems
While it is nice to find percentiles for certain situations, we are often asked for the percentage of results that are between two given parameters or above a given parameter. For example, we might be asked to find the percentage of all 10th grade girls that have hair lengths between 8 inches and 15 inches long. To find these types of results, we often must do multiple z-score calculations and some addition or subtraction.
Suppose the weights of adult males of a particular species of whale are distributed normally with a mean of 11,600 pounds and a standard deviation of 640 pounds.
a) What percent of these adult male whales will weigh between 11,000 and 12,000 pounds?
b) What percent of these adult male whales will weigh more than 12,000 pounds?
a) Begin by finding the z-scores for both of the weights given and get z=11,000−11,600640=−600640=−0.9375 and z=12,000−11,600640=400640=0.625. For z=-0.9375, our Normal Distribution Table in Appendix A, Part 2 gives us a value between 0.1736 and 0.1762. Since -0.9375 is closer to -0.94 than -0.93, we will use a value of 0.174. We get a value between 0.7324 and 0.7357 for z=0.625. We will split the difference on this and use 0.734. All that is left to do now is subtract 0.734 and 0.174 to get 0.56 or about 56% of all adult male whales of this species are between 11,000 and 12,000 pounds. The shaded region in the Figure 7.4 below represents about 56% of the normal curve.
b) Use z=0.625 from part a) to get a value from the table of 0.734. This means that 73.4% of all whales weigh 12,000 pounds or less. Therefore, 100%-73.4%=26.6% of all whales weigh more than 12,000 pounds.
It is also important to note that graphing calculators can be used to quickly solve the types of problems discussed in this section by using the NormalCdf command. Typically, this command requires that four values be entered, the lower bound, the upper bound, the mean, and the standard deviation. In Example 5, we can solve the problem in part a) simply by typing in the command string NormalCdf(11000,12000,11600,640) and obtain the immediate result of 0.5598 or 56%.
Be sure you know how to access this command if you have a graphing calculator. Appendix C has some notes for users of graphing calculators. An online calculator that is very similar to a graphing calculator and gives us the same information can be found at http://wolframalpha.com .
You might also be wondering how to solve a problem using the NormalCdf command if only one parameter is given. Let's revisit Example 4 to see how this works.
At what percentile for hair length is a 10th grade girl if her hair is 17 inches long?
There is only one boundary given in this problem. It is your job to come up with a second boundary. In this case, the percentile we want to calculate is found by finding the percentage of all girls whose hair is 17 inches or less. We will use a lower bound of -100 and an upper bound of 17. We use -100 simply because we are confident that we will not find any results any further left than this. Typically, choose your missing parameter as being so extreme that it will not be even in the realm of possible results. NormalCdf(-100,17,10,4)=0.9599 so the length of the girl's hair is at about the 96th percentile.
Problem Set 7.2
For problems 1) through 14) use the following information: On a particular stretch of road, the number of cars per hour produces a normal distribution with a mean of 125 cars per hour and a standard deviation of 40 cars per hour.
1) Sketch a normal curve for this situation. Be sure to label and mark the mean and 1 and 2 standard deviations above and below the mean.
2) What is the z-score for an observation of 165 cars in one hour?
3) What is the z-score for an observation of 85 cars in one hour?
4) Calculate the z-score associated with an observation of 171 cars in one hour.
5) Suppose 135 cars are observed in one hour. At what percentile would this observation occur?
6) Suppose 70 cars are observed in one hour. At what percentile would this observation occur?
7) At what percentile would an observation of 125 cars occur?
8) What is the probability of observing at least 145 cars on the road in a an hour?
9) What is the probability of observing between 100 and 150 cars on the road in an hour?
10) Determine the percentile for an observation of 140 cars on the road in one hour.
11) Determine the percentile for an observation of 65 cars on the road in one hour.
12) Determine the probability of observing between 90 and 130 cars on the road in one hour.
13) Determine the probability of observing at least 160 cars on the road in one hour.
14) Determine the probability of obsering no more than 110 cars on the road in one hour.
For problems 15) through 20) use the following information: The number of ants found in one mature colony of leafcutter ants is normally distributed with a mean of 136 ants and a standard deviation of 14 ants.
15) One ant colony has 165 ants. At what percentile for size is this ant colony?
16) An ant colony has a z-score of -1.35 for size. How many ants would we expect to find in this colony?
17) Another ant colony has 131 ants. What is the z-score for this ant colony?
18) What is the probability of finding an ant colony with 160 ants or less?
19) What is the probability of finding an ant colony with 150 ants or more?
20) What is the probability of finding an ant colony that has between 120 and 155 ants in it?
21) Twin brothers Ricky and Robbie each took a college entrance exam. Ricky took the SAT which had a mean of 1000 with a Standard Deviation of 200 while Robbie took the ACT which had a mean of 18 with a standard deviation of 6. Which brother did better if Ricky scored a 1140 and Robbie scored a 22?
22) Suppose the average height of an adult American male is 69.5 inches with a standard deviation of 2.5 inches and the average height of an adult American female is 64.5 inches with a standard deviation of 2.3 inches. Who would be considered taller when compared to their gender, an adult American male who is 74 inches tall or an adult American female who is 68.5 inches tall? Explain your answer.
23) Professional golfer John Daly is one of the longest hitting golfers in history. Suppose his drives average 315 yards with a standard deviation of 12 yards. Will a drive of 345 yards be in his top 1% of his longest drives? Explain your answer.
24) What is the area under any density curve equal to?
25) In a standard deck of 52 cards, what is the probability of being dealt two queens if you are dealt two cards from the deck without replacement?
26) In a class competition, each grade (9-12) enters 10 students to run in a 500 meter race. Boys times for 9th graders and 12th graders are given below in seconds. Build a back-to-back stem plot to compare data for the two groups of students.
9th Grade Times = 115, 118, 118, 121, 126, 127, 131, 134, 140
12th Grade Times = 106, 106, 109, 112, 114, 116, 116, 121, 122, 133
27) It turns out that countries that have higher percentages of people with computers also tend to have people who live longer. Is it logical to assume that shipping many computers to countries whose people have lower life-expectancies will help the people in those countries live longer? Answer the question including justification that references either Cause and Effect, Common Response, Confounding, or Coincidence.
28) A sample survey at a local college campus asked 250 students how many textbooks they were currently carrying. The table below shows a summary of the findings. Use the table to determine the expected number of textbooks that an average college student at this campus would be carrying.
Textbooks Carried by Students
# of Books