11.6: Misleading Statistics
Introduction
Prize Money
Hayley is investigating the prize money that mushers have earned in the Iditarod. In completing her research, she finds that the prize money for first place hasn’t changed very much in the past few years and hovers around $69,000. The overall prize money has changed significantly though.
She notices that there was a big year for race money in 2008 with mushers earning a total of $935,000. She shows this to her friend Tiffany.
“That is a lot of money,” Tiffany comments.
“Yes, but they all don’t get the same amount. The first place winner receives about the same amount, but then the other mushers get less based on where they finish. There is less money being given out though this year than last.”
Hayley creates this line graph of the data that she has discovered.
When finished, she shows it to Tiffany.
“Here is the graph that I created,” Hayley says to Tiffany.
“This won’t work. It isn’t accurate.”
“What do you mean?”
Hayley is puzzled by Tiffany’s comment. Do you see what is misleading in Hayley’s graph? Creating graphs that accurately display data can be a tricky thing. Hayley thought that her graph was accurate, but Tiffany disagrees.
This lesson is all about misleading graphs and data. By the end of it, you will know why Tiffany made the comment about Hayley’s graph.
What You Will Learn
By the end of this lesson you will be able to:
- Identify and analyze misleading data displays.
- Identify and explain misleading comparisons of data.
- Revise data displays to eliminate misleading representations or conclusions.
Teaching Time
I. Identify and Analyze Misleading Data Displays
Graphs provide a visual picture of data. Graphs can be used to present, persuade, or even mislead the viewer. The same set of data can be presented on a graph in different ways. Sometimes, the way that a graph is drawn can present only one side of the statistics.
Why would this happen?
Sometimes, companies try to sell more of a specific product by creating misleading graphs. These graphs can be divided to make it look like a product is more popular than another product.
Companies can also do this with purchases by showing that a percentage of people like their product better than another one. However, if you look at how the graph is divided and created, it will give you an idea whether or not the statistics represented give a good overall picture of the data.
There are several ways that data can be misrepresented. By the end of this lesson, you will know how to identify misleading data.
The first way is to show a break in the vertical axis.
A break in the scale on the vertical axis will show more detail or emphasize an increase or decrease in data values. A break in the scale means that the vertical axis does not run straight up from zero to the highest value, it 'breaks' somewhere to keep a portion of the data from being obvious to the chart. This trick can make it difficult to determine where the data actually begins and where it ends.
Using a different scale or spacing along the horizontal or vertical axis will also change the appearance of the data. You may choose to space the values closely together to depict a greater change in the data. Or, you may choose to spread the values out to depict less of a change in the data. If you can’t detect a change in the data, then it is misleading.
Let’s see how this looks in three different examples.
Example
A survey was taken to determine people’s favorite fruit. Both graphs depict the results of the survey. Which graph provides a more reliable view of the data?
The vertical axis scale on Graph 1 is smaller than the scale on Graph 2. The vertical axis scale on Graph 1 is twenty-five. The vertical axis scale on Graph 2 is hundreds. Because the scale is smaller, detecting changes in the data is easier on Graph 1 than on Graph 2. Therefore, Graph 1 depicts the data in a more reliable manner because it shows more of the data.
Example
Both graphs below depict the average monthly water temperature in Hawaii. Which graph provides the most reliable view of the data?
Graph 2 shows a sharp increase in the change in water temperature. The change in temperature does not vary greatly from month to month and therefore graph 2 is not a reliable display. Graph 1 reveals the change in water temperature as more gradual and is therefore a more reasonable indicator of the gradual temperature changes for which Hawaii is known.
Example
Both graphs below depict the population of five animals on the endangered species list. Explain why Graph 1 is misleading and Graph 2 is more reliable.
Graph 1 has a break in the scale on the vertical axis. You can see that the scale begins at 1,000. This provides a distorted and misleading view of the data.
II. Identify and Explain Misleading Comparisons of Data
Now that you have an idea of a couple of ways that data can be misrepresented, let’s look at identifying why data is misleading by investigating a few examples.
Remember, you are looking for a couple of things.
- Does the data show an accurate change from one piece of information to the next?
- Does the data start at zero or is there a break in the vertical axis?
- Is the scale that is being used one that makes sense? Are the sections even or spread too far out?
Write these three things down in your notebooks under “Data Display Checking.”
Sometimes, you can find something else on a graph too. You can find data that isn’t connected. When you look for the above three things on a graph, also be sure that the things that are being compared are similar.
Example
Explain why the data compared on the bar graph is misleading.
While the incidence rates of shark attacks are correct, one must take in to consideration several things when analyzing the data. First, each state on the graph has a different size coastline. For example, Florida’s vast amount of coastline may contribute to the fact that it has a far higher incidence of shark attacks than any other U.S. state. As well, California, Florida, and Hawaii are all big beach destination states; therefore the incidence of shark attacks will be greater than states such as Alabama. To make the data more reliable, a graph should be created comparing the incidence of shark attacks among states that are more geographically similar.
Let’s look at another example where the sample size makes the data misleading.
Example
The graph below compares the percentage of students that passed the California High School Exit Exam in four different counties in California. Explain why comparing the passing rates of these four counties is misleading.
Comparison of passing rates for these four counties is misleading because the number of students that took the CAHSEE in each county varies greatly. In order to make an accurate comparison, the number of students tested should be similar. The population of students might also be taken into consideration. You might ask, do any of the students have learning disabilities? How many of the students speak English as their first language? These factors will influence results and therefore one should be careful when comparing the passing rates.
11I. Lesson Exercises
Think about misleading data to answer each of the following questions.
- Why would data about ocean swimmers in Michigan be misleading?
- Why would data about temperatures below \begin{align*}32^\circ\end{align*} comparing Georgia and Tennessee with North Dakota be misleading?
- Why would a graph whose vertical axis starts at 50 be misleading?
Take a few minutes to discuss your answers with a friend. Explain why you wrote what you did and justify your reasoning. Then continue with the next section.
III. Revise Data Displays to Eliminate Misleading Representations or Conclusions
Suppose that your task was to revise any and all misleading data displays? How would you go about it? This section will teach you how to revise data displays so that you can eliminate misleading data.
There are four ways to be sure that your data display is not misleading.
- One way to ensure that data is represented accurately is to begin the vertical axis scale at zero.
- You should also ensure that there is ample and equal space between the values on the horizontal and vertical axis.
- Choose a scale for the vertical axis that is not too large or too small.
- Be sure that the things that you are comparing are similar.
Example
The graph below depicts a sharp increase in panda population since 1976. Use the data to create another graph that depicts less of an increase in population and is more reliable.
To create a graph that depicts less of an increase in the panda population:
- Allow for more space in between the years on the horizontal axis.
- Increase the scale to five hundred on the vertical axis.
- Plot the data points (1976, 1,100), (1986, 900), (1996, 1,000), and (2006, 1,600) on the graph.
Here is another example that shows a misleading graph and how we can go about correcting it.
Example
The vertical axis break on the graph “Enrollment at Park High School,” makes it appear as if enrollment has doubled since 1995. Redraw the graph to represent the data in a more accurate manner.
To draw a graph that is more accurate, the vertical axis should begin at zero.
Now that you know how to examine different graphs, you will be able to determine which ones are misleading and which ones aren’t.
Real Life Example Completed
Prize Money
Here is the original problem once again. Reread it and then think about why Hayley’s graph is misleading. Read the comments that Tiffany makes at the end of the problem.
Hayley is investigating the prize money that mushers have earned in the Iditarod. In completing her research, she finds that the prize money for first place hasn’t changed very much in the past few years and hovers around $69,000. The overall prize money has changed significantly though.
She notices that there was a big year for race money in 2008 with mushers earning a total of $935,000. She shows this to her friend Tiffany.
“That is a lot of money,” Tiffany comments.
“Yes, but they all don’t get the same amount. The first place winner receives about the same amount, but then the other mushers get less based on where they finish. There is less money being given out though this year than last.”
Hayley creates this line graph of the data that she has discovered.
When finished, she shows it to Tiffany.
“Here is the graph that I created,” Hayley says to Tiffany.
“This won’t work. It isn’t accurate.”
“What do you mean?”
Tiffany looks at the graph over Hayley’s shoulder.
“There are two things wrong with this graph. The first one is that your spacing isn’t even. That means that the money amounts aren’t clearly represented. It makes it look like the jumps are more drastic than they are.”
“The second thing is that the graph doesn’t start at 0. Because it doesn’t start at zero, you can’t get a good idea of how the money has changed over time.”
Hayley looked at her graph again. With Tiffany’s help, she now has an idea how to fix her graph.
Now it is your turn. Look at Hayley’s graph and use Tiffany’s comments to revise the graph. Be sure the graph is accurate and not misleading.
Technology Integration
Khan Academy, Misleading Line Graphs
Time to Practice
Directions: Answer each question regarding misleading data.
1. Is this a misleading graph?
2. What is one thing that makes it a misleading graph?
3. What is one thing that you could do to fix this graph?
The data table below depicts the amount of time students at different grade levels spend on homework and studying. Ensure that the second graph shows that time spent on homework in twelfth grade is triple that of sixth grade. Be certain to read directions carefully, as you will be asked to create two graphs with the data, one correct and one deliberately incorrect.
Grade | Time |
---|---|
\begin{align*}6^{th}\end{align*} | 1.75 |
\begin{align*}7^{th}\end{align*} | 2 |
\begin{align*}8^{th}\end{align*} | 2.25 |
\begin{align*}9^{th}\end{align*} | 2.5 |
\begin{align*}10^{th}\end{align*} | 2.75 |
\begin{align*}11^{th}\end{align*} | 3 |
\begin{align*}12^{th}\end{align*} | 3.5 |
4. Use the data below to create two bar graphs. One that shows the data accurately, that time spent on homework in twelfth grade is double that of sixth grade. The second graph should be created to be deliberately misleading. Draw it so that it shows that time spent on homework in the twelfth grade appears triple that of sixth grade.
5. Ensure that the second graph shows that time spent on homework in twelfth grade is triple that of sixth grade. The second graph should be created to be deliberately misleading. Draw it so that it shows that time spent on homework in the twelfth grade appears triple that of sixth grade.
6. If the students doubled the time that they spend on homework in the \begin{align*}7^{th}\end{align*} grade, how many hours would they be spending?
7. If the students in the \begin{align*}11^{th}\end{align*} grade spent half as much time on homework, how many more hours of free time would they gain?
8. True or false. All students spend at least one hour on homework.
The data table below depicts the sales tax rate for several U.S. states.
State | Sales Tax Rate (%) |
---|---|
Alaska | 0 |
Alabama | 4.0 |
Arizona | 5.6 |
California | 6.25 |
New Jersey | 7.0 |
9. Use the information on the data table to create two graphs. One graph should depict the data accurately. On this graph, the sales tax rate for New Jersey is almost double the sales tax rate in Alabama.
10. The second graph should present the data in a misleading manner to suggest that the sales tax rate in New Jersey is more than triple the tax rate in Alabama.
11. Which state has the highest state tax?
12. If Alaska doesn’t have a state tax, does it make sense to put it on the list?
The data below depicts the daily temperature in Juneau, Alaska for ten days.
\begin{align*}64 \quad 60 \quad 57 \quad 55 \quad 49 \quad 57 \quad 58 \quad 60 \quad 59 \quad 56\end{align*}
13. Draw a line graph that depicts a sharp decrease in temperature.
14. Draw another line graph that depicts the decrease accurately.
15. What is the highest temperature on the list?
16. What is the lowest temperature on the list?
17. – 20. Look through a newspaper and choose three different graphs. Then write a few sentences about each one explaining how the data represented is or is not misleading and why.