10.2: Stem-and-Leaf Plots and Histograms
Introduction
Measuring Up
One afternoon, Mr. Watson had all of the boys on the distance running team line up on the field.
“Hey coach, what’s up?” Manuel asked taking his place on the team.
“I am checking on heights,” Mr. Watson explained taking out his tape measure. He began to measure each boy in centimeters.
“Why?” Carl asked curiously.
“Well, I have the heights of Markwell’s team and I want to compare our heights with theirs. I am wondering if there is a correlation between speed and height. So I am going to start with the heights,” Mr. Watson explained.
Mr. Watson wrote the following heights from smallest to largest.
Markwell Cougars: 170, 172, 175, 176, 176, 176, 178, 181, 182, 183, 183, 183, 185, 185, 187, 188, 188, 189, 190, 195
Hawks: 169, 175, 176, 176, 178, 179, 180, 183, 183, 186, 186, 186, 187, 187, 187, 187, 187, 188, 190, 191, 192
There are many different ways to display this data. In this lesson, you will learn about histograms and by the end of this lesson, you will be able to create a display of the heights of both teams.
What You Will Learn
In this lesson, you will learn how to complete the following skills.
- Draw a stem-and-leaf plot to represent and interpret a set of data.
- Organize data in a frequency table and make a histogram to represent and interpret a set of data, justifying size of intervals.
- Compare stem-and-leaf plots and histograms of the same data.
- Make, compare and interpret back to back stem-and-leaf data and double histograms of real – world data.
Teaching Time
I. Draw a Stem-and-Leaf Plot to Represent and Interpret a Set of Data
Measures of central tendency are an important method for interpreting a set of data. However, humans tend to be very visual. That is, many people understand things best when they can see them. For that reason, we have a variety of tools which allow us to see a set of data. These tools include plots and graphs. Each type of visual tool has advantages and the best type of plot or graph depends on the situation. Indeed, sometimes it is a matter of preference as many different graphs could be used to illustrate the same data. Here we’ll consider stem-and-leaf plots and histograms. Let’s start with stem-and-leaf plots.
Consider a stem and its leaves. The stem is a stronger sort of base from which the leaves sprout. This is the idea of a stem-and-leaf plot. They allow us to see groups of data and tendencies quickly while at the same time showing every single piece of data. We organize a stem-and-leaf plot according to the largest base ten value and the smaller base ten values.
Example
An accountant must consider the cost of healthcare for the employees at a company. The healthcare costs are based on age of the employees. She finds that the ages of the employees are as follows: 32, 19, 37, 22, 25, 46, 58, 35, 41, 45, 35, 27, 29, 42, 53, 70, 56, 34, 29, 30, 21, 24, 27, and 45.
This data is difficult to understand in an unorganized list. Measures of central tendency could be calculated but they would not help to determine healthcare costs. A stem-and-leaf plot will give her a better idea of numbers of employees per age group.
To create a stem-and-leaf plot, we first must put the data in order from smallest to largest:
19, 21, 22, 24, 25, 27, 27, 29, 29, 30, 32, 34, 35, 35, 37, 41, 42, 45, 45, 46, 53, 56, 58, 70.
Now, choose stem values. That means to choose values that would be the first digit(s) of appropriate groupings. In this case, because our youngest employee is 19 and the oldest is 70, we can use the tens place as our stem. We construct the stem vertically, then, as shown below. Then, we place each piece of data, the leaves, in the plot, next to its stem. We place the leaves in order, only separated by a column. The stem, the tens place, is not repeated.
The stem-and-leaf plot is complete.
Now, look at the plot. What trends can you see? The bulk of the employees are in their 20’s and 30’s, for example. By counting off, you can quickly locate the median. You may also notice the importance of lining up the numbers in columns so that you can quickly see how many data items there are per row.
If our example had used numbers in the hundreds, then the hundreds would have become the largest stem. If it had been in the thousands, then the thousands would have been the largest stem. You get the idea!
II. Organize Data in a Frequency Table and Make a Histogram to Represent and Interpret a Set of Data, Justifying the Size of the Intervals
What does the word frequency mean? Frequency is used to measure how often something occurs. When we think about the word frequency, we think about “how often”. We can use a frequency table to measure and visually show how often a data value occurs. Let’s think about this.
Example
A teacher is preparing for parent conferences. In order to provide parents with the most information possible about their children, he wants to organize the grades of the class so that they can compare the grades to the rest of the class. The math percents have been calculated and his students earned the following grades: 88, 86, 92, 65, 72, 75, 81, 84, 85, 93, 99, 50, 78, 80, 86, 76, 74, 95, 81, 87, 90, 72, 76, 61, 85, 84, 78, 83. Grades are determined by percent where 0-59% is an F, 60-69% is a D, 70-79% is a C, 80-89% is a B, and 90-100% is an A, so that makes the most logical intervals. Intervals are always chosen depending on the range of the data. He will make a frequency table to illustrate the information. For each student who scored in the given range, he puts an X.
Interval | Tally | Frequency |
---|---|---|
90-100 | XXXXX | 5 |
80-89 | XXXXXXXXXXXX | 12 |
70-79 | XXXXXXXX | 8 |
60-69 | XX | 2 |
0-59 | X | 1 |
This tally is useful in the sense that it communicates to parents how many students in the class scored in the A range, B range, etc. It would not be as important for the parents to see the individual scores of each student as opposed to seeing the total number of each grade. That way, if their child earned a B, then they would know that the child falls in a category that most other students scored in. If a child earned a D, for example, it would indicate that they are below the general level of the other students and might need additional help.
Notice that a frequency table showed you how often a particular score was earned. We could see it in a visual way. Sometimes, frequency tables use X’s and other times, they can use lines for tally marks.
That is a great question. Yes, we could create a histogram.
A histogram is similar to a bar graph in that it uses columns to illustrate data on - and -axes. In a histogram, we can use the same intervals as we did for the frequency table. The bars in the histogram will have no space between them.
The histogram shows the same information as the frequency table does. However, the histogram is a type of graph, meaning that it is visual representation. Of course, we look at all of the data with our eyes, all data is visual. But the bars on the histogram are interpreted more easily by size than numerical data.
III. Compare Stem-and-Leaf Plots and Histograms of the Same Data
As mentioned, each type of plot or graph has advantages. At times, it is only personal preference that decides which is best. In other times, certain tools make interpretation easier or more logical. Let’s compare stem-and-leaf plots and histograms of the same data.
Example
Create a stem-and-leaf plot and a histogram of the mass of geodes found at a volcanic site. Scientists measured 24 geodes in kilograms and got the following data: .8, .9, 1.1, 1.1, 1.2, 1.5, 1.5, 1.6, 1.7, 1.7, 1.7, 1.9, 2.0, 2.3, 5.3, 6.8, 7.5, 9.6, 10.5, 11.2, 12.0, 17.6, 23.9, 26.8.
Now we need to build a stem-and-leaf plot. Remember, we are trying to determine if the data would be better displayed in a stem-and-leaf plot or in a histogram. We can start with the stem-and-leaf plot.
The stem for this plot could be either the ones place or the tens place. If we use the ones place, it would require 24 rows. That’s too many to be useful. We should, then, use the tens place.
You can see that either using the ones place or the tens place has distinct disadvantages. This stem-and-leaf plot is large and limited to only two options that are not ideal.
The histogram, on the other hand, offers more choices because we could make our intervals whatever values we choose. We should choose even intervals that are based on the data.
The minimum item is .8 kg and the maximum is 26.8. To get a good idea of the data, we could use intervals that encompass perhaps 4 kg intervals, 5 kg intervals, or 6 kg intervals. Let’s try intervals of 5.
Begin with a frequency table.
Interval | Tally | Frequency |
---|---|---|
0-5 | XXXXXXXXXXXXXX | 14 |
5.1-10 | XXXX | 4 |
10.1-15 | XXX | 3 |
15.1-20 | X | 1 |
20.0-25 | X | 1 |
25.1-30 | X | 1 |
Now we can create a histogram for this data.
In this case, we can see that the histogram is more useful than the stem-and-leaf plot because it gave us more flexibility with the intervals. This histogram clearly shows that most geodes are smaller, in the 0-5 kg interval. Although the stem-and-leaf plot shows us the same data, it was unable to break it down into intervals of 5. Had we used intervals of 1 on the stem-and-leaf plot, it would have been too tall.
Now you know that you have to consider the data itself and the intervals that make the most sense when comparing data.
IV. Make, Compare and Interpret Back to Back Stem-and-Leaf Plots and Double Histograms of Real – World Data
Both stem-and-leaf plots and histograms are useful. So far, we have only seen them illustrating a single factor. However, they can both be used to illustrate two factors. Plants generally have leaves on both sides of the stem. A double stem-and-leaf plot is used to show to factors, one on either side of the stem. Let’s look at an example.
Example
A zoologist takes the weights of male and female chimpanzees at 1 year of age. She finds the following data, in pounds, and places it in order from smallest to largest.
Females: 14, 17, 19, 19, 20, 21, 23, 24
Males: 18, 22, 24, 25, 26, 28, 31, 32, 34
Make a double stem-and-leaf plot that represents this data so we can compare the males and the females in the same data display.
In this plot, the female data on the left begins with smallest data closest to the stem and increases as you go left. You can see from the stem-and-leaf plot that the tendency is for the males to weigh more than the females after a year. We can also see that there are more males in this group than females.
We can also create a double histogram to show the data. This will allow us to compare data for both the males and the females on one graph.
Now we can go back and apply what we have learned to the problem in the introduction.
Real-Life Example Completed
Measuring Up
Here is the problem from the introduction. Reread it and then create a display of the data using histograms and frequency tables.
One afternoon, Mr. Watson had all of the boys on the distance running team line up on the field. “Hey coach, what’s up?” Manuel asked taking his place on the team.
“I am checking on heights,” Mr. Watson explained taking out his tape measure. He began to measure each boy in centimeters.
“Why?” Carl asked curiously.
“Well, I have the heights of Markwell’s team and I want to compare our heights with theirs. I am wondering if there is a correlation between speed and height. So I am going to start with the heights,” Mr. Watson explained.
Mr. Watson wrote the following heights from smallest to largest.
Markwell Cougars: 170, 172, 175, 176, 176, 176, 178, 181, 182, 183, 183, 183, 185, 185, 187, 188, 188, 189, 190, 195
Hawks: 169, 175, 176, 176, 178, 179, 180, 183, 183, 186, 186, 186, 187, 187, 187, 187, 187, 188, 190, 191, 192
Remember, there are two parts to your answer.
Solution to Real – Life Example
The first thing to do is to create a frequency table using the data. Here is an example.
Markwell Cougars | Hawks | |||
---|---|---|---|---|
Interval | Tally | Frequency | Tally | Frequency |
160-169 | X | 1 | ||
170-179 | XXXXXXX | 7 | XXXXX | 5 |
180-189 | XXXXXXXXXXX | 11 | XXXXXXXXXXXX | 12 |
190-199 | XX | 2 | XXX | 3 |
Now, you can use this data to create a histogram that compares the data.
You can see from the histogram that both teams have more players in the 180-189 interval. However, while the Cougars have more players in the 170-179 interval, the Hawks have slightly more in the taller interval. The Hawks have a slight height advantage.
Vocabulary
Here are the vocabulary words that are found in this lesson.
- Stem-and-Leaf Plots
- a visual display of data that takes the largest base ten of a value and separates it by large bases and smaller values in the data.
- Histograms
- A visual display of data that uses bars, and axes with no spaces between the intervals.
- Frequency Table
- a table that shows how often different values occur in a data set. They are arranged using tally marks or X’s.
- Double Stem-and-Leaf Plots
- Stem-and-leaf plots that show two different sets of data on the same display by organizing according to stems and leaves of base ten values.
- Double Histograms
- a visual display of data using bars and intervals to compare data sets that contain two different sets of values.
Time to Practice
Directions: Use each situation to answer the following questions.
The following data was taken of employee’s paychecks at a company: $2105, $2390, $2087, $2345, $2166, $2051, $2432, $2344, $2580, $2017, $1977, $2406, $2113, $2501, $2475, $2030.
- Make a stem-and-leaf plot that represents the data.
- What can you interpret from the plot?
- Explain the intervals that you chose.
- Why is it necessary to show intervals for which there was no data?
A company sent its employees on 18 business trips last year. They pay for employees’ food and housing expenses while they are away. The company would like to get a better idea of what the trips are costing.
The following data shows the total receipts for each trip, in dollars: 160, 175, 255, 267, 290, 295, 310, 332, 350, 352, 364, 375, 410, 462, 490, 495, 580, 710.
- Create a histogram that illustrates this data.
- Explain why you chose the intervals that you chose.
- What can you interpret from your histogram?
Compare the stem-and-leaf plot to the histogram of Melanie’s Christmas gift expenses. She told her husband, “Most of the gifts were about $60.”
- Is she telling the truth?
- Which tool is more useful in making a decision about her truthfulness?
A hybrid car and a gasoline-only car filled up on the same days of the month. The drivers recorded the gasoline costs for the two cars.
Hybrid: $17, $24, $19, $21, $10, $12, $15, $20, $6, $16
Gasoline-Only: $34, $27, $15, $31, $29, $27, $24, $14, $35, $28
- Create a double stem-and-leaf plot to represent this data.
- What can you conclude from your stem-and-leaf plot?
The same company that has operations in Germany and in the United States tallied the number of unpaid vacation days taken by their employees in both countries. The frequency table below shows their results.
Germany | United States | |||
---|---|---|---|---|
# of Days | Tally | Frequency | Tally | Frequency |
0-5 | XX | 2 | XXXXXXXXX | 9 |
6-10 | XXXXX | 5 | XXXXXXXX | 8 |
11-15 | XXXXXXXX | 8 | XXXXXX | 6 |
16-20 | XXXXXXXXXX | 10 | XXXX | 4 |
21-25 | XXXXXXXXX | 9 | X | 1 |
- Create a double histogram that illustrates this data.
- What does the histogram tell you about the unpaid vacation days taken in the two countries? Why do you think this might be true?
- How might this affect the company’s decisions?
- Why do you think they chose intervals of 5 days?
16 – 20 Conduct your own survey and collect data. Choose attendance rates in your class or vacation days per year for example. Then create a frequency table, histogram and analyze your data.