<meta http-equiv="refresh" content="1; url=/nojavascript/">
You are reading an older version of this FlexBook® textbook: CK-12 Basic Probability and Statistics - A Short Course Go to the latest version.

# 7.3: Box-and-Whisker Plots

Difficulty Level: At Grade Created by: CK-12

Learning Objectives

• Construct a box-and-whisker plot.
• Construct and interpret a box-and-whisker plot.
• Construct box-and-whisker plots for comparison.
• Use technology to create box-and-whisker plots.

Introduction

An oil company claims that its premium grade gasoline contains an additive that significantly increases gas mileage. To prove their claim the selected 15 drivers and first filled each of their cars with 45L of regular gasoline and asked them to record their mileage. Then they filled each of the cars with 45L of premium gasoline and again asked them to record their mileage. The results below show the number of kilometers each car traveled.

Display each set of data to explain whether or not the claim made by the oil company is true or false.

We will revisit this problem later in the lesson to determine whether or not the oil company did place an additive in its premium gasoline that improved gas mileage.

Box-and-Whisker Plot

A box-and-whisker plot is another type of graph used to display data. It shows how the data are dispersed around a median, but does not show specific values in the data. It does not show a distribution in as much detail as does a stem-and-leaf plot or a histogram, but it clearly shows where the data is located. This type of graph is often used when the number of data values is large or when two or more data sets are being compared. The center of the distribution, its spread and the range of the data are very obvious form the graph. The box-and-whisker plot (often called a box plot), divides the data into quarters by use of the medians of these quarters.

As we construct a box-and-whisker plot for a given set of data, you will understand how this type of graph is very useful in statistics.

Example 1:

You have a summer job working at Paddy’s Pond which is a recreational fishing spot where children can go to catch salmon which have been raised in a nearby fish hatchery and then transferred into the pond. The cost of fishing depends upon the length of the fish caught ($\0.75$ per inch). Your job is to transfer 15 fish into the pond three times a day. Before the fish are transferred, you must measure the length of each one and record the results. Below are the lengths (in inches) of the first 15 fish you transferred to the pond:

$& \text{Length of Fish (in.)}\\& 13 \quad \ 14 \quad \ 6 \quad \ \ \ 9 \quad \ 10\\& 21 \quad \ 17 \quad \ 15 \quad \ 15 \quad \ 7\\& 10 \quad \ 13 \quad \ 13 \quad \ \ 8 \quad \ 11$

Since the box-and-whisker plot is based on medians, the first step is to organize the data in order from smallest to largest.

$& 6\; \quad \ \ 7 \quad \ \ 8\; \quad \ \ 9 \ \quad \ 10\\& 10 \quad \ 11 \quad \ 13 \quad \ 13 \quad \ 13\\& 14 \quad \ 15 \quad \ 15 \quad \ 17 \quad \ 21$

$6, 7, 8, 9, 10, 10, 11, 13, \fbox{{\color{blue}13}}, 13, 14, 15, 15, 17, 21$

This is an odd number of data, so the median of all the data is the value in the middle position which is 13. There are 7 numbers before and 7 numbers after 13. The next step is the find the median of the first half of the data – the 7 numbers before the median. This is called the lower quartile since it is the first quarter of the data. On the graphing calculator this value is referred to as $Q_1$.

$6, 7, 8, \fbox{{\color{blue}9}}, 10, 10, 11$

The median of the lower quartile is 9.

This step must be repeated for the second half of the data – the 7 numbers below the median of 13. This is called the upper quartile since it is the third quarter of the data. On the graphing calculator this value is referred to as $Q_3$.

$13, 13, 14, \fbox{{\color{blue}15}}, 15, 17, 21$

Now that the medians have all been determined, it is time to construct the actual graph. The graph is drawn above a number line that includes all the values in the data set (graph paper works very well since the numbers can be placed evenly using the lines of the graph paper). Represent the following values by using small vertical lines above their corresponding values on the number line:

$& \text{Smallest Number} - 6 && \text{Median of the Lower Quartile} - 9 && \text{Median} - 13\\& \text{Median of the Upper Quartile} - 15 && \text{Largest Number} - 21$

The five data values listed above are often called the five-number summary for the data set and are used to graph every box-and-whisker plot.

Join the tops and bottoms of the vertical lines that were drawn to represent the three median values. This will complete the box.

The three medians divide the data into four equal parts. In other words:

• One-quarter of the data values are located between 6 and 9
• One-quarter of the data values are located between 9 and 13
• One-quarter of the data values are located between 13 and 15
• One-quarter of the data values are located between 15 and 21

From the box-whisker, any outliers (unusual data values that can be either low or high) can be easily seen on a box plot. An outlier would create a whisker that would be very long.

The next diagram will show where these numbers are actually located on the box-and-whisker plot.

Each whisker contains 25% of the data and the remaining 50% of the data is contained within the box. It is easy to see the range of the values as well as how these values are distributed around the middle value. The smaller the box, the more consistent the data values are with the median of the data.

Example 2:

After one month of growing, the heights of 30 parsley seed plants were measured and recorded. The measurements (in inches) are shown in the table below.

Heights of Parsley (in.)
6 26 23 33 11 26
22 28 30 40 38 18
11 37 12 34 49 17
25 37 46 39 8 27
16 38 18 23 26 14

Construct a box-and-whisker plot to represent the data.

The data organized from smallest to largest is shown in the table below. (You could use your calculator to quickly sort these values)

Heights of Parsley (in.)
6 8 11 11 12 14
16 17 18 18 22 23
23 25 26 26 26 27
28 30 33 34 37 37
38 38 39 40 46 49

There is an even number of data values so the median will be the mean of the two middle values. $Med = \frac{26 + 26}{2} = 26$. The median of the lower quartile is the number in the $8^{th}$ position which is 17. The median of the upper quartile is also the number in the $8^{th}$ position which is 37. The smallest number is 6 and the largest number is 49.

The TI83 can also be used to create a box-and whisker plot. The five-number summary values can be determined by using the trace function of the calculator.

Box-and-Whisker plots are very useful when two data sets need to be compared. The graphs are plotted, one above the other, on the same number line. This method can be used to determine whether or not the additive, which the oil company put in their premium gas, improved gas mileage.

From the above box-and-whisker plots, where the blue one represents the regular gasoline and the yellow one the premium gasoline, it is safe to say that the additive in the premium gasoline definitely increases the mileage. However, the value of 500 seems to be an outlier.

Lesson Summary

In this lesson you learned how the medians of a set of data can be used to represent the values in a meaningful graph called the box-and-whisker plot. You also learned that two sets of data can be compared by representing them using box-and-whisker plots graphed on the same number line. In addition, you also learned the importance of the five-number summary associated with a data set and how these values can be found on the TI83 when a box-and whisker plot is created using technology.

Points to Consider

• Are there still other ways to represent data graphically?
• We have seen how the mean and the median are used for graphical representations of data. Is the mode ever used to produce a graph?

Review Questions

1. Below is the data that represents the amount of money that males spent on prom night, $& 25 \quad \ 60 \quad \ 120 \quad \ 64 \quad \ 65 \quad \ 28 \ \quad \ 110 \quad \ 60\\& 70 \quad \ 34 \quad \ 35 \ \quad \ 70 \quad \ 58 \quad \ 100 \quad \ 55 \ \quad \ 95\\& 55 \quad \ 95 \quad \ 93 \quad \ \ 50 \quad \ 75 \quad \ 35 \quad \ \ 40 \ \quad \ 75\\& 90 \quad \ 40 \quad \ 50 \ \quad \ 80 \quad \ 85 \quad \ 50 \ \quad \ 80 \ \quad \ 47\\& 50 \quad \ 80 \quad \ 90 \ \quad \ 42 \quad \ 49 \quad \ 84 \ \quad \ 35 \ \quad \ 70$ Construct a box-and-whisker graph to represent the data.
2. Using the following box-and whisker plot, list three things pieces of information that you can determine from the graph.
3. In a recent survey done at a high school cafeteria, a random selection of males and females were asked how much money they spent each month on school lunches. The following box-and-whisker plots compare the responses of males to those of females. The lower one is the response by males
1. How much money did the middle 50% of each sex spend on school lunches each month?
2. What is the significance of the value $\42$ for males and $\46$ for females?
3. What conclusions can be drawn from the above plots? Explain.
4. The following box-and-whisker plot shows final grades last semester. How would you best describe a typical grade in that course?
1. Students typically made between 82 and 88.
2. Students typically made between 41 and 82.
3. Students typically made around 62.
4. Students typically made between 58 and 82.

1. Three things we can say from the graph are:
• The smallest number is 100
• The largest number is 195
• 50% of the data is between 120 and 155
1. (Males $\22$ - $\58$) (Females $\28$ - $\68$)
2. Median values.
3. Females spend more money on lunches than males spend.
2. Students typically made between 41 and 82.

Answer Key for Review Questions (even numbers)

2. Three things we can say from the graph are:

• The smallest number is 100
• The largest number is 195
• 50% of the data is between 120 and 155

4. Students typically made between 41 and 82.

Vocabulary

Broken-Line Graph
A graph with line segments joining points that represent data.
Continuous Data
Data which has all meaningful values for the problem.
Correlation
A linear relationship between two variables.
Data
A set of numbers or observations that have meaning and are collected from a sample or a population.
Discrete Data
Data in which the values between the plotted points have no meaning for the problem.
Double Broken-Line Graph
Two broken-line graphs plotted on the same axis and used for comparison of data.
Dot Plot
A graph that shows the values of a variable along a number line.
Linear Graph
A graph of a straight line that has an equation in the form $y = mx + b$
Line of Best Fit
A line connecting points on a scatter plot that best represents the data.
Scatter Plot
A plot of dots that shows the relationship between two variables.
Bar Graph
Graph that compares data using equally spaced bars to represent the data.
Histogram
A type of bar graph that has no spaces between the bars.
Stem-and-Leaf Plot
A type of graph that is similar to a histogram and the data is arranged according to place value.

Feb 23, 2012

Dec 29, 2014