13.12: BoxandWhisker Plots
What if your teacher recorded each of her student's scores on the last math test? How could she display that data in such a way that it was broken up into four distinct segments? After completing this Concept, you'll be able to make and interpret boxandwhisker plots for data such as this.
Watch This
CK12 Foundation: BoxandWhisker Plots
Guidance
Consider the following list of numbers: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.
The median is the th value. There are 10 values, so the median lies halfway between the and the value. The median is therefore 5.5. This splits the list cleanly into two halves.
The lower list is: 1, 2, 3, 4, 5
And the upper list is: 6, 7, 8, 9, 10
The median of the lower half is 3. The median of the upper half is 8. These numbers, together with the median, cut the list into four quarters. We call the division between the lower two quarters the first quartile . The division between the upper two quarters is the third quartile (the second quartile is, of course, the median).
A boxandwhisker plot is formed by placing vertical lines at five positions, corresponding to the smallest value, the first quartile, the median, the third quartile and the greatest value. (These five numbers are often referred to as the five number summary .) A box is drawn between the position of the first and third quartiles, and horizontal line segments (the whiskers ) connect the box with the two extreme values.
The boxandwhisker plot for the integers 1 through 10 is shown below.
With a boxandwhisker plot, a simple measure of dispersion can be gained from the distance from the first quartile to the third quartile. This interquartile range is a measure of the spread of the middle half of the data.
Example A
Forty students took a college algebra entrance test and the results are summarized in the boxandwhisker plot below. How many students would be allowed to enroll in the class if the pass mark was set at
a) 65%
b) 60%
Solution
From the plot, we can see the following information:
Lowest score = 50%
First quartile = 60%
Median score = 65%
Third quartile = 77%
Highest score = 97%
Since the pass marks given in the question correspond with the median and the first quartile, the question is really asking how many students there are in: a) the upper half and b) the upper 3 quartiles.
a) Since there are 40 students, there are 20 in the upper half; that is, 20 students scored above 65%.
b) Similarly, there are 30 students in the upper 3 quartiles, so 30 students scored above 60%.
Example B
Harika is rolling 3 dice and adding the numbers together. She records the total score for each of 50 rolls, and the scores she gets are shown below. Display the data in a boxandwhisker plot, and find both the range and the interquartile range .
9, 10, 12, 13, 10, 14, 8, 10, 12, 6, 8, 11, 12, 12, 9, 11, 10, 15, 10, 8, 8, 12, 10, 14, 10, 9, 7, 5, 11, 15, 8, 9, 17, 12, 12, 13, 7, 14, 6, 17, 11, 15, 10, 13, 9, 7, 12, 13, 10, 12
Solution
First we’ll put the list in order. Since there are 50 data points, , so the median will be the mean of the and values. The median will split the data into two lists of 25 values; we can write them as two distinct lists.
Since each sublist has 25 values, the first and third quartiles of the entire data set can be found from the median of each smaller list. For 25 values, , and so the quartiles are given by the value from each smaller sublist.
From the ordered list we can see the five number summary:
 The lowest value is 5
 The first quartile is 9
 The median is 10.5
 The third quartile is 13
 The highest value is 17.
The boxandwhisker plot therefore looks like this:
The range is given by subtracting the smallest value from the largest value: .
The interquartile range is given by subtracting the first quartile from the third quartile: .
Representing Outliers in a BoxandWhisker Plot
Boxandwhisker plots can be misleading if we don’t take outliers into account. An outlier is a data point that does not fit well with the other data in the list. For boxandwhisker plots, we can define which points are outliers by how far they are from the box part of the diagram. Defining which data are outliers is somewhat arbitrary, but many books use the norm that follows. Our basic measure of distance will be the interquartile range (IQR).
 A mild outlier is a point that falls more than 1.5 times the IQR outside of the box.
 An extreme outlier is a point that falls more than 3 times the IQR outside of the box.
When we draw a boxandwhisker plot, we don’t include the outliers in the “whisker” part of the plot; instead, we draw them as separate points.
Example C
Draw a boxandwhisker plot for the following ordered list of data:
Solution
From the ordered list we see:
 The lowest value is 1.
 The first quartile is 9.
 The median is 11.5.
 The third quartile is 14.
 The highest value is 30.
Before we start to draw our boxandwhisker plot, we can determine the IQR:
Outliers are points that fall more than 1.5 times the IQR outside of the box—in other words, values that are more than 7.5 units less than 9 or greater than 14. So any values less than 1.5 or greater than 21.5 are outliers.
Looking back at the data we see:
 The value of 1 is less than 1.5, so it is a mild outlier .
 The value 2 is the lowest value that falls within the included range .
 The value 30 is greater than 21.5. In fact, it’s not just more than 7.5 units outside the box, it’s more than twice that far outside the box. Since it falls more than 3 times the IQR above the third quartile, it’s an extreme outlier .
 The value 25 is also greater than 21.5, so it is a mild outlier .
 The value 19 is the highest value that falls within the included range .
So when we draw our boxandwhisker plot, the whiskers will only go out as far as 2 and 19 respectively. The points outside of that range are all outliers. Here is the plot:
Making BoxandWhisker Plots Using a Graphing Calculator
Graphing calculators make analyzing large lists of data easy. They have builtin algorithms for finding the median and the quartiles, and can be used to display boxandwhisker plots.
Example D
The ages of all the passengers traveling in a train carriage are shown below.
35, 42, 38, 57, 2, 24, 27, 36, 45, 60, 38, 40, 40, 44, 1, 44, 48, 84, 38, 20, 4, 2, 48, 58, 3, 20, 6, 40, 22, 26, 17, 18, 40, 51, 62, 31, 27, 48, 35, 27, 37, 58, 21
Use a graphing calculator to:
a) obtain the 5 number summary for the data.
b) create a boxandwhisker plot.
c) determine if any of the points are outliers.
Solution
Enter the data in your calculator:
Press [START] then choose [EDIT] .
Enter all 43 data points in list .
Find the 5 number summary:
Press [START] again. Use the right arrow to choose [CALU] .
Highlight the 1Var Stats option. Press [EDIT] .
The single variable statistics summary appears.
Note the mean ( ) is the first item given.
Use the down arrow to bring up the data for the five number summary . is the number of data points, and the final fie numbers in the screen are the numbers we require.
Symbol  Value  

Lowest Value  minX  1 
First Quartile  21  
Median  Med  37 
Third Quartile  45  
Highest Value  maxX  84 
Display the boxandwhisker plot:
Bring up the [STARTPLOT] option by pressing [2nd]. [Y=] .
Highlight 1:Plot1 and press [ENTER] .
There are two types of boxandwhisker plots available. The first automatically identifies outliers. Highlight it and press [ENTER] .
Press [WINDOW] and ensure that Xmin and Xmax allow for all data points to be shown. In this example, and .
Press [GRAPH] and the boxandwhisker plot should appear.
The calculator will automatically identify outliers and plot them as such. You can use the [TRACE] function along with the arrows to identify outlier values. In this case there is one outlier: 84.
Watch this video for help with the Examples above.
CK12 Foundation: Box and Whisker Plots
Vocabulary
 We call the division between the lower two quarters the first quartile . The division between the upper two quarters is the third quartile (the second quartile is, of course, the median).
 A boxandwhisker plot is formed by placing vertical lines at five positions, corresponding to the smallest value, the first quartile, the median, the third quartile and the greatest value. (These five numbers are often referred to as the five number summary .) A box is drawn between the position of the first and third quartiles, and horizontal line segments (the whiskers ) connect the box with the two extreme values.
Guided Practice
The boxandwhisker plots below represent the times taken by a school class to complete an obstacle course. The times have been separated into boys and girls. The boys and the girls each think that they did best. Determine the five number summary for both the boys and the girls and give a convincing argument for each of them.
Solution
Comparing two sets of data with a boxandwhisker plot is relatively straightforward. For example, you can see that the data for the boys is more spread out, both in terms of the range and the interquartile range.
The five number summary for each is shown in the table below.
Boys  Girls  

Lowest value  1:30  1:40 
First Quartile  2:00  2:30 
Median  2:30  2:55 
Third Quartile  3:30  3:20 
Highest value  5:10  4:10 
Here are some points each side could use in their argument:
Boys:
 The boys had the fastest time (1 minute 30 seconds), so the fastest individual was a boy.
 The boys also had the smaller median (2 minutes 30 seconds), meaning half of the boys were finished when only one fourth of the girls were finished (since the girls’ first quartile is also 2:30). In other words, the boys’ average time was faster.
Girls:
 The boys had the slowest time (5 minutes 10 seconds), so by the time all the girls were finished there was still at least one boy completing the course.
 The girls had the smaller third quartile (3 min 20 seconds), meaning that even without taking the slowest fourth of each group into account, the girls were still quickest.
Practice
 Draw a boxandwhisker plot for the following unordered data: 49, 57, 53, 54, 57, 49, 67, 51, 57, 56, 59, 57, 50, 49, 52, 53, 50, 58
 A simulation of a large number of runs of rolling 3 dice and adding the numbers results in the following 5number summary: 3, 8, 10.5, 13, 18 . Make a boxandwhisker plot for the data and comment on the differences between it and the plot in example B.
 The boxandwhisker plots below represent the percentage of people living below the poverty line by county in both Texas and California. Determine the 5number summary for each state, and comment on the spread of each distribution.

The 5number summary for the average daily temperature in Atlantic City,
(given in
) is:
31, 39, 52, 68, 76
. Draw the boxandwhisker plot for this data and use it to determine which of the following, if any, would be considered outliers if they were included in the data:
 January’s record high temperature of
 January’s record low temperature of
 April’s record high temperature of
 The all time record high of
 In 1887 Albert Michelson and Edward Morley conducted an experiment to determine the speed of light. The data for the first 10 runs (5 results in each run) is given below. Each value represents how many kilometers per second over 299,000 km/s was measured. Create a boxandwhisker plot of the data. Be sure to identify outliers and plot them as such. 850, 740, 900, 1070, 930, 850, 950, 980, 980, 880, 960, 940, 960, 940, 880, 800, 850, 880, 900, 840, 880, 880, 800, 860, 720, 720, 620, 860, 970, 950, 890, 810, 810, 820, 800, 770, 760, 740, 750, 760, 890, 840, 780, 810, 760, 810, 790, 810, 820, 850
 Is it possible to have outliers on both ends of a data set? Explain.
 Is it possible for more than half the values in a data set to be outliers? Explain.
 Is it possible for more than a quarter of the values in a data set to be outliers? Explain.
 Is it possible for either of the whiskers in a boxandwhisker plot to be of zero length? Explain.
 Is it possible for either of the whiskers in a boxandwhisker plot to be longer than the box? Explain.
 Is it possible for either of the whiskers in a boxandwhisker plot to be twice as long as the box? Explain.
Information taken from data published by Rutgers University Climate Lab ( http://climate.rutgers.edu )
Extremes
The extremes are the maximum and minimum values in a data set.five point summary
The numbers needed to construct a boxandwhisker plot are called the fivepointsummary. The five points are the minimum, the lower median (Q1), the median, the upper median (Q3), and the maximum.line of fit
A line of fit is a straight or continuously curved line representing the trend of changes in the comparison of two data sets (or one set of bivariate data).Median
The median of a data set is the middle value of an organized data set.observed data
Observed data are the values that result from computations performed on the input variable.Outlier
In statistics, an outlier is a data value that is far from other data values.Quartile
A quartile is each of four equal groups that a data set can be divided into.skewed
As with the horizontal skewing of a histogram, stem plots with a obvious skew toward one end or the other tend to indicate an increased number of outliers either lesser than or greater than the mode.statistical correlation
Statistical correlation is a representation of possible related changes in values between the two sets of data.trends
Trends in data sets or samples are indicators found by reviewing the data from a general or overall standpointuniform
A uniform shaped histogram indicates data that is very consistent; the frequency of each class is very similar to that of the others.Image Attributions
Description
Learning Objectives
Here you'll learn another way to graphically display a data set, called a boxandwhisker plot. You'll also learn how to interpret such displays and how to determine the effect of outliers on a data set.