<meta http-equiv="refresh" content="1; url=/nojavascript/">

# Box-and-Whisker Plots

Plotting the five-number summary for ascending data.
%
Progress
Practice Box-and-Whisker Plots
Progress
%
Box-and-Whisker Plots

Have you ever thrown a shot-put? Take a look at this dilemma.

A track and field coach, Mr.Watson was measuring shot put distances for his varsity and junior varsity teams. Here is his data, in feet, that he put in order from least to greatest.

Varsity: 36.8, 43.5, 45.8, 46.2, 49.1, 50.7, 52.7, 54.3, 54.4, 55.8, 56.0, 58.5

Junior Varsity: 33.2, 35.4, 36.2, 37.0, 37.6, 39.4, 40.6, 40.8, 41.3, 42.1, 44.5, 50.3

Mr. Watson wants to present this information to both of his teams. He wants to compare them. How do they compare? How can Mr. Watson create a display that will communicate what he wants to tell his team?

To accomplish this task, you will need to know about box-and-whisker plots. Pay close attention and you will be able to help Mr. Watson at the end of the Concept.

### Guidance

At times it is useful to get a general idea of how data clusters together.

Box-and-whisker plots display the distribution of data items along a number line.

The data are divided into four equal parts, separated by points called quartiles .

You can also see the smallest data point, the extreme minimum, and the largest data point, the extreme maximum.

A box-and-whisker plot is created by determining five points.

First we’ll place the data in order from smallest to largest.

Then, we create a number line that shows the range of the data using equal intervals. We’ll use the median as our middle point on the box-and-whisker plot and to split the data in half.

The median of each half, the quartile , is then calculated. These separate the data into quarters.

Finally, we’ll use the highest datum and the lowest datum as our endpoints or our extremes . Boxes are drawn between the quartiles and whiskers are drawn to the extremes.

Now let's apply these steps with a dilemma.

Draw a box-and-whisker plot for the given data.

16, 51, 32, 16, 24, 37, 7, 22, 19, 40, 10, 31, 29, 38, 21, 11

Step 1: Put the data in order from smallest to largest.

7, 10, 11, 16, 16, 19, 21, 22, 24, 29, 31, 32, 37, 38, 40, 51

Step 2: Draw a number line that includes your extremes, 7 and 51. In this case, we will use a number line from 5 to 55 using intervals of 5.

Step 3: Determine the median of the data. The middle points in the data are 22 and 24 so the median is 23. Mark the median with a point beneath the number line.

Step 4: The median separates the data into two groups as shown below:

$7, 10, 11, 16, 16, 19, 21, 22 \qquad 24, 29, 31, 32, 37, 38, 40, 51$

Find the median in each of these groups. These are the quartiles which are 16 and 34.5. These divide the data into four groups. Mark the quartiles as you did the median, with a point.

Step 5: Draw boxes between the quartiles and the median.

Step 6: Mark the extremes, the smallest and largest numbers, with points. In this case, the extremes are 7 and 51.

Step 7: Draw whiskers, or horizontal lines, to connect the quartiles to the extremes.

You can see from the box-and-whisker plot that half of the data will be found between the first quartile and the third quartile. A quarter of the data is between the minimum and the first quartile and the last quarter is between the third quartile and the maximum. The median, of course, marks the half-way point between the data.

In this particular situation, the second half of the data is stretched out over a further area than the first half and about half way is between 15 and 35.

We can make double plots or graphs when there are two factors that we are comparing. A double box-and-whisker plot can be made by drawing the second factor beneath the first factor. This will allow us to look at both factors on the same plot.

Use this box-and-whisker plot to answer the following questions.

#### Example A

What is the minimum extreme of this box-and-whisker plot?

Solution: $34$

#### Example B

What is the maximum extreme of this box-and-whisker plot?

Solution: $58$

#### Example C

What is the median?

Solution: $49$

Now let's go back to the dilemma from the beginning of the Concept.

Make a double box-and-whisker plot of this data. How does the data compare?

Varsity Junior Varsity
Extremes: 36.8 and 58.5 33.2 and 50.3
Median: 51.7 40.0
First and third quartiles: 46.0 and 55.1 36.6 and 41.7

From this box-and-whisker plot, the coach can tell that the teams’ results are what he expected—the varsity is generally better than the junior varsity. There are a number of players whose results overlap—the highest junior varsity player is better than the entire first quartile of the varsity team. Perhaps some adjustments need to be made. However, the coach must also consider their results in other events before switches are made. The lowest varsity player is also the best long distance runner. It is also apparent that the results are more dispersed, or spread out, in the varsity team than in the junior varsity team.

### Guided Practice

Here is one for you to try on your own.

The data values on the table below depict the number of televisions sold at a department store each month for nine months. Create a box-and-whisker plot to display the data.

April May June July August September October November December
110 98 91 102 89 95 108 118 152

Solution

Step 1: To determine the median of the set of data, arrange the data in order from least to greatest. Identify the data value in the middle of the data set. For this set of data, 102 is the median.

89, 91, 95, 98, 102, 108, 110, 118, 152

Step 2: Identify the median for the lower quartile. Again, since two data values share the middle position, find their mean. The median for the lower quartile is 93.

$& \underline{89, \ 91, \ 95, \ 98}, \ 102, \ 108, \ 110, \ 118, \ 152\\& \qquad \qquad \qquad \quad \ 91 + 95 = 186\\& \qquad \qquad \qquad \quad \ 186 \div 2 = 93$

Step 3: Identify the median of the upper quartile. Remember to find the mean of the two data values that share the middle position. The median of the upper quartile is 114.

$& 89, \ 91, \ 95, \ 98, \ 102, \ \underline{108, \ 110, \ 118, \ 152}\\& \qquad \qquad \qquad \quad \ 110 + 118 = 228\\& \qquad \qquad \qquad \quad \quad \ 228 \div 2 = 114$

Step 4: Draw a number line. The first value on the number line should be near the smallest number in the data set. In this case, the smallest number is 89. Therefore, the number line will start at 80. The last value on the number line should be near the largest number in the set of data. The largest number in the data set is 152. Therefore, the number line will end at 160. In this case, label the number line by tens.

### Explore More

Directions: Use each data for each set of instructions.

90, 104, 98, 156, 140, 85, 122, 129, 142, 138, 131, 81, 151, 147, 130, 156

1. Create a box-and-whisker plot for the data.
2. Identify the minimum extreme.
3. Identify the maximum extreme.
4. Identify the median.

The weight of bears varies between species. Weight also varies within species as a result of habitat and diet. The box-and-whisker plot was created after recording the weight (in pounds) of several black bears across the country. Use the box-and-whisker plot to answer the questions below.

1. What is the minimum extreme?
2. What is the maximum extreme?
3. What is the median?
4. What is the value of the first quartile?
5. What is the value of the third quartile?

A group of dog sled drivers collected the following data about the number of dogs who lead sled teams. Here is the data in a box-and-whisker plot.

1. What is the minimum extreme?
2. What is the maximum extreme?
3. What is the median?
4. What is the value of the first quartile?
5. What is the value of the third quartile?
6. How many dogs do most sled teams have?

### Vocabulary Language: English

arithmetic mean

arithmetic mean

The arithmetic mean is also called the average.
back-to-back stem plots

back-to-back stem plots

A Back-to-Back stem plot is a modified stem-and-leaf plot with the stem in the center and the leaves on the sides, it is used to compare two different related sets of data (bivariate data).
bell shaped

bell shaped

A bell shaped histogram is a histogram with a prominent ‘mound’ in the center and similar tapering to the left and right.
bins

bins

Bins are groups of data plotted on the x-axis.
bivariate data

bivariate data

Bivariate data consists of two paired sets of data.
box- and- whisker plot

box- and- whisker plot

A box- and- whisker plot is a graphic display of quantitative data that demonstrates the five number summary.
calculated data

calculated data

Calculated data has values that are the result of computations performed on the input variable.
dependent variable

dependent variable

The dependent variable is the output variable in an equation or function, commonly represented by $y$ or $f(x)$.
explanatory variables

explanatory variables

Explanatory variables are another name for independent variables.
extreme outliers

extreme outliers

Extreme outliers include points more than 3 times the middle half of your data.      .
Extremes

Extremes

The extremes are the maximum and minimum values in a data set.
five point summary

five point summary

The numbers needed to construct a box-and-whisker plot are called the five-point-summary. The five points are the minimum, the lower median (Q1), the median, the upper median (Q3), and the maximum.
independent variable

independent variable

The independent variable is the input variable in an equation or function, commonly represented by $x$.
input variables

input variables

Input variables are another name for independent variables.
Interquartile range

Interquartile range

The interquartile range is the difference between the third quartile and the first quartile (Q3-Q1).
Leaf

Leaf

The leaves of a stem-and-leaf plot are the rightmost digits of each of the original data values.
line of best fit

line of best fit

A line of best fit is a straight line drawn on a scatter plot such that the sums of the distances to the points on either side of the line are approximately equal and such that there are an equal number of points above and below the line.
line of fit

line of fit

A line of fit is a straight or continuously curved line representing the trend of changes in the comparison of two data sets (or one set of bivariate data).
linear regression

linear regression

In statistics, linear regression is a process that attempts to model the relationship between two variables by fitting a linear equation to the data.
lower median

lower median

The lower median is the first quartile (Q1) in the box-and-whisker plot.
Median

Median

The median of a data set is the middle value of an organized data set.
mild outliers

mild outliers

Mild outliers include data points that are more than 1.5 times the middle half of your data above the upper, or below the lower, quartiles.
modified box-plot

modified box-plot

A modified box plot has whiskers that extend to the highest and lowest non-outlier value.
normal distributed

normal distributed

If data is normally distributed, the data set creates a symmetric histogram that looks like a bell.
observed data

observed data

Observed data are the values that result from computations performed on the input variable.
Outlier

Outlier

In statistics, an outlier is a data value that is far from other data values.
output variables

output variables

Output variables are another name for dependent variables.
Quartile

Quartile

A quartile is each of four equal groups that a data set can be divided into.
range

range

The range of a set of data is the difference in value between the least and greatest values in the set.
response variables

response variables

Response variables are another name for dependent variables.
skewed

skewed

As with the horizontal skewing of a histogram, stem plots with a obvious skew toward one end or the other tend to indicate an increased number of outliers either lesser than or greater than the mode.
statistical correlation

statistical correlation

Statistical correlation is a representation of possible related changes in values between the two sets of data.
stem

stem

A stem  in a stem plot is a values or column of values that represent the greatest place value(s) in a set of data.
Stem-and-leaf plot

Stem-and-leaf plot

A stem-and-leaf plot is a way of organizing data values from least to greatest using place value. Usually, the last digit of each data value becomes the "leaf" and the other digits become the "stem".
trends

trends

Trends in data sets or samples are indicators found by reviewing the data from a general or overall standpoint
uniform

uniform

A uniform shaped histogram indicates data that is very consistent; the frequency of each class is very similar to that of the others.
upper median

upper median

The upper median is the third quartile (Q3) in the box-and-whisker plot.