11.11: Box-and-Whisker Plots
Suppose that in the first 20 screenings of a movie, the number of paying customers at a theater were as follows: 16, 34, 22, 19, 59, 33, 60, 45, 50, 27, 75, 38, 49, 52, 20, 40, 13, 15, 26, 21. Could you analyze this data with a box-and-whisker plot? If so, what would you need to do first? In this Concept, you'll learn how to use box-and-whisker plots to analyze data sets like this one.
Guidance
A box-and-whisker plot is another type of graph used to display data. It shows how the data are dispersed around a median, but it does not show specific values in the data. It does not show a distribution in as much detail as does a stem-and-leaf plot or a histogram.
A box-and-whisker plot is a graph based upon medians. It shows the minimum value, the lower median, the median, the upper median, and the maximum value of a data set. It is also known as a box plot.
This type of graph is often used when the number of data values is large or when two or more data sets are being compared.
Example A
You have a summer job working at Paddy’s Pond. Your job is to measure as many salmon as possible and record the results. Here are the lengths (in inches) of the first 15 fish you found: 13, 14, 6, 9, 10, 21, 17, 15, 15, 7, 10, 13, 13, 8, 11
Create a box-and-whisker plot.
Solution:
Since a box-and-whisker plot is based on medians, the first step is to organize the data in order from smallest to largest.
\begin{align*}6, \ 7, \ 8, \ 9, \ 10, \ 10, \ 11, \ \boxed{13}, \ 13, \ 13, \ 14, \ 15, \ 15, \ 17, \ 21\end{align*}
Step 1: Find the median: \begin{align*}median=13\end{align*}.
Step 2: Find the lower median.
The lower median is the median of the lower half of the data. It is also called the lower quartile or \begin{align*}Q_1\end{align*}.
\begin{align*}6, \ 7, \ 8, \ & \boxed{9}, \ 10, \ 10, \ 11\\ Q_1& =9\end{align*}
Step 3: Find the upper median.
The upper median is the median of the upper half of the data. It is also called the upper quartile or \begin{align*}Q_2\end{align*}.
\begin{align*}13, \ 13, \ 14, \ \boxed{15}, \ 15, \ 17, \ 21\end{align*}
\begin{align*}Q_3=15\end{align*}
Step 4: Draw the box plot. The numbers needed to construct a box-and-whisker plot are called the five-number summary.
The five-number summary are: the minimum value, \begin{align*}Q_1\end{align*}, the median, \begin{align*}Q_2\end{align*}, and the maximum value.
\begin{align*}Minimum=6; \ Q_1=9;\ median=13; \ Q_3=15; \ maximum=21\end{align*}
The three medians divide the data into four equal parts. In other words:
- One-quarter of the data values are located between 6 and 9.
- One-quarter of the data values are located between 9 and 13.
- One-quarter of the data values are located between 13 and 15.
- One-quarter of the data values are located between 15 and 21.
From its whiskers, any outliers (unusual data values that can be either low or high) can be easily seen on a box-and-whisker plot. An outlier would create a whisker that would be very long.
Each whisker contains 25% of the data and the remaining 50% of the data is contained within the box. It is easy to see the range of the values as well as how these values are distributed around the middle value. The smaller the box, the more consistent the data values are with the median of the data.
Example B
After one month of growing, the heights of 24 parsley seed plants were measured and recorded. The measurements (in inches) are given here: 6, 22, 11, 25, 16, 26, 28, 37, 37, 38, 33, 40, 34, 39, 23, 11, 48, 49, 8, 26, 18, 17, 27, 14.
Construct a box-and-whisker plot to represent the data.
Solution:
To begin, organize your data in ascending order. There is an even number of data values so the median will be the mean of the two middle values. \begin{align*}Med=\frac{26+26}{2}=26\end{align*}. The median of the lower quartile is the number between the 6th and 7th positions, which is the average of 16 and 17, or 16.5. The median of the upper quartile is also the number between the 6th and 7th positions, which is the average of 37 and 37, or 37. The smallest number is 6, and the largest number is 49.
Creating Box-and-Whisker Plots Using a Graphing Calculator
The TI-83 can also be used to create a box-and-whisker plot. The five-number summary values can be determined by using the trace function of the calculator.
Example C
'Make a histogram of the data from the previous example on your calculator.
Solution:
Enter the data into \begin{align*}[L_1]\end{align*}.
Change the [STATPLOT] to a box plot instead of a histogram.
Box-and-whisker plots are useful when comparing multiple sets of data. The graphs are plotted, one above the other, to visualize the median comparisons.
Guided Practice
Using the data from the previous Concept, determine whether the additive improved the gas mileage.
540 | 550 | 555 | 570 | 570 |
---|---|---|---|---|
580 | 585 | 587 | 588 | 590 |
591 | 610 | 615 | 640 | 660 |
500 | 589 | 618 | 619 | 629 |
---|---|---|---|---|
633 | 635 | 637 | 638 | 639 |
659 | 664 | 689 | 694 | 709 |
Solution:
Regular Gasoline | Premium Gasoline | |
---|---|---|
Smallest # | 540 | 500 |
\begin{align*}Q_1\end{align*} | 570 | 619 |
Median | 587 | 637 |
\begin{align*}Q_3\end{align*} | 610 | 664 |
Largest # | 660 | 709 |
From the above box-and-whisker plots, where the blue one represents the regular gasoline and the yellow one the premium gasoline, it is safe to say that the additive in the premium gasoline definitely increases the mileage. However, the value of 500 seems to be an outlier.
Practice
Sample explanations for some of the practice exercises below are available by viewing the following video. Note that there is not always a match between the number of the practice exercise in the video and the number of the practice exercise listed in the following exercise set. However, the practice exercise is the same in both. CK-12 Basic Algebra: Box-and-Whisker Plots (13:14)
- Describe a five-number summary.
- What is the purpose of a box-and-whisker plot? When it is useful?
- What are some disadvantages to representing data with a box-and-whisker plot?
- The following is the data that represents the amount of money that males spent on prom night. \begin{align*}&25 && 60 && 120 && 64 && 65 && 28 && 110 && 60\\ &70 && 34 && 35 && 70 && 58 && 100 && 55 && 95\\ &55 && 95 && 93 && 50 && 75 && 35 && 40 && 75\\ &90 && 40 && 50 && 80 && 85 && 50 && 80 && 47\\ &50 && 80 && 90 && 42 && 49 && 84 && 35 && 70\end{align*} Construct a box-and-whisker graph to represent the data.
- Forty students took a college algebra entrance test and the results are summarized in the box-and-whisker plot below. How many students would be allowed to enroll in the class if the pass mark were set at:
- 65 %
- 60 %
- Harika is rolling three dice and adding the scores together. She records the total score for 50 rolls, and the scores she gets are shown below. Display the data in a box-and-whisker plot, and find both the range and the inter-quartile range. 9, 10, 12, 13, 10, 14, 8, 10, 12, 6, 8, 11, 12, 12, 9, 11, 10, 15, 10, 8, 8, 12, 10, 14, 10, 9, 7, 5, 11, 15, 8, 9, 17, 12, 12, 13, 7, 14, 6, 17, 11, 15, 10, 13, 9, 7, 12, 13, 10, 12
- The box-and-whisker plots below represent the times taken by a school class to complete a 150-yard obstacle course. The times have been separated into boys and girls. The boys and the girls both think that they did best. Determine the five-number summary for both the boys and the girls and give a convincing argument for each of them.
- Draw a box-and-whisker plot for the following unordered data. 49, 57, 53, 54, 49, 67, 51, 57, 56, 59, 57, 50, 49, 52, 53, 50, 58
- A simulation of a large number of runs of rolling three dice and adding the numbers results in the following five-number summary: 3, 8, 10.5, 13, 18. Make a box-and-whisker plot for the data.
- The box-and-whisker plots below represent the percentage of people living below the poverty line by county in both Texas and California. Determine the five-number summary for each state, and comment on the spread of each distribution.
- The five-number summary for the average daily temperature in Atlantic City, NJ (given in Fahrenheit) is 31, 39, 52, 68, 76. Draw the box-and-whisker plot for this data and use it to determine which of the following would be considered an outlier if it were included in the data.
- January’s record-high temperature of \begin{align*}78^\circ\end{align*}
- January’s record-low temperature of \begin{align*}-8^\circ\end{align*}
- April’s record-high temperature of \begin{align*}94^\circ\end{align*}
- The all-time record high of \begin{align*}106^\circ\end{align*}
- In 1887, Albert Michelson and Edward Morley conducted an experiment to determine the speed of light. The data for the first ten runs (five results in each run) is given below. Each value represents how many kilometers per second over 299,000 km/sec were measured. Create a box-and-whisker plot of the data. Be sure to identify outliers and plot them as such. 900, 840, 880, 880, 800, 860, 720, 720, 620, 860, 970, 950, 890, 810, 810, 820, 800, 770, 850, 740, 900, 1070, 930, 850, 950, 980, 980, 880, 960, 940, 960, 940, 880, 800, 850, 880, 760, 740, 750, 760, 890, 840, 780, 810, 760, 810, 790, 810, 820, 850
- Using the following box-and-whisker plot, list three pieces of information you can determine from the graph.
- In a recent survey done at a high school cafeteria, a random selection of males and females were asked how much money they spent each month on school lunches. The following box-and-whisker plots compare the responses of males to those of females. The lower one is the response by males.
- How much money did the middle 50% of each gender spend on school lunches each month?
- What is the significance of the value of $42 for females and $46 for males?
- What conclusions can be drawn from the above plots? Explain.
- Multiple Choice. The following box-and-whisker plot shows final grades last semester. How would you best describe a typical grade in that course? A. Students typically got between 82 and 88. B. Students typically got between 41 and 82. C. Students typically got around 62. D. Students typically got between 58 and 82.
Mixed Review
- Find the mean, median, mode, and range for the following salaries in an office building: 63,450; 45,502; 63,450; 51,769; 63,450; 35,120; 45,502; 63,450; 31,100; 42,216; 49,108; 63,450; 37,904
- Graph \begin{align*}g(x)=2\sqrt{x-1}-3\end{align*}.
- Translate into an algebraic sentence: The square root of a number plus six is less than 18.
- Solve for \begin{align*}y\end{align*}: \begin{align*}6(y-11)+9=\frac{1}{3} (27+3y)-16\end{align*}.
- A fundraiser is selling two types of items: pizzas and cookie dough. The club earns $5 for each pizza sold and $4 for each container of cookie dough. They want to earn more than $550.
- Write this situation as an inequality.
- Give four combinations that will make this sentence true.
- Find the equation for a line parallel to \begin{align*}x+2y=10\end{align*} containing the point (2, 1).
Extremes
The extremes are the maximum and minimum values in a data set.five point summary
The numbers needed to construct a box-and-whisker plot are called the five-point-summary. The five points are the minimum, the lower median (Q1), the median, the upper median (Q3), and the maximum.line of fit
A line of fit is a straight or continuously curved line representing the trend of changes in the comparison of two data sets (or one set of bivariate data).Median
The median of a data set is the middle value of an organized data set.observed data
Observed data are the values that result from computations performed on the input variable.Outlier
In statistics, an outlier is a data value that is far from other data values.Quartile
A quartile is each of four equal groups that a data set can be divided into.skewed
As with the horizontal skewing of a histogram, stem plots with a obvious skew toward one end or the other tend to indicate an increased number of outliers either lesser than or greater than the mode.statistical correlation
Statistical correlation is a representation of possible related changes in values between the two sets of data.trends
Trends in data sets or samples are indicators found by reviewing the data from a general or overall standpointuniform
A uniform shaped histogram indicates data that is very consistent; the frequency of each class is very similar to that of the others.Image Attributions
Here you'll learn how to visualize data by using box-and-whisker plots.
Concept Nodes:
Extremes
The extremes are the maximum and minimum values in a data set.five point summary
The numbers needed to construct a box-and-whisker plot are called the five-point-summary. The five points are the minimum, the lower median (Q1), the median, the upper median (Q3), and the maximum.line of fit
A line of fit is a straight or continuously curved line representing the trend of changes in the comparison of two data sets (or one set of bivariate data).Median
The median of a data set is the middle value of an organized data set.observed data
Observed data are the values that result from computations performed on the input variable.Outlier
In statistics, an outlier is a data value that is far from other data values.Quartile
A quartile is each of four equal groups that a data set can be divided into.skewed
As with the horizontal skewing of a histogram, stem plots with a obvious skew toward one end or the other tend to indicate an increased number of outliers either lesser than or greater than the mode.statistical correlation
Statistical correlation is a representation of possible related changes in values between the two sets of data.trends
Trends in data sets or samples are indicators found by reviewing the data from a general or overall standpointuniform
A uniform shaped histogram indicates data that is very consistent; the frequency of each class is very similar to that of the others.