Have you ever thought about how sales can change? Take a look at this dilemma about the sales of television sets.
The data values on the table below depict the number of televisions sold at a department store each month for nine months. Create a box-and-whisker plot to display the data.
Do you know how to create this data display? Pay attention and this Concept will teach you all that you need to know.
At times it is useful to get a general idea of how data cluster together.
Box-and-whisker plots display the distribution of data items along a number line.
The data are divided into four equal parts, separated by points called quartiles .
You can also see the smallest data point, the extreme minimum, and the largest data point, the extreme maximum.
A box-and-whisker plot is created by determining five points.
First we’ll place the data in order from smallest to largest. Then, we create a number line that shows the range of the data using equal intervals. We’ll use the median as our middle point on the box-and-whisker plot and to split the data in half. The median of each half, the quartile , is then calculated. These separate the data into quarters. Finally, we’ll use the highest datum and the lowest datum as our endpoints or our extremes . Boxes are drawn between the quartiles and whiskers are drawn to the extremes.
In order to construct a box-and-whisker plot, you must calculate several statistical measures. However, a box-and-whisker plot that is already constructed can quickly supply statistical measures by looking at the five points.
The first and last points give you the extremes of the data. The third or middle point gives you the median . And the second and fourth points, between the median and the extremes, give you the quartiles .
The interquartile range is the range between the first quartile and the third quartile. This shows you where the middle half of the data is. It can be calculated by subtracting the first quartile from the third quartile. Finally, the outliers , data items that are far away from the general trend , can be located as extremes that cause the whiskers to be exceptionally long. Data does not always have outliers. If there isn't a single point that is exceptionally far from other points, than an outlier doesn't exist.
a) The extremes in this data set are approximately 35 and 129.
b) The median is approximately 95.
c) The first quartile is approximately 82 and the third quartile approximately 104.
d) The interquartile range, then, is 104 – 82 or 22.
e) Finally, the extreme minimum, 35, appears to be an outlier as the left whisker is very long compared to the rest of the plot.
As you know, outliers are points that are unusually large or small compared to the rest of the data. When we discuss measures of central tendency like mean, median, and mode, we must also remember that in the real world there are many exceptions. Sometimes when we consider data, we might choose to remove the outliers in order to draw better conclusions based on the data. Take a look at how removing an outlier can affect the interpretation of the data.
Shanda runs on her school’s track team. They recently ran a 100 meter dash at a track meet and recorded official times. These are the results in seconds: 11.7, 10.8, 11.1, 10.9, 11.7, 11.6, 12.0, 19.6, 12.2, 11.6, 11.5, 11.6, 11.0, 12.0, 11.6, 11.5, 11.7, 11.3, 12.3, 10.1.
Shanda’s time was 11.1 and she wants to know how she compares to the rest of her team. She will use a box-and-whisker plot to help figure this out. Here are the steps to this process.
Step 1: She places the data in order.
10.1, 10.8, 10.9, 11.0, 11.1, 11.3, 11.5, 11.5, 11.6, 11.6, 11.6, 11.6, 11.7, 11.7, 11.7, 12.0, 12.0, 12.2, 12.3, 19.6
Step 2: She draws a number that includes the extremes.
Step 3: She finds the median, 11.6, and places a point on the number line.
Step 4: She finds the first and third quartiles, 11.2 and 11.85.
Step 5: She draws boxes between the quartiles and the median.
Step 6: She places the extremes, 10.1 and 19.6, on the numbers with points.
Step 7: She draws whiskers from the quartiles to the extremes.
When Shanda analyzes the box-and-whisker plot, she finds that her time, 11.1 seconds, is barely less than the first quartile. She knows that her friend, Teresa, is super fast. She has already been offered track scholarships from major universities. Shanda doesn’t think she can realistically catch up to Teresa. Another teammate, Lisa, had fallen during the race but got up and continued to the finish line. Shanda believes that neither Teresa nor Lisa’s scores are useful in gauging her speed. She decides to look at the same data but remove those two outliers.
Here’s her new data:
10.8, 10.9, 11.0, 11.1, 11.3, 11.5, 11.5, 11.6, 11.6, 11.6, 11.6, 11.7, 11.7, 11.7, 12.0, 12.0, 12.2, 12.3
She recalculates her statistical measures and creates a new box-and-whisker plot:
Extremes: 10.8 and 12.3
First and third quartiles: 11.3 and 11.7
When the two outliers are removed, Shanda can see that most of the data is grouped closely together. Her time, 11.1, is still in the first quartile. However, her competition is tight because the rest of the team isn’t far behind. She is proud of her time and motivated to keep ahead of the crowd.
Answer each question about box-and-whisker plots.
What is a value called when it is found very far away from the median?
Solution: An outlier
Will removing an outlier change the median or the mean?
Solution: It will change both. The median value will be different, and the mean will be affected because the outlier will not be calculated as part of the average.
Does a box-and-whisker plot always have quartiles?
Solution: Yes. It is organized around the quartiles and the median.
Now let's go back to the dilemma from the beginning of the Concept.
Step 1: To determine the median of the set of data, arrange the data in order from least to greatest. Identify the data value in the middle of the data set. For this set of data, 102 is the median.
89, 91, 95, 98, 102, 108, 110, 118, 152
Step 2: Identify the median for the lower quartile. Again, since two data values share the middle position, find their mean. The median for the lower quartile is 93.
Step 3: Identify the median of the upper quartile. Remember to find the mean of the two data values that share the middle position. The median of the upper quartile is 114.
Step 4: Draw a number line. The first value on the number line should be near the smallest number in the data set. In this case, the smallest number is 89. Therefore, the number line will start at 80. The last value on the number line should be near the largest number in the set of data. The largest number in the data set is 152. Therefore, the number line will end at 160. In this case, label the number line by tens.
The smallest value, 89 is marked with a “I” at the end of the whisker in the lower quartile. The largest value, 151is marked with a “I” at the end of the whisker in the upper quartile.
The median of the first, second, and third quartiles are marked with a “+.”
Here is one for you to try on your own.
The town hall held it's annual 5k. Here are the times of the finishers: 12 minutes, 13 minutes, 14 minutes, 15 minutes, 16 minutes, 17 minutes, 18 minutes, 19 minutes, 21 minutes, 23 minutes and 26 minutes.
Create a box-and-whisker plot to show the data.
First, let's analyze the data.
17 is the median time.
12 - 16 is the lower quartile with 14 being the median of that quartile.
18 - 26 is the upper quartile with 21 being the median of that quartile.
Here is our box-and-whisker plot.
Directions: Define the following terms.
- Interquartile Range
Directions: Use the box-and-whisker plot to answer the following questions.
- What is the median value?
- Identify the quartiles
- Identify the interquartile range.
- Identify any extremes
- Identify any outliers.
Directions: Use the data set to answer each question.
26, 27, 29, 30, 32, 35, 41, 42, 44
- What is the median value?
- Identify the median of the lower quartile.
- Identify the median of the upper quartile.
- Identify the lower extreme.
- Identify the upper extreme.