## Introduction

*Team Shirts*

“I can’t believe it!” Jacob exclaimed trying on his new long sleeved team shirt for the track team.

“What’s the matter?” his friend Mattias asked.

“This shirt doesn’t fit and this always happens to me. I am going to figure out why!” Jacob said taking off the shirt where the sleeves were too short once again.

After Jacob’s anger had subsided, he started to think about this question. Was he the only one with this problem? Jacob decided to find out by measuring his peers’ heights and arm lengths. He used inches and create a table like this:

**Now that Jacob has his data done, he needs to create a display. Which one should he create? Think about this question throughout this lesson and in the end, you will help Jacob create the appropriate display for his data.**

*What You Will Learn*

- Associate given conclusions about sets of data with given line graphs, scatterplots, circle graphs, bar graphs, stem-and-leaf plots, box-and-whisker plots and histograms.
- Select and justify appropriate displays of given categorical and numerical data.
- Collect, organize and compare displays of real-world categorical and numerical data.

*Teaching Time*

I. **Associate Given Conclusions about Sets of Data with Given Line Graphs, Scatterplots, Circle Graphs, Bar Graphs, Stem-and-Leaf Plots, Box-and-Whisker Plots and Histograms**

As you have seen, data can exist in many forms. A frequent goal of collecting data is drawing conclusions based on the data. The best conclusions correspond with trends that the data shows. Depending on the data you have, certain types of displays are more appropriate or more effective than others. We must make good choices of displaying data in a logical way. Of course, in a world so full of data, it must be collected and organized carefully to aid in appropriate decision-making.

Sometimes two people look at the same graph and draw completely different conclusions. Graphs can show us many things but the conclusions that we draw based upon the graphs is oftentimes more a matter of opinion. The idea of graphs is, in part, to make inferences. Those inferences must be based on the data.

Let’s look at an example.

Example

Some scientists from the EPA were studying the amount of dissolved oxygen in a lake over several weeks. This graph was created by the data they found.

They studied the graph and came up with the following conclusions:

- The amount of dissolved oxygen fluctuated over the 5 weeks.
- The average amount of dissolved oxygen has been about 110 parts per million over 5 weeks.
- The dissolved oxygen in Week 6 will be about 60 parts per million.

**Do you agree with their conclusions?**

**The first two conclusions are clearly shown by the data. However, the prediction about Week 6, conclusion number 3, is not convincingly shown. The level of dissolved oxygen does seem to fluctuate and has gone slightly higher over time and then lower, but it is not enough evidence to be sure that the Week 6 dissolved oxygen will go even lower.**

Example

The boss at an office took a survey of people’s preference for lunch because he wanted to treat the office to a lunch for the holidays. His data is shown below.

He ponders the following conclusions:

- A lot of people like Chinese food.
- Nobody likes Italian food.
- If I order sandwiches, then 80% of the staff will be unhappy.
- If I get some pizzas and some Chinese food, the majority will have their preference.

**Do you agree with his conclusions?**

**According to the graph, Conclusion 1 is supported because the largest number of people selected Chinese food.**

**The graph does not support Conclusion 2. Perhaps Italian food was not an option on the survey. Also, just because you have a preference does not mean you don’t like the other choices.**

**For the same reason, Conclusion 3 is not supported, either. Just because you may prefer Chinese food does not mean you do not like sandwiches.**

**Conclusion 4 is supported because 25% prefer sandwiches and 35% prefer Chinese food so 60%, a majority, will have their preference.**

Example

A tally of the animals at a local shelter was taken so that children visiting on a field trip could see. Here are the results.

When children looked at the bar graph, they shouted out :

Bobby: “Nobody likes dogs!”

Lisa: “I didn’t know that hamsters and rats were the same thing!”

Miguel: “Everyone must have taken all the turtles!”

Mona: “They must have mostly food for dogs and cats!”

The teacher responded to their comments patiently:

“Bobby, just because they have a lot of dogs doesn’t mean people don’t like them. Since dogs are the most common pet, it makes sense that there would be more dogs at the shelter.”

“Lisa, it looks like hamsters and rats were tallied in the same category, maybe because they are kept in the same cage or given similar food. This bar graph does not mean they are the same, though.”

“Miguel, since turtles are a less common pet, the shelter probably has fewer turtles. It doesn’t mean that they had a lot that were taken already.”

“Yes, Mona. It does make sense that having so many dogs and cats compared to other animals requires much more food for them than for the other animals. Also, they are bigger animals than the others, generally, so they eat more, too. Don’t they?”

**Once again, a data display was used to make connections. The children used the graph from the animal shelter to draw conclusions.**

Example

A market research firm collected data on the ages of customers at a surf shop. Look at the stem-and-leaf plot they created based on the data:

These are the recommendations they made for marketing:

- Most of your customers are in the teens and twenties so focus advertising on them.
- People older than 60 don’t like surfing. Forget about them.
- Men like surfing more than women so advertise in men’s magazines.

Sally looked at the same graph but didn’t agree with all of their conclusions.

- Yes, most of my customers are less than 30, I should focus advertising efforts on them. However, I may want to draw in customers in age ranges that are not already clientele.
- This data does not show that people over 60 do not like surfing. Not all of my customers are represented here. Also, although generally surfing may be too strenuous for the elderly, it doesn’t mean that they don’t like it.
- This tally does not consider gender. The conclusion about women not liking surfing is not based on the tally but on the ridiculous bias of the marketing company.

**There are a couple of things that we can decide after looking at these examples. First, you can see how useful different displays of data are depending on the information that has been collected. Then you can see that we can use many different displays. The key is to think about what information you have gathered and how you want the information displayed. Then move on to building the correct visual. In the next section you will see how selecting the appropriate display is done.**

II. **Select and Justify Appropriate Displays of Given Categorical or Numerical Data**

There are many ways to display data so how do you know which is the best way to display given data? Some choices are simply preferential but most types of data have types of displays that suit them best.

**Types of Data**

**Two major types of data are categorical data and numerical data.** *Categorical data***refers to data to which the independent variable is assigned a name, not a number.** For example, you may take data based on the months May, June, July, and August or you may tally people based on males and females. Sometimes categories can be numbers that are used to name the categories. For example, players on a team are given numbers on their shirts. Those numbers are only used to clarify who is who. It would not make sense to use mathematical operations with the numbers. Generally, categorical data is simply tallied.

The second type of data is numerical. *Numerical data***measures some characteristic of the variable. Examples of data that is measured numerically are time, height, weight, length, volume, density, force, etc. Anything that can be measured with a numerical system is numerical data.**

**Types of Displays**

In the previous section, we considered conclusions based on line graphs, scatterplots, circle graphs, bar graphs, stem-and-leaf plots, box-and whisker plots, and histograms.

There are many more types of displays of data, but let’s stick with these for now. Although there are few exact rules about data displays, each type of display has certain instances for which it is ideal. Also, there are instances where certain displays are inappropriate. Furthermore, the best display of data depends on what information you hope to get from it.

**Line graphs** are generally used to show change over time. **Scatterplots** are used to show a trend or a relationship (correlation) between to variables. **Circle graphs** are best to show data that represents one whole or one hundred percent of something. **Bar graphs** are excellent for categorical data. **Stem-and-leaf plots** are useful to represent ranges and can be used to illustrate ranges of two variables. **Box-and-whisker plots** are used to show how spread out data is and where the bulk of the data lies.

*Write down each example of a data display and the best use for each.*

III. **Collect, Organize and Compare Displays of Real – World Categorical and Numerical Data**

Collecting data is not a task for only scientists, mathematicians, and researchers. You can do it, too. As you can see in all of the tables given, it is important to organize the data, usually done in tables. Secondly, it is important to know what type of measure(s) and unit(s) you will be using. Is your variable categorical or numerical? Then you can decide what type of display is best for your data and analyze your results.

Example

Your teacher complains about having more students than all of the other classes. You wonder if this is true and ask your friends to tally the number of students in their classes. At lunchtime they tell you the following information.

Room 301 has 32 students. Room 302 has 27 students. Room 303 has 36 students. Room 304 has 30 students. Your room, Room 305, has 34 students.

This data is hard to compare if it is not organized in some fashion. You decide to make a table.

**This is categorical data, and it is best displayed as a bar graph.**

Is the teacher right about his claim? Not exactly, three classes have less than he does and one class does have more than he does. Although one class has only 27, the class enrollments are not that different.

**Now let’s look at the problem from the introduction and work on solving it.**

## Real-Life Example Completed

*Team Shirts*

**Here is the original problem once again. Reread it and then create a data display.**

“I can’t believe it!” Jacob exclaimed trying on his new long sleeved team shirt for the track team.

“What’s the matter?” his friend Mattias asked.

“This shirt doesn’t fit and this always happens to me. I am going to figure out why!” Jacob said taking off the shirt where the sleeves were too short once again.

After Jacob’s anger had subsided, he started to think about this question. Was he the only one with this problem? Jacob decided to find out by measuring his peers’ heights and arm lengths. He used inches and create a table like this:

*Now create a scatterplot for the data display.*

*Solution to Real – Life Example*

**By using a scatterplot, Jacob can compare the two variables, which are both numerical data, at once to see if there is a relationship. Here are the results.**

**Jacob’s measurements were (62, 27). It looks like his measurements are slightly different than the normal student. For this reason, your shirts don’t seem to fit quite right.**

## Vocabulary

Here are the vocabulary words that are found in this lesson.

- Numerical Data
- Any data that is measured in numbers.
- Categorical Data
- Data that is assigned a name and not a number.

## Time to Practice

Directions: Answer each question about data displays.

- What is considered numerical data?
- What is considered categorical data?
- If you were looking for a relationship between two values, would you use a scatterplot or a line graph?
- If there was a relationship between the data would you have a positive correlation or a negative correlation?
- The words positive correlation and negative correlation are associated with which type of data display?
- If you had an outlier, then would you have a scatterplot or a box-and-whisker plot?
- What is a quartile?
- Which type of data display is a quartile associated with?
- If you were watching a trend over time would you use a line graph or a scatterplot?
- If you were comparing two trends and their results which data display would make the most sense?
- What is the mean?
- What is the median?
- What is the mode?

Directions: Answer each question.

Do the conclusions fit the graph? Explain your reasoning.

- Conclusion 1: Rats are the most feared creature. Conclusion 2: Rats are the most dangerous creature. Conclusion 3: Nobody is afraid of bats.
- Conclusion 1: Prices have increased every year for 10 years. Conclusion 2: Prices of gasoline increased more rapidly after 2000. Conclusion 3: Prices will be even higher in 2008.

Create a graph for the following sets of data. Explain your choice.

- A class survey shows the following chocolate preferences: 16 milk chocolate, 10 dark chocolate, 4 white chocolate

Brainstorm ideas of data that you can take in your class or in your life. Take the data and create a data display. Then answer:

- How did you organize the data?
- What units did you use?
- Was it categorical or numerical?
- Why did you choose that data display?
- What conclusions can be drawn based on your data display?