# 10.4: Scatterplots

**At Grade**Created by: CK-12

## Introduction

*Height and Speed in the 800 m*

Mr. Watson has determined that there is a correlation between speed and the height of the student. He is so sure of it, that he gathered data to support his claim. When looking at the students who run the 800 meters, he gathered the following heights and times.

\begin{align*}5^\prime 3^{\prime \prime} &= 2.26.11\\ 5^\prime 2^{\prime \prime} &= 3.01.11\\ 5^\prime 4^{\prime \prime} &= 2.23.20\\ 5^\prime 5^{\prime \prime} &= 2.20.01\\ 5^\prime 6^{\prime \prime} &= 2.18.23\\ 5^\prime 6^{\prime \prime} &= 2.18.25\end{align*}

Can Mr. Watson support his claim given this data?

**To prove his claim, Mr. Watson will need to create a visual display of the data and a correlation between height and time. This is the perfect place for scatterplots. Use what you learn in this lesson to help Mr. Watson prove or disprove his claim.**

*What You Will Learn*

By the end of this lesson, you will understand how to complete the following skills.

- Make a scatterplot to display paired data sets given as an input-output table.
- Interpret scatterplots as showing positive, negative or no relationship between two sets of data.
- Use a scatterplot to draw a trend line and estimate unknown outputs and given inputs.
- Use a graphing calculator to find the best fitting line for a scatterplot.

*Teaching Time*

I. **Make a Scatterplot to Display Paired Data Sets Given as an Input-Output Table**

In the real world, many things are related to each other. For example, the more you smoke, the lower your life expectancy. Or the more years you spend in college, the greater your income in the future. Many fields try to find relationships between two variables. One tool that helps us accomplish this is the scatterplot. **A** *scatterplot***is a type of graph where corresponding values from a set of data are placed as points on a coordinate plane. A relationship between the points is sometimes shown to be positive, negative, strong, or weak.** Sometimes a scatterplot shows that there is no relationship at all. Aside from finding relationships, scatterplots are useful in predicting values based on the relationship that was revealed.

Let’s look at how a scatterplot can be applied when looking at an example.

Example

A student had a hypothesis for a science project. He believed that the more students studied math, the better their math scores would be. He took a poll in which he asked students the average number of hours that they studied per week during a given semester. He then found out the overall percent that they received in their math classes. His data is shown in the table below:

\begin{align*}& \text{study time(hours)} \qquad \ \ 4 \quad 3.5 \quad \ 5 \quad \ \ 2 \quad \ 3 \quad 6.5 \quad .5 \quad 3.5 \ \ 4.5 \quad \ 5 \quad 1 \quad 1.5 \quad \ 3 \quad 5.5\\ & \text{Math Grade(percent)} \quad 82 \quad 81 \quad 90 \quad 74 \quad 77 \quad 97 \quad 51 \quad 78 \quad 86 \quad 88 \ \ 62 \quad 75 \quad 79 \quad 90\end{align*}

**In order to understand this data, he decided to make a scatterplot.** In this case, our independent variable, or ** input data**, is the study time because the hypothesis is that the math grade depends on the study time. That means that the math grade is the dependent variable, or

**. We will place the input data on the \begin{align*}x\end{align*}-axis and the output data on the \begin{align*}y\end{align*}-axis. Also, the scales and intervals on the axes will be determined by the data. Since the greatest value on the \begin{align*}x\end{align*}-axis is 6.5, we can use intervals of 1 until we reach 7. On the \begin{align*}y\end{align*}-axis, the greatest value is 97 and, since it is a percent, it makes sense to go up to 100. Our intervals could be 10 since that would give the scatterplot a workable shape.**

*output data*

Now we can graph the points on the scatterplot. In order to plot the points, we will show each one as an ordered pair (hours, percent). The first ordered pair, then, is (4, 82). Draw each of the 14 points. Remember, it takes two pieces of data to make a single point.

You can see that there is a relationship between the independent and dependent values of the chart. Let’s look at how we can examine these relationships.

II. **Interpret Scatterplots as Showing Positive, Negative, or No Relationship Between Two Sets of Data**

As you may have guessed, the scatterplots are useful because their shapes may indicate a relationship between the variables. Consider, for example, the following: what happens to people’s heating bills as the temperature outside goes up? Or, what happens to the gasoline consumption in a vehicle as the miles traveled goes up?

For the first question, you might have thought that as the temperature outside goes up, people’s heating bills go down because they use their heaters less. As one variable goes up, the other goes down. How about the second question? You might imagine that as the miles traveled in a car go up, the amount of gasoline consumed also goes up. So, as one variable goes up, the other goes up, too.

**We can say that these variables correlate or are connected together. When we look at a scatterplot, we can determine the different variables and their correlation.**

In the examples above, the first illustrates a negative relationship—as one variable goes up, the other goes down. The second example illustrates a positive relationship—as one variable goes up, the other goes up, too. Well, what if there is no relationship—while one variable goes up, the other may go up, down, or stay the same; the second variable is independent of the first. This oftentimes occurs, too. This is an example of no relationship. Like the number of blue cars on a given road and the number of accidents. The two variables have no relationship.

These three trends, positive, negative, and no relationship are evident on scatterplots. This is what they look like:

Positive Relationship

As the \begin{align*}x\end{align*}-values increase, the \begin{align*}y\end{align*}-values increase. Some points may not follow an exact pattern but the overall *trend*, the general tendency or movement, is clearly from the lower left to the upper right of the plot.

Negative Relationship

In this case, as the \begin{align*}x\end{align*}-values increase, the \begin{align*}y\end{align*}-values decrease. You may argue that the slope is not as steep which is true. However, the general tendency is evident. This graph moves from the upper left to the lower right.

No Relationship

At times, there is no relationship between variables. The scatterplots of these situations will show no trend. In other words, there seems to be no definite pattern with the points; you cannot see any particular direction that they take.

III. **Use a Scatterplot to Draw a Trend Line and Estimate Unknown Outputs and Given Inputs**

Scatterplots are as useful for finding a relationship between variables as they are for making predictions. Here, we will make a ** trend line**,

**or a line that best describes the data on a scatterplot**, in order to estimate unknown outputs for given inputs.

A trend line is a straight line that best represents the points on a scatterplot. The trend line may go through some points but need not go through them all. The trend line is used to show the pattern of the data. This trend line may show a positive trend or a negative trend. However, if there is no relationship, then no trend line can be adequately drawn.

Your trend line is your best approximation so it may be different from others’.

**The line on this graph is the** *trend line***; it is the line that best describes the data. About half of the points should be on either side of the line.**

**You may notice that outliers are practically ignored when a trend line is drawn. This trend line goes from the lower left to the upper right and shows a positive relationship.**

**Notice that this trend goes down and indicates a negative correlation or relationship. You could also see that it goes off of the chart. Therefore, we could use a chart like this one to predict the trend. It is likely that the trend will continue to go down.**

IV. **Use a Graphing Calculator to Find the Best-Fitting Line for a Scatterplot**

You probably recognize this coordinate plane. In previous lessons, you used points on a graph to find the equation of a line. A scatterplot is created on a coordinate plane and its trend line, just like any non-vertical line on a graph, has a function rule. Using the same methodology, you can write the equation of a trend line. You can use slope and \begin{align*}y\end{align*}-intercept, for instance.

Graphing Calculators

If you got the basic idea, you see that scatterplots are powerful tools. I’m sure you know that computers and technology our even more powerful at times. Scientists in the real world rarely create scatterplots on a piece of paper and compute equations by hand. They use computer programs that can approximate the trend line much more accurately than you and I can with our eyes.

You can make scatterplots on your graphing calculator, if you have one. Then, you can compute the trend line called the *linear regression* in some models. Your graphing calculator can put the equation in \begin{align*}y=mx+b\end{align*} form or \begin{align*}Ax+By=C\end{align*} form, depending on the mode you choose. Then, you can choose any input values for which your calculator will tell you the output values.

Every graphing calculator is different. The key combination necessary for these operations is shown in your guidebook. When you study more advanced math, these operations will be required. Try it now and see how close your response gets on your graphing calculator to the one you computed by hand.

## Real-Life Example Completed

*Height and Speed in the 800 m*

**Here is the original problem from the introduction. Reread it and then create a scatterplot to prove that there is or is not a correlation between height and speed.**

Mr. Watson has determined that there is a correlation between speed and the height of the student. He is so sure of it, that he gathered data to support his claim. When looking at the students who run the 800 meters, he gathered the following heights and times.

\begin{align*}5^\prime 3^{\prime \prime} &= 2.26.11\\ 5^\prime 2^{\prime \prime} &= 3.01.11\\ 5^\prime 4^{\prime \prime} &= 2.23.20\\ 5^\prime 5^{\prime \prime} &= 2.20.01\\ 5^\prime 6^{\prime \prime} &= 2.18.23\\ 5^\prime 6^{\prime \prime} &= 2.18.25\end{align*}

Can Mr. Watson support his claim given this data?

*Remember there are two parts to your answer.*

*Solution to Real – Life Example*

**First, let’s put the data in a table so that we can see it clearly.**

height |
time |
---|---|

\begin{align*}5^\prime 6^{\prime \prime}\end{align*} | 2.18.23 |

\begin{align*}5^\prime 6^{\prime \prime}\end{align*} | 2.18.25 |

\begin{align*}5^\prime 5^{\prime \prime}\end{align*} | 2.20.01 |

\begin{align*}5^\prime 4^{\prime \prime}\end{align*} | 2.23.20 |

\begin{align*}5^\prime 3^{\prime \prime}\end{align*} | 2.26.11 |

\begin{align*}5^\prime 2^{\prime \prime}\end{align*} | 3.01.01 |

**Looking at this data, you can suspect that there will be a correlation between speed and height. Let’s look at a scatterplot of the data.**

**You can see the positive correlation which shows that there is a correlation between height and speed in the 800 meters event.**

## Vocabulary

Here are the vocabulary words that are found in this lesson.

- Scatterplot
- a graph where corresponding values are placed on the coordinate plane and the relationship between the values can be determined.

- Input Value
- the \begin{align*}x\end{align*} value - it is the independent value

- Output Value
- the \begin{align*}y\end{align*} value - it is the dependent value

- Positive Correlation
- a scatterplot where the points plotted go up from left to right.

- Negative Correlation
- a scatterplot where the points plotted go down from left to right.

- No Correlation
- a scatterplot where there isn’t a clear relationship between the dependent and independent values.

## Time to Practice

Directions: Use what you have learned to answer each question or complete each task.

- Make a scatterplot to display the data set in the table:

\begin{align*}& x \quad 23 \quad 18 \quad 30 \quad 24 \quad 29 \quad 45 \quad 10 \quad 17 \quad 27 \quad 39 \quad 32 \quad 40 \quad 21 \quad 14\\ & y \quad 62 \quad 72 \quad 54 \quad 60 \quad 57 \quad 30 \quad 79 \quad 65 \quad 55 \quad 34 \quad 48 \quad 41 \quad 68 \quad 76\end{align*}

What type of relationship would you predict for the following variables, positive, negative, or no relationship?

- altitude vs. the amount of oxygen in the atmosphere
- number of customers vs. profit
- number of siblings vs. grade point average

What type of relationship is shown in the following scatterplots?

- Use the following table to make a scatter plot.

\begin{align*}& x \qquad \ 3 \qquad \ 6 \qquad \ 8 \qquad \ 14 \qquad 18 \qquad 23 \qquad 29 \qquad 32 \qquad 37\\ & y \qquad 55 \qquad 50 \qquad 46 \qquad 40 \qquad 37 \qquad 18 \qquad 26 \qquad 20 \qquad 18\end{align*}

- Draw a trend line.
- Identify the type of relationship.
- Then predict the output values for the input values 10 and 40.

A zoologist studied the relationship between the kilometers from a lake and number of felines per 100 square kilometers. She found the following data:

\begin{align*}& \text{Distance from Lake} \qquad 3 \qquad \ 1 \qquad \ 4 \qquad 3 \qquad 4.5 \qquad 5 \qquad .5 \qquad 2 \qquad 2.5 \qquad 3.5 \qquad 8 \qquad 6 \qquad 5\\ & \# \text{ of Felines} \qquad \qquad \quad \ 5 \qquad 10 \qquad 2 \qquad 8 \qquad \ 6 \qquad \ \ 5 \qquad \ 8 \qquad 8 \qquad \ 6 \qquad \ \ 6 \qquad \ 0 \qquad 2 \qquad 4\end{align*}

- Make a scatterplot that illustrates this data.
- Draw a trend line.
- Use slope and \begin{align*}y\end{align*}-intercept to calculate the equation of your trend line.
- Use your equation to estimate the number of felines 1.5 kilometers from a lake.
- Use your graphing calculator to create a scatterplot and calculate the linear regression.

Directions: Define the following terms.

- Input value
- Output value
- Positive correlation
- Negative correlation
- No correlation