<img src="https://d5nxst8fruw4z.cloudfront.net/atrk.gif?account=iA1Pi1a8Dy00ym" style="display:none" height="1" width="1" alt="" />
You are viewing an older version of this Concept. Go to the latest version.

Scatter Plots

Identify positive, negative, and no correlation relations

0%
Progress
Practice Scatter Plots
Progress
0%
Make a Scatterplot to Represent Data

Have you ever thought about height and speed? Take a look at this dilemma.

Mr. Watson has determined that there is a correlation between speed and the height of the student. He is so sure of it, that he gathered data to support his claim. When looking at the students who run the 800 meters, he gathered the following heights and times.

535254555656=2.26.11=3.01.11=2.23.20=2.20.01=2.18.23=2.18.25

Can you create a scatterplot of this data? You will learn how to do this in this Concept.

Guidance

In the real world, many things are related to each other. For instance, the more you smoke, the lower your life expectancy. Or the more years you spend in college, the greater your income in the future. Many fields try to find relationships between two variables.

One tool that helps us accomplish this is the scatterplot.

A scatterplot is a type of graph where corresponding values from a set of data are placed as points on a coordinate plane. A relationship between the points is sometimes shown to be positive, negative, strong, or weak.

Sometimes a scatterplot shows that there is no relationship at all. Aside from finding relationships, scatterplots are useful in predicting values based on the relationship that was revealed.

Let’s look at how a scatterplot can be applied to a situation.

A student had a hypothesis for a science project. He believed that the more students studied math, the better their math scores would be. He took a poll in which he asked students the average number of hours that they studied per week during a given semester. He then found out the overall percent that they received in their math classes. His data is shown in the table below:

study time(hours)  43.5 5  2 36.5.53.5  4.5 511.5 35.5Math Grade(percent)82819074779751788688  62757990

In order to understand this data, he decided to make a scatterplot.

In this case, our independent variable, or input data, is the study time because the hypothesis is that the math grade depends on the study time. That means that the math grade is the dependent variable, or output data. We will place the input data on the x\begin{align*}x\end{align*}-axis and the output data on the y\begin{align*}y\end{align*}-axis. Also, the scales and intervals on the axes will be determined by the data. Since the greatest value on the x\begin{align*}x\end{align*}-axis is 6.5, we can use intervals of 1 until we reach 7. On the y\begin{align*}y\end{align*}-axis, the greatest value is 97 and, since it is a percent, it makes sense to go up to 100. Our intervals could be 10 since that would give the scatterplot a workable shape.

Now we can graph the points on the scatterplot. In order to plot the points, we will show each one as an ordered pair (hours, percent). The first ordered pair, then, is (4, 82). Draw each of the 14 points. Remember, it takes two pieces of data to make a single point.

You can see that there is a relationship between the independent and dependent values of the chart.

Graphing Calculators

If you got the basic idea, you see that scatterplots are powerful tools. I’m sure you know that computers and technology are even more powerful at times. Scientists in the real world rarely create scatterplots on a piece of paper and compute equations by hand. They use computer programs that can approximate the trend line much more accurately than you and I can with our eyes.

You can make scatterplots on your graphing calculator, if you have one. Then, you can compute the trend line called the linear regression in some models. Your graphing calculator can put the equation in y=mx+b\begin{align*}y=mx+b\end{align*} form or Ax+By=C\begin{align*}Ax+By=C\end{align*} form, depending on the mode you choose. Then, you can choose any input values for which your calculator will tell you the output values.

Every graphing calculator is different. The key combination necessary for these operations is shown in your guidebook. When you study more advanced math, these operations will be required. Try it now and see how close your response gets on your graphing calculator to the one you computed by hand.

Example A

If the points on a scatterplot do not show a pattern, is there a connection between the data?

Solution: No, there isn't one.

Example B

If the points on a scatterplot trend up to the right, is there a connection between the data?

Solution: Yes, it is called a positive correlation.

Example C

If the points on a scatterplot trend down and to the left, is there a connection between the data?

Solution: Yes, it is called a negative correlation.

Now let's go back to the dilemma from the beginning of the Concept.

First, let’s put the data in a table so that we can see it clearly.

height time
56\begin{align*}5^\prime 6^{\prime \prime}\end{align*} 2.18.23
56\begin{align*}5^\prime 6^{\prime \prime}\end{align*} 2.18.25
55\begin{align*}5^\prime 5^{\prime \prime}\end{align*} 2.20.01
54\begin{align*}5^\prime 4^{\prime \prime}\end{align*} 2.23.20
53\begin{align*}5^\prime 3^{\prime \prime}\end{align*} 2.26.11
52\begin{align*}5^\prime 2^{\prime \prime}\end{align*} 3.01.01

Here is the scatterplot of the data.

Vocabulary

Scatterplot
a graph where corresponding values are placed on the coordinate plane and the relationship between the values can be determined.
Input Value
the x\begin{align*}x\end{align*} value - it is the independent value
Output Value
the y\begin{align*}y\end{align*} value - it is the dependent value
Positive Correlation
a scatterplot where the points plotted go up from left to right.
Negative Correlation
a scatterplot where the points plotted go down from left to right.
No Correlation
a scatterplot where there isn’t a clear relationship between the dependent and independent values.

Guided Practice

Here is one for you to try on your own.

Create a scatterplot of the data.

After a circus class, the following data was collected. It tracks the number of people who could balance on a tightrope for specific lengths of time.

1 person = 7 minutes

3 people = 15 minutes

7 people = 20 minutes

9 people = 25 minutes

14 people = 32 minutes

18 people = 39 minutes

Solution

Here is the scatterplot of the data.

Practice

Directions: Use what you have learned to answer each question or complete each task.

1. Make a scatterplot to display the data set in the table:

x2318302429451017273932402114y6272546057307965553448416876

Directions: What type of relationship would you predict for the following variables, positive, negative, or no relationship?

1. altitude vs. the amount of oxygen in the atmosphere
2. number of customers vs. profit
3. number of siblings vs. grade point average
4. hours of studying vs. test score
5. hours of driving vs. distance traveled
6. speed of a car vs. distance traveled
7. hours at work vs. amount of money made
8. age of person vs. intelligence

Directions: Use this scatterplot to answer the following questions.

1. True or false. This data shows a positive correlation.
2. True or false. This data shows a negative correlation.
3. True or false. This data shows no correlation.
4. True or false. Data is graphed on a scatterplot by using ordered pairs.
5. True or false. An example of a negative correlation could be the amount of time in school and an increase in intelligence.

Vocabulary Language: English

correlation

correlation

Correlation is a statistical method used to determine if there is a connection or a relationship between two sets of data.
line of best fit

line of best fit

A line of best fit is a straight line drawn on a scatter plot such that the sums of the distances to the points on either side of the line are approximately equal and such that there are an equal number of points above and below the line.
scatter plot

scatter plot

A scatter plot is a plot of the dependent variable versus the independent variable and is used to investigate whether or not there is a relationship or connection between 2 sets of data.