<img src="https://d5nxst8fruw4z.cloudfront.net/atrk.gif?account=iA1Pi1a8Dy00ym" style="display:none" height="1" width="1" alt="" />
You are viewing an older version of this Concept. Go to the latest version.

# Scatter Plots and Linear Correlation

## Plot points and estimate the line that best represents them

Estimated6 minsto complete
%
Progress
Practice Scatter Plots and Linear Correlation
Progress
Estimated6 minsto complete
%
Interpreting Scatter Plots and Line Graphs

#### Objective

Here you will practice recognizing and using some of the primary information available from a scatter plot or line graph.

#### Concept

Steve is playing a stock market simulation game in his social studies class. He has chosen to invest in Apple Inc., Amazon.Com Inc., Walt Disney Company and Microsoft. He originally bought all four stocks in the 10th day of the month. Now he needs to choose one of them to sell. Based on the recent performance of each of them as shown in the line graphs below, which would you recommend he choose and why?

Seem a bit daunting? It certainly can be!  We will return to this question at the end of the lesson to review the situation.

#### Watch This

http://youtu.be/PE_BpXTyKCE vcefurthermaths – Maths Tutorial: Interpreting Scatterplots (statistics)

#### Guidance

The primary use for scatter plots and line graphs is to demonstrate or evaluate the correlation between two variables. Though the two are similar in many ways, there are distinct differences, and specific situations in which one is appropriate and the other is not.

• A scatter plot is generally used when displaying data from two variables that may or may not be directly related, and when neither of the variables is under the direct control of the researcher. The primary function of a scatter plot is to visualize the strength of correlation between the two plotted variables. The number of sunburned swimmers at the local pool each day for a month would be an example of a data set that would best be displayed as a scatter plot, since neither the weather nor the number of swimmers present is under the control of the researcher.
• A line graph is appropriate when comparing two variables that are believed to be related, and when one of the variables is under the direct control of the researcher. The primary use of a line graph is to determine the trend between the two graphed variables. The mileage of a particular car compared to speed of travel would be a good example, since the mileage is certainly correlated to the speed and the speed can be directly controlled by the researcher.

In later lessons we will discuss methods of quantifying the level of correlation between two variables and calculating a line of best fit, but for now we will focus on identifying specific examples of weak or strong correlation and identifying different types of trends.

• SCATTER PLOTS:
• Two variables with a strong correlation will appear as a number of points occurring in a clear and recognizable linear pattern. The line does not need to be straight, but it should be consistent and not exactly horizontal or vertical.
• Two variables with a weak correlation will appear as a much more scattered field of points, with only a little indication of points falling into a line of any sort.
• LINE GRAPHS:
• A linear relationship appears as a straight line either rising or falling as the independent variable values increase. If the line rises to the right, it indicates a direct relationship. If the line falls to the right, it indicates an inverse relationship.
• A non-linear relationship may take the form of any number of curved lines, and may indicate a squared relationship (dependent variable is the square of the independent), a square root relationship (dependent variable is the square root of the independent), an inverse square (dependent variable is one divided by the square of the independent), or many other possibilities.
• BOTH:
• A positive correlation appears as a recognizable line with a positive slope . A line has a positive slope when an increase in the independent variable is accompanied by an increase in the dependent variable (the line rises as you move to the right).

• A negative correlation appears as a recognizable line with a negative slope. As the independent variable increases, the dependent variable decreases (the line falls as you move to the right).

Example A

What type of relationship is indicated by the line graph below?



Solution: The line is straight, indicating a linear relationship. It rises from left to right, meaning that the dependent variable increases as the independent variable increases, indicating a positive correlation.

Example B

Which image shows a non-linear graph with a negative correlation?



Solution:

• The first image is of a curved line that rises from left to right, this is a non-linear positive correlation
• The second image is a straight line that falls from left to right, this is a linear negative correlation
• The third image is a curved line that falls from left to right, this is a nonlinear negative correlation and is the correct image as described by the question.

Example C

1. Which graph(s) indicate(s) a weak correlation?
2. Which one(s) indicate(s) a strong correlation?
3. Which graph(s) indicate positive correlation(s)?



Solution:

• Only graph 2 indicates a weak correlation, since it is the only one with points that are not clearly arranged in a linear fashion.
• Graphs 1, 3, and 4 all indicate strong correlations, as evidenced by the high percentage of points obviously organized in a line. Graph 4 is obviously a very strong correlation as a clear non-horizontal or vertical line connects all of the points.
• Graphs 1, 2, and 4 are all positive correlations as all three rise from left to right. Another way to put it is that those three graphs have a positive slope (though graph 4 does not have a consistent slope, anywhere on the curve the slope is estimated it would still be positive).

Concept Problem Revisited

Which stock(s) should Steve sell if he needs to make a profit right away?

By looking at the lines on each of the four graphs, we can see that it is important to note that Steve purchased the stocks on the 10 th , since only the Walt Disney CO and Amazon are currently valued more highly than they were on the 10 th . Both Apple and Microsoft are going up in value now, at the end of the month, but neither has made it back up to where they were on the 10 th .

If Steve wants to make a profit right now, he should sell Walt Disney or Amazon or both.

#### Vocabulary

A trend is an estimation of the tendency of data points to move in a certain direction. A trend line, also known as a line of fit, is a line drawn on a graph to indicate how the data points generally increase or decrease.

A strong correlation means that the values of the output variable are strongly affected by the values of the input variable. A strong correlation is indicated on a graph by a large percentage of data points lying in an apparent line, either straight or curved.

A linear relationship means that the output values are a simple multiple of the input variable, and appears as a straight line when graphed.

A non-linear relationship appears as a curved line on a graph. It indicates an output variable that is a power, a root, or other more complex multiple of the input.

A direct relationship means that the variables increase and decrease together, resulting in a positive correlation and a line of fit that rises from left to right, whereas an inverse relationship is a negative correlation, meaning that the output decreases as the input increases, and vice versa.

A slope is a description of the rate at which the output variable increases or decreases compared to the input variable. This is referred to as the slope because the rate of increase or decrease affects the angle of the line on a graph.

#### Guided Practice

1. Describe the relationship indicated by the graph:

2. Describe the relationship indicated by the graph:



3. Describe the graph that would result from a strongly correlated positive non-linear relationship. Give an example of a function that could result in such a graph.

4. Which scatter plot below indicates the most strongly correlated variables?



5. Which plot below indicates a weakly correlated positive linear relationship?



Solutions:

1. This is a very strongly correlated (all the points connected by a line), negative (the line falls from left to right), nonlinear (not a straight line) relationship.
2. This is a weekly correlated (significant scattering of the points), positive (points generally increase in value from left to right), linear (a straight line of fit could be drawn) relationship.
3. A strongly correlated positive non-linear relationship would appear as a well-defined curve of points rising from left to right.
4. The center plot is the most strongly correlated, evidenced by the much cleaner line formed by the data points. Incidentally, this is a negative linear relationship.
5. The left hand plot is weakly correlated, but negative. The right hand plot is positive, but strongly correlated. The center plot is weakly correlated and positive, so it is the one matching the question definition.

#### Practice

1. What sort of trend is shown in the scatter plot below?



2. A door to door vacuum cleaner sales man plots a scatter diagram of how much he has earned over the years. In which year was his income the highest?



3. The number of children in two different day care centers, and the types of lunch they eat is represented in the table below. Pick the appropriate scatter plot for the data.

 Lunch Served Center 1 (Yellow) Center 2 (Blue) Hamburger 20 25 Mac and Cheese 30 28 Pizza 40 35 Tuna Salad 60 45 Burritos 80 65

4. The plot shown gives the relationship between the demand and price of a trendy consumer good. What trend does the plot follow?



5. The plot represents the relationship between the price and supply of an item. What type of trend does the graph illustrate?

6. Katie recorded the following data relating to how long it took to fill up a horse trough. She measured the depth every two minutes after she began filling it, until it was full. Which scatter plot accurately represents the data?

 Time (in minutes) Dept (in inches) 2 7 4 8 6 13 8 19 10 20 12 24 14 32 16 37 18 38 20 41 22 47

7. The Scatter Plot below question 8 shows the number of DVD’s sold (in millions) from 2001-2007. Based on the data, about how many DVD’s will be sold in 2009?

8. What sort of trend is shown in the scatter plot?



9. The table below shows a relationship between the weight of a car and its average gas mileage. Which plot best represents the data?

 Type of Car Weight MPG 1 3750 29 2 4125 23 3 3100 33 4 5082 18 5 3690 20 6 4640 21 7 5380 14 8 3241 25 9 3895 31 10 4669 17

10. Which scatter plot shows no relationship between test scores received by Greg, and the temperature that the classroom was at while taking the test? Why?



11. Which scatter plot shows a positive relationship between the weight of a mango, and the number of seeds it contains?



Roy was doing research for a research paper. He questioned students throughout his high school, asking them how much time they spent doing homework and how much time they spent watching TV the previous evening. The following scatter plot shows his results. Based on the information answer the questions that follow.

Choose the best of the 4 points, A, B, C, or D to represent the student’s statements below.

12. “I worked on homework almost all night, I only had time to watch my favorite sitcom.”

13. “Last night was about half and half for me”

14. “Last night didn’t have anything on the screen I wanted to watch, and homework was so light, that I ended up going out.”

15. Write a statement that correlates to the 4 th point.

### Vocabulary Language: English

bivariate

bivariate

Bivariate data has two variables
correlation

correlation

Correlation is a statistical method used to determine if there is a connection or a relationship between two sets of data.
curvilinear relationships

curvilinear relationships

Non-linear relationships are called curvilinear relationships.
direct relationship

direct relationship

If the line on a line graph rises to the right, it indicates a direct relationship.
homogeneity

homogeneity

When a group is homogeneous, or possesses similar characteristics, the range of scores on either or both of the variables is restricted.
indirect relationship

indirect relationship

If the line on a line graph falls to the right, it indicates an indirect relationship.
linear relationship

linear relationship

A linear relationship appears as a straight line either rising or falling as the independent variable values increase.
negative correlation

negative correlation

A negative correlation appears as a recognizable line with a negative slope .
non-linear relationship

non-linear relationship

A non-linear relationship may take the form of any number of curved lines but is not a straight line.
positive correlation

positive correlation

A positive correlation appears as a recognizable line with a positive slope               .
scatter plot

scatter plot

A scatter plot is a plot of the dependent variable versus the independent variable and is used to investigate whether or not there is a relationship or connection between 2 sets of data.
Slope

Slope

Slope is a measure of the steepness of a line. A line can have positive, negative, zero (horizontal), or undefined (vertical) slope. The slope of a line can be found by calculating “rise over run” or “the change in the $y$ over the change in the $x$.” The symbol for slope is $m$
strong correlation

strong correlation

Two variables with a strong correlation will appear as a number of points occurring in a clear and recognizable linear pattern.
trends

trends

Trends in data sets or samples are indicators found by reviewing the data from a general or overall standpoint
weak correlation

weak correlation

Two variables with a weak correlation will appear as a much more scattered field of points, with only a little indication of points falling into a line of any sort.