Often, when real-world data is plotted, the result is a linear pattern. The general direction of the data can be seen, but the data points do not all fall on a line. This type of graph is called a scatter plot. A scatter plot is often used to investigate whether or not there is a relationship or connection between 2 sets of data. The data is plotted on a graph such that one quantity is plotted on the
The following scatter plot shows the price of peaches and the number sold:
The connection is obvious
The following scatter plot shows the sales of a weekly newspaper and the temperature:
There is no connection between the number of newspapers sold and the temperature.
Another term used to describe 2 sets of data that have a connection or a relationship is correlation. The correlation between 2 sets of data can be positive or negative, and it can be strong or weak. The following scatter plots will help to enhance this concept.
If you look at the 2 sketches that represent a positive correlation, you will notice that the points are around a line that slopes upward to the right. When the correlation is negative, the line slopes downward to the right. The 2 sketches that show a strong correlation have points that are bunched together and appear to be close to a line that is in the middle of the points. When the correlation is weak, the points are more scattered and not as concentrated.
When correlation exists on a scatter plot, a line of best fit can be drawn on the graph. The line of best fit must be drawn so that the sums of the distances to the points on either side of the line are approximately equal and such that there are an equal number of points above and below the line. Using a clear plastic ruler makes it easier to meet all of these conditions when drawing the line. Another useful tool is a stick of spaghetti, since it can be easily rolled and moved on the graph until you are satisfied with its location. The edge of the spaghetti can be traced to produce the line of best fit. A line of best fit can be used to make estimations from the graph, but you must remember that the line of best fit is simply a sketch of where the line should appear on the graph. As a result, any values that you choose from this line are not very accurate
In the sales of newspapers and the temperature, there was no connection between the 2 data sets. The following sketches represent some other possible outcomes when there is no correlation between data sets:
Plotting Points on a Scatter Plot
Plot the following points on a scatter plot, with
Describe the correlation, if any, in the following scatter plot:
In the above scatter plot, there is a strong positive correlation.
Drawing a Line of Best Fit
The following table consists of the marks achieved by 9 students on chemistry and math tests:
Plot the above marks on scatter plot, with the chemistry marks on the
If Student I had taken the math test, his or her mark would have been between 32 and 37.
Points to Consider
- Can the equation for the line of best fit be used to calculate values?
- Is any other graphical representation of data used for estimations?
The following table represents the sales of Volkswagen Beetles in Iowa between 1994 and 2003:
Create a scatter plot and draw the line of best fit for the data. Hint: Let 0 = 1994, 1 = 1995, etc.
Use the graph to predict the number of Beetles that will be sodl in Iowa in the year 2007.
The year 2007 would actually be the number 13 on the
Describe the correlation for the above graph.
The correlation of this graph is strong and positive.
- What is the correlation of a scatter plot that has few points that are not bunched together?
- no correlation
- What term is used to define the connection between 2 data sets?
- scatter plot
- Describe the correlation of each of the following graphs:
- Plot the following points on a scatter plot, with
mas the independent variable and nas the dependent variable. Number both axes from 0 to 20. If a correlation exists between the values of mand n, describe the correlation (strong negative, weak positive, etc.). m514210164182811n 613410157165812 m1331892015610214n 7149167131013319
The following scatter plot shows the closing prices of 2 stocks at various points in time. A line of best fit has been drawn. Use the scatter plot to answer the following questions.
- How would you describe the correlation between the prices of the 2 stocks?
- If the price of stock A is $12.00, what would you expect the price of stock B to be?
- If the price of stock B is $47.75, what would you expect the price of stock A to be?
The following scatter plot shows the hours of exercise per week and resting heart rates for various 30-year-old males. A line of best fit has been drawn. Use the scatter plot to answer the following questions.
- How would you describe the correlation between hours of exercise per week and resting heart rate?
- If a 30-year-old male exercises 2 hours per week, what would you expect his resting heart rate to be?
- If a 30-year-old male has a resting heart rate of 65 beats per minute, how many hours would you expect him to exercise per week?
To view the Review answers, open this PDF file and look for section 7.3.