7.1: Line Graphs and Scatter Plots
Learning Objectives
 Represent data that has a linear pattern on a graph.
 Represent data using a brokenline graph and represent two sets of data using a double line graph.
 Understand the difference between continuous data and discrete data as it applies to a line graph.
 Represent data that has no definite pattern as a scatter plot.
 Draw a line of best fit on a scatter plot.
 Use technology to create both line graphs and scatter plots.
Introduction
Each year the school has a fund raising event to collect money to support the school sport teams. This year the committee has decided that each class will make friendship bracelets and sell them for \begin{align*}\$2.00\end{align*} each. To buy the necessary supplies to make the bracelets, each class is given \begin{align*}\$40.00\end{align*} as a start up fee. Create a table of values and draw a graph to represent the sale of 10 bracelets. If the class sells ten bracelets, how much profit will be made?
We will revisit this problem later in the lesson.
When data is collected from surveys or experiments, it can be displayed in different ways; tables of values, graphs, and boxandwhisker plots. The most common graphs that are used in statistics are line graphs, scatter plots, bar graphs, histograms, frequency polygons. Graphs are the most common way of displaying data because they are visual and allow you to get a quick impression of the data and determine if there are any trends in the data. You have probably noticed that graphs of different types are found regularly in newspapers, on websites, and in many textbooks.
If we think of independent and dependent variables in terms of the variables in an input/output machine – we can see that the input variable is independent of anything around it but the output variable is completely dependent on what we put into the machine. The input variable is the \begin{align*}{\color{blue}x}\end{align*} variable and the output variable is the \begin{align*}{\color{red}y}\end{align*} (or the \begin{align*}f(x)\end{align*}) variable.
If we apply this theory to graphing a straight line on a rectangular coordinate system, we must first determine which variable is the dependent variable and which one is the independent variable. Once this has been established, the ordered pairs can be plotted.
Example 1: If you had a job where you earned \begin{align*}\$9.00\end{align*} an hour for every hour you worked up to a maximum of 30 hours, represent your earnings on a graph by plotting the money earned against the time worked.
Solution: The dependent variable is the money earned and the independent variable is the number of hours worked. Therefore, money is on the \begin{align*}y\end{align*}axis and time is on the \begin{align*}x\end{align*}axis. The first step is to create a table values that represent the problem. The number pairs in the table of values will be the ordered pairs to be plotted on the graph.
Time Worked (Hours)  Money Earned 

0  \begin{align*}\$0\end{align*} 
1  \begin{align*}\$9.00\end{align*} 
2  \begin{align*}\$18.00\end{align*} 
3  \begin{align*}\$27.00\end{align*} 
4  \begin{align*}\$36.00\end{align*} 
5  \begin{align*}\$45.00\end{align*} 
6  \begin{align*}\$54.00\end{align*} 
Now that the points have been plotted, the decision has to be made as to whether or not to join them. Between every two points plotted on the graph are an infinite number of values. If these values are meaningful to the problem, then the plotted points can be joined. This data is called continuous data. If the values between the two plotted points are not meaningful to the problem, then the points should not be joined. This data is called discrete data. In the above problem, it is possible to earn \begin{align*}\$4.50\end{align*} for working onehalf hour and this value is meaningful for our problem. Therefore the data is continuous and the points should be joined.
Now you know how to graph a straight line from a table of values. It is just as important to be able to graph a straight line from a linear function that models a problem. The equation of a straight line can be written in the form \begin{align*}y = mx + b\end{align*}, where \begin{align*}m\end{align*} is the slope of the line and \begin{align*}b\end{align*} is the \begin{align*}y\end{align*}intercept.
Example 2: Draw a graph to model the linear function \begin{align*}y = 2x + 5\end{align*}
Solution:
The slope of the line is \begin{align*}\frac{change \ in \ x}{change \ in \ y}.\end{align*}
The slope of this line is \begin{align*}\frac{2}{1}\end{align*}. The \begin{align*}y\end{align*}intercept is (0, 5). To graph this line, begin by plotting the \begin{align*}y\end{align*}intercept. From the \begin{align*}y\end{align*}intercept, move to the right one and up two. Plot this point. You can continue to move right one and up two in order to create more points on the line. Join the points with a smooth line by using a straight edge (ruler).
If you found this difficult to do, you could make a table of values for the function by substituting values for \begin{align*}x\end{align*} into the equation to determine values for \begin{align*}y\end{align*}. Then you would plot the ordered pairs on the graph. Whichever way you plotted the points, the result would be a straight line graph. Let’s apply this method to an everyday problem.
Example 3: Your school is having a teenage dance on Friday night. The dance will begin at 8:00 p.m. and will end at midnight. A DJ is hired to play the music. The cost of hiring the DJ is \begin{align*}\$100\end{align*} plus an additional \begin{align*}\$20.00\end{align*} an hour. Using either a table of values or an equation, draw a graph that would represent the cost of hiring the DJ for the dance. How much would the school pay the DJ for playing music for the dance?
Solution: An equation that would model this problem is \begin{align*}y = 20x + 100\end{align*}. To make the equation match the problem \begin{align*}y\end{align*} can be replaced with \begin{align*}c\end{align*} (cost) and \begin{align*}x\end{align*} can be replaced with \begin{align*}h\end{align*} (number of hours). Now the equation \begin{align*}y = 20x + 100\end{align*} becomes \begin{align*}c = 20h + 100\end{align*}.
The DJ will play 4 hours of music and will be paid \begin{align*}\$180.00\end{align*}
Example 4: The total cost to lease a car is mostly dependent on the number of months you have the lease. The table of values below shows the cost and number of months for ten months of a lease. Plot the data points on a properly labeled \begin{align*}xy\end{align*} axis. Draw the line all the way to the \begin{align*}y\end{align*}axis so that you can find the \begin{align*}y\end{align*}intercept. What could the \begin{align*}y\end{align*}intercept represent in this problem?
\begin{align*}& \text{x(months)} && 2 && 4 && 6 && 8 && 10\\ & y(\$) && 2100 && 2700 && 3300 && 3900 && 4500\end{align*}
We will now return to the fund raising event that was presented in the introduction. You should be able to solve this problem now.
Solution:
Number of Bracelets  Cost 

0  \begin{align*}\$40\end{align*} 
1  \begin{align*}\$42\end{align*} 
2  \begin{align*}\$44\end{align*} 
3  \begin{align*}\$46\end{align*} 
4  \begin{align*}\$48\end{align*} 
5  \begin{align*}\$50\end{align*} 
6  \begin{align*}\$52\end{align*} 
7  \begin{align*}\$54\end{align*} 
8  \begin{align*}\$56\end{align*} 
9  \begin{align*}\$58\end{align*} 
10  \begin{align*}\$60\end{align*} 
In this case the data is discrete. The graph tells that only whole numbers are meaningful for this problem and that selling ten bracelets would mean a profit of \begin{align*}\$20.00\end{align*}. The sales indicate a total of \begin{align*}\$60.00\end{align*} but this includes the start up money of \begin{align*}\$40.00\end{align*}. Therefore \begin{align*}\$60.00  \$40.00 = \$20.00\end{align*} is the profit.
In all of the above examples, the type of line graph that was used was one that described a definite linear pattern. There is another type of line graph that is used when it is necessary to show change over time. This type of line graph is called a broken line graph. A line is used to join the values but the line has no defined slope.
Example 5: Joey has an independent project to do for his Physical Active Lifestyle class. He has decided to do a poster that shows the times recorded for running the 100 meter dash event over the last fifteen years. He has collected the following information from the local library.
Year  Time (seconds)  Year  Time (seconds) 

1995  11.3  2002  11.0 
1996  11.2  2003  10.9 
1997  11.2  2004  10.9 
1998  11.2  2005  10.9 
1999  11.2  2006  10.8 
2000  11.2  2007  10.7 
2001  11.2  2008  10.7 
2009  10.5 
Display the information that Joey has collected on a graph that he might use on his poster.
Solution:
From this graph, you can answer many of the following questions:
Questions
 What was the fastest time for the 100m dash in the year 2000?
 Between what two years was there the greatest decrease in the fastest time to complete the 100m dash?
 As the years pass, why do think runners are completing the race in a faster time?
Answers
 11.2 seconds
 Between 2001 and 2002; Between 2008 and 2009
 The runners are living a healthier and more active life style.
A broken line graph can be extended to include two broken lines. This type of a line graph is very useful when you have two sets of data that relate to the same topic but are from two different sources. For example the deaths in a small town over the past ten years can be graphed on a broken line graph. To extend this data, natural deaths could be plotted along with the deaths that were the result of traffic accidents. With both lines on the same graph, comparing them would be made easier.
Example 6: Jane has operated an icecream parlor for many years. She has decided to retire and is anxious to sell her business. In order to show interested buyers the ice cream sales for the past two years, she has decided to show these sales on a double line graph. She will use the graph to show buyers what month had the highest sales, when the greatest change in sales occurs and to show them when an unexpected increase in sales occurs. Following is the information that Jane has recorded for the monthly sales during the years 2008 and 2009. Can you help Jane by using the double line graph to answer the questions?
Solution:
The month of August had the highest sales for both years. Between the months and August and September there is a great decrease in the ice cream sales. However, the month of December shows an unexpected increase in sales. This could be due to the holiday season.
Scatter Plots
Often, when realworld data is plotted, the result is a linear pattern. The general direction of the data can be seen, but the data points do not all fall on a line. This type of graph is a scatter plot. A scatter plot is often used to investigate the relationship (if one exists) between two sets of data. The data is plotted on a graph such that one quantity is plotted on the \begin{align*}x\end{align*}axis and one quantity is plotted on the \begin{align*}y\end{align*}axis. If the relationship does exist between the two sets of data, it will be visible when the data is plotted.
Example 1: The following graph represents the relationship between the price per pound of lobster and the number of lobsters sold. Although the points cannot be joined to form a straight line, the graph does suggest a linear pattern. What is the relationship between the cost per pound and the number of lobsters sold?
Solution:
From the graph, it is obvious that a relationship does exist between the cost per pound and the number of lobsters sold. When the cost per pound was low, the number of lobsters sold was high.
Example 2: The following scatter plot represents the sale of lottery tickets and the temperature.
Is there a relationship between the number of lottery tickets sold and the temperature?
Solution:
From the graph, it is clearly seen that there is no relationship between the number of lottery tickets sold and the temperature of the surrounding environment.
Example 3: The table below represents the height of ten children in inches and their shoe size.
\begin{align*}& \text{Height(in)} && 51 && 53 && 61 && 59 && 63 && 47 && 53 && 66 && 55 && 49\\ & \text{Shoe Size} && 2 && 4 && 6 && 5 && 7 && 1 && 3 && 9 && 4 && 2 \end{align*}
The information from the table can be displayed on a scatter plot.
Solution:
Yes, there is a relationship between the shoe size and the height of the child. Children who are short wear smallsized shoes and those who are taller wear larger shoes.
In this case, there is a direct relationship (correlation) between the shoe size and the height of the children. Correlation refers to the relationship or connection between two sets of data. The correlation between two sets of data can be weak, strong, negative, or positive, or in some cases there can be no correlation. The characteristics of the correlation between two sets of data can be readily seen from the scatter plot.
The scatter plot of the shoe sizes and the heights of the children show a strong, positive correlation. The scatter plot of the lottery tickets and the temperature showed no correlation.
If there is a correlation between the two sets of data on a scatter plot, then a straight line can be drawn so that the plotted points are either on the line or very close to it. This line is called the line of best fit. A line of best fit is drawn on a scatter plot so that it joins as many points as possible and shows the general direction of the data. When constructing the line of best fit, it is also important to keep, approximately, an equal number of points above and below the line. To determine where the line of best fit should be drawn, a piece of spaghetti can easily be rolled across the graph with the plotted points still being visible.
Returning to the scatter plot that shows the relationship between shoe sizes and the height of children, a line of best fit can be drawn to define this relationship.
In a later lesson, we will determine the equation of this line manually and by using technology.
Lesson Summary
In this lesson you learned how to represent data by graphing three types of line graphsa straight line of the form \begin{align*}y = mx + b\end{align*}, a brokenline graph and a double line graph. You also learned about scatter plots and the meaning of correlation as it applies to a scatter plot. In addition, you saw the result of drawing a line of best fit on a scatter plot.
Points to Consider
 Is a double line graph the only representation used to compare two sets of data?
 Does the line of best fit have an equation that would model the data?
 Is there another representation that could be used instead of a broken line graph?
Review Questions
 On the following graph circle the independent and dependent variables. Write a sentence to describe how the independent (input) variable is related to the dependent (output) variable in each graph.
 Ten people were interviewed for a job at the local grocery store. Mr. Neal and Mrs. Green awarded each of the ten people, points as shown in the following table: \begin{align*}& \text{Mr.Neal} && 30 && 22 && 25 && 17 && 17 && 39 && 33 && 38 && 27 && 33\\ & \text{Mrs.Green} && 25 && 20 && 21 && 15 && 16 && 35 && 30 && 32 && 23 && 22 \end{align*} Draw a scatter plot to represent the above data. (You may use technology to do this).
 The following data represents the fuel consumption of cars with the same size engine, when driven at various speeds. \begin{align*}& \text{Speed(km/h} && 48 && 99 && 64 && 128 && 112 && 88 && 120 && 106\\
& \text{Fuel Consumption(km/L)} && 7 && 14 && 9 && 18 && 16 && 13 && 17 && 15\end{align*}
 Plot the data values.
 Draw in the line of best fit.
 Estimate the fuel consumption of a car travelling at a speed of 72 km/h.
 Estimate the speed of a car that has a fuel consumption of 12 km/L.
 Answer the questions by using the following graph that represents the temperature in \begin{align*}^\circ F\end{align*} for the first 20 days in July.
 What was the coldest day?
 What was the temperature on the hottest day? (Approximately)
 What days appeared to have no change in temperature?
 Answer the questions by using the following graph that represents the temperature in \begin{align*}^\circ F\end{align*} for the first 20 days in July in New York and in Seattle.
 Which City has the warmest temperatures in July?
 Which of the two cities seems to have temperatures that appear to be rising as the month progresses?
 Approximately, what is the difference in the daily temperatures between the two cities?
 The following graphs represent continuous and discrete data. Are the graphs labeled correctly with respect to these types of data? Justify your answer.
 A car rental agency is advertising March Break specials. The company will rent a car for \begin{align*}\$10\end{align*} a day plus a down payment of \begin{align*}\$65\end{align*}. Create a table of values for this problem and plot the points on a graph. Using the graph, what would be the cost of renting the car for one week?
 What type of graph would you use to display each of the following types of data?
 The number of hours you spend doing Math homework each week for the first semester.
 The marks you received in all your home assignments in English this year and the marks you received in all your home assignments in English last year
 The cost of riding in a taxi cab that charges a base rate if \begin{align*}\$5.00\end{align*} plus \begin{align*}\$0.25\end{align*} for every mile you go.
 The time in minutes that it takes you to walk to work each day for 10 days.
Review Answers
The dependent variable (distance) is increasing as the independent variable (time) is increasing.

Using the TRACE function will give the coordinates of the points
 The fuel consumption of a car travelling at a speed of 72 km/h is approximately 10 L.
 The speed of a car that has a fuel consumption of 12 km/L is approximately 85 km/h
 The coldest day was July \begin{align*}7^{th}\end{align*}.
 The hottest day was July \begin{align*}19^{th}\end{align*}.
 There does not appear to be a change in temperature on July \begin{align*}1^{st}\end{align*} and \begin{align*}2^{nd}\end{align*}, July \begin{align*}10^{th}\end{align*} and \begin{align*}11^{th}\end{align*}, July \begin{align*}17^{th}\end{align*} and \begin{align*}18^{th}\end{align*}.
 Seattle
 Both cities appear to have rising temperature as the month progresses, but Seattle seems to have more hot days and on the \begin{align*}20^{th}\end{align*}, the temperature is still rising. The temperature in New York seemed to rise on the \begin{align*}19^{th}\end{align*} but on the \begin{align*}20^{th}\end{align*} the temperature appears to drop off.
 There appears to be a difference of approximately 10 degrees between the temperatures of the cities
 The first graph is labeled correctly as being continuous data. The amount of fuel remaining in your gas tank is plotted for each hour you drive. However, the amount of fuel in your gas tank decreases every minute/second you drive. All values on the graph are meaningful and therefore can be joined. This is continuous data. The second graph is also labeled correctly as being discrete data. The cost of CDs is plotted for each CD you purchase. The cost to you changes only when another CD is purchased. The values between the plotted points are not meaningful and therefore are not joined. This is discrete data.

\begin{align*}& \text{Number of Days} && 1 && 2 && 3 && 4 && 5\\
& \text{Cost(\$)} && \$75 && \$85 && \$95 && \$105 && \$115\end{align*} The cost of renting the car for one week (7 days) would be \begin{align*}\$135.00\end{align*}. This is indicated on the graph by the horizontal line that is drawn from the \begin{align*}7^{th}\end{align*} day to the cost axis.
 A scatter plot
 A double line graph
 A line graph
 A brokenline graph
Answer Key for Review Questions (even numbers)
2.
Using the TRACE function will give the coordinates of the points
4.a. The coldest day was July \begin{align*}7^{th}\end{align*}.
b. The hottest day was July \begin{align*}19^{th}\end{align*}.
c. There does not appear to be a change in temperature on July \begin{align*}1^{st}\end{align*} and \begin{align*}2^{nd}\end{align*}, July \begin{align*}10^{th}\end{align*} and \begin{align*}11^{th}\end{align*}, July \begin{align*}17^{th}\end{align*} and \begin{align*}18^{th}\end{align*}.
6. The first graph is labeled correctly as being continuous data.
The amount of fuel remaining in your gas tank is plotted for each hour you drive. However, the amount of fuel in your gas tank decreases every minute/second you drive. All values on the graph are meaningful and therefore can be joined. This is continuous data.
The second graph is also labeled correctly as being discrete data. The cost of CDs is plotted for each CD you purchase. The cost to you changes only when another CD is purchased. The values between the plotted points are not meaningful and therefore are not joined. This is discrete data.
8. a. A scatter plot
b. A double line graph
c. A line graph
d. A brokenline graph