7.4: Linear Regression Equations
Suppose you have a large database that includes the scores on physics exams and calculus exams from high school students across your state who took both tests. You want to find out whether there is a correlation between these two sets of scores. What tools could you use to find out this information in an efficient way?
Watch This
First watch this video to learn about linear regression equations.
CK-12 Foundation: Chapter7LinearRegressionEquationsA
Then watch this video to see some examples.
CK-12 Foundation: Chapter7LinearRegressionEquationsB
Watch this video for more help.
James Sousa Linear Regression on the TI84 - Example 1
Guidance
Scatter plots and lines of best fit can also be drawn by using technology. The TI-83 is capable of graphing both a scatter plot and of inserting the line of best fit onto the scatter plot. The calculator is also able to find the correlation coefficient \begin{align*}(r)\end{align*} and the coefficient of determination \begin{align*}(r^2)\end{align*} for the linear regression equation.
The correlation coefficient will have a value between \begin{align*}-1\end{align*} and 1. The closer the correlation coefficient is to \begin{align*}-1\end{align*} or 1, the stronger the correlation. If the correlation coefficient is negative, this implies a negative correlation, and if the correlation coefficient is positive, this implies a positive correlation. The coefficient of determination is just the correlation coefficient squared, and, therefore, it is always positive. The closer the coefficient of determination is to 1, the stronger the correlation.
Example A
The following table consists of the marks achieved by 9 students on chemistry and math tests. Create a scatter plot for the data with your calculator.
Student | A | B | C | D | E | F | G | H | I |
---|---|---|---|---|---|---|---|---|---|
Chemistry Marks | 49 | 46 | 35 | 58 | 51 | 56 | 54 | 46 | 53 |
Math Marks | 29 | 23 | 10 | 41 | 38 | 36 | 31 | 24 | ? |
Example B
Draw a line of best fit for the data that you plotted in Example A. Use the line of best fit to calculate the predicted value for Student I's math test mark.
The calculator can now be used to determine a linear regression equation for the given values. The equation can be entered into the calculator, and the line will be plotted on the scatter plot.
From the line of best fit, the calculated value for Student I's math test mark is 33.6.
Example C
Determine the correlation coefficient and the coefficient of determination for the linear regression equation that you found in Example B. Is the linear regression equation a good fit for the data?
The correlation coefficient and the coefficient of determination for the linear regression equation are found the same way that the linear regression equation is found. In other words, to find the correlation coefficient and the coefficient of determination, after entering the data into your calculator, press \begin{align*}\boxed{\text{STAT}}\end{align*}, go to the CALC menu, and choose LinReg(ax + b). After pressing \begin{align*}\boxed{\text{ENTER}}\end{align*} to choose LinReg(ax + b), press \begin{align*}\boxed{\text{ENTER}}\end{align*} again, and you should see the following screen:
You can see that \begin{align*}r\end{align*}, or the correlation coefficient, is equal to 0.9486321738, while \begin{align*}r^2\end{align*}, or the coefficient of determination, is equal to 0.8999030012. This means that the linear regression equation is a moderately good fit, but not a great fit, for the data.
Guided Practice
The data below gives the fuel efficiency of cars with the same-sized engines when driven at various speeds.
\begin{align*}& \text{Speed (m/h)} \qquad \qquad \qquad \ \quad 32 \quad 64 \quad 77 \quad 42 \quad 82 \quad 57 \quad 72\\ & \text{Fuel Efficiency (m/gal)} \qquad \quad 40 \quad 27 \quad 24 \quad 37 \quad 22 \quad 36 \quad 28\end{align*}
a. Draw a scatter plot and a line of best fit using technology. What is the equation of the line of best fit?
b. What is the correlation coefficient and the coefficient of determination of the linear regression equation? Is the linear regression equation a good fit for the data?
c. If a car were traveling at a speed of 47 m/h, estimate the fuel efficiency of the car.
d. If a car has a fuel efficiency of 29 m/gal, estimate the speed of the car.
Answer:
a.
From the following screen, the equation of the line of best fit is approximately \begin{align*}y=-0.36x+52.6\end{align*}.
b. As can be seen in the screen in the answer to part a, the correlation coefficient is 0.9534582451, while the coefficient of determination is 0.9090826251. This means that the linear regression equation is a moderately good fit, but not a great fit, for the data.
c. Using the TI-83 to calculate the value, the fuel efficiency of a car traveling at a speed of 47 m/h would be approximately 35 m\gal.
d. From the calculator, the equation of the line of best fit is approximately \begin{align*}y=-0.36x+52.6\end{align*}, where \begin{align*}y\end{align*} represents the fuel efficiency of the car and \begin{align*}x\end{align*} represents the speed of the car.
Using this equation:
\begin{align*}y & = -0.36x+52.6\\ 29 & = -0.36x+52.6\\ 29-52.6& = -0.36x+52.6-52.6\\ \frac{-23.6}{-0.36} & = \frac{-0.36x}{-0.36}\\ 65.6 \text{ m/h} & = x\end{align*}
The speed of the car would be approximately 65.6 miles per hour.
Interactive Practice
Practice
- Which of the following calculations will create the line of best fit on the TI-83?
- quadratic regression
- cubic regression
- exponential regression
- linear regression \begin{align*}(ax + b)\end{align*}
The linear regression below was performed on a data set with a TI calculator. Use the information shown on the screen to answer the following questions:
- What is the linear regression equation?
- What is the correlation coefficient and the coefficient of determination? Is the linear regression equation a good fit for the data?
- According to the linear regression equation, what would be the approximate value of y when x = 3?
The linear regression below was performed on a data set with a TI calculator. Use the information shown on the screen to answer the following questions:
- What is the linear regression equation?
- What is the correlation coefficient and the coefficient of determination? Is the linear regression equation a good fit for the data?
- According to the linear regression equation, what would be the approximate value of y when x = 10?
The linear regression below was performed on a data set with a TI calculator. Use the information shown on the screen to answer the following questions:
- What is the linear regression equation?
- What is the correlation coefficient and the coefficient of determination? Is the linear regression equation a good fit for the data?
- According to the linear regression equation, what would be the approximate value of x when y = 8?
correlation coefficient
and the coefficient of determination are both standard quantitative measures of best fit. The correlation coefficient has values from to 1, and the closer the value is to or 1, the better the fit. If the correlation coefficient is negative, the correlation is negative, and if it is positive, the correlation is positive. The coefficient of determination is the correlation coefficient squared and has values from 0 to 1. The closer the value is to 1, the better the fit.coefficient of determination
The coefficient of determination is the square of the correlation coefficient and therefore has values from 0 to 1.correlation coefficient
The correlation coefficient is a standard quantitative measure of best fit of a line. It has the symbol r and has values from -1 to +1.Image Attributions
Here you'll learn how to use a Texas Instruments calculator to create a scatter plot and to determine the equation of the line of best fit. You'll also learn how to determine if a linear regression equation is a good fit for the data.
Concept Nodes:
correlation coefficient
and the coefficient of determination are both standard quantitative measures of best fit. The correlation coefficient has values from to 1, and the closer the value is to or 1, the better the fit. If the correlation coefficient is negative, the correlation is negative, and if it is positive, the correlation is positive. The coefficient of determination is the correlation coefficient squared and has values from 0 to 1. The closer the value is to 1, the better the fit.coefficient of determination
The coefficient of determination is the square of the correlation coefficient and therefore has values from 0 to 1.correlation coefficient
The correlation coefficient is a standard quantitative measure of best fit of a line. It has the symbol r and has values from -1 to +1.