# 6.4: More Least-Squares Regression

**At Grade**Created by: Bruce DeWItt

The learning objectives for this lesson are the same as those in the previous section. See section 6.3 for examples.

### Learning Objectives

- Construct scatterplots using technology
- Calculate and graph the least-squares regression line using technology
- Calculate the correlation coefficient using technology
- Use the LSRL to make predictions
- Understand interpolation and extrapolation
- Interpret the slope and the y-intercept of the LSRL

#### Multimedia Links

For an introduction to what a least squares regression line represents, see bionicturtled.com, Introduction to Linear R (5:15).

http://www.youtube.com/watch?v=ocGEhiLwDVc

For an applet that will calculate correlation and the least squares regression line, see

http://illuminations.nctm.org/lessonDetail.aspx?ID=L456

### Problem Set 6.4

#### Section 6.4 Exercises

1) Mr. Exercise wanted to know whether or not customers continued to use their equipment after they purchased it. He contacted an SRS of his customers who had purchased an exercise machine during the past 18 months. His findings are summarized in the following table. We began to look at his data in section 6.1. We are now going to analyze it further.

License: CC BY-NC 3.0

a) Construct a scatterplot. Calculate the LSRL and add it to your graph. Sketch your graph and report your equation. Be sure to identify your variables.

b) Identify and interpret the slope in the context of the problem.

c) Identify and interpret the y-intercept in the context of the problem.

d) What is the correlation coefficient? What are the two things that this statistic tells about the relationship between these two variables?

e) Based on your model, how many hours would you predict a person who has owned the machine for 12 months to exercise? 5 months?

f) Based on your model, if a person claims to exercise 9 hours per week, how long would you suspect that they had owned the machine?

2) A college professor was becoming annoyed by how many of his students were absent during his 8:00 a.m. section of Philosophy 103. He decided to analyze whether these absences were affecting students learning the material or not. He assigned his TA the task of keeping track of attendance. At the end of the semester he compared each students' grade on the final exam (100 points possible) with the number of times he or she had been absent. His findings are displayed in the following graph.

License: CC BY-NC 3.0

a) Identify the explanatory and response variables.

b) Describe the relationship between these two variables (S.C.O.F.D).

c) Jeremy was absent 25 times. What would you predict his score on the final exam to be? Lucy overslept and missed 43 classes. What would you predict for her score on the final?

d) Calculate the correlation coefficient (r). What two things does this statistic tell you about the association between these two variables?

(Hint: you were given R^{2})

e) Interpret the meaning of -1.654 in the context of this problem.

3) The following table shows the grade level and reading level for 5 students. Treat grade level as the explanatory variable as you do the following.

License: CC BY-NC 3.0

a) Create a scatterplot. Then calculate the LSRL and the correlation coefficient for this data. Report your findings.

What if it was found that student E was actually in grade 8? How would this affect the LSRL and/or the correlation?

License: CC BY-NC 3.0

b) Create a new scatterplot. Then calculate the LSRL and the correlation coefficient for the changed data. Report your findings.

c) What changes do you notice between your answers to (a) and (b)? Explain why these changes occurred.

4) The table below shows the nutritional information for Taco Bell Burritos as reported on the website: http://www.tacobell.com. Choose two of the variables to analyze (avoid using trans fat & sugars).

**License**: CC BY-NC 3.0

a) What will you be using as your explanatory and response variables?

b) Construct a scatterplot. Label your axes.

c) Describe the association (S.C.O.F.D.).

d) Calculate the LSRL and the correlation. Report them. Be sure to define your variables. Add the line to your graph in part (b).

e) Use your model to make a prediction that involves interpolation.

f) Use your model to make a prediction that involves extrapolation.

5) *Interpret the calculator output*. The lifeguard at the Swimtastic Pool & Water-Slides decided to keep track of how many people came to the pool each day and compare this to the predicted high temperature for that day. The temperatures ranged from 82^{o} to 96^{o} during his data collection time period. He used the number of people as the response variable. Use this scatterplot and regression output from a TI-84 plus to answer the questions that follow.

License: CC BY-NC 3.0

a) Write the regression equation. Define your variables.

b) Identify and interpret the slope in the context of the problem.

c) What are the two things that the correlation tells us in this situation?

d) Based on this model, how many people would you predict on a 91

^{o}day? How about a 45^{o}day? Are both of these predictions reasonable? Why or why not?

#### Review Exercises

Here are the hourly salaries for the employees at Greezy's Burger Boy: $7.35, $7.85, $7.25, $8.90, $8.25, $7.25, $10.05, $7.70, $16.90, $8.30, $7.75, and $7.55. Use this salary data to answer the following questions.

6) Calculate the mean and standard deviation for the salaries.

7) Calculate the five number summary for the salaries.

8) Construct an accurate box plot.

9) Which numerical summary of center and spread (mean & standard deviation or median & IQR) would be more appropriate in this situation? Explain why.

10) Describe the distribution. Include Shape, Outliers, Context, Center, & Spread (S.O.C.C.S.)

### Image Attributions

**[1]****^**License: CC BY-NC 3.0**[2]****^**License: CC BY-NC 3.0**[3]****^**License: CC BY-NC 3.0**[4]****^**License: CC BY-NC 3.0**[5]****^**License: CC BY-NC 3.0**[6]****^**License: CC BY-NC 3.0