# 6.5: Chapter 6 Review

**At Grade**Created by: Bruce DeWItt

In this chapter, we have learned that when working with bivariate, numerical data it is important to first identify whether there is an explanatory and response relationship between the two variables. Often one of the variables, the explanatory (independent) variable, can be identified as having an impact on the value of the other variable, the response (dependent) variable. The explanatory variable should be placed on the horizontal axis, and the response variable should be placed on the vertical axis. Next we learned how to construct a visual representation, in the form of a scatterplot, so that we can see what the association looks like. A scatterplot helps us see what, if any, association there is between the two variables. If there is an association between the two variables, it can be identified as being strong if the points form a very distinct form or pattern, or weak if the points appear more randomly scattered. If the values of the response variable generally increase as the values of the explanatory variable also increase, the data has a positive association. If the response variable generally decreases as the explanatory variable increases, the data has a negative association. We also are able to see the form of the pattern, if any, in the graph.

When the data looks reasonably linear, we learned how to use technology to calculate the least-squares regression line and the correlation coefficient. The least-squares regression line is often useful for making predictions for linear data. However, we now know to beware of extrapolating beyond the range of our actual data. Correlation is a measure of the linear relationship between two variables – it does not necessarily state that one variable is caused by another. For example, a third variable or a combination of other things may be causing the two correlated variables to relate as they do. We learned how to interpret the linear correlation coefficient and that it can be greatly affected by outliers and influential points. Also, just because two variables have a high correlation, does not mean that they have a cause-and-effect relationship. Correlation ≠ Causation!

Beyond constructing graphs and calculating statistics, we learned how to describe the relationship between the two variables in context. The acronym we learned to help us remember what to include in our descriptions is *S.C.O.F.D*. This tells us to describe the strength of the association, to be sure that our description is in context, to mention any outliers or influential points that we observe, and to describe the form and the direction of the relationship. We also learned how to interpret the slope and y-intercept of the least-squares regression line in context. Even though we are doing easy calculations, statistics is never about meaningless arithmetic and we should always be thinking about what a particular statistical measure means in the real context of the data.

### Chapter 6 Review Exercises

**Answer the following as TRUE or FALSE.**

1) A negative relationship between two variables means that for the most part, as the x variable increases, the y variable increases.

2) A correlation of -1 implies a perfect linear relationship between the variables.

3) The equation of the regression line used in statistics is \begin{align*}\hat{y}= a + bx\end{align*}

4) When the correlation is high, one can assume that x causes y.

**Complete the following statements with the best answer.**

5) The symbol for the Correlation coefficient is _____

6) A statistical graph of two variables is called a(n) ______________________.

7) The ____________________ variable is plotted along the x-axis.

8) The range of r is from _____ to ______.

9) The sign of r and ___________ will always be the same.

10) LSRL stands for _______________________________________________________.

11) If all the points fall on a straight line, the value of r will be ________ or _________.

12) If r = -0.86, then r^{2} = _______.

13) If r^{2} = 0.77, then r = ______ or ______.

14) Using an LSRL to make predictions outside the range of our original data is called ________________.

15) Using an LSRL to make predictions within the range of our original data is called ________________.

16) When describing the relationship visible in a scatterplot, the acronym S.C.O.F.D. stands for _____________________________________________________________________.

17) Suppose that a scatterplot shows a strong, linear, positive relationship, and the correlation coefficient is very high. However, both of the variables are actually increasing due to some outside lurking variable. This relationship suffers from ____________________.

18) Suggest possible lurking variables to explain the high correlations between the following variables. Consider whether common response, confounding, or coincidence may be involved.

a) The number of cell phones being made has been increasing over the past 15 years. So has the number of starving children. Do cell phones cause starvation?

b) The stress level of all of the employees at a certain company has been going up consistently over the past year. During this time, they have received three pay bumps. Does this mean that higher pay is causing the stress?

c) Suppose that a study shows that the number of hours of sleep a person gets is negatively correlated with the number of cigarettes a person smokes. Does this mean that not sleeping causes a person to smoke more cigarettes?

19) Some researchers wanted to determine how well the number of beers consumed can predict what a person's blood alcohol content will be after a given length of time. They set up an experiment in which several volunteers each drank a randomly selected number of beers during a given time period. The volunteers were between 21 and 25 years of age, but all ranged in gender and in weight. Exactly three hours after they began to drink the beers, their BAC level was measured three times. The three measurements were averaged and the results are given in the following table. *(This is fictitious data, but is based on calculations from the BAC calculator at: http://www.dot.wisconsin.gov)*

a) Identify the explanatory and response variables and construct a scatter-plot (be neat & label your axes).

b) Calculate the LSRL and correlation. Report the equation and add it to your scatter-plot? Identify your variables

(report what x and y stand for).

c) Identify and interpret the slope in context.

d) Identify and interpret the y-intercept in context.

e) If a person drinks 6 beers during this time period, on average what do you predict the person’s BAC will be?

f) If a person drinks 15 beers during this time period, on average what do you predict the person’s BAC will be?

g) Are you confident in both of the previous answers? Why or why not?

20) When investigating car crashes, it is often necessary to try to determine the speed at which a vehicle was traveling at the time of the accident. Investigators are able to do this by measuring the length of the skid mark left by the vehicle in question. The following table lists several speeds (mph) based on the skid length (feet), according to the Forensic Dynamics website: http://forensicdynamics.com.

a) Identify the explanatory and response variables and construct a scatter-plot (be neat & label your axes).

b) Calculate the LSRL and add it to your scatter-plot? Report the equation and identify your variables.

c) Describe the relationship you see in the scatter-plot (S.C.O.F.D.).

Be thorough & use complete sentences!Be sure that you explain the relationship in the context of the problem (overall trend between the two variables).

d) What is the correlation coefficient? Based on your scatterplot and the value of r, how well do you feel that your model fits this data? Explain

e) What is the predicted speed if the skid mark is 157 feet? If it were 36 feet?

f) Would you expect predictions beyond 250 feet to generally over-estimate or under-estimate the actual speed of the vehicle? Why?

#### Image References:

Beach visitors & temperature: http://technomaths.edublogs.org

Study Time & Test Scores: http://www.icoachmath.com

Car weight & mpg: http://www.statcrunch.com

Elevation & Temperature: http://staff.argyll.epsb.ca

Peanut Butter & Quality Rating: http://intermath.coe.uga.edu

Arm Span & Height: http://3.bp.blogspot.com

Surgeon General’s Warning Labels: http://abibrands.com

Outlier Example: http://mathworld.wolfram.com

Recycling Rates: http://www.earth-policy.org