Correlation measures the relationship between bivariate data. Scatterplots display these bivariate data sets and provide a visual representation of the relationship between variables.
Examining a scatterplot graph allows us to obtain some idea about the relationship between two variables.
- lower-left-to-upper-right pattern --> positive correlation
- upper-left-to-lower-right pattern --> negative correlation
- straight line --> perfect correlation
- no linear trend --> zero correlation or a near-zero correlation
The value of a perfect positive correlation is 1.0, while the value of a perfect negative correlation is -1.0 .
When there is no linear relationship between two variables, the correlation coefficient is 0. Note: It is important to remember that a correlation coefficient of 0 indicates that there is no linear relationship, but there may still be a strong relationship between the two variables. For example, there could be a quadratic relationship between them.
Calculating the Regression Line
Linear regression involves using data to calculate a line that best fits that data and then using that line to predict scores. In linear regression, we use one variable (the predictor variable) to predict the outcome of another (the outcome variable, or criterion variable).
Least squares regression is a method of fitting the data line so that there is minimal difference between the observations and the line. In the example below, you can see the calculated distances, or residual values, from each of the observations to the regression line.
As you can see, the regression line is a straight line that expresses the relationship between two variables. When predicting one score by using another, we use an equation such as the following, which is equivalent to the slope-intercept form of the equation for a straight line:
We use the following formula to calculate the regression coefficient:
We use the following formula to calculate the regression constant:
Hypothesis Testing for Linear Relationships
The test statistic for this hypothesis test is calculated as follows:
When predicting values using multiple regression, we first use the standard score form of the regression equation, which is shown below: