Scatter plots and line graphs are the most common ways to display bivariate data (data with two variables).
- A scatter plot is generally used when displaying data from two variables that may or may not be directly related, and when neither of the variables is under the direct control of the researcher. The primary function of a scatter plot is to visualize the strength of correlation between the two plotted variables. The number of sunburned swimmers at the local pool each day for a month would be an example of a data set that would best be displayed as a scatter plot, since neither the weather nor the number of swimmers present is under the control of the researcher.
- A line graph is appropriate when comparing two variables that are believed to be related, and when one of the variables is under the direct control of the researcher. The primary use of a line graph is to determine the trend between the two graphed variables. The mileage of a particular car compared to speed of travel would be a good example, since the mileage is certainly correlated to the speed and the speed can be directly controlled by the researcher.
To create a line or scatter plot of data, you must first identify your two variables as either dependent or independent. A dependent (or input) variable may also be referred to as the explanatory variable, and has values that are assigned to it. An independent (or output) variable may also be called the response variable, and has values that are the result of computations performed on the input variable. By convention, the independent variable is plotted on the horizontal, and the dependent variable is plotted on the vertical.
Then, you must organize your data so that it is easy to see how a given input value relates to a given output value. By convention this is done with a ‘T’ chart or a two-column graph, with the input value on the left and the output value on the right, or vertically with the input on the top and output on the bottom.
Once you have the table constructed, start with the first pair of values and move across your horizontal axis to the first input value and up the vertical axis to the associated output value. Continue the process until all of your points have been graphed.
Once all of your points have been plotted, if you are creating a scatter plot, you’re done! If you are creating a line graph, start at your minimum input value and connect the points as you move to the right on the input axis.
Note 1: A broken-line graph is a type of line graph that is used when it is necessary to show change over time. A line is used to join the values, but the line has no defined slope. However, the points are meaningful, and they all represent an important part of the graph.
Note 2: A double line graph is a type of line graph that is used to show a comparison. To create a double line graph simply create two line graphs on the same set of axes, one for each data set.
Construct a scatter plot from the given values.
Solution: The data here is already organized into associated input and output values, so you simply need to create a graph with a horizontal and vertical axis on which to plot the points.
Notice that I have only created the positive values here, since the table of values was all positive.
Now we just plot the points from the table, starting with the first vertical pair: Input = 1, Output = 2. Incidentally, when describing a single point of bivariate data, the conventional method of writing it is in the form (input, output) or
Now we fill in the values on the graph, starting with (1, 2). Beginning at the lower-left corner, which represents (0, 0), move 1 point to the right and 2 points up. The second point is 3 points to the right and 4 points up. Continue until all 10 points are graphed. Since the question asks specifically for a scatter plot, once the individual points are plotted, we are done.
Interpreting Scatter Plots and Line Graphs
- Two variables with a strong correlation will appear as a number of points occurring in a clear and recognizable linear pattern. The line does not need to be straight, but it should be consistent and not exactly horizontal or vertical.
- Two variables with a weak correlation will appear as a much more scattered field of points, with only a little indication of points falling into a line of any sort.
- A linear relationship appears as a straight line either rising or falling as the independent variable values increase. If the line rises to the right, it indicates a direct relationship. If the line falls to the right, it indicates an inverse relationship.
- A non-linear relationship may take the form of any number of curved lines, and may indicate a squared relationship (dependent variable is the square of the independent), a square root relationship (dependent variable is the square root of the independent), an inverse square (dependent variable is one divided by the square of the independent), or many other possibilities.
- A positive correlation appears as a recognizable line with a positive slope . A line has a positive slope when an increase in the independent variable is accompanied by an increase in the dependent variable (the line rises as you move to the right).
- A negative correlation appears as a recognizable line with a negative slope. As the independent variable increases, the dependent variable decreases (the line falls as you move to the right).