11.3: The Two-Way ANOVA Test
Learning Objectives
- Understand the differences in situations that allow for one-way or two-way ANOVA methods.
- Know the procedure of two-way ANOVA and its application through technological tools.
- Understand completely randomized and randomized block methods of experimental design and their relation to appropriate ANOVA methods.
Introduction
In the previous section, we discussed the one-way ANOVA method, which is the procedure for testing the null hypothesis that the population means and variances of a single independent variable are equal. Sometimes, however, we are interested in testing the means and variances of more than one independent variable. Say, for example, that a researcher is interested in determining the effects of different dosages of a dietary supplement on the performance of both males and females on a physical endurance test. The three different dosages of the medicine are low, medium, and high, and the genders are male and female. Analyses of situations with two independent variables, like the one just described, are called two-way ANOVA tests.
Dietary Supplement Dosage | Dietary Supplement Dosage | Dietary Supplement Dosage | ||
---|---|---|---|---|
Low | Medium | High | Total | |
Female | 35.6 | 49.4 | 71.8 | 52.3 |
Male | 55.2 | 92.2 | 110.0 | 85.8 |
Total | 45.2 | 70.8 | 90.9 |
There are several questions that can be answered by a study like this, such as, "Does the medication improve physical endurance, as measured by the test?" and "Do males and females respond in the same way to the medication?"
While there are similar steps in performing one-way and two-way ANOVA tests, there are also some major differences. In the following sections, we will explore the differences in situations that allow for the one-way or two-way ANOVA methods, the procedure of two-way ANOVA, and the experimental designs associated with this method.
The Differences in Situations that Allow for One-way or Two-Way ANOVA
As mentioned in the previous lesson, ANOVA allows us to examine the effect of a single independent variable on a dependent variable (i.e., the effectiveness of a reading program on student achievement). With two-way ANOVA, we are not only able to study the effect of two independent variables (i.e., the effect of dosages and gender on the results of a physical endurance test), but also the interaction between these variables. An example of interaction between the two variables gender and medication is a finding that men and women respond differently to the medication.
We could conduct two separate one-way ANOVA tests to study the effect of two independent variables, but there are several advantages to conducting a two-way ANOVA test.
Efficiency. With simultaneous analysis of two independent variables, the ANOVA test is really carrying out two separate research studies at once.
Control. When including an additional independent variable in the study, we are able to control for that variable. For example, say that we included IQ in the earlier example about the effects of a reading program on student achievement. By including this variable, we are able to determine the effects of various reading programs, the effects of IQ, and the possible interaction between the two.
Interaction. With a two-way ANOVA test, it is possible to investigate the interaction of two or more independent variables. In most real-life scenarios, variables do interact with one another. Therefore, the study of the interaction between independent variables may be just as important as studying the interaction between the independent and dependent variables.
When we perform two separate one-way ANOVA tests, we run the risk of losing these advantages.
Two-Way ANOVA Procedures
There are two kinds of variables in all ANOVA procedures\begin{align*}-\end{align*}dependent and independent variables. In one-way ANOVA, we were working with one independent variable and one dependent variable. In two-way ANOVA, there are two independent variables and a single dependent variable. Changes in the dependent variables are assumed to be the result of changes in the independent variables.
In one-way ANOVA, we calculated a ratio that measured the variation between the two variables (dependent and independent). In two-way ANOVA, we need to calculate a ratio that measures not only the variation between the dependent and independent variables, but also the interaction between the two independent variables.
Before, when we performed the one-way ANOVA, we calculated the total variation by determining the variation within groups and the variation between groups. Calculating the total variation in two-way ANOVA is similar, but since we have an additional variable, we need to calculate two more types of variation. Determining the total variation in two-way ANOVA includes calculating: variation within the group (within-cell variation), variation in the dependent variable attributed to one independent variable (variation among the row means), variation in the dependent variable attributed to the other independent variable (variation among the column means), and variation between the independent variables (the interaction effect).
The formulas that we use to calculate these types of variations are very similar to the ones that we used in the one-way ANOVA. For each type of variation, we want to calculate the total sum of squared deviations (also known as the sum of squares) around the grand mean. After we find this total sum of squares, we want to divide it by the number of degrees of freedom to arrive at the mean of squares, which allows us to calculate our final ratio. We could do these calculations by hand, but we have technological tools, such as computer programs like Microsoft Excel and graphing calculators, that can compute these figures much more quickly and accurately than we could manually. In order to perform a two-way ANOVA with a TI-83/84 calculator, you must download a calculator program at the following site: http://www.wku.edu/~david.neal/statistics/advanced/anova2.htm.
The process for determining and evaluating the null hypothesis for the two-way ANOVA is very similar to the same process for the one-way ANOVA. However, for the two-way ANOVA, we have additional hypotheses, due to the additional variables. For two-way ANOVA, we have three null hypotheses:
- In the population, the means for the rows equal each other. In the example above, we would say that the mean for males equals the mean for females.
- In the population, the means for the columns equal each other. In the example above, we would say that the means for the three dosages are equal.
- In the population, the null hypothesis would be that there is no interaction between the two variables. In the example above, we would say that there is no interaction between gender and amount of dosage, or that all effects equal 0.
Let’s take a look at an example of a data set and see how we can interpret the summary tables produced by technological tools to test our hypotheses.
Example: Say that a gym teacher is interested in the effects of the length of an exercise program on the flexibility of male and female students. The teacher randomly selected 48 students (24 males and 24 females) and assigned them to exercise programs of varying lengths (1, 2, or 3 weeks). At the end of the programs, she measured the students' flexibility and recorded the following results. Each cell represents the score of a student:
Length of Program | Length of Program | Length of Program | ||
---|---|---|---|---|
1 Week | 2 Weeks | 3 Weeks | ||
Gender | Females | 32 | 28 | 36 |
27 | 31 | 47 | ||
22 | 24 | 42 | ||
19 | 25 | 35 | ||
28 | 26 | 46 | ||
23 | 33 | 39 | ||
25 | 27 | 43 | ||
21 | 25 | 40 | ||
Males | 18 | 27 | 24 | |
22 | 31 | 27 | ||
20 | 27 | 33 | ||
25 | 25 | 25 | ||
16 | 25 | 26 | ||
19 | 32 | 30 | ||
24 | 26 | 32 | ||
31 | 24 | 29 |
Do gender and the length of an exercise program have an effect on the flexibility of students?
Solution:
From these data, we can calculate the following summary statistics:
Length of Program | Length of Program | Length of Program | ||||
---|---|---|---|---|---|---|
1 Week | 2 Weeks | 3 Weeks | Total | |||
Gender | Females | \begin{align*}n\end{align*} | 8 | 8 | 8 | 24 |
Mean | 24.6 | 27.4 | 41.0 | 31.0 | ||
St. Dev. | 4.24 | 3.16 | 4.34 | 8.23 | ||
Males | \begin{align*}n\end{align*} | 8 | 8 | 8 | 24 | |
Mean | 21.9 | 27.1 | 28.3 | 25.8 | ||
St. Dev. | 4.76 | 2.90 | 3.28 | 4.56 | ||
Totals | \begin{align*}n\end{align*} | 16 | 16 | 16 | 48 | |
Mean | 23.3 | 27.3 | 34.6 | 28.4 | ||
St. Dev. | 4.58 | 2.93 | 7.56 | 7.10 |
As we can see from the tables above, it appears that females have more flexibility than males and that the longer programs are associated with greater flexibility. Also, we can take a look at the standard deviation of each group to get an idea of the variance within groups. This information is helpful, but it is necessary to calculate the test statistic to more fully understand the effects of the independent variables and the interaction between these two variables.
Technology Note: Calculating a Two-Way ANOVA with Excel
Here is the procedure for performing a two-way ANOVA with Excel using this set of data.
- Copy and paste the above table into an empty Excel worksheet, without the labels 'Length of program' and 'Gender'.
- Select 'Data Analysis' from the Tools menu and choose 'ANOVA: Single-factor' from the list that appears.
- Place the cursor in the 'Input Range' field and select the entire table.
- Place the cursor in the 'Output Range' field and click somewhere in a blank cell below the table.
- Click 'Labels' only if you have also included the labels in the table. This will cause the names of the predictor variables to be displayed in the table.
- Click 'OK', and the results shown below will be displayed.
Using technological tools, we can generate the following summary table:
Source | \begin{align*}SS\end{align*} | \begin{align*}df\end{align*} | \begin{align*}MS\end{align*} | \begin{align*}F\end{align*} | Critical Value of \begin{align*}F^*\end{align*} |
---|---|---|---|---|---|
Rows (gender) | 330.75 | 1 | 330.75 | 22.36 | 4.07 |
Columns (length) | 1,065.5 | 2 | 532.75 | 36.02 | 3.22 |
Interaction | 350 | 2 | 175 | 11.83 | 3.22 |
Within-cell | 621 | 42 | 14.79 | ||
Total | 2,367.25 |
\begin{align*}*\end{align*}Statistically significant at \begin{align*}\alpha=0.05\end{align*}.
From this summary table, we can see that all three \begin{align*}F\end{align*}-ratios exceed their respective critical values.
This means that we can reject all three null hypotheses and conclude that:
In the population, the mean for males differs from the mean of females.
In the population, the means for the three exercise programs differ.
There is an interaction between the length of the exercise program and the student’s gender.
Technology Note: Two-Way ANOVA on the TI-83/84 Calculator
http://www.wku.edu/~david.neal/statistics/advanced/anova2.html. A program to do a two-way ANOVA on the TI-83/84 Calculator.
Experimental Design and its Relation to the ANOVA Methods
Experimental design is the process of taking the time and the effort to organize an experiment so that the data are readily available to answer the questions that are of most interest to the researcher. When conducting an experiment using the ANOVA method, there are several ways that we can design an experiment. The design that we choose depends on the nature of the questions that we are exploring.
In a totally randomized design, the subjects or objects are assigned to treatment groups completely at random. For example, a teacher might randomly assign students into one of three reading programs to examine the effects of the different reading programs on student achievement. Often, the person conducting the experiment will use a computer to randomly assign subjects.
In a randomized block design, subjects or objects are first divided into homogeneous categories before being randomly assigned to a treatment group. For example, if an athletic director was studying the effect of various physical fitness programs on males and females, he would first categorize the randomly selected students into homogeneous categories (males and females) before randomly assigning them to one of the physical fitness programs that he was trying to study.
In ANOVA, we use both randomized design and randomized block design experiments. In one-way ANOVA, we typically use a completely randomized design. By using this design, we can assume that the observed changes are caused by changes in the independent variable. In two-way ANOVA, since we are evaluating the effect of two independent variables, we typically use a randomized block design. Since the subjects are assigned to one group and then another, we are able to evaluate the effects of both variables and the interaction between the two.
Lesson Summary
With two-way ANOVA, we are not only able to study the effect of two independent variables, but also the interaction between these variables. There are several advantages to conducting a two-way ANOVA, including efficiency, control of variables, and the ability to study the interaction between variables. Determining the total variation in two-way ANOVA includes calculating the following:
Variation within the group (within-cell variation)
Variation in the dependent variable attributed to one independent variable (variation among the row means)
Variation in the dependent variable attributed to the other independent variable (variation among the column means)
Variation between the independent variables (the interaction effect)
It is easier and more accurate to use technological tools, such as computer programs like Microsoft Excel, to calculate the figures needed to evaluate our hypotheses tests.
Review Questions
- In two-way ANOVA, we study not only the effect of two independent variables on the dependent variable, but also the ___ between the two independent variables.
- We could conduct multiple \begin{align*}t\end{align*}-tests between pairs of hypotheses, but there are several advantages when we conduct a two-way ANOVA. These include:
- Efficiency
- Control over additional variables
- The study of interaction between variables
- All of the above
- Calculating the total variation in two-way ANOVA includes calculating ___ types of variation.
- 1
- 2
- 3
- 4
- A researcher is interested in determining the effects of different doses of a dietary supplement on the performance of both males and females on a physical endurance test. The three different doses of the medicine are low, medium, and high, and again, the genders are male and female. He assigns 48 people, 24 males and 24 females, to one of the three levels of the supplement dosage and gives a standardized physical endurance test. Using technological tools, he generates the following summary ANOVA table:
Source | \begin{align*}SS\end{align*} | \begin{align*}df\end{align*} | \begin{align*}MS\end{align*} | \begin{align*}F\end{align*} | Critical Value of \begin{align*}F\end{align*} |
---|---|---|---|---|---|
Rows (gender) | 14.832 | 1 | 14.832 | 14.94 | 4.07 |
Columns (dosage) | 17.120 | 2 | 8.560 | 8.62 | 3.23 |
Interaction | 2.588 | 2 | 1.294 | 1.30 | 3.23 |
Within-cell | 41.685 | 42 | 992 | ||
Total | 76,226 | 47 |
\begin{align*}^* \alpha=0.05\end{align*}
(a) What are the three hypotheses associated with the two-way ANOVA method?
(b) What are the three null hypotheses for this study?
(c) What are the critical values for each of the three hypotheses? What do these tell us?
(d) Would you reject the null hypotheses? Why or why not?
(e) In your own words, describe what these results tell us about this experiment.
On the Web
http://www.ruf.rice.edu/~lane/stat_sim/two_way/index.html Two-way ANOVA applet that shows how the sums of square total is divided between factors \begin{align*}A\end{align*} and \begin{align*}B\end{align*}, the interaction of \begin{align*}A\end{align*} and \begin{align*}B\end{align*}, and the error.
http://tinyurl.com/32qaufs Shows partitioning of sums of squares in a one-way analysis of variance.
http://tinyurl.com/djob5t Understanding ANOVA visually. There are no numbers or formulas.
Keywords
- ANOVA method
- Analysis of Variance (ANOVA), which allows us to test the hypothesis that multiple population means and variances of scores are equal.
- Experimental design
- Experimental design is the process of taking the time and the effort to organize an experiment so that the data are readily available to answer the questions that are of most interest to the researcher.
- \begin{align*}F-\end{align*}distribution
- To test hypotheses about variance, we use a statistical tool called the \begin{align*}F-\end{align*}distribution.
- \begin{align*}F-\end{align*}Max test
- When we test the hypothesis that two variances of populations from which random samples were selected are equal, \begin{align*}H_0:\sigma^2_1=\sigma^2_2\end{align*} (or in other words, that the ratio of the variances \begin{align*}\frac{\sigma^2_1}{\sigma^2_2}=1\end{align*}), we call this test the \begin{align*}F-\end{align*}Max test.
- \begin{align*}F-\end{align*}ratio test statistic
- We use the \begin{align*}F-\end{align*}ratio test statistic when testing the hypothesis that there is no difference between population variances.
- Grand mean
- In ANOVA, we analyze the total variation of the scores, including the variation of the scores within the groups, the variation between the group means, and the total mean of all the groups (also known as the grand mean).
- Mean squares between groups
- Mean Square Between groups compare the means of groups to the grand mean: \begin{align*}\frac{SS_{\text{between}}}{K-1}\end{align*}. If the means across groups are close together, this number will be small.
- Mean squares within groups
- The mean squares within groups calculation is also called the pooled estimate of the population variance.
- Pooled estimate of the population variance
- when we square the standard deviation of a sample, we are estimating population variance.
- \begin{align*}SS_B\end{align*}
- which is the sum of the differences between the individual scores and the mean in each group.
- \begin{align*}SS_W\end{align*}
- \begin{align*}SS_W=\sigma_{k=1} \sigma^{n_k}_{i=1} x^2_{ik}-\sigma^m_{k=1} \frac{T^2_k}{n_k}\end{align*}
- Two-way ANOVA
- In two-way ANOVA, there are two independent variables and a single dependent variable. Changes in the dependent variables are assumed to be the result of changes in the independent variables.