# 11.1: The F-Distribution and Testing Two Variances

**At Grade**Created by: CK-12

## Learning Objectives

- Understand the differences between the
F - and the Student’st -distributions. - Calculate a test statistic as a ratio of values derived from sample variances.
- Use random samples to test hypotheses about multiple independent population variances.
- Understand the limits of inferences derived from these methods.

## Introduction

In previous lessons we learned how to conduct hypothesis tests examining the relationship between two variables. Most of these tests simply evaluated the relationship of the **means** of two variables. However, sometimes we also want to test the **variance** or the degree to which observations are spread out within a distribution. In the figure below, we see three samples with identical means (the samples in red, green and blue) but with very difference variances.

So why would we want to conduct a hypothesis test on variance? Let’s consider an example. Say that a teacher wants to examine the effectiveness of two reading programs. She randomly assigns her students into two groups, uses the different reading programs with each group and gives her students an achievement test. In deciding which reading program is more effective, it would be helpful to not only look at the mean scores of each of the groups, but also the “spreading out” of the achievement scores. To test hypotheses about variance, we use a statistical tool called the **distribution.**

In this lesson we will examine the difference between the

## Differences between the F- and Student’s t-Distributions

As review, we use the Student’s *not* known and it is necessary to estimate it by using the variance of the sample. Using the variance of a sample to estimate population variance can be inappropriate – especially if we have a small sample size. For estimating the population variance from a small sample we use a statistical tool called the **Student’s** **distribution.**

The Student’s

The **Max test.**

Since we are testing ratios, the

**Max Test: Calculating the Sample Test Statistic**

We use the **ratio** test statistic when testing the hypothesis that there is no difference between population variances. When calculating this ratio, we really just need the variance from each of the samples. It is recommended that the larger sample variance be placed in the numerator of the

**Example:**

Suppose a teacher administered two different reading programs to two groups of students and collected the following achievement score data:

What is the

**Solution:**

## F-Max Test: Testing Hypotheses about Multiple Independent Population Variances

As mentioned, in certain situations we are interested in determining if there is a difference in the population variances between two independent samples. We can conduct a hypothesis test of no difference between the population variances with the null hypothesis of

Establishing the critical values in an **each** of the samples to determine the critical values.

Say, for example, that we are trying to determine the critical values for the scenario above and we set the level of significance at

Once we set our critical values and calculate our test statistic, we perform the hypothesis test the same way we do with the hypothesis tests using the normal and the Student’s

**Example:**

Using our example above, suppose a teacher administered two different reading programs to two different groups of students and was interested if one program produced a greater variance in scores. Perform a hypothesis test to answer her question.

**Solution:**

In the example above, we calculated an \begin{align*}F\end{align*} ratio of \begin{align*}2.909\end{align*} and found a critical value of \begin{align*}2.20\end{align*}.

Since the observed test statistic exceeds the critical value, we reject the null hypothesis. Therefore, we can conclude that the observed ratio of the variances from the independent samples would have occurred by chance if the population variances were equal less than \begin{align*}2\% (.02)\end{align*} of the time. We can conclude that the variance of the student achievement scores for the second sample is less than the variance for the students in the first sample. We can also see that the achievement test means are practically equal so the variance in student achievement scores may help the teacher in her selection of a program.

## The Limits of Using the F-Distribution to Test Variance

The test of the null hypothesis \begin{align*}H_0: \sigma_1{^2} = \sigma_2{^2}\end{align*} using the \begin{align*}F\end{align*}-distribution is only appropriate when it can be safely assumed that the population is normally distributed. If we are testing the equality of standard deviations between two samples, it is important to remember that the \begin{align*}F\end{align*}-test is extremely sensitive. Therefore, if the data displays even small departures from the normal distribution including non-linearity or outliers, the test is unreliable and should not be used. In the next lesson, we will introduce several tests that we can use when the data are not normally distributed.

## Lesson Summary

- We use the \begin{align*}F\end{align*}-Max test and the \begin{align*}F\end{align*}-distribution when testing if two variances from independent samples are equal.
- The \begin{align*}F\end{align*}-distribution differs from the Student’s \begin{align*}t\end{align*}-distribution. Unlike the normal and the \begin{align*}t\end{align*}-distributions, the \begin{align*}F\end{align*}-distributions are not symmetrical and go from zero to infinity \begin{align*}(\infty)\end{align*} not from \begin{align*}-\infty\end{align*} to \begin{align*}\infty\end{align*} as the others do.
- When testing the variances from independent samples, we calculate the \begin{align*}F\end{align*}-ratio, which is the ratio of the variances of the independent samples.
- When we reject the null hypothesis \begin{align*}H_0: \sigma_1{^2} = \sigma_2{^2}\end{align*} we conclude that the variances of the two populations are not equal.
- The test of the null hypothesis \begin{align*}H_0: \sigma_1{^2} = \sigma_2{^2}\end{align*} using the \begin{align*}F\end{align*}-distribution is only appropriate when it can be safely assumed that the population is normally distributed.

**Supplemental Links**

- Distribution Tables http://www.statsoft.com/textbook/sttable.html

## Review Questions

- We use the \begin{align*}F\end{align*}-Max test to examine the differences in the ___ between two independent samples.
- List two differences between the \begin{align*}F\end{align*}- and the Student’s \begin{align*}t\end{align*}-distributions.
- When we test the differences between the variance of two independent samples, we calculate the ___.
- When calculating the \begin{align*}F\end{align*}-ratio, it is recommended that the sample with the ___ sample variance be placed in the numerator and the sample with the ___ sample variance be placed in the denominator.

Suppose the guidance counselor tested the mean of two student achievement samples from different SAT preparatory courses. She found that the two independent samples had similar means, but also wants to test the variance associated with the samples. She collected the following data:

\begin{align*}& \text{SAT Prep Course}\ \#1 & & \text{SAT Prep Course}\ \#2\\ & n = 31 & & n = 21\\ & s^2 = 42.30 & & s^2 = 18.80\end{align*}

- What are the null and alternative hypotheses for this scenario?
- What is the critical value with a \begin{align*}\alpha =.10\end{align*}?
- Calculate the \begin{align*}F\end{align*}-ratio.
- Would you reject or fail to reject the null hypothesis? Explain your reasoning.
- Interpret the results and what the guidance counselor can conclude from this hypothesis test.
- True or False: The test of the null hypothesis \begin{align*}H_0: \sigma_1{^2} = \sigma_2{^2}\end{align*} using the \begin{align*}F\end{align*}-distribution is only appropriate when it can be safely assumed that the population is normally distributed.

## Review Answers

- Variance
- Answers may vary but could include:
- We use the \begin{align*}t\end{align*}-distribution when testing the difference between the means of two independent samples and the \begin{align*}F\end{align*}-distribution when testing the difference between the variances of two independent samples.
- The \begin{align*}t\end{align*}-distribution is based off of one degree of freedom and the \begin{align*}F\end{align*}-distribution is based off of two.
- \begin{align*}F\end{align*}-distributions are not symmetrical, \begin{align*}t\end{align*}-distributions are.
- \begin{align*}T\end{align*}-values range from \begin{align*}-\infty\end{align*} to \begin{align*}\infty\end{align*} while \begin{align*}F\end{align*}-ratios range from zero to \begin{align*}\infty\end{align*}

- \begin{align*}F\end{align*}-ratio
- larger, smaller
- \begin{align*}H_0: \sigma_1{^2} = \sigma_2{^2}\end{align*} or \begin{align*}\sigma_1{^2}/\sigma_2{^2} = 1, H_a: \sigma_1{^2} \ne \sigma_2{^2}\end{align*} or \begin{align*}\sigma_1{^2}/\sigma_2{^2} \ne 1\end{align*}
- \begin{align*}2.04\end{align*}
- \begin{align*}2.25\end{align*}
- We would reject the null hypothesis because the calculated \begin{align*}F\end{align*} ratio \begin{align*}(2.25)\end{align*} exceeds the critical value \begin{align*}(2.04)\end{align*}.
- We can conclude that the variance of the student achievement scores for the second sample is less than the variance for the students in the first sample. Since the achievement test means are practically equal, the variance in student achievement scores may help the guidance counselor in her selection of a preparatory program.
- True