# 10.3: Testing One Variance

**At Grade**Created by: CK-12

## Learning Objectives

- Test a hypothesis about a single variance using the Chi-Square distribution.
- Calculate a confidence interval for a population variance based on a sample standard deviation.

## Introduction

In the previous lesson we learned how the Chi-Square test can help us assess the relationships between two variables. But the Chi-Square test can also help us test hypotheses surrounding **variance,** which is the measure of the variation, or scattering, of scores in a distribution. Often times when we test variance we are assessing whether or not a **sample mean** differs from the population mean by more than we would expect due to chance. This test is somewhat similar to the test of z-scores where we measure the likelihood that a single observation came from a population but a bit different since we are using **samples** instead of **individual** observations.

There are several different tests that we can use to assess the variance of a sample. The most common tests used to assess variance are the single-sample Chi-Square test, the F-test and the *An*alysis *o*f *Va*riance (ANOVA). Both the Chi-Square test and the F-test are extremely sensitive to non-normality (or when the populations do not have a normal distribution) so the ANOVA test is used most often for this analysis. However, in this section we will examine the testing of a single variance using the Chi-Square test in greater detail.

## Testing a Single Variance Hypothesis Using the Chi-Square Test

Suppose that we want to test two samples to determine if they belong to the same population. This testing of variance between samples is used quite frequently in the manufacturing of food, parts and medications since it is necessary for individual products of each of these types to be very similar in size and chemical make-up.

To test a hypothesis about a single variance using the Chi-Square distribution, we need several pieces of information. First, as mentioned, we should check to make sure that the population has a normal distribution. Next, we need to determine the number of observations in the sample. The remaining pieces of information that we need are the standard deviation and the hypothetical population variance, which we learned how to calculate in previous lessons. For the purposes of this exercise, we will assume that we will be provided the standard deviation and the population variance.

Using these key pieces of information, we use the following formula to caluclate the Chi-Square value to test hypothesis surrounding single variance:

\begin{align*}X^2 = \frac{dfs^2} {\sigma^2}\end{align*}

where:

\begin{align*}X^2 =\end{align*} Chi-Square statistical value

\begin{align*}df = \mathrm{degrees}\end{align*} of freedom \begin{align*}= N-1\end{align*}, where \begin{align*}N =\end{align*} size of the sample

\begin{align*}s^2=\end{align*} sample variance

\begin{align*}\sigma^2 =\end{align*} population variance

Similar to the \begin{align*}z-\end{align*}test, we want to test a hypothesis that the sample comes from a population with a variance greater than the obseved variance. Let’s take a look at an example to help clarify.

**Example:** Suppose we have a sample of \begin{align*}41\end{align*} female gymnasts from Mission High School. We want to know if their heights are truly a random sample of the general high school population, with respect to variance. We know from a previous study that the standard deviation for height of high school women is \begin{align*}2.2\end{align*}.

To test this question, we first need to generate null and alternative hypotheses. Our null hypothesis states that the sample comes from the population that has a variance of \begin{align*}4.84\end{align*} (\begin{align*}\sigma^2 =\end{align*} the standard deviation of the overall population squared or \begin{align*}4.84\end{align*}). Therefore:

*Null Hypothesis* \begin{align*}H_0: \sigma^2 \le 4.84\end{align*} *( the variance of the sample is greater than or equal to that of the population)*

*Alternative Hypothesis* \begin{align*}H_a: \sigma^2 > 4.84\end{align*} *(the variance of the sample is less than that of the population)*

Using the sample of the \begin{align*}41\end{align*} gymnasts, we compute the standard deviation and find it to be \begin{align*}1.2 (s = 1.2)\end{align*}. Using the information from above, we can calculate our Chi-Square value and find that:

\begin{align*}= X^2 = \frac{dfs^2} {\sigma^2} = (40 \cdot 1.2^2) / 4.84 = 11.90\end{align*}

Therefore, since \begin{align*}11.90\end{align*} is less than \begin{align*}55.76\end{align*}, we fail to reject the null hypothesis and therefore cannot conclude this sample female gymnasts has significantly higher variance in height when compared to the general female high school population.

## Calculating a Confidence Interval for a Population Variance

Once we know how to test a hypothesis about a single variance, calculating a confidence interval for a population variance is relatively easy. Again, it is important to remember that this test is dependent on the normality of the population. For non-normal populations, it is best to use the ANOVA test which we will cover in greater detail in another lesson.

Similar to constructing confidence intervals in other types of tests, we construct a confidence interval when testing a population variance to identify a range that we think will encompasses the variance. To construct a confidence interval for the population variance, we need three pieces of information: the number of observations in a sample, the variance of the sample, and the desired confidence interval. With the desired confidence interval (most often this is set at \begin{align*}90\end{align*} or \begin{align*}95 \%\end{align*}), we can construct the upper and lower limits around the significance level.

To construct the upper limit of the confidence interval, we set the value equal to \begin{align*}\alpha/2\end{align*} (alpha is the Greek letter “a”) where \begin{align*}\alpha =\end{align*} probability that the variance is *not in* the interval) and the lower limit to \begin{align*}(1 - (\alpha/2))\end{align*} . Therefore, when constructing a \begin{align*} 90 \%\end{align*} confidence interval \begin{align*}(\alpha = 0.1)\end{align*}we would find that the two limits of the confidence interval would be at \begin{align*}0.05\ (\alpha/2)\end{align*} and \begin{align*} 0.95 (1 - (\alpha/2))\end{align*}. Similarly, a \begin{align*}98 \%\end{align*} confidence interval \begin{align*}(\alpha = 0.02)\end{align*} would have limits set at \begin{align*}0.01\end{align*} and \begin{align*}0.99\end{align*}. Using these limits and the number of degrees of freedom from the sample, we can use the standard Chi-Square distribution table to look up actual values to construct our confidence interval for population variance. Let’s look at an example to help clarify.

**Example:** We randomly select \begin{align*}30\end{align*} samples of Coca Cola and measure the amount of sugar in each sample. Using the formula that we learned earlier, we calculate that the variance of the sample is \begin{align*}5.20\end{align*}. What would be the population variance with a \begin{align*}90 \%\end{align*} confidence interval? In other words, if we were to repeatedly draw random samples from a normal population, what is the range of the population variance?

To construct this 90% confidence interval, we first need to determine our upper and lower limits. The formula to construct this confidence interval and calculate the population variance \begin{align*}(\sigma^2)\end{align*} is:

\begin{align*}X^2_{0.05} \le \frac{dfs^2} {\sigma^2} \le X^2_{0.95}\end{align*}

Using our standard Chi-Square distribution table, we can look up the critical \begin{align*}X^2\end{align*} values for \begin{align*}0.05\end{align*} and \begin{align*}0.95\end{align*} at \begin{align*}29 \;\mathrm{Degrees}\end{align*} of Freedom. Using our \begin{align*}X^2\end{align*} distribution table, we find that \begin{align*}X^2_{\left \{0.05 \right \}}\end{align*} and that \begin{align*}X^2_{\left \{0.05 \right \}} = 17.71\end{align*}. Since we know the number of observations and the standard deviation for this sample, we can then solve for \begin{align*}\sigma^2\end{align*}:

\begin{align*}\frac{\text{dfs}^2} {42.56} & \le \sigma^2 \le \frac{\text{dfs}^2} {17.71} \\ \frac{295.20} {42.56} & \le \sigma^2 \le \frac{295.20} {17.71} \\ 3.54 & \le \sigma^2 \le 8.51\end{align*}

In other words, we are \begin{align*}90 \%\end{align*} confident that the population variance of this sample is between \begin{align*}3.54\end{align*} and \begin{align*}8.51\end{align*}.

## Lesson Summary

1. We can also use the Chi-Square distribution to test hypotheses about population variance. Variance is the measure of the variation or scattering of scores in a distribution and we often use this test to assess the likelihood that a population variance is within a certain range.

2. To test the variance using the Chi-Square statistic, we use the formula

\begin{align*}X^2 = \frac{dfs^2} {\sigma^2}\end{align*}

where:

\begin{align*}X^2 =\end{align*} Chi-Square statistical value

\begin{align*}df = \mathrm{Degrees}\end{align*} of Freedom \begin{align*}= N-1\end{align*}, where \begin{align*}N =\end{align*} size of the sample

\begin{align*}s^2=\end{align*} sample variance

\begin{align*}\sigma ^2 =\end{align*} population variance

This formula gives us a Chi-Square statistic which we can compare to values taken from the Chi-Square distribution table to test our hypothesis.

3. We can construct a confidence interval which is a range of values that includes the population variance with a given degree of confidence. To find this interval, we use the formula.

\begin{align*}X^2_{\frac{\alpha} {2}} \le \frac{\text{dfs}^2} {\sigma^2} \le X^2_{1 - \frac{\alpha} {2}}\end{align*}

For example, if \begin{align*}\sigma = 0.1\end{align*}, the range is a \begin{align*}90 \%\end{align*} interval, from \begin{align*}0.05\end{align*} to \begin{align*}0.95\end{align*}. We then say that the probability is \begin{align*}10 \%\end{align*} that the population variance is not in the resulting interval.

## Review Questions

- We use the Chi-Square distribution for the:
- Goodness-of-Fit test
- Test for Independence
- Testing a hypothesis of single variance
- All of the above

- True or False: We can test a hypothesis about a single variance using the chi-square distribution for a non-normal population
- In testing variance, our null hypothesis states that the two population means that we are testing are:
- equal with respect to variance
- are not equal
- none of the above

- In the formula for calculating the Chi-Square statistic for single variance, \begin{align*}\sigma^2 = :\end{align*}
- standard deviation
- number of observations
- hypothesized population variance
- Chi-Square statistic

- If we knew the number of observations in the sample, the standard deviation of the sample and the hypothesized variance of the population, what additional information would we need to solve for the Chi-Square statistic?
- the Chi-Square distribution table
- the population size
- the standard deviation of the population
- no additional information needed

- We want to test a hypothesis about a single variance using the Chi-Square distribution. We weighed \begin{align*}30\end{align*} bars of Dial soap and this sample had a standard deviation of \begin{align*}1.1\end{align*}.We want to test if this sample comes from the general factory which we know from a previous study to have an overall variance of \begin{align*}3.22\end{align*}. What is our null hypothesis?
- Compute \begin{align*}X^2\end{align*} for Question 6
- Given the information in Questions 6 and 7, would you reject or fail to reject the null hypothesis?
- Let’s assume that our population variance for this problem is unknown. We want to construct a \begin{align*}90 \%\end{align*} confidence interval around the population variance \begin{align*}(\sigma^2)\end{align*}. If our critical values at a \begin{align*}90 \%\end{align*} confidence interval \begin{align*}(a = 0.1)\end{align*} are \begin{align*}17.71\end{align*} and \begin{align*}42.56,\end{align*} what is the range for \begin{align*}\sigma^2\end{align*}?
- What statement would you give surrounding this Confidence Interval?

## Review Answers

- D
- False
- A
- C
- D
- The null hypothesis states that the sample comes from a population with a variance less than or equal to the population variance of \begin{align*}3.22\end{align*} \begin{align*}(H_0:O)\ \sigma^2 \le 3.22\end{align*}
- \begin{align*}X^2 = \frac{\text{dfs}^2} {\sigma^2} = (29 \times 1.1^2) / 3.22 = 10.90\end{align*}
- Failure to Reject the Null Hypothesis since \begin{align*}3.22 \le 10.90\end{align*}. We cannot conclude that the sample comes from the larger population with respect to variance.
- Between \begin{align*}0.82\end{align*} and \begin{align*}1.98\end{align*} \begin{align*}\frac{\text{dfs}^2} {42.56} & \le \sigma^2 \le \frac{\text{dfs}^2} {17.71} \\ \frac{295.20} {42.56} & \le \sigma^2 \le \frac{295.20} {17.71} \\ 0.82 & \le \sigma^2 \le 1.98\end{align*}
- We are \begin{align*}90 \%\end{align*} confident that the overall population variance \begin{align*}(\sigma^2)\end{align*} is between \begin{align*}0.82\end{align*} and \begin{align*}1.98\end{align*}