10.1: The Goodness-of-Fit Test
Learning Objectives
- Understand the difference between the Chi-Square distribution and the Student’s t-distribution.
- Identify the conditions which must be satisfied when using the Chi-Square test.
- Understand the features of experiments that allow Goodness-of-Fit tests to be used.
- Evaluate an hypothesis using the Goodness-of-Fit test.
Introduction
In previous lessons, we learned that there are several different tests that we can use to analyze data and test hypotheses. The type of test that we choose depends on the data available and what question we are trying to answer. For example:
- We analyze simple descriptive statistics such as the mean, median, mode and standard deviation to give us an idea of the distribution and to remove outliers, if necessary;
- We calculate probabilities to determine the likelihood of something happening; and
- We use regression analysis to examine the relationship between two or more continuous variables.
But what test do we run if we are trying to examine patterns between distinct categories such as gender, political candidates, locations or preferences? To analyze patterns like these we use the Chi-Square test.
The Chi-Square test is a statistical test used to examine patterns in distinct or categorical variables, which we learned about in the earlier chapter entitled Planning and Conducting an Experiment or Study. This test is used in:
- 1. Estimating how closely a sample matches the expected distribution (also known as the Goodness-of-Fit test) and
- 2. Estimating if two random variables are independent of one another (also known as the Test of Independence - see Chapter 9).
In this lesson we will learn more about the Goodness-of-Fit test and how to create and evaluate hypotheses using this test.
The Chi-Square Distribution
The Chi-Square Goodness-of-Fit test is used to compare the observed values of a categorical variable with the expected values of that same variable. For example, we would use this test to analyze surveys that contained categorical variables (for example, gender, city of origin, or locations that people preferred to visit on vacation) to determine if there are in fact relationships between certain items.
Example: We would use the Chi-Square Goodness-of-Fit test to evaluate if there was a preference in the types of lunch that \begin{align*}11^{th}\end{align*} grade students bought in the cafeteria. For this type of comparison it helps to make a table to visualize the problem. We could construct the following table to compare the observed and expected values.
Research Question: Do \begin{align*}11^{th}\end{align*} grade students prefer a certain type of lunch?
Using a sample of \begin{align*}11^{th}\end{align*} grade students, we recorded the following information:
Type of Lunch | Observed Frequency | Expected Frequency |
---|---|---|
Salad | \begin{align*}21\end{align*} | \begin{align*}25\end{align*} |
Sub Sandwich | \begin{align*}29\end{align*} | \begin{align*}25\end{align*} |
Daily Special | \begin{align*}14\end{align*} | \begin{align*}25\end{align*} |
Brought Own Lunch | \begin{align*}36\end{align*} | \begin{align*}25\end{align*} |
If there is no difference in which type of lunch is preferred, we would expect the students to prefer each type of lunch equally. To calculate the expected frequency of each category as if school lunch preferences were distributed equally, we divide the number of observations by the number of categories. Since there are \begin{align*}100\end{align*} observations and \begin{align*}4\end{align*} categories, the expected frequency of each category is \begin{align*}100/4\end{align*} or \begin{align*}25\end{align*}.
The value that indicates the comparison between the observed and expected frequency is called the Chi-Square statistic. The idea is that if the observed frequency is close to the expected frequency, then the Chi-Square statistic will be small. Or, if the difference between the two frequencies is big, then we expect the Chi-Square statistic to be large.
To calculate the Chi-Square statistic \begin{align*}(X^2)\end{align*}, we use the formula:
\begin{align*}X^2=\sum_i \frac{(O_i-E_i)^2}{E_i}\end{align*} where:
\begin{align*}X^2=\end{align*} Chi-Square statistical value
\begin{align*}O_i=\end{align*} observed frequency value for each event
\begin{align*}E_i=\end{align*} expected frequency value for each event
Once calculated, we take this Chi-Square value along with the degrees of freedom (this will be discussed later) and look up the Chi-Square value on a standard Chi-Square distribution table. The Chi-Square distribution allows us to determine the probability that a sample fits an expected pattern. In contrast, the t-distribution tests how likely it is that the means of two different samples will differ. Please see the table below for more details.
Type of Distribution | Tells Us | Every Day Example | Data Needed to Determine Value |
---|---|---|---|
Chi-Square | The relationship between two or more categorical variables. | Analyzing survey data with categorical variables. | Observed and expected frequencies for categorical variables, degrees of freedom. |
Student’s t-Test | The differences between the means of two groups with respect to a continuous variable. | Determining if there is a difference in the mean of the SAT scores between schools. | The mean values for samples from two populations, degrees of freedom. |
Features of the Goodness-of-Fit Test
As mentioned, the Goodness-of-Fit test is used to determine patterns of distinct or categorical variables. As we learned in Lesson 6, a categorical variable is one that is not continuous and has observations in separate categories. Examples of categorical variables include:
-gender (male or female)
-preferences (agreed, neutral or disagreed)
-behaviors (got sent to the office or didn’t get sent to the office)
-physical traits (straight, wavy or curly hair)
Categorical variables are not the same as measurement or continuous variables. The following are normally not categorical variables:
\begin{align*}& - \text{height} && - \text{distance} \\ & - \text{weight} && - \text{income} \\ & - \text{test scores}\end{align*}
It is important to note that most of these continuous variables could in fact be converted to a categorical variable. For example, you could create a categorical variable with two values such as ¨Less that \begin{align*}10 \;\mathrm{miles}\end{align*}¨ and ¨Greater than \begin{align*}10 \;\mathrm{miles}\end{align*}.¨
In addition to categorical variables, a Goodness-of-Fit test also requires:
-data obtained through a random sample
-a calculation of the Chi-Square statistic using the formula explained in the last section
-the calculation of the Degrees of Freedom. For a Chi-Square test, the Degrees of Freedom are equal to the number of categories minus one or \begin{align*}df=c-1\end{align*}
Using our example about the preferences of types of school lunches, we calculate the \begin{align*}df=3\end{align*}.
\begin{align*}\text{df} & = \#\ \text{of categories} - 1 \\ 3 & = 4 - 1\end{align*}
There are many situations that use the Goodness-of-Fit test, including surveys, taste tests and analysis of behaviors. Interestingly, Goodness-of-Fit tests are also used in casinos to determine if there is cheating in games of chance such as cards and dice. For example, if a certain card or number on a die shows up more than expected (a high observed frequency compared to the expected frequency), officials use the Goodness-of-Fit test to determine the likelihood that the player may be cheating or the game may not be fair.
Evaluating Hypothesis Using the Goodness-of-Fit Test
Let’s use our original example to create and test a hypothesis using the Goodness-of-Fit Chi-Square test. First, we will need to state the null and alternative hypotheses for our research question. Since our research question states “Do \begin{align*}11^{th}\end{align*} grade students prefer a certain type of lunch?” our null hypothesis for the Chi-Square test would state that there is no difference between the observed and the expected frequencies. Therefore, our alternative hypothesis would state that there is a significant difference between the observed and expected frequencies.
Null Hypothesis \begin{align*}(H_0:O)= E\end{align*} (there is no statistically significant difference between observed and expected frequencies)
Alternative Hypothesis \begin{align*}(H_a:O) \neq E\end{align*} (there is a statistically significant difference between observed and expected frequencies)
Using an alpha level of \begin{align*}.05\end{align*}, we look under the column for \begin{align*}.05\end{align*} and the row for Degrees of Freedom (remember the Degrees of Freedom = Number of categories \begin{align*}- 1 = 3\end{align*}). Using the standard Chi-Square distribution table, we see that the critical value for Chi-Square is \begin{align*}7.81\end{align*}. Therefore we would reject the null hypothesis if the Chi-Square statistic is greater than \begin{align*}7.81\end{align*}.
Reject\begin{align*}(H_0)\end{align*} if \begin{align*}X_2 > 7.81\end{align*}
Using the table from above, we can calculate the Chi-Square statistic with relative ease.
Type of Lunch | Observed Frequency | Expected Frequency | \begin{align*}(O-E)^2 /E\end{align*} |
---|---|---|---|
Salad | \begin{align*}21\end{align*} | \begin{align*}25\end{align*} | \begin{align*}0.64\end{align*} |
Sub Sandwich | \begin{align*}29\end{align*} | \begin{align*}25\end{align*} | \begin{align*}0.64\end{align*} |
Daily Special | \begin{align*}14\end{align*} | \begin{align*}25\end{align*} | \begin{align*}4.84\end{align*} |
Brought Own Lunch | \begin{align*}36\end{align*} | \begin{align*}25\end{align*} | \begin{align*}4.84\end{align*} |
Total (chi-square) | \begin{align*}10.96\end{align*} |
\begin{align*}X^2=\sum \frac{(0-E)^2}{E} = 0.64 + 0.64 + 4.84 + 4.84 = 10.96\end{align*}
Since our Chi-Square statistic of \begin{align*}10.96\end{align*} is greater than \begin{align*}7.81\end{align*}, we reject the null hypotheses and accept the alternative hypothesis. Therefore we can conclude that there is a significant difference between the types of lunches that \begin{align*}11^{th}\end{align*} grade students prefer.
As review, we follow the following steps to formulate and evaluate hypothesis:
- State the null and alternative hypothesis for the research question.
- Select the significance level and use the Chi-Square distribution table to write a rule for rejecting the null hypothesis.
- Calculate the value of the Chi-Square statistic.
- Determine whether to reject or fail to reject the null hypothesis and write a summary statement based on the results.
Lesson Summary
1. We use the Chi-Square test to examine patterns between categorical variables such as gender, political candidates, locations or preferences.
2. There are two types of Chi-Square tests: the Goodness-of-Fit test and the Test for Independence. We use the Goodness-of-Fit test to estimate how closely a sample matches the expected distribution.
3. To test for significance, it helps to make a table detailing the observed and expected frequencies of the data sample. Using the standard Chi-Square distribution table, we are able to create criteria for accepting the null or alternative hypotheses for our research questions.
4. To test the null hypothesis it is necessary to calculate the Chi-Square statistic. To calculate the Chi-Square statistic \begin{align*}( x^2)\end{align*}, we use the formula:
\begin{align*}X^2=\sum_i \frac{(0_i-E_i)^2}{E_i}\end{align*}
where:
\begin{align*}X^2 =\end{align*} Chi-Square statistical value
\begin{align*}O =\end{align*} observed frequency value
\begin{align*}E =\end{align*} expected frequency value
5.Using the Chi-Square statistic and the level of significance, we are able to determine whether to reject or fail to reject the null hypothesis and write a summary statement based on these results.
Supplemental Links
Distribution Tables (including the Student’s t-distribution and Chi-Square distribution)
http://www.statsoft.com/textbook/stathome.html?sttable.html&1
Review Questions
- What is the name of the statistical test used analyze the patterns between two categorical variables?
- the Student’s t-test
- the ANOVA test
- the Chi-Square test
- the z-score
- There are two types of Chi-Square tests. Which type of Chi-Square test estimates how closely a sample matches an expected distribution?
- the Goodness-of-Fit test
- the Test for Independence
- Which of the following is considered a categorical variable:
- income
- gender
- height
- weight
- If there were \begin{align*}250\end{align*} observations in a data set and \begin{align*}2\end{align*} uniformly distributed categories that were being measured, the expected frequency for each category would be:
- \begin{align*}125\end{align*}
- \begin{align*}500\end{align*}
- \begin{align*}250\end{align*}
- \begin{align*}5\end{align*}
- What is the formula for calculating the Chi-Square statistic? The principal is planning a field trip. She samples a group of \begin{align*}100\end{align*} students to see if they prefer a sporting event, a play at the local college or a science museum. She records the following results:
Type of Field Trip | Number Preferring |
---|---|
Sporting Event | \begin{align*}53\end{align*} |
Play | \begin{align*}18\end{align*} |
Science Museum | \begin{align*}29\end{align*} |
- What is the observed frequency value for the Science Museum category?
- What is the expected frequency value for the Sporting Event category?
- What would be the null hypothesis for the situation above?
- There is no preference between the types of field trips that students prefer
- There is a preference between the types of field trips that students prefer
- What would be the Chi-Square statistic for the research question above?
- If the estimated Chi-Square level of significance was \begin{align*}5.99\end{align*}, would you reject or fail to reject the null hypothesis?
Review Answers
- C
- A
- B
- A
- \begin{align*}X^2=\sum \frac{(0-E)^2}{E}\end{align*}
- \begin{align*}29\end{align*}
- \begin{align*}33.33\end{align*}
- A
- \begin{align*}20.0\end{align*} (see table below)
Type of Field Trip | Observed Frequency | Expected Frequency | Chi-Square |
---|---|---|---|
Sporting Event | \begin{align*}53\end{align*} | \begin{align*}33.33\end{align*} | \begin{align*}12.4\end{align*} |
Play | \begin{align*}18\end{align*} | \begin{align*}33.33\end{align*} | \begin{align*}7.0\end{align*} |
Science Museum | \begin{align*}29\end{align*} | \begin{align*}33.33\end{align*} | \begin{align*} 0.6\end{align*} |
Chi-Square Total | \begin{align*}20.0\end{align*} |
- Reject the Null Hypothesis