10.1: The GoodnessofFit Test
Learning Objectives
 Understand the difference between the chisquare distribution and Student’s
t distribution.  Identify the conditions which must be satisfied when using the chisquare test.
 Understand the features of experiments that allow goodnessoffit tests to be used.
 Evaluate a hypothesis using the goodnessoffit test.
Introduction
In previous lessons, we learned that there are several different tests that we can use to analyze data and test hypotheses. The type of test that we choose depends on the data available and what question we are trying to answer. We analyze simple descriptive statistics, such as the mean, median, mode, and standard deviation to give us an idea of the distribution and to remove outliers, if necessary. We calculate probabilities to determine the likelihood of something happening. Finally, we use regression analysis to examine the relationship between two or more continuous variables.
However, there is another test that we have yet to cover. To analyze patterns between distinct categories, such as genders, political candidates, locations, or preferences, we use the chisquare test.
This test is used when estimating how closely a sample matches the expected distribution (also known as the goodnessoffit test) and when estimating if two random variables are independent of one another (also known as the test of independence).
In this lesson, we will learn more about the goodnessoffit test and how to create and evaluate hypotheses using this test.
The ChiSquare Distribution
The chisquare distribution can be used to perform the goodnessoffit test, which compares the observed values of a categorical variable with the expected values of that same variable.
Example: We would use the chisquare goodnessoffit test to evaluate if there was a preference in the type of lunch that
Research Question: Do
Using a sample of
Type of Lunch  Observed Frequency  Expected Frequency 

Salad  21  25 
Sub Sandwich  29  25 
Daily Special  14  25 
Brought Own Lunch  36  25 
If there is no difference in which type of lunch is preferred, we would expect the students to prefer each type of lunch equally. To calculate the expected frequency of each category when assuming school lunch preferences are distributed equally, we divide the number of observations by the number of categories. Since there are 100 observations and 4 categories, the expected frequency of each category is
The value that indicates the comparison between the observed and expected frequency is called the chisquare statistic. The idea is that if the observed frequency is close to the expected frequency, then the chisquare statistic will be small. On the other hand, if there is a substantial difference between the two frequencies, then we would expect the chisquare statistic to be large.
To calculate the chisquare statistic,
where:
We compare the value of the test statistic to a tabled chisquare value to determine the probability that a sample fits an expected pattern.
Features of the GoodnessofFit Test
As mentioned, the goodnessoffit test is used to determine patterns of distinct categorical variables. The test requires that the data are obtained through a random sample. The number of degrees of freedom associated with a particular chisquare test is equal to the number of categories minus one. That is,
Example: Using our example about the preferences for types of school lunches, we calculate the degrees of freedom as follows:
On the Web
http://tinyurl.com/3ypvj2h Follow this link to a table of chisquare values.
There are many situations that use the goodnessoffit test, including surveys, taste tests, and analysis of behaviors. Interestingly, goodnessoffit tests are also used in casinos to determine if there is cheating in games of chance, such as cards or dice. For example, if a certain card or number on a die shows up more than expected (a high observed frequency compared to the expected frequency), officials use the goodnessoffit test to determine the likelihood that the player may be cheating or that the game may not be fair.
Evaluating Hypotheses Using the GoodnessofFit Test
Let’s use our original example to create and test a hypothesis using the goodnessoffit chisquare test. First, we will need to state the null and alternative hypotheses for our research question. Since our research question asks, “Do
Null Hypothesis
Alternative Hypothesis
Also, the number of degrees of freedom for this test is 3.
Using an alpha level of 0.05, we look under the column for 0.05 and the row for degrees of freedom, which, again, is 3. According to the standard chisquare distribution table, we see that the critical value for chisquare is 7.815. Therefore, we would reject the null hypothesis if the chisquare statistic is greater than 7.815.
Note that we can calculate the chisquare statistic with relative ease.
Type of Lunch  Observed Frequency  Expected Frequency 


Salad  21  25  0.64 
Sub Sandwich  29  25  0.64 
Daily Special  14  25  4.84 
Brought Own Lunch  36  25  4.84 
Total (chisquare)  10.96 
Since our chisquare statistic of 10.96 is greater than 7.815, we reject the null hypotheses and accept the alternative hypothesis. Therefore, we can conclude that there is a significant difference between the types of lunches that
Lesson Summary
We use the chisquare test to examine patterns between categorical variables, such as genders, political candidates, locations, or preferences.
There are two types of chisquare tests: the goodnessoffit test and the test for independence. We use the goodnessoffit test to estimate how closely a sample matches the expected distribution.
To test for significance, it helps to make a table detailing the observed and expected frequencies of the data sample. Using the standard chisquare distribution table, we are able to create criteria for accepting the null or alternative hypotheses for our research questions.
To test the null hypothesis, it is necessary to calculate the chisquare statistic,
where:
Using the chisquare statistic and the level of significance, we are able to determine whether to reject or fail to reject the null hypothesis and write a summary statement based on these results.
Multimedia Links
For a discussion on
Review Questions
 What is the name of the statistical test used to analyze the patterns between two categorical variables?
 Student’s
t test  the ANOVA test
 the chisquare test
 the
z score
 Student’s
 There are two types of chisquare tests. Which type of chisquare test estimates how closely a sample matches an expected distribution?
 the goodnessoffit test
 the test for independence
 Which of the following is considered a categorical variable?
 income
 gender
 height
 weight
 If there were 250 observations in a data set and 2 uniformly distributed categories that were being measured, the expected frequency for each category would be:
 125
 500
 250
 5
 What is the formula for calculating the chisquare statistic?
 A principal is planning a field trip. She samples a group of 100 students to see if they prefer a sporting event, a play at the local college, or a science museum. She records the following results:
Type of Field Trip  Number Preferring 

Sporting Event  53 
Play  18 
Science Museum  29 
(a) What is the observed frequency value for the Science Museum category?
(b) What is the expected frequency value for the Sporting Event category?
(c) What would be the null hypothesis for the situation above?
(i) There is no preference between the types of field trips that students prefer.
(ii) There is a preference between the types of field trips that students prefer.
(d) What would be the chisquare statistic for the research question above?
(e) If the estimated chisquare level of significance was 5.99, would you reject or fail to reject the null hypothesis?
On the Web
http://onlinestatbook.com/stat_sim/chisq_theor/index.html Explore what happens when you are using the chisquare statistic when the underlying population from which you are sampling does not follow a normal distribution.
Notes/Highlights Having trouble? Report an issue.
Color  Highlighted Text  Notes  

Please Sign In to create your own Highlights / Notes  
Show More 