Hypothesis Testing for Dependent and Independent Samples
We have learned about hypothesis testing for proportion and means with both large and small samples. However, in the examples in those lessons only one sample was involved. In this lesson we will apply the principals of hypothesis testing to situations involving two samples. There are many situations in everyday life where we would perform statistical analysis involving two samples. For example, suppose that we wanted to test a hypothesis about the effect of two medications on curing an illness. Or we may want to test the difference between the means of males and females on the SAT. In both of these cases, we would analyze both samples and the hypothesis would address the difference between two sample means.
In this Concept, we will identify situations with different types of samples, learn to calculate the test statistic, calculate the estimate for population variance for both samples and calculate the test statistic to test hypotheses about the difference of proportions or means between samples.
Dependent and Independent Samples
When we are working with one sample, we know that we need to select a random sample from the population, measure that sample statistic and then make hypothesis about the population based on that sample. When we work with two independent samples we assume that if the samples are selected at random (or, in the case of medical research, the subjects are randomly assigned to a group), the two samples will vary only by chance and the difference will not be statistically significant. In short, when we have independent samples we assume that the scores of one sample do not affect the other.
Independent samples can occur in two scenarios.
Testing the difference of the means between two fixed populations we test the differences between samples from each population. When both samples are randomly selected, we can make inferences about the populations.
When working with subjects (people, pets, etc.), if we select a random sample and then randomly assign half of the subjects to one group and half to another we can make inferences about the populations.
Dependent samples are a bit different. Two samples of data are dependent when each score in one sample is paired with a specific score in the other sample. In short, these types of samples are related to each other. Dependent samples can occur in two scenarios. In one, a group may be measured twice such as in a pretest-posttest situation (scores on a test before and after the lesson). The other scenario is one in which an observation in one sample is matched with an observation in the second sample.
To distinguish between tests of hypotheses for independent and dependent samples, we use a different symbol for hypotheses with dependent samples. For dependent sample hypotheses, we use the delta symbol \begin{align*}\delta\end{align*} to symbolize the difference between the two samples. Therefore, in our null hypothesis we state that the difference of scores across the two measurements is equal to \begin{align*}0; \delta=0\end{align*} or:
\begin{align*}H_0: \delta=\mu_1-\mu_2\end{align*}
Calculating the Pooled Estimate of Population Variance
When testing a hypothesis about two independent samples, we follow a similar process as when testing one random sample. However, when computing the test statistic, we need to calculate the estimated standard error of the difference between sample means, \begin{align*}s_{\bar{x}_1-\bar{x}_2}=\sqrt{s^2 \left(\frac{1}{n_1}+\frac{1}{n_2}\right)}.\end{align*}
Where \begin{align*}n_1\end{align*} and \begin{align*}n_2\end{align*} are the sizes of the two samples \begin{align*}s^2\end{align*} is the pooled sample variance, which is computed as \begin{align*}s^2=\frac{\sum(x_1-\bar{x}_1)^2+\sum(x_2-\bar{x}_2)^2}{n_1+n_2-2}\end{align*}. Often, the top part of this formula is simplified by substituting the symbol \begin{align*}SS\end{align*} for the sum of the squared deviations. Therefore, the formula often is expressed by \begin{align*}s^2=\frac{SS_1+SS_2}{n_1+n_2-2}.\end{align*}
Calculating \begin{align*}s^2\end{align*} Suppose we have two independent samples of student reading scores.
The data are as follows:
Sample 1 | Sample 2 |
---|---|
7 | 12 |
8 | 14 |
10 | 18 |
4 | 13 |
6 | 11 |
10 |
From this sample, we can calculate a number of descriptive statistics that will help us solve for the pooled estimate of variance:
Descriptive Statistic | Sample 1 | Sample 2 |
---|---|---|
Number \begin{align*}n\end{align*} | 5 | 6 |
Sum of Observations \begin{align*}\sum x\end{align*} | 35 | 78 |
Mean of Observations \begin{align*}\bar{x}\end{align*} | 7 | 13 |
Sum of Squared Deviations \begin{align*}\sum^n_{i=1} (x_i-\bar{x})^2\end{align*} | 20 | 40 |
Using the formula for the pooled estimate of variance, we find that
\begin{align*}s^2=6.67\end{align*}
We will use this information to calculate the test statistic needed to evaluate the hypotheses.
Testing Hypotheses with Independent Samples
When testing hypotheses with two independent samples, we follow similar steps as when testing one random sample:
- State the null and alternative hypotheses.
- Choose \begin{align*}\alpha\end{align*}
- Set the criterion (critical values) for rejecting the null hypothesis.
- Compute the test statistic.
- Make a decision: reject or fail to reject the null hypothesis.
- Interpret the decision within the context of the problem.
When stating the null hypothesis, we assume there is no difference between the means of the two independent samples. Therefore, our null hypothesis in this case would be:
\begin{align*}H_0: \mu_1=\mu_2 \ \text{or} \ H_0: \mu_1-\mu_2=0\end{align*}
Similar to the one-sample test, the critical values that we set to evaluate these hypotheses depend on our alpha level and our decision regarding the null hypothesis is carried out in the same manner. However, since we have two samples, we calculate the test statistic a bit differently and use the formula:
\begin{align*}t=\frac{(\bar{x}_1-\bar{x}_2)-(\mu_1-\mu_2)}{s.e.(\bar{x}_1-\bar{x}_2)}\end{align*}
where:
\begin{align*}\bar{x}_1-\bar{x}_2\end{align*} is the difference between the sample means
\begin{align*}\mu_1-\mu_2\end{align*} is the difference between the hypothesized population means
\begin{align*}s.e.(\bar{x}_1-\bar{x}_2)\end{align*} is the standard error of the difference between sample means
Evaluating the Difference Between Two Samples
The head of the English department is interested in the difference in writing scores between remedial freshman English students who are taught by different teachers. The incoming freshmen needing remedial services are randomly assigned to one of two English teachers and are given a standardized writing test after the first semester. We take a sample of eight students from one class and nine from the other. Is there a difference in achievement on the writing test between the two classes? Use a 0.05 significance level.
First, we would generate our hypotheses based on the two samples.
\begin{align*}H_0: \mu_1 &= \mu_2\\ H_a: \mu_1 & \neq \mu_2\end{align*}
This is a two tailed test. For this example, we have two independent samples from the population and have a total of 17 students that we are examining. Since our sample size is so low, we use the \begin{align*}t-\end{align*}distribution. In this example, we have 15 degrees of freedom (number in the samples minus 2) and with a .05 significance level and the \begin{align*}t\end{align*} distribution, we find that our critical values are 2.131 standard scores above and below the mean.
To calculate the test statistic, we first need to find the pooled estimate of variance from our sample. The data from the two groups are as follows:
Sample 1 | Sample 2 |
---|---|
35 | 52 |
51 | 87 |
66 | 76 |
42 | 62 |
37 | 81 |
46 | 71 |
60 | 55 |
55 | 67 |
53 |
From this sample, we can calculate several descriptive statistics that will help us solve for the pooled estimate of variance:
Descriptive Statistic | Sample 1 | Sample 2 |
---|---|---|
Number \begin{align*}n\end{align*} | 9 | 8 |
Sum of Observations \begin{align*}\sum x\end{align*} | 445 | 551 |
Mean of Observations \begin{align*}\bar{x}\end{align*} | 49.44 | 68.875 |
Sum of Squared Deviations \begin{align*}\sum^n_{i=1}(x_i-\bar{x})^2\end{align*} | 862.22 | 1058.88 |
Therefore:
\begin{align*}s^2=\frac{SS_1+SS_2}{n_1+n_2-2}=128.07\end{align*}
and the standard error of the difference of the sample means is:
\begin{align*}s_{\bar{x}_1-\bar{x}_2}=\sqrt{s^2 \left(\frac{1}{n_1}+\frac{1}{n_2} \right)}=\sqrt{128.07 \left(\frac{1}{9}+\frac{1}{8}\right)} \approx 5.50\end{align*}
Using this information, we can finally solve for the test statistic:
\begin{align*}t=\frac{(\bar{x}_1-\bar{x}_2)-(\mu_1-\mu_2)}{s.e.(\bar{x}_1-\bar{x}_2)}=\frac{(49.44-68.875)-(0)}{5.50} \approx -3.53\end{align*}
Since -3.53 is less than the critical value of 2.13, we decide to reject the null hypothesis and conclude there is a significant difference in the achievement of the students assigned to different teachers.
Testing Hypotheses about the Difference in Proportions between Two Independent Samples
Suppose we want to test if there is a difference between proportions of two independent samples. As discussed in the previous lesson, proportions are used extensively in polling and surveys, especially by people trying to predict election results. It is possible to test a hypothesis about the proportions of two independent samples by using a similar method as described above. We might perform these hypotheses tests in the following scenarios:
- When examining the proportion of children living in poverty in two different towns.
- When investigating the proportions of freshman and sophomore students who report test anxiety.
- When testing if the proportion of high school boys and girls who smoke cigarettes is equal.
In testing hypotheses about the difference in proportions of two independent samples, we state the hypotheses and set the criterion for rejecting the null hypothesis in similar ways as the other hypotheses tests. In these types of tests we set the proportions of the samples equal to each other in the null hypothesis \begin{align*}H_0: p_1=p_2\end{align*} and use the appropriate standard table to determine the critical values (remember, for small samples we generally use the \begin{align*}t\end{align*} distribution and for samples over 30 we generally use the \begin{align*}z-\end{align*}distribution).
When solving for the test statistic in large samples, we use the formula:
\begin{align*}z=\frac{(\hat{p}_1-\hat{p}_2)-(p_1-p_2)}{se(p_1-p_2)}\end{align*}
where:
\begin{align*}\hat{p}_1, \hat{p}_2\end{align*} are the observed sample proportions
\begin{align*}p_1, p_2\end{align*} are the population proportions under the null hypothesis
\begin{align*}se(p_1-p_2)\end{align*} is the standard error of the difference between independent proportions
Similar to the standard error of the difference between independent samples, we need to do a bit of work to calculate the standard error of the difference between independent proportions. To find the standard error under the null hypothesis we assume that \begin{align*}p_1=p_2=p\end{align*} and we use all the data to estimate \begin{align*}p\end{align*}.
\begin{align*}\hat{p}=\frac{n_1 \hat{p}_1+n_2 \hat{p}_2}{n_1+n_2}\end{align*}
Now the standard error of the difference is \begin{align*}\sqrt{\hat{p}(1-\hat{p}) \left(\frac{1}{n_1}+\frac{1}{n_2}\right)}\end{align*}
The test statistic is now \begin{align*}z=\frac{(\hat{p}_1-\hat{p}_2)-(0)}{\sqrt{\hat{p}(1-\hat{p}) \left(\frac{1}{n_1}+\frac{1}{n_2}\right)}}\end{align*}
Determining Statistical Difference
Suppose that we are interested in finding out which particular city is more is more satisfied with the services provided by the city government. We take a survey and find the following results:
Number Satisfied | City 1 | City 2 |
---|---|---|
Yes | 122 | 84 |
No | 78 | 66 |
Sample Size | \begin{align*}n_1=200\end{align*} | \begin{align*}n_2=150\end{align*} |
Proportion who said Yes | 0.61 | 0.56 |
Is there a statistical difference in the proportions of citizens that are satisfied with the services provided by the city government? Use a 0.05 level of significance.
First, we establish the null and alternative hypotheses:
\begin{align*}H_0: p_1 &= p_2\\ H_a:p_1 &\neq p_2\end{align*}
Since we have a large sample size we will use the \begin{align*}z-\end{align*}distribution. At a .05 level of significance, our critical values are \begin{align*}\pm 1.96\end{align*}. To solve for the test statistic, we must first solve for the standard error of the difference between proportions.
\begin{align*}\hat{p} &= \frac{200(.61)+150(.56)}{350}=.589\\ se(\hat{p}_1-\hat{p}_2) &= \sqrt{0.589(.411) \left(\frac{1}{200}+\frac{1}{150}\right)} \approx 0.053\end{align*}
Therefore, the test statistic is:
\begin{align*}z=\frac{(0.61-0.56)-(0)}{0.053} \approx 0.94\end{align*}
Since 0.94 does not exceed the critical value 1.96, the null hypothesis is not rejected. Therefore, we can conclude that the difference in the probabilities could have occurred by chance and that there is no difference in the level of satisfaction between citizens of the two cities.
Testing Hypotheses with Dependent Samples
When testing a hypothesis about two dependent samples, we follow the same process as when testing one random sample or two independent samples:
- State the null and alternative hypotheses.
- Choose the level of significance
- Set the criterion (critical values) for rejecting the null hypothesis.
- Compute the test statistic.
- Make a decision, reject or fail to reject the null hypothesis
- Interpret our results.
As mentioned in the section above, our hypothesis for two dependent samples states that there is no difference between the scores across the two samples \begin{align*}H_0: \delta=\mu_1-\mu_2=0\end{align*}. We set the criterion for evaluating the hypothesis in the same way that we do with our other examples – by first establishing an alpha level and then finding the critical values by using the \begin{align*}t-\end{align*}distribution table. Calculating the test statistic for dependent samples is a bit different since we are dealing with two sets of data. The test statistic that we first need calculate is \begin{align*}\bar{d}\end{align*}, which is the difference in the means of the two samples. Therefore, \begin{align*}\bar{d}=\bar{x}_1-\bar{x}_2\end{align*}. We also need to know the standard error of the difference between the two samples. Since our population variance is unknown, we estimate it by first using the formula for the standard deviations of the samples:
\begin{align*}s^2_d &= \frac{\sum (d-\bar{d})^2}{n-1}\\ s_d &= \sqrt{\frac{\sum d^2-\frac{\left (\sum d \right )^2}{n}}{n-1}}\end{align*}
where:
\begin{align*}s^2_d\end{align*} is the sample variance
\begin{align*}d\end{align*} is the difference between corresponding pairs within the sample
\begin{align*}\bar{d}\end{align*} is the difference between the means of the two samples
\begin{align*}n\end{align*} is the number in the sample
\begin{align*}s_d\end{align*} is the standard deviation
With the standard deviation, we can calculate the standard error using the following formula:
\begin{align*}s_{\bar{d}}=\frac{s_d}{\sqrt{n}}\end{align*}
After we calculate the standard error, we can use the general formula for the test statistic:
\begin{align*}t=\frac{\bar{d}-\delta}{s_d}\end{align*}
Evaluating the Relationship Between Two Samples
The math teacher wants to determine the effectiveness of her statistics lesson and gives a pre-test and a post-test to 9 students in her class. Our hypothesis is that there is no difference between the means of the two samples and our alternative hypothesis is that the two means of the samples are not equal. In other words, we are testing whether or not these two samples are related or:
\begin{align*}H_0: \delta &= \mu_1-\mu_2=0\\ H_a: \delta &= \mu_1-\mu_2 \neq 0\end{align*}
The results for the pre-and post-tests are below:
Subject | Pre-test Score | Post-test Score | \begin{align*}d\end{align*} difference | \begin{align*}d^2\end{align*} |
---|---|---|---|---|
1 | 78 | 80 | 2 | 4 |
2 | 67 | 69 | 2 | 4 |
3 | 56 | 70 | 14 | 196 |
4 | 78 | 79 | 1 | 1 |
5 | 96 | 96 | 0 | 0 |
6 | 82 | 84 | 2 | 4 |
7 | 84 | 88 | 4 | 16 |
8 | 90 | 92 | 2 | 4 |
9 | 87 | 92 | 5 | 25 |
Sum | 718 | 750 | 32 | 254 |
Mean | 79.7 | 83.3 | 3.6 |
Using the information from the table above, first solve for the standard deviation of the two samples, then the standard error of the two samples and finally the test statistic.
Standard Deviation:
\begin{align*}s_d=\sqrt{\frac{\sum d^2-\frac{(\sum d)^2}{n}}{n-1}}=\sqrt{\frac{254-\frac{(32)^2}{9}}{8}} \approx 4.19\end{align*}
Standard Error of the Difference:
\begin{align*}s_{\bar{d}}=\frac{s_d}{\sqrt{n}}=\frac{4.19}{\sqrt{9}}=1.40\end{align*}
Test Statistic (\begin{align*}t-\end{align*}Test)
\begin{align*}t=\frac{\bar{d}-\delta}{s_{\bar{d}}}=\frac{3.6-0}{1.40} \approx 2.57\end{align*}
With 8 degrees of freedom (number of observations - 1) and a significance level of .05, we find our critical values to be \begin{align*}\pm 2.306\end{align*}. Since our test statistic exceeds this critical value, we can reject the null hypothesis that the two samples are equal and conclude that the lesson had an effect on student achievement.
Example
Example 1
You have obtained the number of years of education from one random sample of 38 police officers from City A and the number of years of education from a second random sample of 30 police officers from City B. The average years of education for the sample from City A is 15 years with a standard deviation of 2 years. The average years of education for the sample from City B is 14 years with a standard deviation of 2.5 years. Is there a statistically significant difference between the education levels of police officers in City A and City B?
First, find the test statistic:
\begin{align*} t=\frac{\bar{x_1}-\bar{x_2}}{\sqrt{\frac{s_1^2}{n_1}+\frac{s^2_2}{n_2}}}= \frac{15-14}{\sqrt{\frac{2^2}{38}+\frac{2.5^2}{20}}}=\frac{1}{\sqrt{0.3136}}=1.79\end{align*}
This is a t – statistic with 66 degrees of freedom. This is a two-sided test, with the p-value = 0.07. Since this is greater than .05 we fail to reject the null hypothesis. This means that we believe there is no statistically significant difference between the education levels of police officers in the two different cities.
Review
- In hypothesis testing, we have scenarios that have both dependent and independent samples. Give an example of an experiment with (1) dependent samples and (2) independent samples.
- True or False: When we test the difference between the means of males and females on the SAT, we are using independent samples.
- A study is conducted on the effectiveness of a drug on the hyperactivity of laboratory rats. Two random samples of rats are used for the study and one group is given Drug A and the other group is given Drug B and the number of times that they push a lever is recorded. The following results for this test were calculated:
Drug A | Drug B | |
---|---|---|
\begin{align*}X\end{align*} | 75.6 | 72.8 |
\begin{align*}n\end{align*} | 18 | 24 |
\begin{align*}s^2\end{align*} | 12.25 | 10.24 |
\begin{align*}s\end{align*} | 3.5 | 3.2 |
(a) Does this scenario involve dependent or independent samples? Explain.
(b) What would the hypotheses be for this scenario?
(c) Compute the pooled estimate for population variance.
(d) Calculate the estimated standard error for this scenario.
(e) What is the test statistic and at an alpha level of .05 what conclusions would you make about the null hypothesis?
- A survey is conducted on attitudes towards drinking. A random sample of eight married couples is selected, and the husbands and wives respond to an attitude-toward-drinking scale. The scores are as follows:
Husbands | Wives |
---|---|
16 | 15 |
20 | 18 |
10 | 13 |
15 | 10 |
8 | 12 |
19 | 16 |
14 | 11 |
15 | 12 |
(a) What would be the hypotheses for this scenario?
(b) Calculate the estimated standard deviation for this scenario.
(c) Compute the standard error of the difference for these samples.
(d) What is the test statistic and at an alpha level of .05 what conclusions would you make about the null hypothesis?
- In a random sample of 160 couples, the difference between the husband and wife’s ages had a mean of 2.24 years and a standard deviation of 4.1 years. Test the hypothesis that men are significantly older than their wives, on average.
- For each of the following determine if a paired t-test or a two-sample t-test is appropriate:
- The weights of marathon runners were taken before and after a run to test if runners lose dangerous levels of fluid.
- Do levels of knowledge about current events differ between freshmen and juniors in college?
- Calculate the value of the test statistic t in each of the following situations. In each case the null hypothesis is the same:
- \begin{align*} \bar{x_1}=35, s_1=10, n_1=100, \bar{x_2}=33, s_2=9, n_2=81\end{align*}
- The difference between the sample means is 52, the standard error of the difference between the sample means is 24.
- Consider the following data. Assume the data comes from appropriate random samples: \begin{align*} &\text{Data set A:} && 188.5 && 183 && 194.5 && 185 && 214 && 205.5 && 187 && 183.5 \\ &\text{Data set B:} && 188 && 185.5 && 207 && 188.5 && 196.5 && 204.5 && 180 && 187 \\ \end{align*} Test the hypothesis that the means of the two populations are equal versus that they are not equal.
- A sociologist is interested in determining of the life expectancy of people in Asia is greater than the life expectancy of people in Africa. In a sample of 42 Asians the mean life expectancy was 65.2 years with a standard deviation of 9.3 years. In the sample of 53 Africans the mean life expectancy was 55.3 years with a standard deviation of 8.1 years. Test the hypothesis at the .01 level of significance.
- In each of the following determine whether the alternative hypothesis was the difference in means is greater than zero, or the difference in means is less than zero, or the difference in means is not equal to zero.
- \begin{align*}H_0:\mu_1-\mu_2=0, t=2.33, df=8, p-value=0.048\end{align*}
- \begin{align*}H_0:\mu_1-\mu_2=0, t=-2.33, df=8, p-value=0.024\end{align*}
- \begin{align*}H_0:\mu_1-\mu_2=0, t=-2.33, df=8, p-value=0.976\end{align*}
- A manufacturer is testing two different designs for an air tank. This involves observing how much pressure the tank can withstand before it bursts. For design A, four tanks are sampled and the average pressure to failure was 1500 psi with a standard deviation 250 psi. For design B, six tanks were sampled and had an average pressure to failure of 1610 psi with a standard deviation of 240 psi. Test for a difference in mean pressure to failure for the two designs at the 10% level of significance. Assume the two populations are normally distributed and have the same variance.
- Researchers were studying whether the administration of a growth hormone affects weight gain in pregnant rats. For 6 rats receivng the growth hormone the mean weight gain was 60.8 with a standard deviation of 16.4. For the 6 control rats the weight gain was 41.8 with a standard deviation of 7.6. Is the weight gain for rats receiving the hormone significantly higher than the weight gain in the control group? (source: V.T. Sara, Science 186)
- Do two types of music, type-I and type-II, have different effects upon the ability of college students to perform a series of mental tasks requiring concentration? Thirty college students were randomly divided into two groups of 15 students each. They were asked to perform a series of mental tasks under conditions that are identical in every respect except one: namely, that group A has music of type-I playing in the background, while group B has music of type-II. Following are the results showing how many of the 40 components the students were able to complete.
Group A: music of type-I | Group B: music of type-II | ||
---|---|---|---|
26 21 22 | 18 23 21 | ||
26 19 22 | 20 20 29 | ||
26 25 24 | 20 16 20 | ||
21 23 23 | 26 21 25 | ||
18 29 22 | 17 18 19 |
Complete the hypothesis test to determine if the two types of music have different effects upon the ability of college students to perform a series of mental tasks requiring concentration. (source: Vassar College)
- The campus bookstore asked a random sample of sophomores and juniors how much they spent on textbooks. The bookstore believes the two groups spend the same amount on textbooks. Fifty sophomores had a mean expenditure of $40 with a sample variance of $500 and the 70 juniors sampled had a mean expenditure of $45 with a sample variance of $800. Based on this information is the bookstore’s belief accurate?
- In 1988 Wood, et al, did a study. Eighty-nine sedentary men were given one of two treatments. Forty-two of the men were placed on a diet while forty-seven of them were put on an exercise program. The group on the diet lost an average of 7.2 kg, with a standard deviation of 3.7 kg. The men who exercised lost an average of 4 kg, with a standard deviation of 3.9 kg. Test the hypothesis that the mean weight loss would be different under the two different programs.
- Do the minutes spent exercising in a week differ between men and women in college? To answer this question a random sample of students was taken and the time each spent exercising for a week was recorded. Following is the data that was collected: \begin{align*} & Women: && 65 &&243 && 0 && 365 && 455 && 210 && 100 && 72 && 246 && 0 && 64 && 370 && 190 && 310 && 0 && 280\\ & Men: && 190 && 310 && 70 && 490 && 0 && 95 && 310 && 17 && 620 && 370 && 130 && 0 && 250 \\ \end{align*} Conduct a test to determine if the mean amount of exercise differs for men and women.
Review (Answers)
To view the Review answers, open this PDF file and look for section 8.6.