# 4.4: Statistical Conclusions

### Learning Objectives

- Understand when valid statistical conclusions can be made.
- Calculate an estimated margin of error and 95% confidence interval
- Make confidence statements

### Statistical Conclusions

Remember that when you collect information from every unit in a population, it is called a census. In doing a census, we can be certain that the numbers we have calculated really do represent the entire population. But, because a census is often impractical, we generally take a representative sample of the population, and use that sample to try to make conclusions about the entire population. The downside to sampling is that we can never be completely, 100% sure that we have captured the truth about the entire population.

For example, imagine taking a random sample of 100 from a large population. Put those back and choose another sample of 100, repeating many times. Each of these samples of size 100 will include a different combination of 100 members of the population. Thus, each sample will result in different statistics. This natural difference between various samples is an expected random sampling error. To take this into account, researchers generally report their findings to have a **margin of error** or to be within a certain range of possible values. This range is called a **confidence interval.** For example the President's approval rating might be reported as, *"The approval rating for the President is 43.2%, with a margin of error of ±3%."* Which could also be reported as, *"The approval rating for the President is between 40.2% and 46.2%."*

Using a statistic to make a conclusion about a population is called statistical inference. This course is an introduction course, so we will only briefly touch on this idea. In a future statistics class, you will learn much more about statistical inference and calculations. It is important to note that statistical conclusions are meaningless when poor sampling techniques have been used. If the data was collected from a voluntary response sample, or you had a low response rate, or an incomplete sampling frame was used, then don't waste your time performing inference on your statistics. Random sampling error is the only type of error or bias that the margin of error accounts for.

### 95% Confidence Intervals

Once a statistic is calculated for a sample, it is used as an estimate for what the actual parameter might be. We do not know whether our statistic is close to the population parameter, or if it is too high, or too low, so we build our interval around the statistic. We add the margin of error to, and subtract the margin of error from, our statistic. We then report this range of values as our **confidence interval**, the interval that we are fairly confident that the true parameter must be within. In a more formal course you will learn how to calculate the margin of error more precisely, and for various levels of confidence (such as 90% or 99% etc.). In this course we will use a simple formula that estimates the margin of error for a 95% confidence interval. We will also make a **95% confidence statement**, which explains our conclusion regarding the population parameter in context. The formulas for an estimated 95% margin or error and confidence interval are:

**note: In order to make a smaller margin of error, and therefore a more narrow confidence interval, one must increase the size of the sample.*

Once you have found the range of numbers for your confidence interval, you are going to state your conclusion in context. Such a statement is called a **confidence statement**. The confidence interval refers to the population - not the sample. We are 100% certain of our sample statistic. It is the population parameter that we are estimating. Writing a confidence statement can be kind of confusing, so you can just use the following template:

“We are 95% confident that the true proportion of _____(parameter of interest)____ will be between ___(low value of CI)____ and __(high value of CI)______.”

#### Example 1

A random sample of 125 union members was conducted to see whether or not the union members would support a strike. Sixty-four of those surveyed said that they would support a strike unless safety conditions were improved. Identify

a) population of interest

b) parameter of interest

c) sample

d) statistic

e) margin of error

f) 95% confidence interval

g) confidence statement.

#### Solution

a) Population of Interest: All members of this union

b) Parameter of Interest: The percent of the union members who would support a strike

c) Sample: The 125 union members who were surveyed

d) Statistic: (p-hat)

e) Margin of Error: m.e.

f) 95% Confidence Interval: 0.512 + 0.0894 = 0.6014 and 0.512 - 0.0894 = 0.4226

[0.4226 to 0.6014] or [42.26% to 60.14%]

g) Confidence Statement:"We are 95% confident that the true proportion of union members who would support a strike is between 42.26% and 60.14%"

### Problem Set 4.4

#### Section 4.4 Exercises

1. A survey was done to determine the texting habits of MBHS students. An SRS of 270 students were asked several questions related to texting and cell phone usage. Of particular interest to the researchers was the proportion of students who text while in class. Of those surveyed, 178 said that they text during class at least ten times per week. Identify each of the following as specifically as possible.

a) Population of Interest

b) Parameter of Interest

c) Sample

d) Statistic

e) Margin of Error

f) 95% Confidence Interval

g) Confidence Statement

h) Do you personally feel that this is too high or too low of an estimate of the proportion of teens at your high school who text during class?

2. To predict the outcome of an upcoming Mayoral election, a random sample of 814 voters was selected. These people are asked several questions regarding the election. One question asked whether they were "...leaning Republican, Democratic, Independent, or other/undecided?" Based on this question, 38.2% of respondents said that they were "..leaning Democratic...". Identify each of the following as specifically as possible.

a) Population of Interest

b) Parameter of Interest

c) Sample

d) Statistic

e) Margin of Error

f) 95% Confidence Interval

g) Confidence Statement

3. In the same survey, 42.3% said that they were "...leaning Republican...".

a) Calculate an estimated 95% confidence interval

b) Is this enough evidence to "call" the election in favor of the republicans? Why or why not?

4. The quality control officer at Spaz Cola uses a systematic random sampling method to select cans of Spaz Cola to determine whether the machines are maintaining the correct recipe. Among the 480 cans analyzed today, 43 cans contained less sugar than the Spaz recipe requires! Identify each of the following as specifically as possible.

a) Population of Interest

b) Parameter of Interest

c) Sample

d) Statistic

e) Margin of Error

f) 95% Confidence Interval

g) Confidence Statement

h) Do you think that the company should be concerned? Why or why not?

#### Review Exercises

5) Marcus got 18 points correct, out of 42 possible points, on his science test. On his history test, Marcus got 31 points out of 55 possible points. On which test did Marcus do better? Explain or show how you know.

6) Lydia got 15 points correct on her probability quiz (out of 23 possible). Then she earned 37 points, of the 48 possible points on her probability test. On which of these assessments did Lydia do better? Explain or show how you know.

7) The figure below is a dartboard. Suppose that a dart is thrown at it randomly. What is the probability that the dart will land on the shaded area?

8) Sketch two different "dart boards" such that the probability of hitting the shaded are is equal to one-third.