12.8: Surveys and Samples
One of the most important applications of statistics is collecting information. Statistical studies are done for many purposes:
- To find out more about animal behaviors;
- To determine which presidential candidate is favored;
- To figure out what type of chip product is most popular;
- To determine the gas consumption of cars.
In most cases except the Census, it is not possible to survey everyone in the population. So a sample is taken. It is essential that the sample is a representative sample of the population being studied. For example, if we are trying to determine the effect of a drug on teenage girls, it would make no sense to include males in our sample population, nor would it make sense to include women that are not teenagers.
The two types of sampling methods studied in this book are:
- Random Sampling
- Stratified Sampling
Random Samples
Random sampling is a method in which people are chosen “out of the blue.” In a true random sample, everyone in the population must have the same chance of being chosen. It is important that each person in the population has a chance of being picked.
Stratified Samples
Stratified sampling is a method actively seeking to poll people from many different backgrounds. The population is first divided into different categories (or strata) and the number of members in each category is determined.
Sample Size
In order to lessen the chance of a biased result, the sample size must be large enough. The larger the sample size is, the more precise the estimate is. However, the larger the sample size, the more expensive and time-consuming the statistical study becomes.
Example 1: For a class assignment you have been asked to find if students in your school are planning to attend university after graduating high school. Students can respond with “yes,” “no,” or “undecided.” How will you choose those you wish to interview if you want your results to be reliable?
Solution:
The stratified sampling method would be the best option. By randomly picking a certain number of students in each grade, you will get the most accurate results.
Biased Samples
If the sample ends up with one or more sub-groups that are either over-represented or under-represented, then we say the sample is biased. We would not expect the results of a biased sample to represent the entire population, so it is important to avoid selecting a biased sample.
Some samples may deliberately seek a biased sample in order to obtain a particular viewpoint. For example, if a group of students were trying to petition the school to allow eating candy in the classroom, they might only survey students immediately before lunchtime when students are hungry. The practice of polling only those who you believe will support your cause is sometimes referred to as cherry picking.
Many surveys may have a non-response bias. In this case, a survey that is simply handed out gains few responses when compared to the number of surveys given out. People who are either too busy or simply not interested will be excluded from the results. Non-response bias may be reduced by conducting face-to-face interviews.
Self-selected respondents who tend to have stronger opinions on subjects than others and are more motivated to respond may also cause bias. For this reason phone-in and online polls also tend to be poor representations of the overall population. Even though it appears that both sides are responding, the poll may disproportionately represent extreme viewpoints from both sides, while ignoring more moderate opinions that may, in fact, be the majority view. Self-selected polls are generally regarded as unscientific.
Example 2: Determine whether the following survey is biased. Explain your reasoning.
“Asking people shopping at a farmer’s market if they think locally grown fruit and vegetables are healthier than supermarket fruits and vegetables”
Solution: This would be a biased sample because people shopping a farmer’s market are generally interested in buying fresher fruits and vegetables than a regular supermarket provides. The study can be improved by interviewing an equal number of people coming out of a supermarket, or by interviewing people in a more neutral environment such as the post office.
Biased Questions
Although your sample may be a good representation of the population, the way questions are worded in the survey can still provoke a biased result. There are several ways to identify biased questions.
- They may use polarizing language, words, and phrases that people associate with emotions.
- How much of your time do you waste on TV every week?
- They may refer to a majority or to a supposed authority.
- Would you agree with the American Heart and Lung Association that smoking is bad for your health?
- They may be phrased so as to suggest the person asking the question already knows the answer to be true, or to be false.
- You wouldn’t want criminals free to roam the streets, would you?
- They may be phrased in an ambiguous way (often with double negatives), which may confuse people.
- Do you disagree with people who oppose the ban on smoking in public places?
Design and Conduct a Survey
The method in which you design and conduct the survey is crucial to its accuracy. Surveys are a set of questions in which the sample answers. The data is compiled to form results, or findings. When designing a survey, be aware of the following recommendations.
- Determine the goal of your survey. What question do you want to answer?
- Identify the sample population. Who will you interview?
- Choose an interviewing method, face-to-face interview, phone interview, or self-administered paper survey or internet survey.
- Conduct the interview and collect the information.
- Analyze the results by making graphs and drawing conclusions.
Surveys can be conducted in several ways.
Face-to-face interviews
\begin{align*}\checkmark\end{align*} Fewer misunderstood questions
\begin{align*}\checkmark\end{align*} High response rate
\begin{align*}\checkmark\end{align*} Additional information can be collected from respondents
- Time-consuming
- Expensive
- Can be biased based upon the attitude or appearance of the surveyor
Self-administered surveys
\begin{align*}\checkmark\end{align*} Respondent can complete on their free time
\begin{align*}\checkmark\end{align*} Less expensive than face-to-face interviews
\begin{align*}\checkmark\end{align*} Anonymity causes more honest results
- Lower response rate
Example: Martha wants to construct a survey that shows which sports students at her school like to play the most.
- List the goal of the survey.
- What population sample should she interview?
- How should she administer the survey?
- Create a data collection sheet that she can use to record her results.
Solution: The goal of the survey is to find the answer to the question: “Which sports do students at Martha’s school like to play the most?”
- A sample of the population would include a random sample of the student population in Martha’s school. A good stategy would be to randomly select students (using dice or a random number generator) as they walk into an all-school assembly.
- Face-to-face interviews are a good choice in this case since the survey consists of only one question, which can be quickly answered and recorded.
- In order to collect the data to this simple survey, Martha can design a data collection sheet such as the one below:
Sport | Tally |
---|---|
baseball | |
basketball | |
football | |
soccer | |
volleyball | |
swimming |
Display, Analyze, and Interpret Survey Data
This textbook has shown you several ways to display data. These graphs are also useful when displaying survey results. Survey data can be displayed as:
- A bar graph
- A histogram
- A pie chart
- A tally sheet
- A box-and-whisker plot
- A stem-and-leaf plot
The method in which you choose to display your data will depend upon your survey results and to whom you plan to present the data.
Practice Set
Sample explanations for some of the practice exercises below are available by viewing the following video. Note that there is not always a match between the number of the practice exercise in the video and the number of the practice exercise listed in the following exercise set. However, the practice exercise is the same in both.
CK-12 Basic Algebra: Surveys and Samples (12:09)
- Explain the most common types of sampling methods. If you needed to survey a city about a new road project, which sampling method would you choose? Explain.
- What is a biased survey? How can bias be avoided?
- How are surveys conducted, according to this text? List one advantage and one disadvantage of each? List one additional method that can be used to conduct surveys.
- What are some keys to recognizing biased questions? What could you do if you were presented with a biased question?
- For a class assignment, you have been asked to find out how students get to school. Do they take public transportation, drive themselves, have their parents drive them, use carpool, or walk/bike. You decide to interview a sample of students. How will you choose those you wish to interview if you want your results to be reliable?
- Comment on the way the following samples have been chosen. For the unsatisfactory cases, suggest a way to improve the sample choice.
- You want to find whether wealthier people have more nutritious diets by interviewing people coming out of a five-star restaurant.
- You want to find if a pedestrian crossing is needed at a certain intersection by interviewing people walking by that intersection.
- You want to find out if women talk more than men by interviewing an equal number of men and women.
- You want to find whether students in your school get too much homework by interviewing a stratified sample of students from each grade level.
- You want to find out whether there should be more public busses running during rush hour by interviewing people getting off the bus.
- You want to find out whether children should be allowed to listen to music while doing their homework by interviewing a stratified sample of male and female students in your school.
- Raoul wants to construct a survey that shows how many hours per week the average student at his school works.
- List the goal of the survey.
- What population sample will he interview?
- How would he administer the survey?
- Create a data collection sheet that Raoul can use to record his results.
- Raoul found that 30% of the students at his school are in \begin{align*}9^{th}\end{align*} grade, 26% of the students are in the \begin{align*}10^{th}\end{align*} grade, 24% of the students are in \begin{align*}11^{th}\end{align*} grade, and 20% of the students are in the \begin{align*}12^{th}\end{align*} grade. He surveyed a total of 60 students using these proportions as a guide for the number of students he interviewed from each grade. Raoul recorded the following data.
Grade Level | Record Number of Hours Worked | Total Number of Students |
---|---|---|
\begin{align*}9^{th}\end{align*} grade | 0, 5, 4, 0, 0, 10, 5, 6, 0, 0, 2, 4, 0, 8, 0, 5, 7, 0 | 18 |
\begin{align*}10^{th}\end{align*} grade | 6, 10, 12, 0, 10, 15, 0, 0, 8, 5, 0, 7, 10, 12, 0, 0 | 16 |
\begin{align*}11^{th}\end{align*} grade | 0, 12, 15, 18, 10, 0, 0, 20, 8, 15, 10, 15, 0, 5 | 14 |
\begin{align*}12^{th}\end{align*} grade | 22, 15, 12, 15, 10, 0, 18, 20, 10, 0, 12, 16 | 12 |
(a) Construct a stem-and-leaf plot of the collected data.
(b) Construct a frequency table with bin size of 5.
(c) Draw a histogram of the data.
(d) Find the five-number summary of the data and draw a box-and-whisker plot.
- The following pie chart displays data from a survey asking students the type of sports they enjoyed playing most. Make five conclusions regarding the survey results.
- Melissa conducted a survey to answer the question: “What sport do high school students like to watch on TV the most?” She collected the following information on her data collection sheet.
Sport | Tally | |
---|---|---|
Baseball | 32 | |
Basketball | 28 | |
Football | 24 | |
Soccer | 18 | |
Gymnastics | 19 | |
Figure Skating | 8 | |
Hockey | 18 | |
Total 147 |
(a) Make a pie chart of the results showing the percentage of people in each category.
(b) Make a bar-graph of the results.
- Samuel conducted a survey to answer the following question: “What is the favorite kind of pie of the people living in my town?” By standing in front of his grocery store, he collected the following information on his data collection sheet:
Type of Pie | Tally | |
---|---|---|
Apple | 37 | |
Pumpkin | 13 | |
Lemon Meringue | 7 | |
Chocolate Mousse | 23 | |
Cherry | 4 | |
Chicken Pot Pie | 31 | |
Other | 7 | |
Total 122 |
(a) Make a pie chart of the results showing the percentage of people in each category.
(b) Make a bar graph of the results.
- Myra conducted a survey of people at her school to see “In which month does a person’s birthday fall?” She collected the following information in her data collection sheet:
Month | Tally | |
---|---|---|
January | 16 | |
February | 13 | |
March | 12 | |
April | 11 | |
May | 13 | |
June | 12 | |
July | 9 | |
August | 7 | |
September | 9 | |
October | 8 | |
November | 13 | |
December | 13 | |
Total: 136 |
(a) Make a pie chart of the results showing the percentage of people whose birthday falls in each month.
(b) Make a bar graph of the results.
- Nam-Ling conducted a survey that answers the question: “Which student would you vote for in your school’s elections?” She collected the following information:
Candidate | \begin{align*}9^{th}\end{align*} graders | \begin{align*}10^{th}\end{align*} graders | \begin{align*}11^{th}\end{align*} graders | \begin{align*}12^{th}\end{align*} graders | Total |
---|---|---|---|---|---|
Susan Cho | 19 | ||||
Margarita Martinez | 31 | ||||
Steve Coogan | 16 | ||||
Solomon Duning | 26 | ||||
Juan Rios | 28 | ||||
Total | 36 | 30 | 30 | 24 | 120 |
(a) Make a pie chart of the results showing the percentage of people planning to vote for each candidate.
(b) Make a bar graph of the results.
- Graham conducted a survey to find how many hours of TV teenagers watch each week in the United States. He collaborated with three friends who lived in different parts of the U.S. and found the following information:
Part of the country | Number of hours of TV watched per week | Total number of teens |
---|---|---|
West Coast | 10, 12, 8, 20, 6, 0, 15, 18, 12, 22, 9, 5, 16, 12, 10, 18, 10, 20, 24, 8 | 20 |
Mid West | 20, 12, 24, 10, 8, 26, 34, 15, 18, 6, 22, 16, 10, 20, 15, 25, 32, 12, 18, 22 | 20 |
New England | 16, 9, 12, 0, 6, 10, 15, 24, 20, 30, 15, 10, 12, 8, 28, 32, 24, 12, 10, 10 | 20 |
South | 24, 22, 12, 32, 30, 20, 25, 15, 10, 14, 10, 12, 24, 28, 32, 38, 20, 25, 15, 12 | 20 |
(a) Make a stem-and-leaf plot of the data.
(b) Decide on an appropriate bin size and construct a frequency table.
(c) Make a histogram of the results.
(d) Find the five-number summary of the data and construct a box-and-whisker plot.
- “What do students in your high school like to spend their money on?”
- Which categories would you include on your data collection sheet?
- Design the data collection sheet that can be used to collect this information.
- Conduct the survey. This activity is best done as a group with each person contributing at least 20 results.
- Make a pie chart of the results showing the percentage of people in each category.
- Make a bar graph of the results.
- “What is the height of students in your class?”
- Design the data collection sheet that can be used to collect this information.
- Conduct the survey. This activity is best done as a group with each person contributing at least 20 results.
- Make a stem-and-leaf plot of the data.
- Decide on an appropriate bin size and construct a frequency table.
- Make a histogram of the results.
- Find the five-number summary of the data and construct a box-and-whisker plot.
- “How much allowance money do students in your school get per week?”
- Design the data collection sheet that can be used to collect this information.
- Conduct the survey. This activity is best done as a group with each person contributing at least 20 results.
- Make a stem-and-leaf plot of the data.
- Decide on an appropriate bin size and construct a frequency table.
- Make a histogram of the results.
- Find the five-number summary of the data and construct a box-and-whisker plot.
- Are the following statements biased?
- You want to find out public opinion on whether teachers get paid a sufficient salary by interviewing the teachers in your school.
- You want to find out if your school needs to improve its communications with parents by sending home a survey written in English.
- “What time do students in your school get up in the morning during the school week?”
- Design the data collection sheet that can be used to collect this information.
- Conduct the survey. This activity is best done as a group with each person contributing at least 20 results.
- Make a stem-and-leaf plot of the data.
- Decide on an appropriate bin size and construct a frequency table.
- Make a histogram of the results.
- Find the five-number summary of the data and construct a box-and-whisker plot.
Mixed Review
- Write the equation containing (8, 1) and (4, –6) in point-slope form.
- What is the equation for the line perpendicular to this containing (0, 0)?
- What is the equation for the line parallel to this containing (4, 0)?
- Classify \begin{align*}\sqrt[3]{64}\end{align*} according to the real number hierarchy.
- A ferry traveled to its destination, 22 miles across the harbor. On the first voyage, the ferry took 45 minutes. On the return trip, the ferry encountered a head wind and its trip took one hour, ten minutes. Find the speed of the ferry and the speed of the wind.
- Solve for \begin{align*}a: \frac{6a}{a-1}=\frac{7}{a+7}\end{align*}.
- Simplify \begin{align*}\frac{7}{18x^2} \cdot \frac{9x}{14}\end{align*}.
- Use long division to simplify: \begin{align*}\frac{3w^3-6w^2-27w+54}{2w^2-4w-30}\end{align*}.
- A hot air balloon rises 16 meters every second.
- Is this an example of a linear function, a quadratic function, or an exponential function? Explain.
- At four seconds the balloon is 68.5 meters from the ground. What was its beginning height?