- Differentiate between population and sample
- Understand the terminology of sampling methods
- Identify various sampling methods
- Recognize and name sources of bias or errors in sampling
Population vs. Sample
What is the approval rate of the President? If we really wanted to know the true approval rating of the president, we would have to ask every single adult in the United States her or his opinion. If a researcher wants to know the exact answer in regard to some question about a population, the only way to do this is to conduct a census. In a census, every unit in the population being studied is measured or surveyed. In this example our population, the entire group of individuals that we are interested in, is every adult in the United States of America.
A census like this (asking the opinion of every single adult in the United States) would be impractical, if not impossible. First, it would be extremely expensive for the polling organization. They would need a large workforce to try and collect the opinions of every single adult in the United States. Once the data is collected, it would take many workers many hours to organize, interpret, and display this information. Other practical problems that might arise are: some adults may be difficult to locate, some may refuse to answer the questions or not answer truthfully, some people may turn 18 before the results are published, others may pass away before the results are published, or an event may happen that changes peoples' opinions drastically, etc. Even if this all could be done within several months, it is highly likely that peoples’ opinions will have changed. So by the time the results are published, they are already obsolete.
Another reason why a census is not always practical is because a census has the potential to be destructive to the population being studied. For example, it would not be a good idea for a biologist to find the number of fish in a lake by draining the lake and counting them all. Also, many manufacturing companies test their products for quality control. A padlock manufacturer, for example, might use a machine to see how much force it can apply to the lock before it breaks. If they did this with every lock, they would have none to sell. In both of these examples it would make much more sense to simply test or check a sample of the fish or locks. The researchers hope that the sample that they select represents the entire population of fish or locks.
This is why sampling is often used. Sampling refers to asking, testing, or checking a smaller sub-group of the population. A sample is a representative subset of the population, whereas the population is every single member of the group of interest. The purpose of a sample is to be able to generalize the findings to the entire population of interest. Rather than do an entire census, samples are generally more practical. Samples can be more convenient, efficient and cost less in money, labor and time.
A number that describes a sample is a statistic, while a number that describes an entire population is a parameter. Researchers are trying to approximate parameters based on statistics that they calculate from the data that they have collected from samples. However, results from samples cannot always be trusted.
A poll was done to determine how much time the students at SDHS spend getting ready for school each morning. One question asked, “Do you spend more or less than 20 minutes styling your hair for school each morning?” Of the 263 students surveyed, 61 said that they spend more than 20 minutes styling their hair before school. Identify the population, the parameter, the sample, and the statistic for this specific question.
a) population (of interest): all students at SDHS
b) parameter (of interest): what proportion of students spend more than 20 minutes styling their hair for school each morning
c) sample: the 263 SDHS students who were surveyed
One common problem in sampling is that the sample chosen may not represent the entire population. In such cases, the statistics found from these samples will not accurately approximate the parameters that the researchers are seeking. Samples that do not represent the population are biased. If someone was interested in the average height of all male students at his or her high school, but somehow the sample of students measured included the majority of the varsity basketball team, the results would certainly be biased. In other words, the statistics that were calculated would most certainly overestimate the average height of male students at the school. Samples should be selected randomly in order to limit bias. Also, if only three students' heights are measured, it is very possible that the average height of these three will not be close to the average height of all of the male students. The average of the heights of 40 randomly chosen male students would be more likely to result in a number that will match the average of the entire population than that of just three students. Larger sample sizes will have less variability, so small sample sizes should be avoided.
A random sample is one in which every member of the population has an equal chance of being selected. There are many ways to make such random selections. The way many raffles are done is that every ticket is put into a hat (or box), then they are shaken or stirred up, and finally someone reaches into the hat without looking and selects the winning ticket(s). Flipping a coin to decide which group someone belongs in is another way to choose randomly. Computers and calculators can be used to make random selections as well. The purpose of choosing randomly is to avoid any personal bias from influencing the selection process. Randomization will limit bias by mixing up any other factors that might be present. Think of the heights of those male students, if we assigned every male at that school a number and then had a computer program select 40 numbers at random, it is most likely that we would end up with a mixture of students of various heights (rather than a bunch of basketball players). Also, no one did something like just measuring their friends heights, or the first 40 males he sees who are staying after school, or everyone in first lunch who is willing to come be measured. A computer program has no personal stake in the outcome and is not limited by its comfort level or laziness.
If the goal of our sample is to truly estimate the population parameter, then some planning should be done as to how the sample will be selected. First of all, the list of the population should actually include every member of the population. This list of the population is called the sampling frame. For example, if the population is supposed to be all adults in a given city and someone is working from the phone book to make selections, then everyone who is unlisted and those who do not have a land line telephone will not have any chance of being selected. Therefore, this is not an accurate sampling frame.
Good Sampling Methods
Simple Random Sample
When the selection of which individuals to sample is made randomly from one big list, it is called a simple random sample (or SRS). An example of this would be if a teacher put every single student's name in a hat and then draws 5 names from the hat, without looking, to receive a piece of candy. In an SRS every single member of the population has an equal probability of being selected - every student has an equal chance of getting the candy. And, in an SRS every combination of individuals also has an equal chance of being selected - any group of 5 students might end up getting candy. It might be all 5 girls, it might be the 5 students who sit in the back row, or it might even end up being the 5 students who misbehave the most. Anything is possible with an SRS!
Stratified Random Sample
A simple random sample is not always the best choice though. Suppose you were interested in students' opinions regarding the homecoming theme, and you wanted to make certain that you heard from students from all four grades. In such a case it would make more sense to have four separate lists (freshmen, sophomores, juniors and seniors), and then to randomly select 50 students from each list to give your survey to. A selection done in this way is called a stratified random sample. A stratified random sample is when the population is divided into deliberate groups called strata first, then individual SRS's are selected from each of the strata. This is a great method when the researchers want to be sure to include data from specific groups. Divisions may be done by gender, age groups, races, geographic location, income levels, etc. With stratified random samples, every member of the population has an equal chance of being selected, but not every combination of individuals is possible.
Systematic Random Sample
Another way to choose a sample is systematically. A systematic random sample makes the first selection randomly and then uses some type of 'system' to make the remaining selections. A system could be: every 15th customer will be given a survey, or every 30 minutes a quality control test will be run. A systematic random sample might start with a single list like an SRS, randomly choose one person from the list, then every 25th person after that first person will also be selected. Systematic random samples still give every member of the population an equal chance of being chosen, but do not allow for all combinations of individuals. Some groups are impossible, such as a group including several people who are in order on the list.
Multi-Stage Random Sample
When seeking the opinions of a large population, such as all registered voters in the United States, a multi-stage random sample is often employed. A multi-stage random sample involves more than one stage of random selection and does not choose individuals until the last step. A pollster might start by randomly choosing 10 states from a list of the 50 states in the U.S.A. Then she might randomly choose 10 counties in each of those states. And, finally she can randomly choose 50 registered voters from each of those counties to interview over the telephone. When she is done, she will have 10x10x50 = 5000 individuals in her sample. This is another sampling method that gives individuals an equal chance of being chosen, but does not allow for all possible combinations of individuals. For example, there is no possible way that all 5000 of these voters will be from Texas.
Random Cluster Sample
Sometimes cluster samples are used to collect data. Splitting the population into representative clusters, and then randomly selecting some of those clusters, can be more practical than making only individual selections. In cluster sampling, a census is done on each cluster or group selected. When appropriately used, cluster sampling can be very useful and efficient. One needs to be careful that the clusters are in fact selected randomly and that this method is the best choice. When a study of teenagers across the country is to be done, a random cluster method can be the best choice. An SRS of all teens would be nearly impossible. Imagine that one big list of all teens! A multi-stage random sample might be theoretically ideal, but the practicality of surveying one teenager from a high school in Little Rock, and one from another high school in Duluth, and so on would be quite a nightmare. The best choice might be to randomly select 10 metropolitan areas, 10 suburban areas, and 10 urban areas from across the country. And then to randomly select one high school in each of these areas and then finally to randomly select 4 second hour classes from each of those high schools. Then survey the entire classes selected (clusters). This would be a combination of mulit-stage random selection and cluster sampling. Another use for random cluster sampling is quality control at a popcorn factory. If every hour, a bucket of popcorn is scooped out. The entire bucket of popcorn can be checked for salt content, appearance, number of kernels not popped, not burnt, etc. This is an example of a systematic random cluster sample, the system being 'every hour' a sample is taken, and the clusters being each bucket of popcorn.
Bad Sampling Methods
Voluntary Response Sample
Beware of call in surveys, and online surveys! Suppose that a radio hosts on KDWB says something like, “Do you think texting while driving should be illegal? Call in and have your opinion heard!” It is highly likely that many people will call in and vote “No!” However, the people who do take the time to call will not represent the entire population of the twin cities and so the results cannot possibly be trusted to be equal to what all members of the population think. The 'statistic' that this 'survey' calculates will be biased. The only people who will take the time to call in are those who feel strongly that texting while driving should be legal (or illegal). Such a sampling method is called a voluntary response sample. In voluntary response samples, participants get to choose whether or not to participate in the survey. Online, text-in, call-in, mail-in, and surveys that are handed out to people with an announcement of where to turn them in when completed, are all examples of voluntary response surveys. Voluntary response samples are almost always biased because they result in no response whatsoever from most people who are invited to complete the survey. So, most opinions are never even heard, except for those who have really strong opinions for or against the topic in question. Also, those who have strong opinions can call or text multiple times. A new problem that comes with the Internet is that many companies are offering to pay people to complete surveys, which makes any results suspect. For these reasons, the results of voluntary response samples are always suspect because of the potential for bias.
Another commonly used, but dangerous method for choosing a sample is to use a convenience sample. A convenience sample just asks those individuals who are easy to ask or are conveniently located - right by the pollster for example. The big problem here is that the sample is unlikely to represent the entire population. The fact that this group was convenient, implies that they most likely have at least something in common. This will almost always result in biased results. An interviewer at the mall only asks people who shop at the mall, and only at some given time of day, so many people in the community will never have the opportunity to be interviewed. When the population of interest is only mall-shoppers, this will be somewhat better than when the population of interest is community members. Even then, the interviewers choose whom to go up to and the interviewees can easily refuse to participate.
With both of the bad sampling methods, the word random is nowhere to be found. That lack of randomness should serve as a big hint that some type of bias will likely be present. The scary thing is that most of the results we see published in the media are the results of convenience samples and voluntary response samples. One should always ask questions about where and how the data was collected before believing the reported statistics.
Suppose that a survey is to be conducted at the new Twin's Stadium. A five question survey if developed. Population of interest: All of the 31,045 fans present that day. Sample size: 2,500 randomly selected fans. Identify specifically the sampling method that is being proposed in each scenario. Also, comment on any potential problem or bias that will likely occur.
a) The first 2,500 fans to arrive are asked five questions.
b) Fifty sections are randomly selected. Then ten rows are randomly selected from each of those sections. Then five seats are randomly selected from each of those rows. The people in these seats are interviewed in person during the game.
c) A computer program selects 2,500 seat numbers randomly from a list of all seats occupied that day. The people in these seats are interviewed in person during the game.
d) 2,500 seats are randomly selected. The surveys are taped to those 2,500 with instructions as to where to return the completed surveys.
e) The number 8 was randomly selected earlier. The 8th person through any gate is asked five questions. Then, every 12th person after that is also asked the five questions.
f) The seats are divided into 25 sections based on price and view. A computer program randomly selects 100 seats from each of these sections. The people in these seats are interviewed in person during the game.
a) This is a convenience sample. It will not represent everyone present that day. This will suffer from bias because all of these people have at least one thing in common-they arrived early.
b) This is a multi-stage random sample. It will probably represent the entire population. As long as the people are in their seats and willing to answer the questions honestly, it could be a good plan.
c) This is simple random sample. It will probably represent the entire population. As long as the people are in their seats and willing to answer the questions honestly, it could be a good plan.
d) This is a voluntary response sample. It is very likely that most of those surveys will end up on the ground or in the garbage. This will likely suffer from many people not responding. It is also probable that anyone who had an extremely negative experience will be more likely to complete their surveys.
e) This is a systematic random sample. It will probably represent the entire population. As long as the people are willing to answer the questions honestly, it could be a good plan.
f) This is a stratified random sample. It will probably represent the entire population. As long as the people are in their seats and willing to answer the questions honestly, it could be a good plan.
Errors in Sampling
Some errors have to do with the way in which the sample was chosen. The most obvious is that many reports result from a bad sampling method. Convenience samples and voluntary response samples are used often and the results are displayed in the media constantly. Now, we have seen that both of these methods for choosing a sample are prone to bias. Another potential problem is when results are based on too small of a sample. If a statistic reports that 80% of doctors surveyed say something, but only five doctors were even surveyed this does not give us a good idea of what all doctors would say.
Another common mistake in sampling is to leave an entire group (or groups) out of the sample. This is called undercoverage. Suppose a survey is to be conducted at your school to find out what types of music to play at the next school dance. The dance committee develops a quick questionnaire and distributes it to 12 randomly selected 5th period classes. However, what if they did this on a day when the football teams and cheerleaders had all left early to go to an out of town game. The results of the dance committee's survey will suffer from undercoverage, and will therefore not represent the entire population of your school.
There is also the fact that each sample, randomly selected or not, will result in a different group of individuals. Thus, each sample will end up with different statistics. This expected variation is called random sampling error and is usually only a slight difference. However, every now and then the sample selected can be a 'fluke' and just simply not represent the entire population. A randomly selected sample might accidentally end up with way too many males for example. Or a survey to determine the average GPA of students at your school might accidentally include mostly honor's students. There is no way to avoid random sampling error. This is one reason that many important surveys are repeated with a new sample. The odds of getting such a 'fluke' group more than once are very low.
Non Sampling Errors
One of the biggest problems in polling is that most people just don’t want to bother taking the time to respond to a poll of any kind. They hang up on a telephone survey, put a mail-in survey in the recycling bin, or walk quickly past an interviewer on the street. Even when the researchers take the time to use an appropriate and well-planned sampling method, many of the surveys are not completed. This is called non-response and is a source of bias. We just don’t know how much the beliefs and opinions of those who did complete the survey actually reflect those of the general population, and, therefore, almost all surveys could be prone to non-response bias. When determining how much merit to give to the results of a survey, it is important to look for the response rate .
The wording of the questions can also be a problem. The way a question is worded can influence the response of those people being asked. For example, asking a question with only two answer choices forces a person to choose one of them, even if neither choice describes his or her true belief. When you ask people to choose between two options, the order in which you list the choices may influence their response. Also, it is possible to ask questions in leading ways that influence the responses. A question can be asked in different ways which may appear to be asking the same thing, but actually lead individuals with the same basic opinions to respond differently.
Consider the following two questions about gun control.
“Do you believe that it is reasonable for the government to impose some limits on purchases of certain types of weapons in an effort to reduce gun violence in urban areas?”
“Do you believe that it is reasonable for the government to infringe on an individual’s constitutional right to bear arms?”
The first question will result in a higher rate of agreement because of the wording 'some limits' as opposed to 'infringe'. Also, 'an effort to reduce gun violence' rather than 'infringe on an individual's constitutional right' will bring more agreement. Thus, even though the questions are intended to research the same topic, the second question will render a higher rate of people saying that they disagree. Any person who has strong beliefs either for or against government regulation of gun ownership will most likely answer both questions the same way. However, individuals with a more tempered, middle position on the issue might believe in an individual’s right to own a gun under some circumstances, while still feeling that there is a need for regulation. These individuals would most likely answer these two questions differently.
You can see how easy it would be to manipulate the wording of a question to obtain a certain response to a poll question. This type of bias may be done intentionally in an effort to sway the results. But it is not necessarily always a deliberate action. Sometimes a question is poorly worded, confusing, or just plain hard to understand, this will still lead to non-representative results. another thing to look at when critiquing the results of a survey is the specific wording of the questions. It is also important to know who paid for or who is reporting the results. Do the sponsors of this survey have an agenda they are trying to push through?
A major problem with surveys is that you can never be sure that the person is actually responding truthfully. When an individual responds to a survey with an incorrect or untruthful answer, this is called response bias . This can occur when asking questions about extremely sensitive, controversial or personal issues. Some responses are actual lies, but it is also common for people just to not remember correctly. Also, sometimes someone who is completing a survey or answering interview questions will 'mess with the data' by lying or making up ridiculous answers.
Response bias is also common when asking people to remember what they watched on TV last week, or how often they ate at a restaurant last month, or anything from the past. Someone may have the best intentions as they complete the questionnaire, but it is very easy to forget what you did last week, last month, or even yesterday. Also, people are often hurrying through survey questions, which can lead to incorrect responses. So the results on questions regarding the past should be viewed with caution.
It is difficult to know whether or not response bias is present. We can look at how questions were worded, how they were asked, and who asked them. Person-to-person interviews on controversial topics carry a definite potential for response bias for example. It is sometimes helpful to see the actual questionnaire that the subjects were asked to complete.
There are sometimes mistakes in calculations or typos present in results, these are processing errors (or human errors). For example, it is not uncommon for someone to enter a number incorrectly when working with large amounts of data, or to misplace a decimal point. These types of mistakes happen frequently in life, and are not always caught by those responsible for editing. If a reported statistic just doesn't seem right, then it is a good idea to recheck calculations when possible. Also, if the numbers appear to be 'too good to be true', then they just might be!
The department of health often studies the use of tobacco among teens. The following is a description by the Minnesota Department of Health describing how they chose the sample for the 2008 Minnesota Youth Tobacco and Asthma Survey. In 2008, they had 2,267 high school students complete surveys and 2,322 middle school students complete surveys. Each student in the sample completed an extensive questionnaire consisting of many questions related to tobacco use. Answer the questions that follow. To see the entire report go to: http://www.health.state.mn.us/divs/hpcd/tpc/reports/
a) What type of sampling methods were used for this?
b) Identify the population, the parameter of interest, the sample, and the individuals for this study
c) When asking teens about tobacco use, what types or causes of bias will likely be present?. What could be done to limit bias?
d) This graph shows how the percent of teens using tobacco has changed from 2000 to 2008. Identify the statistics that were found for this question in 2008.
a) This study used a complicated combination of sampling methods. They used a stratified, multi-stage, random cluster sample method to select individuals. It was a stratified random sample (by high school and middle school), it was a multi-stage random sample (first random schools were selected, second random classes were selected), and it was a cluster sample (every student in each class was given the survey).
b) population (of interest): all middle and high school students in Minnesota
parameter (of interest): teen tobacco use
sample: 2267 high school, and 2322 middle school students in Minnesota (from 48 public middle schools and 51 public high schools in Minnesota)
individuals: each student who completed a survey
c) Response bias: Tobacco use is not legal for people under 18, so teens will not want to tell the truth if they think they may get in trouble.
Non-response bias: Some people were absent the day survey was given.
Undercoverage: Only public school students were included, so those who attend private schools were left out.
Wording of the questions: This could be a problem, but we do not know the exact wording so cannot be sure.
To avoid the response bias factor, surveys regarding controversial topics should all be anonymous. If you read further into this report, you will see that the students were assured all results would be anonymous (no names or ID numbers included).
To avoid the non-response bias factor, students who were absent could be given the survey when they return to school.
To avoid the undercoverage of private school students, private schools could be included in the sample.
d) In 2008, 27.0% of the high school students asked and 6.9% of the middle school students asked had used tobacco in the last 30 days.
Problem Set 4.2
Section 4.2 Exercises
1) For each of the following, determine whether the bold number is a parameter or a statistic. (hint: remember that a parameter is a number that represents an entire population and a statistic is a number that represents a sample)
a) The average height of all oak trees is 42.3 feet.
b) Ms. Anderson's class average on the final exam was 71.4%.
c) The average number of songs that the students surveyed have on their iPods was 791 songs.
d) Itunes reports that the average number of songs people have on their iPods is 503 songs.
e) The sticker on the Super Speedster Sport Sedan says 17.82 mpg.
f) Martin had to keep track of how much time he spent watching TV for a whole week. He found that last week he averaged 3.4 hours of TV per day.
2) Minnesota's Best High School found that last year they did not have enough seats or room for all of the family members who wished to attend the graduation ceremony. The administrators at MBHS need to decide where to hold the graduation ceremony this year, so they sent a questionnaire home with each of this year's 543 seniors early in September. They asked for the surveys to be completed and returned by September 27th. Of the 148 surveys returned, the average number of seats that will be needed is 6.2. To be safe, the administrators use 7 and determine that they will need a hall that can hold 3800 people (543 students X 7 seats = 3801 seats needed). Using this number they find an appropriately sized hall and reserve it. Identify each of the following as specifically as possible.
a) population (of interest)
b) parameter (of interest)
e) sampling method that was used
f) What is the response rate (the percent of surveys returned)?
g) What is wrong with the what these administrators have done? What type of bias or error is likely present?
h) Will the statistic most likely by too high or too low? What is a likely consequence of this biased result?
3. Suppose that a survey is to be conducted at Minnesota's Best High School. Population of interest: 2640 MBHS students. Sample size: 240 MBHS students. Identify specifically the sampling method that is being proposed in each scenario. Also, comment on any potential problem or bias that will likely occur.
a) Every freshman's name is put on a slip of paper and put into a giant bucket. Sixty names are pulled out of the hat. This process is repeated for each grade level.
b) A list of all students is obtained from the counselors. Julie randomly selects a number between 1 and 2640 and then finds the student that matches this number on the list. She then selects every eleventh person on the list after that one (cycling back to the beginning of the list) until 240 names are chosen.
c) Surveys are handed out with lunches. Students are asked to complete them and turn them in on a table in the front of the cafeteria.
d) A computer randomly selects 240 names from the entire list of students in the school database.
e) Twelve teachers are randomly selected. Two of each of their classes are then randomly selected. Ten students from each of these classes are then selected.
f) Three teachers, Mr. Niceguy, Mr. Greatguy and Mr. Happyguy, each volunteers to survey the students in all of his classes.
4. Name, and briefly describe, the type of bias that would most likely be present in each of the following situations:
a) What is the name of the type of bias in the cartoon?
b) As the 2010 Census was being conducted, many people did not return their forms. What type of bias is this?
c) What type of bias would most likely be present if high school students are interviewed about their drinking and drug use habits? Would the statistics most likely over- or under-estimate the true parameters?
d) What is the one type of sampling error that we expect to happen, but cannot do anything to avoid, called?
e) When calculating the statistics from a survey, a typo is made. What type of error is this?
f) A radio talk show host asks, “Do you think that the driving age should be changed to 18?” What type of bias will most likely be present? Why is this?
g) If a survey is conducted by door-to-door interviews and the interviewers skip a few neighborhoods that 'make them nervous', what type of bias is this called?
h) If an interviewer asks each person, “Do you prefer Pizza Ickarooni, or the delicious fresh flavors of Pizza Delicioso?”, what type of bias is present?
5) One die is rolled, what is the chance that a number greater than four or an even number is showing?
6) One die is rolled, what is the chance that a number greater than four and an even number is showing?
7) Two dice are rolled. What is the probability that the sum of the number of dots showing is nine or greater?
8) If three dice are rolled, what is the probability of getting three of a kind (all 3 dice show the same number of dots)?