- Differentiate between a census and a survey or sample.
- Distinguish between sampling error and bias.
- Identify and name potential sources of bias from both real and hypothetical sampling situations.
The New York Times/ CBS News Poll is a well-known regular polling organization that releases results of polls taken to help clarify the opinions of Americans on current issues, such as election results, approval ratings of current leaders, or opinions about economic or foreign policy issues. In an article that explains some of the details of a recent poll entitled “How the Poll Was Conducted” the following statements appear:
“In theory, in cases out of , overall results based on such samples will differ by no more than three percentage points in either direction from what would have been obtained by seeking to interview all American adults.”
“In addition to sampling error, the practical difficulties of conducting any survey of public opinion may introduce other sources of error into the poll. Variation in the wording and order of questions, for example, may lead to somewhat different results.”
These statements illustrate the two different potential problems with opinion polls, surveys, observational studies, and experiments. In chapter 1, we identified some of the basic vocabulary of populations and sampling. In this lesson, we will review those ideas and investigate the sampling in more detail.
Census vs. Sample
In Chapter 1 we identified a population as the entire group that is being studied. A sample is a small, representative subset of the population. If a statistician or other researcher really wants to know some information about a population, the only way to be truly sure is to conduct a census. In a census, every unit in the population being studied is measured or surveyed. In opinion polls like the New York Times poll mentioned above, a smaller sample is used to generalize from. If we really wanted to know the true approval rating of the president, for example, we would have to ask every single American adult their opinion. There are some obvious reasons that a census is impractical in this case, and in most situations.
First, it would be extremely expensive for the polling organization. They would need an extremely large workforce to try and collect the opinions of every American adult. How would you even be sure that you could find every American adult? It would take an army of such workers and many hours to organize, interpret, and display this information. Even if all those problems could be overcome, how long do you think it would take? Being overly optimistic that it could be done in several months, by the time the results were published it would be very probable that recent events had changed peoples’ opinions and the results would be obsolete.
Another reason to avoid a census is when it is destructive to the population. For example, many manufacturing companies test their products for quality control. A padlock manufacturer might use a machine to see how much force it can apply to the lock before it breaks. If they did this with every lock, they would have none to sell! It would not be a good idea for a biologist to find the number of fish in a lake by draining the lake and counting them all!
The US Census is probably the largest and longest running census. The Constitution mandates a complete counting of the population. The first U.S. Census was taken in 1790 and was done by U.S. Marshalls on horseback. Taken every , a new Census is scheduled for 2010 and in a report by the Government Accountability Office in 1994, was estimated to cost . This cost has recently increased as computer problems have forced the forms to be completed by hand. You can find a great deal of information about the US Census as well as data from past censuses on the Census Bureau’s website: http://www.census.gov/.
Due to all of the difficulties associated with a census, sampling is much more practical. However, it is important to understand that even the most carefully planned sample will be subject to random variation between the sample and population. As we learned in Chapter 1, these differences due to chance are called sampling error. We can use the laws of probability to predict the level of accuracy in our sample. Opinion polls, like the New York Times poll mentioned in the introduction tend to refer to this as margin of error. In later chapters, you will learn the statistical theory behind these calculations. The second statement quoted from the New York Times article mentions the other problem with sampling. It is often difficult to obtain a sample that accurately reflects the total population. It is also possible to make mistakes in selecting the sample and collecting the information. These problems result in a non-representative sample, or one in which our conclusions differ from what they would have been if we had been able to conduct a census.
To help understand these ideas, let’s look at a more theoretical example. A coin is considered “fair” if the probability, , of the coin landing on heads is the same as the probability of landing on tails . The probability is defined as the proportion of each result obtained from flipping the coin infinitely. A census in this example would be an infinite number of coin flips, which again is quite impractical. So instead, we might try a sample of coin flips. Theoretically, you would expect the coin to land on heads . But it is very possible that, due to chance alone, we would experience results that differ from the actual probability. These differences are due to sampling error. As we will investigate in detail in later chapters, we can decrease the sampling error by increasing the sample size (or the number of coin flips in this case). It is also possible that the results we obtain could differ from those expected if we were not careful about the way we flipped the coin or allowed it to land on different surfaces. This would be an example of a non-representative sample.
At the following website you can see the results of a large number of coin flips - http://shazam.econ.ubc.ca/flip/. You can see the random variation among samples by asking for the site to flip coins five times. Our results for that experiment produced the following number of heads: and which seems quite strange, since the expected number is . How do your results compare?
Bias in Samples and Surveys
The term most frequently applied to a non-representative sample is bias. Bias has many potential sources. It is important when selecting a sample or designing a survey that a statistician make every effort to eliminate potential sources of bias. In this section we will discuss some of the most common types of bias. While these concepts are universal, the terms used to define them here may be different than those used in other sources.
Sampling bias refers in general to the methods used in selecting the sample for a survey, observational study, or experiment. The sampling frame is the term we use to refer to the group or listing from which the sample is to be chosen. If we wanted to study the population of students in your school, you could obtain a list of all the students from the office and choose students from the list. This list would be the sampling frame. The following are some of the more common sources of potential sampling bias.
Incorrect Sampling Frame
If the list from which you choose your sample does not accurately reflect the characteristics of the population, this is called incorrect sampling frame. A sampling frame error occurs when some group from the population does not have the opportunity to be represented in the sample. Surveys are often done over the telephone. You could use the telephone book as a sampling frame by choosing numbers from the phonebook. In addition to the many other potential problems with telephone polls, some phone numbers are not listed in the telephone book. Also, if your population includes all adults, it is possible that you are leaving out important groups of that population. For example, many younger adults especially tend to only use their cell phones or computer based phone services and may not even have traditional phone service. The sampling frame does not need to be an actual list. Even if you picked phone numbers randomly, the sampling frame could be incorrect because there are also people, especially those who may be economically disadvantaged, who have no phone. There is absolutely no chance for these individuals to be represented in your sample. A term often used to describe the problems when a group of the population is not represented in a survey is undercoverage. Undercoverage can result from all of the different sampling bias.
One of the most famous examples of sampling frame error occurred during the 1936 U.S. presidential election. The Literary Digest, a popular magazine at the time, conducted a poll and predicted that Alf Landon would win the election that, as it turned out, was won in a landslide by Franklin Delano Roosevelt. The magazine obtained a huge sample of ten million people, and from that pool replied. With these numbers, you would typically expect very accurate results. However, the magazine used their subscription list as their sampling frame. During the depression, these individuals would have been only the wealthiest Americans, who tended to vote Republican, and left the majority of typical voters undercovered.
Suppose your statistics teacher gave you an assignment to perform a survey of individuals. You would most likely tend to ask your friends and family to participate because it would be easy and quick. This is an example of convenience sampling or convenience bias. While it is not always true, your friends are usually people that share common values, interests, and opinions. This could cause those opinions to be over-represented in relation to the true population. Have you ever been approached by someone conducting a survey on the street or in a mall? If such a person were just to ask the first people they found, there is the potential that large groups representing various opinions would not be included, resulting in under coverage.
Judgment sampling occurs when an individual or organization, usually considered an expert in the field being studied, chooses the individuals or group of individuals to be used in the sample. Because it is based on a subjective choice, even someone considered an expert, it is very susceptible to bias. In some sense, this is what those responsible for the Literary Digest poll did. They incorrectly chose groups they believed would represent the population. If a person wants to do a survey on middle class Americans, how would they decide who to include? It would be left to their own judgment to create the criteria for those considered middle-class. This individual’s judgment might result in a different view of the middle class that might include wealthier individuals that others would not consider part of the population. Related to judgment sampling, in quota sampling, an individual or organization attempts to include the proper proportions of individuals of different subgroups in their sample. While it might sound like a good idea, it is subject to an individual’s prejudice and is therefore prone to bias.
If one particular subgroup in a population is likely to be more or less represented due to its size, this is sometimes called size bias. If we chose a state at random from a map by closing our eyes and pointing to a particular place, larger states have a greater chance of being chosen than smaller ones. Suppose that we wanted to do a survey to find out the typical size of a student’s math class at this school. The chances are greater that you would choose someone from a larger class. To understand this, let’s use a very simplistic example. Say that you went to a very small school where there are only four math classes, one has students, and the other three have only students. If you simply choose a student at random, there are more students in the larger class, so it is more likely you will select students in your sample who will answer “”.
For example, people driving on an interstate highway tend to say things like, “Wow, I was going the speed limit and everyone was just flying by me.” The conclusion this person is making about the population of all drivers on this highway is that most of them are traveling faster than the speed limit. This may indeed most often be true! Let’s say though, that most people on the highway, along with our driver, really are abiding by the speed limit. In a sense, the driver is collecting a sample. It could in fact be true that most of the people on the road at that time are going the same exact speed as our driver. Only those few who are close to our driver will be included in the sample. There will be a larger number of drivers going faster in our sample, so they will be overrepresented. As you may already see, these definitions are not absolute and often in a practical example, there are many types of overlapping bias that could be present and contribute to over or under coverage. We could also cite incorrect sampling frame or convenience bias as potential problems in this example.
We will use the term response bias to refer in general terms to the types of problems that result from the ways in which the survey or poll is actually presented to the individuals in the sample.
Voluntary Response Bias
Television and radio stations often ask viewers/listeners to call in with opinions about a particular issue they are covering. The websites for these and other organizations also usually include some sort of online poll question of the day. Reality television shows and fan balloting in professional sports to choose “all star” players make use of these types of polls as well. All of these polls usually come with a disclaimer stating that, “This is not a scientific poll.” While perhaps entertaining, these types of polls are very susceptible to voluntary response bias. The people who respond to these types of surveys tend to feel very strongly one way or another about the issue in question and the results might not reflect the overall population. Those who still have an opinion, but may not feel quite so passionately about the issue, may not be motivated to respond to the poll. This is especially true for phone in or mail in surveys in which there is a cost to participate. The effort or cost required tends to weed out much of the population in favor of those who hold extremely polarized views. A news channel might show a report about a child killed in a drive by shooting and then ask for people to call in and answer a question about tougher criminal sentencing laws. They would most likely receive responses from people who were very moved by the emotional nature of the story and wanted anything to be done to improve the situation. An even bigger problem is present in those types of polls in which there is no control over how many times an individual may respond.
One of the biggest problems in polling is that most people just don’t want to be bothered taking the time to respond to a poll of any kind. When people hang up on a telephone survey, put a mail-in survey in the recycling bin, or walk quickly past the interviewer on the street. We just don’t know how those individuals beliefs and opinions reflect those of the general population and therefore almost all surveys could be prone to non-response bias.
Questionnaire bias occurs when the way in which the question is asked influences the response given by the individual. It is possible to ask the same question in two different ways that would lead individuals with the same basic opinions to respond differently. Consider the following two questions about gun control.
Do you believe that it is reasonable for the government to impose some limits on purchases of certain types of weapons in an effort to reduce gun violence in urban areas?
Do you believe that it is reasonable for the government to infringe on an individual’s constitutional right to bear arms?
A gun rights activist might feel very strongly that the government should never be in the position of limiting guns in any way and would answer no to both questions. Someone who is very strongly against gun ownership would similarly answer no to both questions. However, individuals with a more tempered, middle position on the issue might believe in an individual’s right to own a gun under some circumstances while still feeling that there is a need for regulation. These individuals would most likely answer these two questions differently.
You can see how easy it would be to manipulate the wording of a question to obtain a certain response to a poll question. Questionnaire bias is not necessarily always a deliberate action. If a question is poorly worded, confusing, or just plain hard to understand it could lead to non-representative results. When you ask people to choose between two options, it is even possible that the order in which you list the choices may influence their response!
Incorrect Response Bias
A major problem with surveys is that you can never be sure that the person is actually responding truthfully. When an individually intentionally responds to a survey with an untruthful answer, this is called incorrect response bias. This can occur when asking questions about extremely sensitive or personal issues. For example, a survey conducted about illegal drinking among teens might be prone to this type of bias. Even if guaranteed their responses are confidential, some teenagers may not want to admit to engaging in such behavior at all. Others may want to appear more rebellious than they really are, but in either case we cannot be sure of the truthfulness of the responses. As the dangers of donated blood being tainted with diseases carrying a negative social stereotype developed in the 1990’s, the Red Cross deals with this type of bias on a constant and especially urgent basis. Individuals who have engaged in behavior that puts them at risk for contracting AIDS or other diseases, have the potential to pass them on through donated blood. Screening for these behaviors involves asking many personal questions that some find awkward or insulting and may result in knowingly false answers. The Red Cross has gone to great lengths to devise a system with several opportunities for individuals giving blood to anonymously report the potential danger of their donation.
In using this example, we don’t want to give the impression that the blood supply is unsafe. According to the Red Cross, “Like most medical procedures, blood transfusions have associated risk. In the more than fifteen years since March 1985, when the FDA first licensed a test to detect HIV antibodies in donated blood, the Centers for Disease Control and Prevention has reported only cases of AIDS caused by transfusion of blood that tested negative for the AIDS virus. During this time, more than blood components were transfused in the United States… The tests to detect HIV were designed specifically to screen blood donors. These tests have been regularly upgraded since they were introduced. Although the tests to detect HIV and other blood-borne diseases are extremely accurate, they cannot detect the presence of the virus in the "window period" of infection, the time before detectable antibodies or antigens are produced. That is why there is still a very slim chance of contracting HIV from blood that tests negative. Research continues to further reduce the very small risk.” Source: http://chapters.redcross.org/br/nypennregion/safety/mythsaid.htm
Reducing Bias: Randomization and other Techniques
The best technique for reducing bias in sampling is randomization. A simple random sample (commonly referred to as an SRS) is a technique in which all units in the population have an equal probability of being selected for the sample. For example, if your statistics teacher wants to choose a student at random for a special prize, they could simply place the names of all the students in the class in a hat, mix them up, and choose one. More scientifically, we could assign each student in the class a number from to say (assuming there are students in the class) and then use a computer or calculator to generate a random number to choose one student.
A note about “randomness”
Your graphing calculator has a random number generator. Press [MATH] and move over to [PRB], which stands for probability. (Note: instead of pressing the right arrow three times, you can just use the left once!). Choose rand for the random number generator and press [ENTER] twice to produce a random number between and . Press [ENTER] a few more times to see more results.
It is important that you understand that there is no such thing as true “randomness”, especially on a calculator or computer. When you choose the rand function, the calculator has been programmed to return a ten digit decimal that, using a very complicated mathematical formula, simulates randomness. Each digit, in theory, is equally likely to occur in any of the individual decimal places. What this means in practice, is that if you had the patience (and the time!) to generate a million of these on your calculator and keep track of the frequencies in a table, you would find there would be an approximately equal number of each digit. Two brand new calculators will give the exact same sequence of random numbers! This is because the function that simulates randomness has to start at some number, called a seed value. All the calculators are programmed from the factory (or when the memory is reset) to use a seed value of zero. If you want to be sure that your sequence of “random” digits is different from someone else’s, you need to seed your random number function using a number different from theirs. Type a unique sequence of digits on the homescreen and then press [STO], enter the rand function, and press [ENTER]. As long as the number you chose to seed the function is different, you will get different results.
Now, back to our example, if we want to choose a student, at random, between and , we need to generate a random integer between and . To do this, press [MATH], [PRB], and choose the random integer function.
The syntax for this command is as follows:
RandInt( starting value, ending value, number of random integers)
The default for the last field is , so if you only need a single random digit, you can enter:
In this example, the student chosen would be student #7. If we wanted to choose students at random, we could enter:
However, because the probabilities of any digit being chosen each time are independent, it is possible to choose the same student twice.
What we can do in this case is ignore any repeated digits. Student has already been chosen, so we will ignore the second . Press [ENTER] again to generate new random numbers and choose the first one that is not in your original set.
In this example, student was also already chosen, so we would select #14 as our fifth student.
There are other types of samples that are not simple random samples. In systematic sampling, after choosing a starting point at random, subjects are selected using a jump number chosen at the beginning. If you have ever chosen teams or groups in gym class by “counting off” by threes or fours, you were engaged in systematic sampling. The jump number is determined by dividing the population size by the desired sample size, to insure that the sample combs through the entire population. If we had a list of everyone in your class of students in alphabetical order, and you wanted to choose five of them, we would choose every student. Generate a random number from to .
In this case we would start with student #14 and then generate every fifth student until we had five in all, and when we came to the end of the list, we would continue the count at number . Our chosen students would be: . It is important to note that this is not a simple random sample as not every possible sample of students has an equal chance to be chosen. For example, it is impossible to have a sample consisting of students and .
Cluster sampling is when a naturally occurring group is selected at random, and then either all of that group, or randomly selected individuals from that group are used for the sample. If we select from random out of that group, or cluster into smaller subgroups, this is referred to as multi-stage sampling. To survey student opinions or study their performance, we could choose schools at random from your state and then use an SRS (simple random sample) from each school. If we wanted a national survey of urban schools, we might first choose major urban areas from around the country at random, and then select schools at random from each of those cities. This would be both cluster and multi-stage sampling. Cluster sampling is often done by selecting a particular block or street at random from within a town or city. It is also used at large public gatherings or rallies. If officials take a picture of a small, representative area of the crowd and count the individuals in just that area, they can use that count to estimate the total crowd in attendance.
In stratified sampling, the population is divided into groups, called strata (the singular term is stratum) that have some meaningful relationship. Very often, groups in a population that are similar may respond differently to a survey. In order to help reflect the population, we stratify to insure that each opinion is represented in the sample. For example, we often stratify by gender or race in order to make sure that the often divergent views of these different groups are represented. In a survey of high school students we might choose to stratify by school to be sure that the opinions of different communities are included. If each school has approximately equal numbers, then we could simply choose to take an SRS of size from each school. If the numbers in each stratum are different, then it would be more appropriate to choose a fixed sample ( students, for example) from each school and take a number from each school proportionate to the total school size.
If you collect information from every unit in a population, it is called a census. Because censuses are so difficult to do, we instead take a representative subset of the population, called a sample, to try and make conclusions about the entire population. The downside to sampling is that we can never be completely, sure that we have captured the truth about the entire population due to random variation in our sample that is called sampling error. The list of the population from which the sample is chosen is called the sampling frame. Poor technique in choosing or surveying a sample can also lead to incorrect conclusions about the population that are generally referred to as bias. Selection bias refers to choosing a sample that results in a sub group that is not representative of the population. Incorrect sampling frame occurs when the group from which you choose your sample does not include everyone in the population or at least units that reflect the full diversity of the population. Incorrect sampling frame errors result in undercoverage. This is where a segment of the population containing an important characteristic did not have an opportunity to be chosen for the sample and will be marginalized, or even left out altogether.
Points to Consider
- How is the margin of error for a survey calculated?
- What are the effects of sample size on sampling error?
- Is the plural of census censuses, or censi?
- Brandy wanted to know which brand of soccer shoe high school soccer players prefer. She decided to ask the girls on her team which brand they liked.
- What is the population in this example?
- What are the units?
- If she asked ALL high school soccer players this question, what is the statistical term we would use to describe the situation?
- Which group(s) from the population is/are going to be underrepresented?
- What type of bias best describes the error in her sample? Why?
- Brandy got a list of all the soccer players in the colonial conference from her athletic director, Mr. Sprain. This list is called the:
- If she grouped the list by boys and girls, and chose boys at random, and girls at random, what type of sampling best describes her method?
- Your doorbell rings and you open the door to find a foot tall boa constrictor wearing a trench coat and holding a pen and a clip board. He says to you, “I am conducting a survey for a local clothing store, do you own any boots, purses, or other items made from snake skin?” After recovering from the initial shock of a talking snake being at the door you quickly and nervously answer, “Of course not.” As the wallet you bought on vacation last summer at Reptile World weighs heavily in your pocket. What type of bias best describes this ridiculous situation? Explain why.
In each of the next two examples, identify the type of sampling that is most evident and explain why you think it applies.
- In order to estimate the population of moose in a wilderness area, a biologist familiar with that area selects a particular marsh area and spends the month of September, during mating season, cataloging sightings of moose. What two types of sampling are evident in this example?
- The local sporting goods store has a promotion where every customer gets a gift card.
For questions 5-9, an amusement park wants to know if its new ride, The Pukeinator, is too scary. Explain the type(s) of bias most evident in each sampling technique and/or what sampling method is most evident. Be sure to justify your choice.
- The first riders on a particular day are asked their opinions of the ride.
- The name of a color is selected at random and only riders wearing that particular color are asked their opinion of the ride.
- A flier is passed out inviting interested riders to complete a survey about the ride at that evening.
- Every teenager exiting the ride is asked in front of his friends: “You didn’t think that ride was scary, did you?”
- Five riders are selected at random during each hour of the day, from until closing at .
- There are students taking statistics in your school and you want to choose of them for a survey about their impressions of the course. Use your calculator to select a SRS of students. (Seed your random number generator with the number before starting). Assuming the students are assigned numbers from to , which students are chosen for the sample?
- (a) All high school soccer players. (b) Each individual high school soccer player. (c) A census. (d) Boys, students from other areas of the country of different socio-economic or cultural backgrounds, if she is on a varsity team, perhps JV or freshman soccer players might have different preferences. (e) There are multiple answers, which is why the explanation is very important. The two most obvious sources are: Convenience bias, she asked the group that was most easily accessible to her, her own teammates. Incorrect Sampling frame, boys or some of the other undercovered groups mentioned in , have no chance of being included in the sample. (f) The sampling frame. (g) Stratification.
- This is incorrect response bias. You are intentionally answering the question incorrectly so as to not antagonize the giant talking snake!
- The biologist is using her knowledge of moose behavior to choose an area and a time in which to estimate the population, this is judgment sampling. She has also selected one particular lake to estimate the entire region, which could be considered a form of cluster sampling.
- Systematic sampling. The customer is selected based on a fixed interval.
- Convenience bias. The first riders is an easy group to access. Incorrect Sampling Frame. The first riders of the day are likely to be those who are most excited by high-thrill rides and may not have the same opinions as those who are less enthusiastic about riding.
- Cluster sampling. A group is chosen because of a natural relationship that does not necessarily have any similarity of response, i.e. we have no reason to believe that people wearing a certain color would respond similarly, or differently, from anyone else in the population.
- Voluntary response bias. Participants will self-select. Non-response bias. A large percentage of potential participants are not going to want to be bothered participating in a survey at the end of a long day at an amusement park.
- There are several potential answers. Incorrect Response Bias. The chosen participants might not want to admit to being scared in front of the young lady. Questionnaire bias. The question is definitely worded in a manner that would encourage participants to answer in a particular way. This is also systematic sampling and someone used their judgment that only boys should be surveyed. A case could also be made for incorrect sampling frame as no girls or other age groups have a chance of being represented. All of these examples also eliminate the opinions of those in the park who do not choose to ride.
- Stratification. It could be that people who ride at different times during the day have different opinions about thrill rides or are from different age groups. In this case, each hour is a stratum. For example, it could be that those riding early in the morning are more of the thrill seeker types, and the more hesitant folks might take some time to muster the courage to ride.
- To make it easier to keep track of repeated choices, we have generated numbers and stored them in . The chosen students are: In this example there were no repeated digits.