In this concept you will learn how to appropriately decide whether to gather a sample for study, or use the entire population in question.
Sometimes it can be a bit tricky to decide whether to conduct a particular study upon a sample group or on the entire population. Suppose you were attempting to put together a menu for a camping trip with a large group of friends and wanted to make sure nobody was allergic to peanuts before planning peanut-butter sandwiches for lunch. Would you need to question all 50+ friends individually? Would it make sense to choose a representative sample to poll instead? What if you wanted to pick a few popular types of soda to bring along, would that be a different situation?
At the end of the lesson, we’ll return to this question to apply what we have discussed.
http://youtu.be/vlbjxguaMrk EducatorVids2: Statistics – Populations vs Samples
Before you begin any particular study, you will need to decide whether you need to get data from the entire population in question, or just a representative sample of the population instead. For most studies, it makes much more sense to use a sample than to try to collect data on an entire population, but sometimes a sample is not enough. The most famous census is the U.S. Population Census, conducted once every 10 years.
According to the Constitution, the population of the United States is enumerated once every ten years by physical count, and estimated in the intervening years by statistical sample. Observing the incredible cost (the 2010 Census cost approximately 13 billion dollars!) and organizational effort required for the census makes it clear why there are so few census studies conducted on the U.S. population. However, smaller census studies are more common than you might think.
Some studies would make no sense at all to conduct on an entire population. In fact, one entire class of study comes to mind: destructive study. A destructive study requires that the sample be ruined for its intended use by the study itself. Vehicle manufacturers test the durability of different models by crashing sample vehicles into simulated walls or other cars. Obviously if such studies were conducted on the entire population, there would be no cars left to sell!
As a student you are most likely familiar with the most common census there is: the attendance count that takes place each morning. Each class is polled to identify any students who are not present, and the data is compiled in the administrative office.
What likely uses are there for this data, and why might it be collected as a census rather than using a representative sample?
Solution: The most apparent use of this data is to notify parents of students who did not make it to class, for safety and rule enforcement. Since the goal is to locate each and every possibly missing or late student, a statistical sample just isn’t acceptable.
When insurance companies set auto insurance rates, they adjust them according to statistically relevant demographic differences among drivers. The process of determining which groups of drivers is the most likely to be involved in expensive accidents is a statistical analysis using police reports and accident claims as data sources.
It is widely accepted that teenage boys are the most expensive demographic to insure, would you expect this information to be based on the population of teenage boys, or of teenage drivers, or of a sample of the appropriate demographic(s), and why?
Solution: The information is based on a sample of the teenage male drivers demographic, compared to a sample of teen and adult drivers in general. It would be virtually impossible to conduct a true census of all accidents involving teen male drivers, as there are just too many and there is no real way to insure that all accidents are correctly documented.
Suppose your biology teacher wanted to encourage the students in her class to work together on a large project, so she promised the class a pizza party if every single student completed the assigned homework by the deadline. With the deadline fast approaching, you decide to make sure that everyone is on track to get the assignment done on time.
Is this a situation where it would be appropriate to conduct a sample poll of the students, or should you do a full census of all 32 students in the class?
Solution: If you want to be sure that everyone is really on track, you’d better complete a full census . A well-chosen sample would give you an idea of how far along the class is in general, but would not be effective at identifying all of the outliers which are really the most important data points in this particular study.
Concept Problem Revisited
Suppose you were attempting to put together a menu for a camping trip with a large group of friends and wanted to make sure nobody was allergic to peanuts before planning peanut-butter sandwiches for lunch. Would you need to question all 50+ friends individually? Would it make sense to choose a representative sample to poll instead? What if you wanted to pick a few popular types of soda to bring along, would that be a different situation?
As inconvenient as it might be, you would certainly be well advised to actually ask each and every one of the friends planning to attend the trip about possible peanut allergies. Since even a single person having a severe allergic reaction would probably ruin the trip for everyone, the time saving of a sample poll instead of the complete census would just not be worth the risk.
The soda choice would indeed be a very different situation. Since it is very unlikely that anyone is going to be more than a little inconvenienced by a particular set of drink choices, a quickly generated list of suggestions from a half-dozen people or so would probably be just fine.
A population is the complete set (every single member) of a group of possible items to be studied.
To poll the members of a group means to question them regarding a specific topic.
A sample or subset is a smaller group of members chosen to represent a larger group. Properly chosen, a sample should provide the same results (on a smaller scale) as the population from which it was created.
A control group is a set of members deliberately kept as separate as possible from a particular study so as to provide an example of how the members should appear if unchanged.
Bias refers to a desire to achieve a specific result from a particular study, regardless of the data.
- A study is to be conducted on the psychological effects of personally witnessing a jewelry store theft from a local mall. Police records suggest that there were a total of 23 witnesses. Is this a situation that would suggest that the entire population be included in the study, why or why not?
- A new medicine has been developed that the developer claims will stimulate hair growth in balding men. Would you expect there to be safety tests conducted on the population of men before release?
- The Ford Explorer is a popular sport-utility vehicle sold in the U.S. originally equipped with Firestone tires. In May of 2000, Ford and Firestone were both accused of responsibility in hundreds of vehicle accidents caused by tire failure. Given that all vehicles sold in the U.S. undergo extensive safety testing, how could so many bad products have slipped through?
- You and your team are conducting a study on the differences in the ability of students in your school to focus during different times throughout the day. Each day your team chooses every student to walk in the door, and you study 112 students on Monday, 78 on Tuesday, and 109 on Wednesday. If there are 299 students in the school, is this a sample or a population?
- Why would it be virtually unarguable to state that a product claiming to be “Everyone’s Favorite Soda,” has not been properly evaluated from a statistical standpoint?
- The relatively small population size in this example certainly suggests that a full census be taken. A shopping mall in likely to contain a rather broad range of demographics, and the 23 witnesses are therefore likely to have many differences in age, sex, background, profession, etc.. Any representative sample taken would probably not be able to accurately represent the full range of possible factors affecting the results of the study.
- Read the question carefully! In statistics, “population” has a very specific meaning. It would be impossible to conduct safety tests on every man in the world, therefore any safety tests would have to be conducted on a representative sample, not on the population of male humans.
- There are many ways that the problem could have gone unnoticed. This is a situation where a census study of every Explorer produced in just not feasible; much of the testing simply has to be conducted on a representative sample. Perhaps the sample vehicles used for safety testing just happened to be ones with good tires, or perhaps the safety tests weren’t extensive enough, or the results were incorrectly evaluated.
- Even though your team collected samples equal to the population of the school, it would still be a representative sample rather than a true census since your random selection method almost certainly resulted in the observation of some students multiple times, and missed others entirely.
- A population study on every single person in the world is impossible.
The local public library wants to know if it should increase its hours of operation.
1. How would you want to go about conducting your research? Would you collect a sample or take a census?
2. How would you collect your sample? What time of day would be best to collect the information? Why?
Some college students who were writing a research paper on whether people their age prefer vocal or instrumental music, decide to do so by sampling 100 people at a concert.
3. What is their population?
4. What is their sample?
5. What is wrong with their sample, based on the identified population?
Identify the Population and the Sample
6. In a survey of 1500 American households, it was found that 20% of the households own a computer.
7. In a recent survey of 2578 highschool students, it was found that 28% of them come from single parent homes.
8. The average height of every person entering the movie theatre within a 3 hour period was .
Identify each scenario as either sampling or census, and identify it as either random or not random.
9. Only 12 tickets are available for over 30 candidates. All their names are thrown into a hat and 12 are pulled out.
10. A student wants to know how many students in school have ever worn a cast. Every student who comes to school that day is handed a short survey that they must turn in before they head to lunch.
11. You ask 30 people in a clothing store which clothing store is their favorite.
Identify the choice that best completes the statement or answers the question.
12. A local business owner wants to find out which benefits plan its employees would prefer. Which of the procedures listed below would be the best way to obtain a statistically unbiased sample?
a. Survey a random sample of employees from a list of all employees
b. Invite all employees to indicate their choices by email
c. Place suggestion boxes at random locations in the company’s plant and offices
d. Assemble a group with one member from each department and ask them their preference.
13. A simple random sample of 300 people is selected from the 1650 male students in a university business course to take part in a business analysis test. The population being considered is:
c. People taking part in the test
d. Male students enrolled in a university business course.
14. Which is the best example of an unbiased question?
a. Does the school board have the right to enforce a dress code?
b. Do you think the principal is doing a good job in spite of his questionable character?
c. Do you prefer a daytime or evening class schedule?
d. Do you think the government should be allowed to seize whatever property they want to build a new highway?
15. Which question is biased?
a. Do you prefer daytime or evening television programming?
b. Should there be a school dress code?
c. Do you prefer news or mindless sitcoms?
d. Do you think a new highway should be built?