Sampling distributions are frequently tough for students. Most of what has been studied up to this point are fairly concrete, but now there is a level of abstraction that can be tough to follow. The problem is that we are now talking about sampling a random variable that is itself a function of other random variables. The best way to handle this is to try to present as many different kinds of explanations as possible, using different language. One thing that is nice is that it is very possible to carry out an experiment in the classroom just like the example that is outlined in the test. Another good practice is to make absolutely sure you have been, and continue to be consistent with the notation used, with and both representing means, but the population and the sample respectively.
This is the beginning of the some of the interesting facts about the normal distribution. There will be theorae later on that states why, but students should have their attention brought to all of the times that the normal distribution appears.
The formulae for sampling error, the mean and the standard deviation are ones that should be added to the “memorize this” list.
The z-score and the Central Limit Theorem
Using the term score is unfortunately necessary. It’s always preferable to not introduce new terms when they are not needed. In this case the z-score is really just the standard deviations away from the mean, with the small exception of negative or positive signs indicating direction below or above respectively. It is a term that is in common use, however, and will be referred to by that name in both the AP examination as well as later courses students may take. The only possible exception being a calculus based stats class for math majors. It’s always a good idea to cycle through some problems regarding reading z-score tables and standardizing values even outside of the chapters that include the topic.
The central limit theorem might be the most important single idea of a first year class. It is critical to know the formulas for sample proportions and sample means, but knowledge of the mechanics of the central limit theorem are not needed. The text does not present a formal proof, nor do most texts for a first year stats class. Because of student’s work with various previous problems, students should be familiar with the idea that so many natural occurrences are normal, so a sort of study of a large number of studies should also be normal.
Binomial Distribution and Binomial Experiments
This is the beginning of the set of lessons where a common distribution is discussed, the mean and standard deviation is derived and then practiced. It is a common practice to not determine how to find the mean each time, but rather work from formulae once the distribution is determined. I will give my students a sheet of the common distributions and each of their means and variances. On a timed test it’s a major advantage to be able to recall how to find each parameter, and move on with the problem.
Depending on what your technology resources, you may run a computer program that shows an animation of the binomial distribution sampling as n increases. There are instructions available for programs like geometers sketchpad and fathom, as well as some java applets, like [INSERT LINK]. This can give a nice conceptual sense for how, even if the probability of success is off to the side, the sampling approaches the normal distribution.
Statistical inference is a large part of the examination. Confidence intervals is probably the key topic for the section on inference. It’s also sometimes counter intuitive to students, so careful use of language is required.
I force students to use very precise language here. The correct language is “95% confidence level”, not 95% chance, or 95% probability of anything else similar. It can be considered picky, but there is no probability here, and that’s important. Especially as students are considering different distributions and samples of different distributions, clarity helps.
It’s worth having a discussion with students about what they consider to be an appropriate level of confidence for their studies. Obviously 99% is very strong, but students should be aware of how much more “expensive” that level of confidence is. is very easy, but doesn’t make you feel very, well, confident. The common levels of confidence tend to be and . Students should work some problems and get a sense for what they feel the “sweet spot” is for keeping the sample size manageable, but gives a high enough confidence level.
Sums and Differences of Independent Random Variables
I can get confusing for students when trying to figure out when to add probabilities and when to multiply. There are a couple of tricks to help students out. First, I always tell them that choosing one and then another, without replacement, is logically identical to choosing two at the same time. In the case of the book’s example of miners, the question can be reformatted to choosing one miner, than another. This clearly implies a multiplication of the two individual probabilities to get the probability for the two together. However, if we then ask about the probability of getting at least one of the two having the illness, then adding the different probabilities is needed. The difference is that the multiplication happens on the “front end”, multiple people, multiple coins, multiple dice and so on. The addition happens on the “back end”, where a condition is set where multiple outcomes can meet the requirement.
I tend to define expected value for my students as “probability times payout.” This isn’t always strictly true, like for the television hook-up example, the number of TVs in the house is not directly a payout. Students seem to be able to make the connection fairly easily, and it applies most of the time as expected values are connected to wagering and business more often than not. I would shy away from the contribution to the mean language.
The linearity problem is the first time that students will see why the variance is used frequently in upper level courses. Stress that you can’t use the standard deviation in the same way because of the rules of the square root, although I am sure you will come across at least one student who tries.
Student’s t Distribution
The story of the Student’s t is kind of cool. The person responsible for the invention of the distribution is William Sealy Gosset. Gosset worked for Guinness who was, at the time (1900s), interested in scientifically boosting barley crop production. His work involved small sample sizes, so he had to find a distribution that would work in testing different plots against each other. Due to a previous employee publishing trade secrets of brewing in an academic paper, all employees were barred from publishing, regardless of content. As a consequence, Gosset chose to publish under the name “Student”.
One thing that some statistics books will stress is that there is no proving the null hypothesis true. The book dances around this, but if you are aware of the rule then you will see that they specifically say “accept the null hypothesis” or “there is no evidence against the null hypothesis.” Strictly speaking, rejecting the null hypothesis is a stronger condition than not rejecting one, or accepting one. It’s kind of like how a single counterexample will disprove a theorem definitively, but no amount of examples in support will prove that it is true. The supporting examples will give hints to give you confidence that the theorem is probably true, and that is exactly how a non-rejection should be treated. Sometimes the null hypothesis is written in a manner where rejection is the goal, therefore showing the opposite has a high degree of certainty to be true. Therefore, students need not only the skill to compute the statistic, but writing good null hypotheses.
Degrees of freedom is one of those things that sounds fancy, but really isn’t. There are reasons for the name, but they are beyond the scope of the class, and really aren’t necessary. The beginning and end of what students need to know is that it’s .