- Understand the difference between the levels of measurement: nominal, ordinal, interval, and ratio.
- Identify the general elements that characterize a study.
- Understand the fundamentals of experimental design.
- Understand the basic concept of measures of center and variation and their uses for statistical analysis.
This lesson is an overview of the basic considerations involved with collecting and analyzing data. All of these concepts will be examined in greater detail in later chapters, but it is important that students are familiar with the ideas before examining them in greater detail.
Levels of Measurement
In the first lesson, you learned about the different types of variables that statisticians use to describe the characteristics of a population. Some researchers and social scientists use a more detailed distinction when examining the information that is collected for a variable, called the levels of measurement. This widely accepted (though not universally used) theory was first proposed by the American psychologist, Stanley Smith Stevens in 1946 (see links at end of this section). According to Stevens’ theory, the four levels of measurement are:
Each of these four levels refers to the relationship between the values of the variable.
It is easiest to think of nominal measurement in terms of discrete, categorical variables. This is the type of measurement in which the values of the variable are names, and not numerical at all. The names of the different species of Galapagos tortoises would be a nominal measurement.
Comparing the Levels of Measurement
Using Stevens’ theory can help make distinctions in the type of data that the numerical/categorical classification could not. Let’s use an example from the previous section to help show how you could collect data at different levels of measurement from the same population. Assume your school wants to collect data about all the students in the school (which they frequently do):
Nominal: We could collect information about the students’ gender, the town or sub-division in which they live, race, or political opinions.
It is also helpful to think of the levels of measurement as building in complexity, from the most basic (nominal) to the most complex (ratio). Each higher level of measurement includes aspects of those before it. The diagram below is a useful way to visualize the different levels of measurement.
Small Ground Finch, Santa Cruz, Galapagos Islands.
Some of the other famous residents of the Galapagos that have provided scientists with a wealth of information and opportunities for study are the so-called Darwin’s finches. Each of the numerous species of finches has developed special adaptations that allow it to survive in a particular area. There are ground finches, tree finches, cactus finches, medium-billed, small-billed, and large-billed finches, just to name a few. One particular variety has even learned to use a stick as a tool to dig for bugs. To the untrained observer, it is almost impossible to tell them all apart, and on a visit to the islands you will see them everywhere!
Daphne Major, Galapagos Islands.
The other widely used method for conducting research is called an experiment. In an experiment, the researcher imposes a treatment on a group of subjects in an effort to determine a “cause and effect” relationship between variables. While observational studies could appear to show a relationship between diet and heart disease, for example, there could be another factor that is actually causing an individual’s heart condition. An experiment designed to investigate this relationship might take two groups of similar subjects, impose different diets on each group of those subjects, and then record any differences in the condition of their hearts. What makes this difficult, and in some instances impossible, is that the researcher would then need to make sure that anything else that might have an influence on a subject’s heart health (e.g. exercise, genetics, stress level) is controlled, or exactly the same for each individual in the study. One of the ways that statisticians insure this control is by randomly assigning subjects and treatments, thereby using the laws of probability to help guarantee the validity of the results. Designing experiments can be difficult and costly, but they are the only way to establish meaningful and reliable cause and effect relationships. We will study the elements of designing experiments in more detail in later chapters.
Measures of Center and Spread
Let us assume that you have collected some data on one of the various levels of measurement (nominal, ordinal, interval, or ratio) using a statistically valid procedure (observational study or experiment). How do you summarize this information? One of the most important tools for summarizing data is to display it visually, and the various methods for doing so will be covered in later chapters. If we want to use one number or value to summarize the data, we can look at where the data is centered. Data measured at different levels can be characterized by different summaries. Look back at the Tortoise data. This data was collected through an observational study. The variable “Climate Type” is a categorical variable that has been measured at the nominal level. The easiest way to summarize this variable is to identify the most common value (mode), which is “humid.” Variables that are measured at the ratio level, like “population density,” we might find the average (mean) or the middle number (median) in the data to summarize it.
Data can be measured at different levels depending on the type of variable and amount of detail that is collected. A widely used method for categorizing the different types of measurement breaks them down into four groups. Nominal data is measured by classification or categories. Ordinal data uses numerical categories that convey a meaningful order. Interval measurements show order, and the spaces between the values also have significant meaning. In ratio measurement, the ratio between any two values has meaning because the data includes an absolute zero value.
Statisticians and researchers use two main techniques to form important conclusions about the relationships between variables. An observational study is when a researcher observes the subjects in the real world without manipulating them. An experiment is the way to establish true cause-and-effect relationships. It involves the researcher imposing some randomly assigned treatment(s) on the subjects in an effort to isolate the effect of a single variable.
In order to summarize a set of data, we often look to a single quantity to describe where it is centered. There are various measures that are used for this summary, including the mean, median, and mode. These will be covered in detail in later sections, but they are generally referred to as measures of center. Similarly, for information about how the data is spread out, we investigate measures of spread that include the range, interquartile range, and standard deviation.
Points to Consider
- How do we summarize, display, and compare data measured at different levels?
- What are the differences between an observational study and an experiment?
- What are the advantages/disadvantages of observational studies and experiments?
- How do you determine which measure of center or spread best describes a particular data set?
- In each of the following situations, identify the level(s) at which each of these measurements has been collected.
- Lois surveys her classmates about their eating preferences by asking them to rank a list of foods from least favorite to most favorite.
- Lois collects similar data, but asks each student what is their favorite thing to eat.
- In math class, Noam collects data on the Celsius temperature of his cup of coffee over a period of several minutes.
- Noam collects the same data, only this time using degrees Kelvin.
- Which of the following statements is not true.
- All ordinal measurements are also nominal.
- All interval measurements are also ordinal.
- All ratio measurements are also interval.
- Steven’s levels of measurement is the one theory of measurement that all researchers agree on.
- Look at Table 3 in Section 1. What is the highest level of measurement that could be correctly applied to the variable “Population Density”?
Lonesome George, the Last Pinta tortoise, Charles Darwin Research Station, Santa Cruz, Galapagos Islands.
- In each of the following situations, identify if it is an observational study or an experiment.
- In an attempt to determine if students prefer bottled water to tap water, you set up a table in the cafeteria at lunchtime and have students sample some of each and ask them which they prefer.
- Researchers collect data over 15years about 100 sets of identical twins to see how their personalities develop similar or different characteristics.
- Cloned mice are put into different colored cage environments to see if there is an effect on their temperaments.
- Researchers find that babies who were exposed to lead paint have a high risk of brain damage.
- Interval. Even though Celsius has a “”, this is a completely arbitrary decision to set the freezing point of water and not the “absence” of temperature.
- Ratio. The Kelvin scale is based on an absolute zero, the theoretical temperature at which molecules stop moving.
- The levels of measurement theory is a useful tool to help categorize data, but like much of statistics, it is not an absolute “rule” that applies easily to every situation and several statisticians have pointed out some of the difficulties with the theory. See: http://en.wikipedia.org/wiki/Level_of_measurement
- Population densities are certainly measured up to the interval level as there is meaning to the values and distance between two observations. To decide if it is measured at the ratio level, we need to establish a meaning for absolute zero. In this case, it would be individuals per km2. This is possible and indeed represents the extinct populations.
- This is an experiment as each subject is drinking both waters (the imposed treatment). However, it will have to be designed properly. Students should not know which water is bottled and which is tap (this is called a “blind” experiment) and they should be randomly assigned the order in which they drink the water. Other conditions such as the appearance, amount, and temperature would also need to be tightly controlled.
- Observational study.
- Experiment. The research is imposing a treatment (different color rooms) on the mice.
- Observational Study. It would be unacceptable to intentionally expose a baby to potentially harmful substances. The dangers of lead paint were discovered through years of careful observational studies.