# 2.1: An Introduction to Analyzing Statistical Data

This Probability and Statistics Enrichment FlexBook is one of seven Teacher's Edition FlexBooks that accompany the CK-12 Foundation's Probability and Statistics Student Edition.

To receive information regarding upcoming FlexBooks or to receive the available Assessment and Solution Key FlexBooks for this program please write to us at teacher-requests@ck12.org.

## Definitions of Statistical Terminology

**Activity: The Effect of Units on Continuous Measurements**

In this activity students will explore the effect units have on continuous variables. A continuous variable must be measured using some instrument and some unit. The distance between two cities could be measured in feet, meters, miles, or kilometers. The instrument could be the odometer of a car or a bike, a pedometer, satellite technology, or the distance could be calculated using a map. The method and units of measurement affect the data that is gathered.

**Materials:** a class set of rulers that have both inches and centimeters

**Procedure:**

- Each student should measure the length of their right ring finger in both centimeters and inches.
- The class data should be compiled. After the students have made the measurements they can write them on the board. One side of the board can be used for measurements made in inches and the other for those taken in centimeters. The instructor could also write the data on the board as the student read off their information, or a paper could be passed around the room and the data transferred to the board.
- Make a dot plot for each set of data. This will be a number line with measurements along the bottom. One graph will use inches, and the other, centimeters. Dots, one for each measurement, can then be placed above the number line.
- Analyze the data in a class discussion with the following questions.

- Theoretically, is the length of a finger discrete or continuous?
- Does the data displayed on the board look discrete or continuous?
- How does rounding affect the data?
- How do the units of measurement affect the data?

As the discussion progresses, students should realize that even though the length of a figure is a continuous variable, the data is effectively discrete. There is only so much accuracy that can be gotten from a ruler and measurements must be rounded. What value the number is rounded to depends on the units used. When inches are used, there may be fingers recorded as length , but when centimeters are used that same finger might be The practicalities of measurement make continuous variable an abstraction, but the smaller the units and the greater the degree of accuracy, the closer the variable comes to being truly continuous.

## An Overview of Data

**Research and Discuss: Experimental Ethics**

Many fields depend on experimental data. Pharmaceutical companies use experiments when developing new drugs, as so do cosmetic companies when creating new beauty products. Psychologists are famous for conducting behavioral experiments. The subjects of these experiments are often animals, including people. There is a wide variety in the opinions on what is, or is not, ethical treatment of the subjects of these experiments, and when the knowledge gained by the experiment justifies the discomfort, trauma, or death of the participant.

**Research Topics:**

- Find examples of experiments done on people that are considered unethical by the current standards of our society. Include examples from the fields of psychology and medicine, and example from different time periods in history. How did humanity benefit from these experiments?
- Find examples of experiments done on animal, other than humans, that may be considered unethical. How did humanity benefit from these experiments?
- Different organizations have developed guidelines for what constitutes an ethical experiment. Find examples from a variety of groups including psychologists, medical doctors, pharmaceutical companies, animal rights groups, governments, and any other relevant group.

**Discussion Topics:**

- Compare and contrast the different philosophies on ethical experiments. How have these opinions evolved over time?
- When is the cost to the individual subjects of the experiments justified by the benefit of the results to society?

**Procedure:**

Assign individuals or groups of students to different research topics. Have them present what they found in class to stimulate discussion.

## Measures of Center

**Assignment: The Mean, Median, and the Data**

The measures of center are important tools used to summarize and describe sets of data. Calculating the mean or median of a set of data will come easily for students at this level. The skill that needs to be developed at this stage is the ability to interpret what the mean and median convey about a specific data set. Students need to be able to get information about the data set by comparing these two measures of center, and be able to decide which is a better description of the data in a specific situation. They need experience with data sets that are familiar and of interest to them.

**Guidelines:**

- Find a data set with between and elements. You can collect this data yourself or get it from a reliable source. Cite the source of your data or describe your collection method.
- Make a dot-plot of the data set. Label and title the plot.
- Calculate the mean and median of the data set.
- Mark the mean and median on the plot.
- Write an analysis of the work you have done that addresses the following topics.

- Does the set have outlier(s)? Is the shape of the graph symmetric?
- Are the mean and the median close in value? Why or why not?
- Which measure of center is closer to the outlier(s)?
- Which measure of center best describes a typical value in the data set?

Have students present their work to the class. Display the dot-plots on the walls of the classroom. The exposure to these sets, along with their measures of center, will help students develop an understanding and intuition for what the mean and median can tell them about a set of data.

## Measures of Spread

**Integrating Technology: Spreadsheets**

Knowing how to use a spreadsheet is a valuable skill. Many college science classes require that data analysis for lab work be done on a spreadsheet. Students at the college level are expected to have basic knowledge of, and the ability to use this tool. A quick perusal of the requirements given in job descriptions for work in the fields of accounting and finance, as well as many other fields, would convince anyone that learning to use a spreadsheet is a worthwhile pursuit. Students are adept at picking up new technology, and before long will be showing you useful features of this program.

**Objective:** Calculate the standard deviation of a large set of data by making the usual table on a spreadsheet.

**Procedure:**

- Provide the students with a large set of data, one with at least a hundred elements. Another option is to have the students provide their own data set so that it will be of more interest to them. They can use sports statistics, measurements that indicate climate change, or anything. For grading purposes it will be easier to have everyone using the same set of data, especially for a first attempt.
- The table should have three columns titled like those shown in the text of the lesson.
- In the first column the value of the variable must be entered. If everyone in the class is using the same set of data, you can provide the spreadsheet to the students with the first column already filled.
- The second row of the second column will contain a formula for the deviation of each value from the mean. The students can be taught how to reference other cells and to fill down.
- The second row of the third column will square the values in the second column.
- Now students can get the sum of the third column and calculate the standard deviation.

**Notes:**

- This would be done more efficiently with two columns, but it would give the students less practice. A brief discussion of error magnification and an explanation of the three column requirement are appropriate.
- Excel is the most widely used and nicest spreadsheet, but use what is available to you.
- Students can find detailed explanations of how to use these programs with a quick internet search.

**Assignment: Picturing the Standard Deviation**

The standard deviation provides vital information about a set of data. It is a key component of many of the calculations that are done in statistics. The mean of a data set is not particularly useful unless it is paired with the standard deviation. In the past students have had multiple exposures to the mean, but this is most likely the first time they have encountered the standard deviation. They will have a difficult time seeing where it is and what it measures. Experience with the standard deviation is the key to their understanding. This assignment will provide students with an opportunity to work with the standard deviation of data sets that are of interest to them.

**Guidelines:**

- Find two data sets each with at least elements. Chose one data set with numbers that are fairly spread out and one where the values are all relatively close to the mean. You can collect this data yourself or get it from a reliable source. Cite the source of your data or describe your collection method.
- Make a dot-plot of each data set. Label and title the plots.
- Calculate the standard deviation of each data set using a table. Check your answer with your calculator.
- On your dot-plot, highlight all the values that are within one standard deviation of the mean in yellow, those between one and two standard deviation from the mean in pink, and those between three and four standard deviation in green.

Have students present their work to the class. Display the dot-plots on the walls of the classroom.

Discuss if and how the standard deviation would change if the measurements were made in different units.

The process of finding data sets with large and small standard deviations will make the students think about what the standard deviation tells them about the data.