Timothy S. Vaughan
University of Wisconsin - Eau Claire
Journal of Statistics Education Volume 11, Number 1 (2003), jse.amstat.org/v11n1/vaughan.html
Copyright © 2003 by Timothy S. Vaughan, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the author and advance notification of the editor.
Key Words: Sampling distribution; Sampling variability; Student simulation.
The advent of electronic communication between students and teachers facilitates a number of new techniques in the teaching of statistics. This article presents the author’s experiences with providing each student in a large, multi-section class with a unique dataset for homework and in-class exercises throughout the semester. Each student’s sample is pseudo-randomly generated from the same underlying distribution (in the case of hypothesis tests and confidence intervals involving ), or the same underlying linear relationship (in the case of simple linear regression). This approach initially leads students to identify with their individual summary statistics, test results, and fitted models, as “the answer” they would have come up with in an applied setting, while subsequently forcing them to recognize their answers as representing a single observation from some larger sampling distribution.
There are certain fundamental statistical concepts that are notoriously difficult for students to truly comprehend at an intuitive level, and instructors are continuously exploring innovative teaching practices in the interest of rectifying this situation. In particular, a number of authors have reported their experiences in engaging students in simulation exercises, designed to convey the concepts of sampling distributions and sampling variability.
These exercises generally fall into one of two categories. The first category represents those exercises in which students are physically engaged in the process of drawing random samples, using sampling bowls, bags of candy, and slips of “yes” or “no” votes (see Schwarz and Sutherland 1997; Dyck and Gee 1998; Rossman and Chance 1999; Gourgey 2000). This approach has the benefit of creating a “real” experiment, and actively involves the student in the sampling process itself. The drawbacks to this approach are that the exercise is generally limited to binomial or multinomial sampling from finite populations, while the sample size and number of samples are inherently constrained by the time available. Variations on this idea include having students prepare subjective confidence interval estimates (Anderson-Cook 1999), while Zerbolio (1989) has suggested exercises in which students “imagine” a physical sampling experiment rather than actually conducting one.
The second category represents those exercises in which computer software such as Fathom is used to generate pseudo-random observations, thus allowing students to see the resulting sampling distribution of various summary or test statistics (Schwarz and Sutherland 1997; Anderson-Cook 1999; delMas, Garfield, and Chance 1999; Rossman and Chance 1999). This obviously allows any number of samples of any size to be “drawn” from a much broader collection of underlying distributions. The downside here is that the student is often placed in the role of passive observer, basically “watching” the demonstration as they experiment with different sample sizes and alternative source populations.
While both types of exercises help to instill a distinction between the distribution of the data as opposed to the distribution of the sample statistic, students frequently fail to make the connection between these demonstrations and subsequent course topics (Gourgey 2000). Indeed, we instructors frequently defeat our own best efforts, first delivering a compelling demonstration of sampling distributions, then proceeding through the remainder of the course with a succession of homework and case projects aimed at students coming to the “right answer” for the data at hand. This reinforces the students’ tendency to think of the results of their homework and in-class exercises as “the right answer,” rather than as “an answer,” representing a single observation from the sampling distribution in question. If the implications of sampling variability are both fundamental to an understanding of statistics and difficult for students to understand, common sense pedagogy would suggest that this concept be continually reinforced, through integration into all topics to which the concept applies.
The author has recently experimented with providing each student in a large, mass lecture or multi-section class with a unique dataset for selected homework and in-class exercises throughout the semester. Within the context of a hypothetical study or research question, random observations are generated from a distribution with given parameters. After covering the material in question with a separate example, students are required to perform the appropriate analysis on their unique dataset, and to return their answers to the instructor via e-mail. Students thus recognize their answers as the conclusions they would have drawn, had they performed the study with their individual data. Subsequent in-class examination of all students’ answers clearly demonstrates the variability resulting from the simple fact that they all worked with different random samples drawn from the same “population.”
This approach is facilitated by an Excel macro written in Microsoft Visual Basic. The macro was initially developed by the author to allow e-mail distribution of confidential student grade reports in a large mass lecture or multi-section course. The macro references the instructor’s spreadsheet “gradebook,” and translates each row of data (one row per student) into a formatted report that is sent directly to each student’s e-mail address.
The author subsequently realized that this same macro can facilitate a unique development in the teaching of basic statistical concepts, easily generating and disseminating unique random samples to a large number of students throughout the semester. The data are sent via e-mail as text file attachments, which the students subsequently copy and paste into Microsoft Excel, or another spreadsheet or statistical package of their choice. The remainder of this paper will describe a number of assignments and exercises exploiting this capability. These procedures have been used in a second-semester statistics course, both to review concepts from the introductory course, as well as to reinforce the concept of sampling distributions within the context of more advanced material.
The scenario for the first ongoing example is a study of January 20XX heating bills. Obviously, any scenario of interest could be developed. For each student, I generate a unique random sample of n = 25 observations from a normal distribution with mean = 135 and standard deviation = 20. In class, the students are told they are attempting to estimate and for all January 20XX heating bills statewide. I make it clear that each student has received randomly drawn observations, corresponding to the idea that in practice they would have randomly chosen n = 25 houses from which to collect data. The first assignment is to compute the sample mean , sample variance s2, and sample standard deviation s using Excel, and to return their answers to the instructor via e-mail.
In the following class period, I reveal the “true” = 135 and = 20, emphatically repeating that in practice these values would not be known. In a class of 144 students across all sections, I have generated and sent a total of 144 x 25 = 3600 observations. This provides an excellent opportunity to review some basic normal distribution concepts, demonstrating that approximately 90% of the observations have fallen within and approximately 99% have fallen within .
We then observe a histogram of all the values the students have computed, and students are encouraged to identify where “their” has fallen relative to the rest of the class. The compelling lesson from this display is, of course, that is a random variable. Each student is then forced to realize that “their answer” for the homework is just one observation from the “population” of all possible ’s. I have previously computed the grand mean of all the students ’s, which motivates the discussion that . The slight variation between the grand mean and also presents the opportunity to discuss sample size considerations.
I have also previously computed the sample variance of the students’ values, which of course launches discussion of the fact that . This idea is reinforced by demonstrating that approximately 90% of all the students’ values have fallen within and approximately 99% of the values have fallen within . At this point, a show of hands identifying which students’ ’s fell outside the respective ranges is especially compelling. Lest the lesson impact only those students, it is important to emphasize that “it could have been any one of you, the fact that it happened to be these particular students is purely due to which 25 houses they randomly selected for their sample.” It is also useful to compare this analysis, side-by-side, to the earlier analysis of the 90% and 99% ranges for the 3600 individual observations.
The analysis of the 90% and 99% ranges for , obviously requires discussion of the Central Limit Theorem as well. In this case, normality of is supported by the fact that the heating bills themselves are normally distributed. (It would of course be possible to use this approach with non-normal data, demonstrating the degree of normality for under alternative sample sizes. This issue, I believe, is best dealt with using one of the software-based demonstrations discussed earlier. The individual heating bill observations were drawn from the normal distribution in order to provide compelling demonstrations of subsequent analyses based on the t-distribution.)
Finally, we observe a histogram of the students’ s2 values, similarly demonstrating the important idea that S2 is a random variable, and . We also observe the corresponding histogram of the students’ sample standard deviations, but I try to avoid getting into a discussion of the fact that .
The second assignment based on this data is to have each student compute and e-mail “their” values for (that is, a z-statistic for their value) and (that is, a t-statistic for their and s values). When making the assignment, I emphasize that in the first calculation they are all dividing their by the same constant , while in the latter calculation, each student has their observation of the random variable in the denominator. Students are encouraged to anticipate which values will demonstrate greater variability across the class.
In the following class period, we review histograms of the students’ z and t values. The first point made is that the histogram of “z-scores” is identical to the histogram of values examined in the earlier class period, except for a change of location and scale on the horizontal axis. Superimposed on the histogram is an appropriately scaled diagram of the standard normal density function. This point is reinforced by a show of hands from all students whose values fell outside the range , followed by a show of hands from all students whose values for fell outside the range 1.645. It is, of course, the exact same set of students.
A simultaneous display of students’ z and t values demonstrates that the
t-values are indeed more variable. A histogram of the students’
t values is displayed in
Figure 1. After introducing the
t-table and discussing degrees of freedom, I ask for a show of hands for all students whose value of
falls outside the range
1.711 (the appropriate t-value for
Figure 1. Histogram of 144 students’ test statistics
,
where
is the mean of
n = 25 observations, pseudo-randomly generated from the normal distribution with mean
After working through a separate example introducing the hypothesis testing framework, the next assignment requires students to compute and return the values of
and
associated with their respective samples. The first statistic is appropriate for testing the null hypothesis
versus either a one-sided or two-sided alternative, while the second statistic is appropriate for testing the null hypothesis
,
versus either a one-sided or two-sided alternative. At this point, I relate that we typically wouldn’t draw a single sample for the purpose of testing a variety of null hypotheses. For convenience and clarity, we are first going to
pretend we drew the sample for the purpose of testing
,
and then separately
pretend we drew the sample for the purpose of testing
.
As by now the class is well aware that the true mean is
In order to generate an observable number of Type I errors, we first test
at significance level
Repeating the one-tailed test at
We then turn to the alternate scenario, pretending we drew the sample for the purpose of “proving”
Figure 2. Histogram of 144 students’ test statistics
,
where
is the mean of
n = 25 observations, pseudo-randomly generated from the normal distribution with mean
Repeating this one-tailed test with
At this point, I have the students speculate as to the distribution of test statistics had they tested
.
Returning to the histogram of
values in
Figure 1, we note that we would see a virtually identical picture had we calculated
.
As such, approximately 90% of the students would have “committed” a Type II error using
A similar approach is used to demonstrate the meaning of a confidence interval, having each student compute the lower and upper limits of 90% and 95% confidence intervals based on their data.
A histogram of lower and upper confidence limits generated, as well as a show of hands during class, demonstrates that approximately
90% (or 95%)
of the students’ datasets result in a confidence interval that covers the true mean, while 10% (or 5%) of the students generate a confidence interval with upper limit
This approach has recently been extended to the case of simple linear regression. Here, I generate a random sample of
n = 15 x observations from a discrete Uniform(25000, 50000) distribution, with the interpretation that
x represents the number of miles on the odometer of a recently traded vehicle. For each
x, the spreadsheet computes
As before, each student receives his or her own unique dataset via e-mail. After working through a separate example introducing the concept of least squares estimation, students are assigned to fit the model to their data, using the Excel Data Analysis add-in. Students are also told to use their model to make a prediction ( ) of the trade-in value for a vehicle with 40,000 miles on the odometer, and to return the entire analysis to the instructor by attaching their Excel file to a return e-mail.
As before, subsequent in-class review begins with revelation of the “true” parameters
As with the earlier material, histograms of all the students’
and
values drive home the point that any statistics computed from random data are themselves random variables. (The
values are displayed in
Figure 3.) The central tendency of the histograms again demonstrate the concept of unbiased estimation, e.g.
and
.
(It is important that students not identify too closely with the overall mean of their collective
and
estimates. I have to continually reinforce that in practice, they would be looking at
their single value for
and
,
i.e. one random observation from the respective distributions of all such values.) A histogram of the students’ predicted values at
Figure 3. Histogram of students’ fitted
parameters, when each student fit the model
to
Figure 4. Histogram of students’ predicted values
A histogram of all students’ values for
serves as a backdrop for the discussion that the statistic
follows a t distribution with
The effectiveness of the technique described above lies in getting the students to identify with “their” answers as the results they would have come up with in practice, based on an analysis of “their” data. It is important, then, to maintain the illusion that I am directly using the students’ homework submissions as we review the various sampling distribution properties. In actuality, I have prepared the various summary statistics and displays prior to sending out the student-specific datasets. When reviewing the various results I tell them I have supplied the correct answer for any student who has done the homework incorrectly.
Handling the large volume of e-mail homework submissions is simplified by creating separate “inbox folders” for each assignment, and applying rules that direct any message with a certain key phrase in the subject to the appropriate folder. I have also found that I am able to quickly check the emails as they trickle in, responding to questions or incorrect answers in a more timely manner than traditional paper-based assignment collection.
In summary, the author has explored the idea of engaging students in demonstrations of the sampling distributions pertinent to various topics throughout the course. This is accomplished by providing each student with a unique dataset for homework problems and in-class exercises. In addition to demonstrating the characteristics of the sampling distribution in question, this approach forces students to recognize “their results” as being a single observation from that distribution. As such, every student directly experiences the implications of sampling variability as it applies to each new topic. Although no objective measurement of the effectiveness of this approach under controlled conditions has been attempted, student response has generally been positive.
This technique could obviously be applied to any topical coverage in which the random nature of data is a concern. The author intends to next extend the approach to coverage of various nonparametric statistical tests, which easily degenerates into a “cookbook” approach without an understanding of the behavior of the relevant test statistics.
Anderson-Cook, C. M. (1999), “An In-Class Demonstration to Help Students Understand Confidence Intervals,” Journal of Statistics Education [Online], 7(3). (jse.amstat.org/secure/v7n3/anderson-cook.cfm)
delMas, R. C., Garfield, J., and Chance, B. L. (1999), “A Model of Classroom Research in Action: Developing Simulation Activities to Improve Students’ Statistical Reasoning,” Journal of Statistics Education [Online], 7(3). (jse.amstat.org/secure/v7n3/delmas.cfm)
Gourgey, A. F. (2000), “A Classroom Simulation Based on Political Polling To Help Students Understand Sampling Distributions,” Journal of Statistics Education [Online], 8(3). (jse.amstat.org/secure/v8n3/gourgey.cfm)
Schwarz, C. J., and Sutherland, J. (1997), “An On-Line Workshop Using a Simple Capture-Recapture Experiment to Illustrate the Concepts of a Sampling Distribution,” Journal of Statistics Education [Online], 5(1). (jse.amstat.org/v5n1/schwarz.html)
Timothy S. Vaughan
Department of Management and Marketing
University of Wisconsin - Eau Claire
Eau Claire, WI 54702-4004
USA
vaughats@uwec.edu
Volume 11 (2003) | Archive | Index | Data Archive | Information Service | Editorial Board | Guidelines for Authors | Guidelines for Data Contributors | Home Page | Contact JSE | ASA Publications