Mustafa R. Yilmaz
Journal of Statistics Education v.4, n.1 (1996)
Copyright (c) 1996 by Mustafa R. Yilmaz, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the author and advance notification of the editor.
Key Words: Teaching objectives; Constraints; Integrated course design; Teaching software.
Traditional methods of teaching introductory statistics are often viewed as being ineffective because they fail to establish a clear link between statistics and its uses in the real world. To be more effective, it is essential that teaching objectives are clearly defined at the outset and issues of content and methodology are addressed accordingly. This paper proposes that the relevant objectives should aim to develop the following competencies: (1) ability to link statistics and real-world situations, (2) knowledge of basic statistical concepts, (3) ability to synthesize the components of a statistical study and to communicate the results in a clear manner. Towards these objectives, we propose a revamp of the traditional course design together with the creation of a new software tool that is currently unavailable.
1 It is widely believed that customary methods of teaching statistics to students majoring in other fields (non-specialists) are not very effective. Desired outcomes are often not attained, and real improvements have been few (Cobb 1993; Hogg 1991, 1992; Mosteller 1988; Snee 1993). This paper will touch upon the underlying reasons, and propose a more effective, integrated approach. Although the views expressed here are rooted in the author's experience in business education, most of the discussion is relevant to other fields as well. The present discussion is concerned primarily with undergraduate education, but much of it also applies to non-specialist graduate education.
2 There is near-universal consensus that statistical literacy and appreciation is an important component of undergraduate and graduate education in all fields involving the gathering, interpretation, or presentation of data. Many fields of study fit this description, but non-specialist curricula generally allow very limited time for teaching statistics. Consequently, it is necessary to define the teaching objectives as clearly as possible, and to use the objectives in determining content and methodology.
3 Although this discussion does not provide resolutions of all relevant issues, it suggests that a more effective approach can be developed. The key word here is develop because the approach involves the creation of interactive computer software specifically designed for teaching statistics. An integrated approach incorporating such a tool is proposed. The discussion elaborates on the design of the software as well as the overall teaching approach.
4 Especially for non-specialists, an effective statistics education must strive towards the use of statistics in the real world. For this purpose, it is necessary to develop a clear sense of the relevance of statistics in real situations. Some of these situations consist of applications specific to the student's field of study, but others may involve situations of general interest or experiences in daily life. In any case, linkage with the real world is necessary for an understanding of the kinds of questions where statistics can help. This in turn promotes the importance of data, issues concerning the sources and measurement of data, the concept of variability, and errors inherent in data collection. Technical tools for dealing with these would be of interest to the student only when the need for these tools has been established.
5 The overall ability to use statistics in the real world requires three specific competencies:
6 In our view, the objectives of statistics education for non-specialists must include the development of all three competencies. We believe that specialists and non-specialists differ only in the extent to which these competencies are developed, not in the choice of competencies. Undoubtedly, a major constraint in this regard is the shortage of allotted time. Typically, non-specialist curricula include very few statistics courses, and at best, limited coverage in other courses. There are, however, additional reasons why statistics is a difficult subject for non-specialists, not just from the viewpoint of the student but from the teacher as well.
7 First, statistics involves essential concepts that are abstract and complex in nature, such as randomness, distributions of sample statistics, and the probabilistic nature of statistical conclusions. Randomness is a vexing notion for most people because what one's intuition may suggest as being random may not be random at all. Sampling distributions are abstract because a given sample yields a single computed value for a sample statistic, not an entire distribution. The distinction between a statistical conclusion and the truth of a hypothesis is difficult because a hypothesis is never really proved in statistics. The fact that, using the same data, one can fail to reject a hypothesis as well as the opposite hypothesis adds to the confusion. Second, statistics requires analytical skills involving problem formulation, variable identification, and model building that are intrinsically difficult to teach. The most effective means of teaching these skills is by practicing them, but this is difficult to achieve under time constraints. Third, learning the technical tools of statistics requires some basic mathematical skills, but with the exception of science and engineering students, most non-specialists lack a strong background in mathematics. Finally, effective use of statistics requires the ability to synthesize various components and analyses into a coherent whole, and to communicate the results clearly through memoranda or reports. It is safe to say that many students lack these skills.
8 In view of these difficulties, the development of technical expertise is an unattainable goal in introductory courses for non-specialists. The kind of technical sophistication a non-specialist must attain involves an awareness of what statistics can or cannot tell us. For example, upon hearing a statement like "the margin of error in this survey was plus or minus three percentage points," the student must recognize that the statement leaves some important things unsaid, that this is really an imprecise statement based on certain assumptions and a chosen level of confidence, and that there is some chance that the error may well exceed three percent. This kind of statistical literacy is essential, but a thorough knowledge of the details in arriving at the statement is not. It is sufficient to know how such a statement is arrived at in the simplest cases, such as for a population mean or proportion. One can then use analogy for other population characteristics without specific knowledge about the details.
9 The technical material we consider essential in an introductory course includes the distinction between populations and samples, types of random variables and measurement scales for them, observational studies versus controlled experiments, basic measures of location and variability, the concept of a probability distribution (the normal distribution in particular), the concept of sampling distribution and the central limit theorem, confidence interval estimation, significance testing using confidence intervals, and the notion of a model for a random variable in terms of other variables. We do not consider the following topics essential, although they are commonly included in introductory courses: standard treatment of probability calculus, sampling distributions of many statistics (except when the central limit theorem is applicable), and the traditional approach to hypothesis testing using acceptance and rejection regions, types of errors, power of tests, and operating characteristic curves. More importantly, and regardless of the particular choices made, the need for essential technical material must first be established via applications. For example, rather than introducing standard deviation directly as a measure of variability, it should be introduced by a question like "How should we measure the variability in this data set?" or "Which of these two data sets exhibits greater variability?" It is not reasonable to assume that the student establishes this kind of linkage on his or her own.
10 In the proposed course design, applications and case analyses are the main vehicles for developing the first and third competencies mentioned earlier, and interactive software is the main vehicle for the second. Following a discussion of these, a sample course outline for a business curriculum is included at the end of this section.
11 There is a growing body of literature providing suggestions and discussions of teaching issues concerning statistics. A sample of this literature is included in the references (e.g., Bentley 1992; Cobb 1992, 1993; Hogg 1990; Landwehr 1993; Rossman 1992; Snee 1993; Wardrop 1992; Willett and Singer 1992). A common theme in this literature is the need to focus on the use of statistics via cases and experiments, even if this comes at the expense of reduced emphasis on technical material. The effect of this theme is becoming more visible in statistics textbooks and courses aimed at non-specialists.
12 To be more specific, classroom discussions would primarily focus on substantive application issues, and technical material would be used only in supporting the discussions. Cases and experiments would include applications relevant to the students' field of study as well as applications involving generic activities like the gathering of data (e.g., questionnaire design) and hands-on experiments for teaching randomness and variability. These applications and hands-on experiments are important means of making statistics real for the student (Chatterjee and Hawkes 1995; Cobb 1993; Halvorsen and Moore 1991; McKenzie 1992; Roberts 1992; Sylwester and Mee 1992). They can also facilitate the learning of technical tools which are linked to the applications under discussion.
13 Suppose we are interested in the question "Is gender a significant factor in determining salaries?" At the outset, it would be discussed that a proper investigation of this question involves a sequence of other questions. It is necessary to define the scope of inquiry by deciding whether it is to be investigated in a particular organization or a larger group of organizations. Next, we must determine the specific tasks required. This brings up the need for data and identification of its source(s). Other questions follow naturally: If we can gather our own data, should we collect it from a sample or an entire population? If we are to sample, how should the sample be selected and how large a sample do we need? What are the potential problems in designing and carrying out a sampling plan? If a survey is needed, how will it be conducted and how should the questionnaire be designed? Would it be proper to look at data for one year or should we look at a longer horizon? These discussions would emphasize that collection of good data is a difficult task that must be done carefully. Just because we have access to some data does not mean we can draw meaningful conclusions from them.
14 Assuming that we are able to gather representative salary data, discussions can turn to analyses. A good way to begin is to look at various summaries and comparisons using descriptive statistics like averages, ranges, percentiles, standard deviations, and histograms. If we have salary data collected over time, we should also look at it using time series plots. At a later point, we may ask if variability in salaries is attributable to gender, after taking into account other factors such as experience, education, and type of job. Without considering these lurking (confounding) factors, it is possible to reach incorrect conclusions, and this possibility should be demonstrated. We may then ask how we can conceptualize the relationship between salaries and other relevant variables, for example via regression modeling, and discuss what needs to be done if we want to establish a causal relationship rather than mere correlation. This leads to the need for carefully designed experiments in which the effects of lurking variables are controlled before causation can be claimed.
15 The new design would emphasize communication skills by requiring students to write reports and to participate in class discussions on a regular basis. One kind of report could be weekly write-ups that summarize what was learned during the week's discussions. These summaries would be short (e.g., a one-page memo), but they provide opportunities for practicing concise writing and making sense of the discussions. Students would also be required to write some longer reports to present an assigned case analysis or experiment. Course materials would include guidelines, instructions, and examples for these writing assignments. Written work would be graded on the clarity of writing and its content.
16 Course materials in the new design would consist of a course manual and teaching software. The manual would contain four kinds of materials: (1) cases or experiments to be discussed in the classroom and to be used for student projects, (2) brief explanations of technical material appended to each case, (3) instructions on using the teaching software, and (4) guidelines for and examples of report writing for effective communication.
17 In the new design, the development of basic technical skills is envisioned to take place significantly, albeit not entirely, outside the classroom through the use of integrated teaching software. To complement the instructions in the course manual, the course outline would contain target dates and deadlines for passing certain technical skill tests that would be administered and graded using the software. Students would have multiple opportunities to pass any given test. Students would thus be able to work at their own pace, within limits, to learn the material and pass the tests. In terms of relative weight, written work and class participation would receive at least two-thirds of the total weight of all grading instruments, and computerized tests of technical skills would receive the remaining weight.
18 Although the use of computer software is common today, typical statistics software is meant for computation and not for teaching. The kind of software we envision would be a more complete teaching tool by which basic technical skills can be learned at each student's own pace, and significantly outside the classroom. Capabilities of the software would include helping the student learn technical material, practice computational skills, conduct data analyses, and take tests in specific topics. This kind of software is presently unavailable, although electronic textbooks are becoming more common. Incorporating this technology into the curriculum can increase learning productivity and free up an important portion of the classroom time that is currently spent on technical details.
19 The software would have three main modules for (1) learning specific topics, (2) administering tests, and (3) data analysis. Ease-of-use through a consistent graphical interface would be a primary requirement. The user would be able to switch between modules at any point without extensive navigation or having to close one module in order to open another. Each module would have context-dependent help with hypertext links to related subjects and to other modules.
20 The learning module would be arranged by topics in a way similar to current textbooks: data summarization, the normal distribution, sampling distribution of the mean, confidence intervals, and so on. It would make judicious but not excessive use of multimedia facilities for effective presentation as in Ferris and Hardaway (1994). The test module would be linked to the learning module so that the student can easily move from one to the other in any selected topic. The student would also be able to repeat a test if so desired, but the test would be modified automatically each time it is administered. When the student demonstrates an acceptable level of proficiency, test results could be made available to the professor electronically.
21 Capabilities of the data analysis module would be similar to the student editions of currently available statistics software such as Minitab, SPSS, Systat, and StataQuest. It would contain all of the standard tools that may be used in introductory courses, but advanced features like factor analysis and discriminant analysis could be optional. It would, however, include the capability to conduct simulations so that the user can investigate the effects of different assumptions on the results of analyses.
22 It should be emphasized that this software is not simply an electronically stored version of a textbook embellished with multimedia bells and whistles. This kind of software is already available in the marketplace for many technical subjects including basic mathematics, algebra, calculus, physics, chemistry, accounting, and even statistics. In these multimedia textbooks, information is presented in a more or less fixed sequence, and the student must learn by absorbing whatever information is presented on the computer screen. A typical test in this kind of software consists of questions for which each answer is either correct or incorrect, and a wrong answer merely produces a message such as "Incorrect. Try again." In the interactive software we envision, the sequence of events would be controlled by the student. For example, the student could decide what material to study on a topic, how long to study, or how many examples to study. It would be possible to jump from one module to another at any point. A test in this software would not simply classify each answer as right or wrong, but it would offer options like "Here is why" and "Show me," which could be activated by the student.
23 Our belief that the interactive software outlined above would be a useful teaching tool is based on two basic findings in cognitive science (see Stillings et al. 1995 for a good introduction). First, people differ in the way they are able to learn complex skills; what fits one person well may not be suitable for another person. For example, one student may learn best by being given general directions on how to do a task, whereas learning by examples may be more suitable for another student. By their nature, classroom lectures cannot be individualized to the needs of each student, and traditional textbooks are even more static sources of instruction. Well-designed software can substantially eliminate these constraints.
24 Second, complex tasks generally require longer to learn than simple tasks. According to Schank (1986), the level of understanding developed through learning falls somewhere on a scale from "making sense" at the lower end, to cognitive understanding in the middle, and "complete empathy" at the other end. The process of moving towards cognitive understanding of complex tasks requires higher and longer levels of attention and more instances of reminding with examples. For many non-specialist students, technical material in statistics can be abstract and complex. In learning this material, students may need readily available explanations and examples. They also need to use what they have learned through practice problems and to be given immediate corrections and explanations when they make mistakes. The traditional lecture-textbook combination is not conducive to this process because time is limited, and the information given by the instructor or the textbook is not always what each student needs at the time. It is possible, for example, that an instructor who answers one student's question may be wasting the time of many other students. Well-designed interactive software can provide more timely access to information, and thus allow students to learn more productively. It also allows a more flexible scheduling of the learning activity.
25 The design, development, and testing of the teaching software is a substantial undertaking which requires collaboration among teachers, students, publishers, and software developers. In the past, textbook publishing and software development were separate, independent activities. Although the gap between them has not yet been substantially narrowed, the need to do so is becoming more urgent. When textbook publishers and software developers are able to collaborate more closely, better interactive software teaching tools will become available.
26 For purposes of illustration, an integrated sample course outline for an undergraduate business curriculum is given below. Clearly, the outline would be somewhat different in other fields of study. The given outline is for a two-quarter course sequence with 11-week quarters. Two class meetings per week are assumed, each session lasting approximately 90 minutes. Class sessions in certain weeks consist almost entirely of discussions of case problems, and this is indicated in parentheses. In other sessions, shorter cases or examples, hands-on experiments, and computer simulations would be used.
Part I. Data Collection and Preliminary Analyses
(Approximately 5 weeks)
Week 1. The need for data and statistical analyses in answering real-world questions. Examples and discussion.
Week 2. Data collection process: What data to collect and how to collect it. Surveys, experiments, and sampling.
Week 3. Designing a questionnaire: Examples.
Week 4. Data summarization and preliminary analyses (case analysis). Target date for completing computerized tests of descriptive statistics.
Week 5. Data summarization and preliminary analyses (case analysis).
Part II. Distributions of Data
(Approximately 3 weeks)
Week 6. Information provided by a probability distribution. The normal distribution.
Week 7. Population, sample, and sampling distributions (simulations). Target date for completing computerized tests of probability distributions.
Week 8. Distributions of sample statistics and statements about errors due to chance.
Part III. Drawing Conclusions from Sample Data
(Approximately 3 weeks)
Week 9. Confidence interval estimation and the margin of error.
Week 10. Using confidence intervals for testing hypotheses (case analysis). Target date for completing computerized tests of estimation and hypothesis testing.
Week 11. Confidence intervals in quality control: Control charts (simulations).
Part IV. Comparing Populations
(Approximately 4 weeks)
Week 1. Detecting a significant difference between two populations (case analysis).
Week 2. Establishing if a proposed treatment would make a difference: Controlled experiments.
Week 3. The need for randomization and experimental design in controlled experiments. Target date for completing computerized tests of analysis of variance.
Week 4. Contingency table analyses.
Part V. Regression Modeling and Forecasting
(Approximately 7 weeks)
Week 5. Discerning and modeling relationships among variables (case analysis).
Week 6. What is a linear model and what information does it really provide?
Week 7. Estimating a model and interpreting the estimates. Using the model for prediction (case analysis).
Week 8. Factors relevant in model specification (simulations). Target date for completing computerized tests of regression analysis.
Week 9. Regression models for times series data (case analysis).
Week 10. Forecasting models for time series (case analysis).
Week 11. Problems and pitfalls in modeling (case analysis). Deadline for completing computerized tests of time series models.
27 It is not realistic to suggest that the same course outline could be compressed into a one-semester course. Rather than attempting to cover the same topics at a more rapid pace, we believe that some topics would have to be deleted from this outline. Certainly, what should be deleted is subject to some discretion as well as controversy. In an undergraduate business curriculum, our choices for deletion would all be among the topics covered in the second quarter. Specifically, we would suggest deleting the material of weeks 1 through 4, 8, 10 and 11 in a single 15-week course.
28 Many of the recommendations here are yet to be developed, tested, and implemented. Undoubtedly, some recommendations would be modified in the light of real experience, and new ideas leading to further improvement would be discovered in the process. We also have not discussed assessment issues that must be addressed to ensure the quality of teaching (Garfield 1994; Wiggins 1992; Zahn 1991). The suggested approach would require more effort from the teacher and the student, but given the right materials and tools, the increased effort can be more productive as well as more interesting.
29 Since the proposed software tool is yet to be developed, we are not able to report data on real experience with the suggested course design. On the other hand, emphasis on real-world applications is becoming more prevalent in non-specialist statistics courses at many colleges and universities, including our courses at Northeastern University. Consequently, certain features of the new design, such as a heightened focus on data (as opposed to technical tools) and emphasis on writing skills, are already being implemented. Although evaluations of the changes by students and faculty appear to be positive, we believe that incorporation of new technology will generate even greater improvements.
Bentley, D. L. (1992), "Investigational Statistics: A Data Driven Course," in Proceedings of the Section on Statistical Education, American Statistical Association, pp. 119-122.
Chatterjee, S., and Hawkes, J. S. (1995), "Statistics and Intuition for the Classroom," Teaching Statistics, forthcoming.
Cobb, G. W. (1992), "Teaching Statistics," in Heeding the Call for Change, ed. Lynn Steen, MAA Notes No. 22, Washington: Mathematical Association of America, pp. 3-23.
----- (1993), "Reconsidering Statistics Education: A National Science Foundation Conference," Journal of Statistics Education [Online], 1(1). (http://jse.amstat.org/v1n1/cobb.html)
Ferris, M., and Hardaway, D. (1994), "Teacher 2000: A New Tool for Multimedia Teaching of Introductory Business Statistics," Journal of Statistics Education [Online], 2(1). (http://jse.amstat.org/v2n1/ferris.html)
Garfield, J. B. (1994), "Beyond Testing and Grading: Using Assessment to Improve Student Learning" Journal of Statistics Education [Online], 2(1). (http://jse.amstat.org/v2n1/garfield.html)
Halvorsen, K. T., and Moore, T. L. (1991), "Motivating, Monitoring, and Evaluating Student Projects," in Proceedings of the Section on Statistical Education, American Statistical Association, pp. 20-25.
Hogg, R. V. (1990), "Statisticians Gather to Discuss Statistical Education," Amstat News, No. 169, 19-20.
----- (1991), "Statistical Education: Improvements Are Badly Needed," The American Statistician, 45, 342-343.
----- (1992), "Report of Workshop on Statistical Education," in Heeding the Call for Change, ed. Lynn Steen, MAA Notes No. 22, Washington: Mathematical Association of America, pp. 34-43.
Landwehr, J. M. (1993), "Project STATS: Statistical Thinking and Teaching Statistics," Amstat News, No. 196, 25.
McKenzie, J. D., Jr. (1992), "The Use of Projects in Applied Statistics Courses," in Proceedings of the Section on Statistical Education, American Statistical Association, pp. 142-146.
Mosteller, F. (1988), "Broadening the Scope of Statistics and Statistical Education," The American Statistician, 42, 93-99.
Roberts, H. V. (1992), "Student-Conducted Projects in Introductory Statistics Courses," in Statistics for the Twenty-First Century, eds. Florence Gordon and Sheldon Gordon, MAA Notes No. 26, Washington: Mathematical Association of America, pp. 109-121.
Rossman, A. J. (1992), "Introductory Statistics: The `Workshop' Approach," in Proceedings of the Section on Statistical Education, American Statistical Association, pp. 352-357.
Schank, R. C. (1986), Explanation Patterns: Understanding Mechanically and Creatively, Hillsdale, NJ: Lawrence Erlbaum Associates.
Snee, R. D. (1993), "What's Missing in Statistical Education?" The American Statistician, 47, 149-154.
Stillings, N. A., Weisler, S. E., Chase, C. H., Feinstein, M. H., Garfield, J. L., and Rissland, E. L. (1995), Cognitive Science (2nd ed.), Cambridge, MA: MIT Press.
Sylwester, D. L., and Mee, R. W. (1992), "Student Projects: An Important Element in the Beginning Statistics Course," in Proceedings of the Section on Statistical Education, American Statistical Association, pp. 137-141.
Wardrop, R. L. (1992), "A Radically Different Approach to Introductory Statistics," Technical Report No. 889, University of Wisconsin, Department of Statistics.
Wiggins, G. (1992), "Toward Assessment Worthy of the Liberal Arts," in Heeding the Call for Change, ed. Lynn Steen, MAA Notes No. 22, Washington: Mathematical Association of America, pp. 150-162.
Willett, J. B., and Singer, J. (1992), "Providing a Statistical `Model': Teaching Applied Statistics Using Real-World Data," in Statistics for the Twenty-First Century, eds. Florence Gordon and Sheldon Gordon, MAA Notes No. 26, Washington: Mathematical Association of America, pp. 83-98.
Zahn, D. A. (1991), "Getting Started on Quality Improvement in Statistics Education," in Proceedings of the Section on Statistical Education, American Statistical Association, pp. 135-140.
Mustafa R. Yilmaz
College of Business Administration
Boston, MA 02115