Teaching Statistics: Making It Memorable

Eric R. Sowey
University of New South Wales

Journal of Statistics Education v.3, n.2 (1995)

Copyright (c) 1995 by Eric R. Sowey, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the author and advance notification of the editor.

Key Words: Deep learning; Long-term learning.


An overriding goal of teaching is to stimulate learning that lasts. A way to achieve this is, surely, to make teaching memorable. By asking "what makes teaching memorable?", this paper identifies a number of fundamental characteristics of statistics teaching that will assist students in long-term retention of ideas. It structures these attributes of memorable statistics teaching and then shows, with examples, how they can be realised. The author's reflection on his extensive teaching experience underpins this paper.

1. Introduction

1 Very often teaching and learning statistics at university seem to be activities directed primarily to students' passing assessments and gaining paper credentials. This tends to be the case especially in service courses, where, indeed, the vast majority of statistics students are found. Once such an orientation becomes entrenched, longer term objectives may be regarded by teachers or students (or both) as unattainable.

2 If, however, we wish to promote enhanced quantitative thinking, analysis, and practice in statistics-using professions, then an overriding goal of all teaching ought to be to stimulate learning that lasts.

3 In this paper I draw on over 20 years' experience in university teaching at every level, and in classes ranging in size from 5 to 500, to offer suggestions on ways of teaching statistics so that ideas will be long remembered (see background comment in endnote 1). I take it for granted that this requires a certain level of student understanding of those ideas, for understanding is essential to long-term learning.

4 To teach so that ideas will be long remembered I shall here call making teaching memorable.

2. What Makes Teaching Memorable?

5 I take as my inspiration a striking aphorism attributed to the eminent psychologist, B.F. Skinner: Education is what remains when the facts one has learned have been forgotten. (See background comment in endnote 2.)

6 This leads me to ask: what does remain by way of statistical knowledge when the facts one has learned have been forgotten?

7 From conversations on education I have had over the past 20 years with former students, both soon after and long after they studied statistics or econometrics subjects with me, two points have crystallised. What remains with them, when the facts they have learned have largely been forgotten, is a sense of the structure of the subject and a sense of the worthwhileness of the subject (see background comment in endnote 3). Synthesising student comments, I interpret these two characteristics in the following way.

8 Structure is a reflection of the coherence of the subject and its presentation to students, and is often seen best in a perspective view, that is, from a vantage point above the fine detail of the subject matter. Worthwhileness is an amalgam of intellectual excitement, resilience to challenging questioning, and demonstrated practical usefulness. To make statistics memorable, these attributes of its structure and worthwhileness need to be consciously conveyed to students.

9 Before exploring how this may be done, I note that other attributes of good teaching are commonly mentioned. Such attributes as establishing rapport with students and providing clear explanations in class are, in the main, bound up with personal qualities of the teacher. Such qualities often make the teacher memorable. But I am focusing here on the principal factors that will make the subject memorable. These factors are not, by and large, personal qualities of the teacher, but rather aspects of the teacher's own conception of and approach to the discipline. My reading of the statistical education literature suggests that such factors are much less commonly discussed.

10 Reverting now to my theme: five important attributes of the discipline need to be brought out in teaching statistics, to help ensure that students will firmly grasp its structure and be convinced of its worthwhileness. Here, before I go into details, is a schematic view of these attributes and their inter-relation.

STRUCTURE is the vital cognitive (`rational') aspect of the discipline.
(i) Coherence in exposition can reveal structure in three different ways:
(ii) Perspective in presentation can reveal the merits of a coherent exposition.

WORTHWHILENESS is the vital affective (`emotional') aspect of the discipline.
(iii) Intellectual excitement stimulates the student. It is evoked by: seeing scope for advancing the subject; observing the teacher's interest in the subject; and own discovery of the subject (especially when findings are surprising).
(iv) The discipline's resilience to challenging questioning reassures the student. Reassurance comes from a clear picture of both the strengths and weaknesses of the discipline and an appreciation of how the former outweigh the latter.
(v) Demonstrating practical usefulness implies career prospects that can fulfill the student.

11 Given that there are strong linkages among these attributes, how can memorable teaching make the most of those linkages? This, too, is a question I shall presently address.

3. Structure

12 A hallmark of memorable teaching is to make students aware of structure in the details of a subject. It is structure that is the bridge between knowing things and understanding them, and it derives, as I have said, from all the elements of coherence in the subject. Structure is best grasped from a perspective view.

3.1 Coherence

13 To teach statistics coherently means to identify related elements in the discipline and to explain the elements in a way that highlights the relationships. In Sowey (1991) I have enlarged on this theme, identifying three dimensions of coherence which I call theme, pattern, and knowledge coherence.

14 Theme coherence is a smooth traverse in explaining (a) the theoretical basis of statistical technique (going from assumptions to conclusions, and from simple cases to more complex ones), and (b) the transition from theory to practice (e.g., from explaining the theory of stratified sampling to arranging for students to do and analyse a survey in the field). The key concept is continuity of logic within each theme.

15 Pattern coherence arises from the existence of common characteristics, or patterns, in the structure of statistical theory across different areas of the discipline. The key concept is underlying unity in disparate procedures.

16 Knowledge coherence comes from a view of statistics as woven into the tapestry of all human knowledge. A good approach to this kind of coherence is contained in the papers collected in Tanur et al. (1989). The key concept is integration of statistics with its source and auxiliary disciplines (logic, probability, computing), its cognate disciplines (biometrics, psychometrics, econometrics, cliometrics, etc.) and its myriad disciplines of application in the sciences, social sciences and humanities.

17 Most textbooks aim at theme coherence in their exposition, but it is not always done with success, especially in the transition from theory to practice. On the latter count, Chatfield (1988) is a notable exception. By contrast, pattern coherence is highlighted almost exclusively in advanced texts. This is unfortunate, for there are ample instances that intermediate texts could point to (e.g., the communality of form in simple and multiple regression estimators, made apparent via the use of matrix notation; and the communality of structure in regression, experimental design, and variance component models within the general linear model). Knowledge coherence, intrinsic to making sense of the world from the statistician's standpoint, is hardly mentioned in the textbooks at all. To the statistics student, the mosaic of publications, each covering a particular interdisciplinary area of statistical application, is no substitute for a single broadly integrated picture.

18 Because textbooks are generally unsupportive of teaching that pays regard to all aspects of coherence in statistics, the teacher needs to assume greater responsibility for the task. To this end, skill in stimulating students' intuition, a capacity for lateral thinking about the scholarly literature, and a broad general knowledge of the place of the discipline in the world of learning will stand the teacher in good stead.

3.2 Perspective

19 A perspective view woven into the exposition from time to time brings students at least three benefits: (a) it helps them chart their progress through the syllabus; (b) it promotes understanding of the coherence of the subject; (c) it makes clear what parts of the area under study are not currently being treated in detail. In other words, it lets students know what it is that they don't yet know.

20 Constructive though this sounds and is, many teachers do without regular pauses for perspective in their exposition. The effect is to make student learning rather like cartography in the days before aircraft -- slow, tedious, disjointed and, ultimately, very imperfect!

4. Worthwhileness

21 Conveying a feeling for the worthwhileness of statistics is in every sense a key element in memorable teaching, for it is both a key motivator of students to learn well in class and the key to effective self-directed learning in the future. Three principal qualities of the discipline need to be brought out so as to engender this sense of worthwhileness.

4.1 Intellectual Excitement

22 David Attenborough, Stephen Jay Gould, Julius Sumner Miller, David Suzuki, Naomi Wolf: these names are readily recognised. What have these people in common? Each has the gift of talking about his/her discipline in an intellectually exciting way. I have asked myself how they generate that excitement and have carried over my conclusions to statistics.

23 Intellectual excitement about statistics, I can confirm from practice, will grow from teaching where

(a) students see the discipline as one of central importance, but one in which not everything is yet settled. Intriguing questions that are still largely unanswered (e.g., how to define `probability' in a way that unifies the disparate approaches of objectivists, subjectivists, and axiomatists; judging the quality in small samples of asymptotically optimal inferential procedures; and deciding the optimality of standard estimators when used after model selection via a data-directed specification search) should certainly be raised when the moment is apt, whether or not a solution is being offered.
(b) the teacher's enthusiasm for and commitment to the discipline is evident. The impact on students of these qualities cannot be overstated, though by themselves they may do no more than ignite students' interest. To keep the fire burning, the teacher needs to show others why the subject merits their enthusiasm and commitment, as well. The review journal Statistical Science can provide much material for this purpose.
(c) some striking demonstrations are introduced that will arouse students' curiosity and provoke reflection. Three well-proven teaching devices can serve here: to present an arresting example (especially one that relates to students' own life experiences), to challenge students to resolve a paradox, and to guide them to an unexpected discovery. Here are some instances.

24 Early in my introductory statistics course I want students to appreciate something rather counter-intuitive: that, from bits of information acquired quite randomly (random sample elements), something that is both structured and appealing (an inference about a population) can be derived. I make this notion come alive by means of a playful musical metaphor: I demonstrate a sampling experiment using Mozart's (1793) Musical Dice Game. This musical curiosity presents a "treasury" of 176 detached bars of piano music. Sixteen bars are selected randomly from a table of bar numbers by sampling according to the sum of the faces of two rolled dice. Though the bars are seen to have been drawn quite at random, when they are strung together the "composition" that results is, to the students' surprise, not only musically well-structured but also, indeed, quite harmonious and appealing! This demonstration, I have found, puts my point across in a striking way.

25 Counterexamples to statistical propositions also make arresting examples in statistics courses at every level (see, for example, Romano and Siegel (1986)), as do some of the more bizarre misuses of statistics in the daily press. Another lively idea comes from Mansfield (1989). Many further possibilities can be drawn from articles in The American Statistician, the Journal of Statistics Education, and Teaching Statistics.

26 Paradoxes abound in statistics and, if aptly posed, can stimulate lively and memorable discussion. Here is an example. A regression model containing an exact linear dependence among k regressors cannot be estimated by ordinary least squares unless one regressor is deleted. However, by transforming to (k-1) principal components and then unscrambling the OLS regression on principal components, one has an estimated model that contains (surprise!) all k original regressors. Where is the catch? More subtle paradoxes are found in Szekely (1986). Among these are two that I find particularly effective with second year undergraduates: Bernstein's paradox (p. 13) and Simpson's paradox (p. 58).

27 Unexpected discoveries that students make themselves have the strongest impact on learning and remembering. The remarkable meaning and generality of the Central Limit Theorem, for instance, are rarely so fully appreciated as by students who have performed computer sampling simulations summarised in real-time graphics, with samples of ever-increasing size drawn from a variety of symmetric and asymmetric unimodal populations, and even some bimodal populations.

28 A similar impact is found when students investigate the geometry of the standard normal curve. They are generally astonished to discover that, if the curve were to be drawn so that the ordinate at z = 6 is 1 mm high, then the ordinate at the mode would be 65.7 km (yes, kilometers) high. This particular fact is interesting, in my view, because it does several things, depending on what kind of student is involved. It helps nonmathematical students understand intuitively how a curve open at both tails can nevertheless contain a finite area. It convinces students of modelling (as opposed to simply informing them) that the normal distribution is a poor probability model unless virtually all observations are found within three standard errors from the mean. It sharpens the contrast that can be theoretically shown with the t-distribution and other so-called "fat-tailed" distributions, which are considered more relevant for modelling certain kinds of data. Finally, it underlines the fact that conventional textbook sketches of density functions sometimes fail to convey important truths.

4.2 Resilience to Challenging Questioning

29 I think of resilience under challenge as a defining characteristic of a university discipline. Displaying that resilience should, moreover, be part of the teaching of all such disciplines. Ideally, the challenging questions will come from students, but if not, the teacher ought to ask those questions.

30 The purpose is to elicit critique of disciplinary foundations and theoretical conventions, commentary on limitations of techniques and controversial interpretations, and scrutiny of the worth of claimed achievements. These are all, in the philosophical sense, methodological issues. The result is that a realistic three-dimensional picture of the discipline emerges. Some examples here are the legitimacy of seeking to establish the direction of causation between two variables by statistical means (see Jacobs et al. (1979) and Conway and Swamy (1984)); and the equivocal attitude of some statistical practitioners to systematic sampling, to data mining, and to persisting with elaborate analyses of data known to be flawed.

31 My experience confirms that offering students (especially senior students) a methodological view of statistics develops their skill in critical thinking and, by involving them, acts as a powerful catalyst to long-term learning.

32 Notwithstanding the benefits, it is, regrettably, uncommon to find support for methodological discussion in academic syllabi and textbooks. What might account for this? I have heard a number of explanations including "It is more important to cover substantive techniques that give students a marketable skill," and "Methodology is a waste of time -- it never settles anything." Such explanations too easily dismiss the subtlety of motivation to long-term learning.

33 The merits, for long-term learning, of a methodological approach to statistics carry over in large measure also to teaching which acknowledges the history of ideas in the discipline. By retracing just a few of the often bumpy research paths of statistical pioneers, students can gain a sense of the pains and pleasures of creative thinking. What's more, they can come to appreciate the resilience, or otherwise, of the discipline to the challenging questions of an earlier generation of thinkers. Peters' (1987) text is the first to move in this direction.

4.3 Demonstrated Practical Usefulness

34 Students need more than simply an assurance that statistics is important in the real world to be motivated to learn and retain statistical ideas. It is the demonstration of practical usefulness that convinces.

35 How can one go about such a demonstration? Illustrative practical examples, presented in the course of teaching theory, are the most obvious device. But these are hardly sufficient because they are contrived for a different primary purpose and so are usually unrealistic. More broadly constructive are assignments, based on real-world problems, in which students play the active role in formulating a verbally-stated problem in statistical terms, and must struggle with ill-conditioned data, decide for themselves the most appropriate technique to apply, cope with unanticipated analytical obstacles, and finally write a professional report on their investigations. Thoughtful prompts from the teacher can nudge students along if they find themselves unsure of how to proceed at any point.

36 This kind of problem-based learning by discovery is a powerful way to cement understanding, as the literature on using case-studies in teaching confirms. Another way to achieve some of the same outcomes with senior students is to give them a role as rostered statistical consultants to less-trained users of statistics on campus.

37 Most strikingly effective is to bring students into contact with statistical practitioners in government, business, and industry, through one or more of these initiatives: field trips into the workplace; "sandwich" programs that slot a substantial involvement in work experience into the middle of a degree course; a visiting speaker seminar that is melded into the teaching program. Hahn and Schmee (1987) make an interesting proposal to take further the idea of a visitor from industry.

5. Linkages

38 I have been looking separately at five attributes of memorable teaching. Each has a place in every kind of statistics course, but to a varying degree. For example, an introductory service course might emphasise theme coherence and practical usefulness; an intermediate service course theme and knowledge coherence, intellectual excitement, and practical usefulness; an intermediate specialist course theme and pattern coherence and intellectual excitement; and an advanced specialist course all the dimensions of coherence and disciplinary resilience.

39 Synergy, it is important to recognise, can develop from jointly emphasising these various elements. Students who respond to the discipline's intellectual excitement are, on that account, likely to be more motivated to seek out pattern coherence in statistical theory and to challenge disciplinary conventions. The teacher should always be alert to possibilities of fostering and forging such vital links.

40 All the examples I have just given have a feature in common. Each unites one attribute from the "structure" group with at least one attribute from the "worthwhileness" group. The former group represents the wholly rational inputs into learning and remembering, while the latter encompasses emotional influences. It is this way that linkage can make its most valuable contribution to the teaching of statistical ideas.

41 If it is done well, students will experience statistics as both logical and pleasurable. When their minds and feelings have both been engaged, the ideas they have learned will prove indelible.


This paper is a revised version of a paper (Sowey 1994) presented at the First Scientific Meeting of the International Association for Statistical Education in Perugia, Italy, in August 1993. I am grateful for the comments of three anonymous referees.


1. Academic tradition expects that I will provide independent corroborative evidence for my views, most desirably from formal experimental studies of what promotes student understanding and long-term learning in statistics. I have not found any detailed studies on this theme. Two very interesting recent papers (Jolliffe 1991, Nitko and Lane 1991) confirm that experimental research in this area is in its infancy.
2. The form of words which I have used in the text and which I first heard many years ago turns out to be a pithier version of what, I have recently discovered, Skinner actually wrote (in the New Scientist, 31 May 1964, p. 484): "It has often been remarked that an educated man has probably forgotten most of the facts he acquired in school and university. Education is what survives when what has been learned has been forgotten."
3. This empirically-founded insight is quite parallel to Ericksen's (1985, p. 30) dictum "Successful lecturing promotes the two basic conditions for learning and retention: motivation and meaning." My "structure" parallels meaning, and "worthwhileness" parallels motivation.


Chatfield, C. (1988), Problem Solving: A Statistician's Guide, London: Chapman and Hall.

Conway, R. K., and Swamy, P. A. V. (1984), "The Impossibility of Causality Testing," Agricultural Economics Research, 36, 1-19.

Ericksen, S. C. (1985), The Essence of Good Teaching, San Francisco: Jossey-Bass.

Hahn, G. J., and Schmee, J. (1987), "Practitioner and Academician Co-Teaching: An Idea to Consider," in Proceedings of the Statistical Education Section, American Statistical Association, pp. 199-203.

Jacobs, R. L., Leamer, E. E., and Ward, M. P. (1979), "Difficulties With Testing for Causation," Economic Inquiry, 17, 401-413.

Jolliffe, F. R. (1991), "Assessment of the Understanding of Statistical Concepts," in Proceedings of the Third International Conference on Teaching Statistics, Vol. 1, ed. D. Vere-Jones, The Hague: I.S.I., pp. 461-466.

Mansfield, E. R. (1989), "Teaching Elementary Probability Like Magic," in Proceedings of the Statistical Education Section, American Statistical Association, pp. 87-88.

Mozart, W. A. (1793), Musikalisches Wurfelspiel [Musical Dice Game] (Kochel catalogue Appendix No. 294d), published posthumously, Schott Edition No. 4474, issued 1989.

Nitko, A. J., and Lane, S. (1991), "Solving Problems Is Not Enough: Assessing and Diagnosing the Ways in Which Students Organise Statistical Concepts," in Proceedings of the Third International Conference on Teaching Statistics, Vol. 1, ed. D. Vere-Jones, The Hague: I.S.I., pp. 467-74.

Peters, W. S. (1987), Counting for Something: Statistical Principles and Personalities, New York: Springer.

Romano, J. P., and Siegel, A. F. (1986), Counterexamples in Probability and Statistics, Belmont: Wadsworth.

Sowey, E. R. (1991), "Teaching Econometrics as Though Coherence Matters," in Proceedings of the Third International Conference on Teaching Statistics, Vol. 2, ed. D. Vere-Jones, The Hague: I.S.I., pp. 321-331.

Sowey, E. R. (1994), "Teaching Statistics: Making It Memorable," in Proceedings of the First Scientific Meeting of the International Association for Statistical Education, eds. L. Brunelli and G. Cicchitelli, Perugia, Italy: University of Perugia, pp. 401-408.

Szekely, G. J. (1986), Paradoxes in Probability Theory and Mathematical Statistics, Dordrecht: Reidel.

Tanur, J. M., Mosteller, F., Kruskal, W. H., Lehmann, E. L., Link, R. F., Pieters, R. S., and Rising, G. R. (eds.) (1989), Statistics: A Guide to the Unknown (3rd ed.), Belmont: Wadsworth.

Eric R. Sowey
Department of Econometrics
University of N.S.W.
Sydney, N.S.W. 2052

Return to Table of Contents | Return to the JSE Home Page