Copyright (c) 1993 by David S. Moore, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the author and advance notification of the editor.
Frederick Mosteller is a member of the Department of Statistics in the Harvard Faculty of Arts and Sciences, Director of the Technology Assessment Program in the Harvard School of Public Health, and Roger I. Lee Professor of Mathematical Statistics, emeritus. This interview was conducted at the Harvard Department of Statistics on December 18, 1992.
1 Moore: You got your Princeton Ph.D. in 1946.
2 Mosteller: Yes, I started in (I think) '39, and then the war came.
3 Moore: You were a student of Sam Wilks, with a large assist from John Tukey, I gather.
4 Mosteller: That's right. Right at the end of the war, Sam was very busy and Tukey spent a lot of time with me, just when he was going from topology into statistics.
5 Moore: Very shortly after that, in 1948, Sam Wilks published his book Elementary Statistical Analysis . In the preface he thanks you, along with Al Tucker and John Tukey, for helpful discussions. What was your involvement with that book?
6 Mosteller: Well, my recollection is that I taught that course once or twice, and Sam had another course that had a slightly higher level of mathematics required. I taught both those courses and I helped him with the manuscripts of his books because as we taught from the books of course we found difficulties, as in any text. I think my biggest contribution to this little blue book called Elementary Statistical Analysis was to think of the thumbtack as a device for giving a fixed but unknown probability.
7 Moore: That's a famous innovation. I didn't realize it was one of yours.
8 Mosteller: I was proud of that, though perhaps others had thought of it earlier. I created some of the homework problems, so we sat and talked a long time about how to get something like this. I solved at one time or another all the problems in the book and, as I say, taught the course several times.
9 Moore: That book seems to me to be perhaps the progenitor of the next generation of elementary statistics texts. It stands in strong contrast to the research methods books by Fisher and by Snedecor (later Snedecor and Cochran), which were important texts at the time, although their origins go back before the war. Those books had essentially no probability at all in them, although they introduced distributions as needed. But Wilks' book, out of 280 pages, spends 100 pages on probability. It has a very careful probability-based approach, with a lot of emphasis on making sure the student understands the distinction between sample and population. I'd like to hear your reflections on why that seemed to be the right approach to statistics -- not just to probability, but to statistics -- in those days.
10 Mosteller: Statistics was being taught by mathematics departments. I think the kind of course -- practical course -- that we often teach in a statistics department, or in a psychology or social science department, didn't fit, intellectually. That is, the mathematics department felt the need that there be a mathematical basis underneath what was taught. And so, without the probabilistic and algebraic (or calculus) base, I think that the mathematics department felt that the course wouldn't be appropriate. I'm sure Wilks felt that way. Wilks was enormously interested in applications, was willing to work on them, and often did. Still, he felt that the mathematics was the real foundation to statistics.
11 Moore: You then became involved, especially through Continental Classroom, in the writing of books at a similar level. You helped write several books, of which the longest lasting was Probability with Statistical Applications , which you wrote jointly with Rourke and Thomas. It's a fine book, I might say. In fact, we still use it for an elementary probability course at Purdue, even though it's long out of print. So presumably you shared Wilks' opinion on the importance of a probabilistic foundation for teaching statistics.
12 Mosteller: Yes. Again, I think a little more history will help. Wilks was on a committee, called the Commission on Mathematics, for the College Entrance Examination Board. Al Tucker, a mathematician -- a topologist -- was the chair of that commission, and I was a member of it, as were George Thomas and Robert Rourke. What was significant about it was that it had members from every level of mathematics teaching, all the way from schools of education to elementary and secondary schools, and college and university teachers as well. Nearly every committee or commission in the field of mathematics since the early 1900s has recommended much the same thing, namely, that there be statistics taught in the elementary and secondary schools. But it hadn't happened. One of the things we on the commission did was to produce a book called Introductory Probability and Statistical Inference: An Experimental Course . Many of us wrote it together. This course -- this book -- was written with the cooperation of elementary and secondary school teachers, as well as teachers from colleges and schools of education. They all had in common an interest in mathematics teaching and so that slant remained.
13 Moore: Mr. Rourke was a secondary teacher.
14 Mosteller: Mr. Rourke was a secondary teacher, and he was one of three or four people who cooperated very heavily in the writing of that text. I remember it very well: I had a nice chapter on rank correlation methods. It was a beautiful chapter and every time I rewrote it, the secondary school teachers would work it over and tell me it wasn't satisfactory yet. And after revising it twelve times, we threw it away. It does not appear in this introductory book. The book sold very well. Indeed, it sold so well that the College Entrance Examination Board decided to stop selling it because it was going to get them into all kinds of problems about taxation. They weren't supposed to be in the textbook business, and here they had a book that was selling like hotcakes. So, after the commission finished its work, Rourke and Thomas and I decided that we would prepare a text. The next two or three years we worked on that text and just as we finished it up, the idea of my teaching Continental Classroom came up. We were able to adapt the book that we were writing, Probability with Statistical Applications , to the Continental Classroom course by just chopping it down a little bit and entitling it Probability and Statistics .
15 So the Continental Classroom course really grew out of the work on the Commission on Mathematics and my first experience with these many very fine elementary and secondary school teachers. The course itself relied on heavy interaction between the administrators of the course and school systems. Each school system had its own courses and its own teachers. They held regular section meetings, they administered standard tests that were given to them if they wished, and credit for the courses was given by the secondary school and also in college. Gottfried Noether at Harvard extension school ran the course for us, more than once I believe. (At that time Gottfried was a professor at Boston University. He also helped to develop the lesson plans for the TV course.) There were hundreds of thousands of people watching the course and, among those, many thousands who were taking the course for credit, either in secondary school or in colleges and universities.
16 Moore: Continental Classroom in retrospect was quite remarkable. Here was a course that assumed that people would wake up at 6AM regularly -- this was in the days before video recorders, so you couldn't simply set your recorder to wake up at 6AM -- turn on their television sets, and watch someone teach them statistics. No computer graphics, no location shooting, though it's true that there was a master teacher. It seems that the heirs of that are enterprises like the National Technological University, which is aimed at graduate engineers who have a strong professional interest and often are seeking a credential which is valuable to them. It's quite remarkable in retrospect that a large number of people around 1960 would arise at that hour of the morning to watch educational television.
17Mosteller: Yes, and I suppose some of those people were in hospitals and didn't have anything much to do (laughter). Phil Hauser told me that he was. It's true -- they did get up early. For two years they had to get up quite early. Later it was reshown in the afternoon, at 2:00 or 4:00 in the afternoon in New York City, for example. Finally there were tapes and some universities used the tapes to teach courses at the university, rather than in connection with an actual television broadcast.
18 It's always seemed to me a shame that we didn't have better visuals. I tried as hard as I could to have visual aids, but one of the reasons that they wanted to have a mathematics course was that it was going to be inexpensive (laughter). We wouldn't have to have all the extra things that cost money, like a chemistry course or a physics course might have. We tried as best we could to make little visual aids, toys, railroad cars and so on. But we didn't have any serious photography from outside, everything was from inside the system.
19 Continental Classroom should have continued. There wasn't any reason at the time why there shouldn't have been a national college of the air. But it didn't happen. What happened instead was that Sunrise Semester, which was on one network, and Continental Classroom, which was on another network, each tried to more or less corner this market and then because of disagreements among different groups of people, it all fell apart, as occasionally happens. And so I think that the companies decided to give it up.
20 Moore: There are studies of the success of the Open University in England which claim that one major reason for that success was that England had for a long time a relatively closed university system. There was a substantial demand for post-secondary education which was not being met, and that opened the way for the Open University. In the U.S., almost everyone can find a post-secondary institution which will be happy to accept them, so perhaps the demand for distance learning is somewhat different.
21 Mosteller: Yes, it may be. Our book was used in the Open University. That was Probability with Statistical Applications , perhaps a slightly shortened version. So they also gave a course based partly on it.
22 Moore: I'd like to ask you to reflect a little bit on the experience of teaching on television and perhaps also on what seems in retrospect to be the optimistic assessment of what television could accomplish for education that characterized that era. In the early and mid 1960s, television was the great technological hope. Here is a quote from Time Magazine: ``Not only is a taped professor as informative as a live one, but he seldom turns sour and never grows weary of talking.'' There was actually a feeling that taped teaching by master teachers would replace live teachers on campus as well as taking advantage of the reach of broadcast television. Do you have any reflection from your own experience about that era and the hopes that were attached to television?
23 Mosteller: Yes. I think that the description you've just given certainly reflected some views of the times. What they never took into account, however, was the problem of the professors back at the university. They wanted to teach too, and they didn't want to play second fiddle to somebody on a screen. It's very likely that a course taught on television, because of the careful preparation, will be better organized lecture by lecture than the usual lecture in class, but it does have a lack of flexibility. We were, however, quite capable of having lectures Monday, Wednesday, and Friday and sessions about the problems on Tuesday and Thursday. Paul Clifford presented these, and he had better visuals than I had. And of course the students also had sessions with live teachers at their university or at their school. But still, the idea that the professor was going to be satisfied back at the university to just follow someone else -- that I think was unrealistic.
24 On the other hand, the idea that certain materials can be expressed better in a tv session seemed to me to be right, and still can be right. I think that the expanded ability to produce material that has more visual content than anything we were able to put together adds a lot more interest to the course. I think too that with all that added visual interest -- as was the case with many of the programs in your course, David -- people who are not taking the course could get a lot more out of it than they could get out of a course like mine, where we stuck very closely to the work and didn't reach outside the classroom much. [The course referred to is the Annenberg/Corporation for Public Broadcasting telecourse Against All Odds: Inside Statistics .]
25 Moore: It's also true that the audience is different in the sense that potential students now have grown up with television to an extent that still wasn't true in the early 1960s. So they're accustomed to a higher level of production values and they may not be as patient in simply sitting and listening as was the case then.
26 Mosteller: I'm sure you're right. The level of production is so much better now than anything we had then. Even then what we were doing didn't match the kind of production that was being put on in situation comedies, for example, or even in the daily news.
27 Moore: In talking about the first 15 years or so of your career, we've set the stage for looking at the changes in statistical education that have taken place in the subsequent 30 years. We've introduced technology by talking about television; we've talked about a probability-based approach to statistics; we've talked about teaching in math departments as opposed to statistics departments. All these are things that have changed a lot since, say, 1965. Not too long after this you became involved with data analysis -- not only the techniques, but the philosophy of exploratory data analysis originated by John Tukey. You collaborated with Tukey on the ``green book'' on regression, and you have been editing with Tukey and David Hoaglin a series of books on data analysis. I think three are now out, is that correct?
28 Mosteller: Yes, three.
29 Moore: And your most recent introductory text, Beginning Statistics , with Fienberg and Rourke, is quite a strong contrast to your earlier texts. It is data-oriented, and formal probability is treated in an optional chapter. I'd like to hear you reflect on ``How I changed my mind,'' how you moved from the formal probability approach to teaching statistics to the exploratory and data-oriented approach that seems to characterize your more recent elementary writing.
30 Mosteller: The history of that is, of course, a little complicated. I came to Harvard University in the Department of Social Relations. And there I met many graduate students in anthropology, clinical and social psychology, and sociology and came up against dozens and dozens of practical problems. At the same time, I was also working with Dr. Henry Beecher, an anesthesiologist at the Massachusetts General Hospital, on problems of anesthetics and analgesics. Consequently, I was heavily data-oriented, both in the social and medical sciences at that time, and I had had a great deal of experience in data analysis during the war. I found that the students were really not very interested in the logic of statistical analysis. They were eager to get their hands on tools that would help them solve the problems that they had with their doctoral or undergraduate dissertations. Most of them had concrete problems, most of them had data problems, and they were eager to use statistics to solve their own problems. They were not very interested in knowing how somebody invented a statistical device, nor did they care to have it proved to them personally that the method was appropriate or worked. They were rather eager to get on with doing the analysis and trying to interpret it, and then bringing it back to patients or theory of social science, or anthropology, or whatever it might be.
31 An outsider may not realize that in many practical situations a very few standard statistical techniques are used over and over again in much the same way. Consequently these techniques become well understood by the practitioner, even if they seem difficult at first.
32 So, in the classroom as opposed to in the writing of the textbook you speak of, I found that the students I was teaching didn't have much interest in the underpinnings that we in math departments often found very valuable. So I got to thinking that there might be a way to teach a course in statistics using primarily data. My dream in 1955 was to take the great sets of data from the history of social, biological, and physical science and use that collection as a basis for teaching statistics. And so, when I had a sabbatical and went to the University of Chicago for a year, that was the plan I had -- to write a book that would do that. Indeed, my sabbatical was being paid for by that branch of the Ford Foundation that supported teaching. I tried to do it, and I failed. I worked very hard at it, but I just couldn't write this textbook with the statistical equipment that I had at the time. The kinds of data that I had available were quite varied, and were not always amenable to the kinds of statistics that I thought that I was going to teach the students. Though I didn't realize it at the time, what we didn't have was exploratory data analytic methods which could be used on a great variety of kinds of data. Now, it seems to me, I probably could produce a book like this using the same examples that I failed on originally. I was pretty embarrassed about that. I did do a lot of other work at the time, and I think we finished a book, but on another topic. But I didn't ever succeed in that particular book. Later, I'll tell you how I tried to make up for it.
33 Moore: Have you changed your mind again, or do you still believe that exploratory investigation of data is, for most students at the beginning, the proper way to approach statistics?
34 Mosteller: Yes, I do still believe that. I believe that students are very interested in findings from the data and are willing to work hard on it, and so I think data-oriented statistical teaching is a good idea. I have written a book -- Biostatistics in Clinical Medicine -- with colleagues on statistics for physicians, and it tries to orient itself toward teaching the course from the point of view of the problems that physicians have -- problems of diagnosis, problems of treatment, problems of different dosage levels, problems of tests and the conflicts between tests that are carried out. Also how to read statistical literature and know what it's talking about. So that course is oriented in a different way from our usual statistics course which tends to teach about statistical topics such as means and variances and regression and analysis of variance. It's more oriented toward the way the practical people in the field think about the subject matter that they're working with.
35 Now, for the general situation where the students may not be so far along and may not have a specialty like the physician, it seems to me you still can try to interest them by picking topics that have in themselves a lot of serious interest -- problems like the effectiveness of capital punishment, or its ineffectiveness as the case may be, whether integrating the school systems led to improvements in performance of the pupils, and on and on. Problems that have popular appeal.
36 Moore: And before you know it, you're talking about the statistics of the Kinsey report, and the National Halothane Study, and all the other things you've done.
37 Mosteller: Yes. All those things feed in. The National Halothane Study led to Bishop et al. Discrete Multivariate Analysis . So you get a chance to show students lots of problems, and they then take an interest in the devices that can be used to drag information out of the statistics, and they hope to use the devices. Currently, as you know, there's a movement into that approach to teaching statistics: Just pick one problem and use that problem area, say capital punishment or the effectiveness of capital punishment as a deterrent to crime. Have the students look up the data, read the papers, try to understand the analyses and the devices used for them, and through that mechanism learn about the original questions, then the design of the studies, then the gathering of data, the analysis of the studies and interpretation, and finally one step further, which is how to apply the information found to policy. This last is very difficult and very different from what most people think. Usually people think (and many statisticians tend to think) that once good data are available, then the answer to the policy question is at hand. But that usually is not true, because policy implies politics, and politics implies controversy, and the same data that some people use to support a policy are used by others to oppose it. So it's very difficult to handle policy questions, but nevertheless data does help the debate.
38 Moore: It's also true on issues of social policy that collecting good data is very time consuming. If a study is designed to answer this year's policy questions, for example, about the effects of reformed welfare programs, five years from now when the data are finally available the questions have changed and the data don't directly address current questions anymore.
39 Mosteller: That's true. On the other hand, I do feel that good data are likely to be valuable sometime again. It's rather surprising that some questions never really die, and they don't fade away either. I'm still finding current references to the National Halothane Study, even to its data. They come back again and again and again. I think a good set of data often nails down a corner of a system and does a lot of good. Even though it may seem that the data didn't answer all of the questions, that's because no data ever answers all of the questions, and especially they don't settle the policies. But they contribute a lot to the discussion and they often force the reformulation of the whole question.
40 For example, it seems to me that the first Coleman report on equality of educational opportunities did this. The study was geared to find out whether, under a system that was intended to be equal in opportunity but separate, equality of opportunity had been made available. It seemed as if it had, in the sense of equipment and training of teachers for the minority groups and for the majority group. But it turned out from examining the children that the minority groups still were falling behind the majority group in their personal achievement. More or less at that moment, with those data, the country changed its mind. It had had an idea of equal but separate and it found out that the achievement wasn't anywhere near equal. And it decided that it wasn't so interested in the concept of equality of opportunity as it was in the concept of equality of achievement. So instead of being pleased with itself for having achieved something like equality of opportunity, or, perhaps more realistically, of facilities, it changed its mind and decided it wanted equality of outcome. And since then, that's what we've been struggling for. I think there's an example where a good and interesting set of data, which didn't tell us how to fix education at all, did change the question. It settled the point that we had equality in facilities about as near as one could hope, and also showed that we didn't have anything like equality in the outcomes. And the nation decided it wasn't satisfied. It changed what it was trying to achieve.
41 Moore: The Coleman report, by the usual statistician's habit, leads me to think about multiple regression, which leads me to think of the ``green book.'' Could you tell me how you came to write Data Analysis and Regression with John Tukey?
42 Mosteller: It had a long history, starting with Robert Bush. Bob Bush and I wrote for the first edition of the Handbook of Social Psychology a long piece on statistics for psychologists. In that piece we explained a lot of things, such as how to combine data from several sources, and we introduced many ideas of nonparametric methods and popularized these methods (I think) for that group of scientists. When it came time to revise the handbook, Gardner Lindzey, the editor, invited John Tukey and me to write the new chapter to replace the old Mosteller and Bush chapter that had appeared in the first edition. We wrote quite furiously, and after a while we had hundreds of pages of material, but Lindzey didn't seem to want to devote a whole volume to our chapter (laughter). Consequently, we had a great deal left over. And so we looked at this pile of material and decided we should go on and make a book. In the course of that, we introduced things like the jackknife and certain exploratory data-analytic methods, and then we decided to turn to regression because that seemed to be an area that was growing in popularity and interest for data analysts. We had many concerns about it because there seemed to be so many ways to have misunderstandings about analyses based on regression. Many people hope that they can derive from regression equations the equivalent of proof of causality. Sometimes they probably can. Both John Tukey and I are very worried about that aspect of regression and we think there are a lot of difficulties in regression. We tried to explain what those difficulties were.
43 Moore: You used wonderful phrases like ``the woes of regression coefficients'' to get that across.
44 Mosteller: I hope those were objective pieces of writing (laughter).
45 Moore: It is my impression, which you may wish to correct, that the green book, although widely known and widely respected, is not widely used as a text.
46 Mosteller: Oh, I'm sure you're right about that. I think of it really as a monograph, with a pretty good exposition of the jackknife methods, which were quite new at the time, and then a strong discussion of regression and an introduction to robust methods and to use of transformations. It's not a very systematic development of the areas that you might want to teach in a second course on statistics. I taught it a couple of times, and graduate students in statistics were very, very interested in it. I'm not so sure how well it goes over for students who have not had quite a bit more statistics than just a one year non-mathematical course. And I think that's part of the problem. The book tries not to use much mathematics, yet somehow without it showing there is a lot of mathematical underpinning.
47 Moore: It's also true that statistical concepts are intellectually sophisticated in their own right. Students don't find them easy even when they are not presented mathematically. One can't judge how approachable a statistics book is by simply looking at its level of mathematics.
48 Mosteller: That's a very good point, David, and you do well to make it.
49 Moore: The theme of our conversation right now is ``What has changed since 1965?'' We've talked about content -- about the coming of exploratory data analysis and the return to an emphasis on data in teaching statistics. I'd like to raise the issue of what other changes in content have come about in basic instruction, or perhaps haven't come about but that you think might be appropriate. For example, at the professional level in statistics Bayesian approaches are much more common now than they were in 1965. Now, I personally feel there are good reasons not to use the Bayesian approach in teaching statistics to beginners, but don't let that influence you (laughter).
50 Mosteller: Well, you know David Wallace and I got interested in Bayesian methods and puzzled why it was that they weren't more widely used. So we launched on the study of the authorship of the disputed Federalist papers with an intent to use Bayesian methods to try to get an answer to the historical problem of who wrote the disputed papers, Madison or Hamilton, or else sort the disputed papers between the separate authors. We were going to use Bayesian methods to find out how to do it. What we found out was that Bayesian methods weren't ready. We had to invent Bayesian methods every step of the way. We were not in a position to do routine statistical things; instead, we had to do something clever (if I can use such a word in honor of our work) or something innovative in every corner.
51 Although you can, like Jimmie Savage, feel that every new statistical problem has to be solved with new statistics, I think the reverse must really be true. Someone like Whitehead said something like ``New ideas, like calvary charges, should be saved for great occasions'' (laughter). One wants routine methods that can be applied over and over, and we have them in the t tests, in the analysis of variance, and in regression, in spite of what we said earlier about regression. Consequently, when Bayesian methods are tried on new problems one is distressed to find that the equipment is not ready. It doesn't mean that the Bayesian methods aren't a good thing to use, but it's hard to use the techniques unless they've been worked out in enough detail that you feel comfortable and confident about them. So I think one reason Bayesian methods have not been used so much is because they have not been adequately routinized. On the other hand, now that we have heavier computational facilities it does seem that we are in a better position to use Bayesian methods in routine ways for fairly standard problems. So it may be possible now, as it was not then, to lay out in many instances solutions through a Bayesian approach to routine problems.
52 One of the things we found in the Federalist work was that, although everybody always worries a lot about the priors, in fact in the data analysis the data distributions rather than the prior distributions mattered enormously to us. We were wanting to use a model in which something simple like the Poisson distribution would represent the data. We found out that it did not fit the data. We needed to use something like a negative binomial, and it made an enormous difference in the odds that we got out. When we used the negative binomial, the odds were something like the square root of the odds we had when we used the Poisson. It may be news to some people that the data distribution is terribly important in Bayesian inference. Prior distributions are always obviously important, but still, maybe much more attention needs to be paid to the data distribution.
53 Moore: In talking about changes, computing has to be the most striking thing that has happened to statistics in the last thirty years. It has completely changed the practice of statistics, and made it in many ways much more accessible to users. Computing has changed the teaching of statistics to some extent, in that (if we're wise) we now automate routine calculations. Yet it strikes me that just as television was the great technological hope of the early 60s and turned out not to have a great influence on education, computing also has not yet fulfilled its promise as a tool for teaching statistics rather than simply speeding calculations. How would you react to that? Also, what do you think are appropriate uses of computing technology in teaching statistics, especially at an elementary level?
54 Mosteller: In the elementary courses I taught here, we routinely did have statistical packages available, and we taught using a few of them with enough depth so that the student could handle homework problems and perhaps thesis problems by using the packages. The big need always is to help the students understand what is going on. I think, in a way, that's always been a difficult problem in statistics, for the student to figure out what the question is, formulate the question, and then manage to pick out a satisfactory method for answering the question. The complaint of the students really was the same as the complaint of the students in the old mathematics courses: that they'd learned a lot of techniques and if anyone gave them a problem they were ready -- except they didn't know which bag of tricks to pick up and use. Well, that same complaint still exists in a time when computers are available for solving the problem. What we may need now is something I don't have, an interactive method to help the student find his way through the many different kinds of devices there are for solving problems and help pick out the appropriate one for the kind of question the student wants to answer. That may not be so hard once someone has decided that this is the kind of help that they intend to give the student. I don't know examples of that, but in principle there is no reason why we shouldn't have it. (It also is not much needed in situations where the same statistical methods recur repeatedly.)
55 Moore: You say ``in principle.'' One principle is that it's always the next generation of technology that is really going to have an impact on education. The current version of that abiding truth is that when what is called ``multimedia'' comes fully into play -- when video, audio, and computing are completely integrated so that one can have a fully interactive system that poses problems and offers some sort of expert guidance, with which the student interacts via keyboard and mouse -- then we'll be able to do much more. I have to express a little skepticism, having seen that first television and then computing have not had the impact on education that we hoped they would.
56 Mosteller: Sure enough. Still, when I look at what's happened in medicine over the last 40 years, I see that very fancy statistics are used widely in the analysis of data from experimental and observational studies. I see lots of techniques used that I myself really don't know about and have to look up in order to decide whether something appropriate has been done when I'm refereeing papers for a medical journal. But the quality of the statistical analyses being carried out in those papers is so far beyond what was being done 40 years ago that it's incredible. The material that appears in many of these papers is almost the equivalent of a doctoral dissertation when I was going to graduate school. This is a consequence of the many changes in statistics -- the computing of course plays a big role, but also the manufacture of a lot more statisticians than there were in those days. Good, practical statisticians are really much more available today than they were at that time. Many of the professions now have a lot more equipment and are able to take advantage of statistics in ways that they couldn't before.
57 Moore: Yes, and computing has clearly played a very large role in that. If I can share a reminiscence, I spent a summer while I was at Princeton working for a distinguished astronomer, using a desk calculator to calculate double star orbits. (At the beginning of the summer, he asked himself whether it was worthwhile to have me learn Fortran so I could use the new IBM 650 computer on campus, but he decided that no, it wasn't.) I computed for weeks and plotted points on a graph. These points, which of course had lots of observational error in them, were supposed to form a parabola. So then I would sharpen a pencil and draw a parabola through the points freehand. The distinguished astronomer was very happy. He said he could tell that I was a mathematics student because the parabolas that I drew freehand actually looked like parabolas. When I came to Purdue in 1967, I related this experience to an older colleague. He said, well, when he had come to Purdue 10 years earlier, he had gone over to consult with some engineers who also had a theory that said that points should form a parabola. They were fitting a parabola by plotting the points on a large piece of graph paper, putting it up on the wall, and hanging a chain through it (laughter).
58 These anecdotes reinforce your comment that the sophistication of scientists, and technologists in general, in computing and statistics and data analysis has advanced almost unimaginably in the last generation. Will this make statisticians to some extent redundant?
59 Mosteller: It doesn't seem to. It seems to make them more needed. Perhaps people are beginning to appreciate enough about statistics that they're beginning to understand when statisticians should be called in. With John Bailar I edited a recent book of a very different kind -- Medical Uses of Statistics -- that was motivated by Arnold Relman, who was the very distinguished editor of the New England Journal of Medicine . He felt that the New England Journal ought to have a role in the education of physicians, and so he wanted a book to be written for them. But his idea of the book that was needed for the physician was very different from the kind of text that we ordinarily write.
60 As I said earlier, we statisticians tend to write on how to do estimation, regression, analysis of variance, and so on. Relman's idea was that that's not what the physician ought to know. He felt that the physician ought to know the state of statistics in the area that he or she works in, what are the big ideas, and why should those ideas be ideas that the physician should have in mind. He was especially insistent that we not tell the physicians how to carry out the statistical analyses. Rather, they were to understand what the purpose of the statistics was, how effective they could be, and what would go wrong if you didn't do the appropriate thing. So he helped us prepare a book, which deliberately does not try to tell the physicians how to do the statistics but does try to tell them something about the state of the art, what can be learned, what kind of troubles there are. His thought is that physicians ordinarily aren't going to carry out the analyses.
61 Much medical research is interdisciplinary work, and physicians are very used to the idea of interdisciplinary work -- they have one person for hearts, and another person for some other activity, and so on. They know they should bring in a statistician at the appropriate moment. Relman wants the physician to understand why the statistician ought to be there and when it would be especially appropriate to be sure the statistician is present and active in the analysis. We tell them that the early participation of the statistician in the design of the investigation will help a lot, and that it's very difficult to rescue a failed experiment through analysis, though once in a while it can be done. And so on. So this book tries to tell, in a ``What's it all about?'' kind of way, the story of statistics for physicians.
62 Mosteller: Now that's a very different idea from the idea that led to Statistics: A Guide to the Unknown . Statistics: A Guide to the Unknown was prepared under the editorship of Judith Tanur and a committee of the American Statistical Association and the National Council of Teachers of Mathematics. When I was president of the American Statistical Association, I was invited by the teachers -- because of my earlier work on the Commission on Mathematics -- to come to Reno and give them a lecture on probability and statistics. I did, and it seemed to me that the time had come for us to join forces. So I suggested we form a committee which would try to do more on statistics and probability for schools and community colleges. They were quite happy to join, and a committee was formed and that committee then met and decided what it ought to do.
63 One of the committee's activities was to try to develop some essays that described ``whole uses'' of statistics. As I've said several times earlier, we teach statistics piecemeal because we teach it by statistical topic. It seemed to me that a difficulty in all of statistical teaching had been that we never show how statistics is used in solving a whole problem; we only use statistics in illustrating some aspect of a problem. So what the committee did was to get a lot of experts in different fields to write little essays on how statistics was used in solving whole problems. For example, one of the problems had to do with formulating a gelatin dessert, deciding just how sweet it should be and just how effective it would be from a nutritional point of view, and so on. The statistics of that were all contained in about a 10-page story. We also had a 10-page story on the Salk vaccine experiments that finally prevented almost all of paralytic polio. In all, we had about 40 or 45 essays that used statistics to solve problems, and that was one attempt to answer the question of how you get around this piecemeal teaching of statistics.
64 Though we wrote the book primarily for high schools, in fact it's largely been used in colleges (laughter). It has been very successful and it's now in its third edition.
65 Moore: About that same time, the ASA/NCTM committee, with perhaps similar motivations, undertook the production of the books called Statistics by Example .
66 Mosteller: Yes, that same group of people added to its membership some additional secondary and elementary school teachers, and together we gathered examples of all kinds and their solutions to form four little books. Those books were actually used in elementary and secondary schools and were quite successful. One of the most interesting problems was written by Ralph D'Agostino, who was then a graduate student here in the Department of Statistics. He wrote his Ph.D. dissertation with me, as a matter of fact. His son, Ralph D'Agostino Jr., is a graduate student in this department right now. The problem Ralph wrote about was: How much does a 40-lb box of bananas weigh? That seems like a strange question. The point of it is that bananas are boxed up in Central America and then they're shipped to the U.S. In the course of shipping, they ripen and change their composition, and consequently they don't weigh the same at the end of the trip as they did at the beginning. The problem is to try to pack them so that they weigh the right amount when you get them to market, and thus certain statistical principles having to do with forecasting or prediction are brought into play.
67 These two projects ( Statistics: A Guide to the Unknown and Statistics by Example) seemed to me to be successful alternatives to my original failed idea of teaching statistics through historically important data sets.
68 Moore: You were instrumental in getting ASA involved in education through its cooperation with NCTM, which of course continued and later gave birth to the Quantitative Literacy Project and many other good things.
69 Mosteller: It's been most exciting for me that this effort has continued beautifully, I think without a break, ever since it was originally set up. The ASA has maintained a strong interest in the area all along.
70 Moore: The ASA went through a transformation from being a scholarly society, you might say, to being a professional association. One important part of that transformation was taking a much broader view of its responsibilities, particularly in education, even in education that didn't necessarily pertain to future professional statisticians. You are now president of the International Statistical Institute, and the ISI is also becoming more interested in education. I'd like to hear some reflection on the role and importance of professional society involvement in education. I also wonder, speaking of ISI, if you have any words of wisdom about the benefits or promise of international cooperation in education.
71 Mosteller: We're having quite a set of innovations with respect to education in the ISI. We first formed a committee on statistical education, and more recently we have set up a section within the society called the International Association for Statistical Education. It's what we call a section of the ISI. The committee has been very successful in having international meetings and producing from those meetings books which describe current practices and current crises in teaching statistics.
72 Moore: These are the ICOTS conferences?
73 Mosteller: That's right, these are the ICOTS conferences. I think the next ICOTS conference is to be held in Morocco.
74 Moore: That's true. July of 1994.
75 Mosteller: These conferences have been so well attended and so productive that the Institute wanted there to be a whole section -- a whole society -- devoted to statistical education. People from all over the world are coming to these meetings, and carrying the message back home. This is important for the less developed countries, because they're beginning to get the flavor of the kind of statistics that we're using for data analysis. The tendency in their education systems, like my own original education, has been to bear down on the mathematical aspects of statistics in order to get, shall I say, respectability and sophistication into the statistics. But what they often need are relatively simple statistics so as to analyze the kind of data that they're able to put together. There's a great interest in simplifying methods, and there's a great deal of interest in trying to standardize methods. But we may not be able to do that for countries that are not as computerized as other countries.
76 At a more advanced professional level, one of the kinds of things ISI is trying to do is to make sure that new chiefs of statistics, such as heads of censuses, in various countries are well acquainted with the international organizations that they must deal with. When a new person, inexperienced from the point of being chief of statistics, comes into office, he's invited to visit ISI at The Hague. There he's given quite a lot of information about the international organizations that deal with statistics and he learns how to get through the maze of international statistical agencies. There are many, many ways to help a country with its statistics, but if you don't know whom to ask, you can spend an awful lot of time at the wrong door. This program is being done with the cooperation of the Dutch statistical office. They've been very encouraging and cooperative in that educational area.
77 Moore: In talking about changes in the last 25 or 30 years, we spoke of content and we spoke of technology. It's also striking that in the last few years a general movement for the reform of mathematics education has sprung up, with quite ample funding from NSF and other sources. This movement seems to have as one of its main strands a strong feeling that the style of mathematics education, and perhaps as a corollary of statistics education, ought to change. Students need to be more actively engaged in their own learning; we need to find ways to get the students to be less passive, to be more active in their learning. And so video, to come back to an earlier topic, would now be looked at as having the essential drawback that in its current technological incarnation it's largely passive. I'd like to hear your reaction to this, and to ask more directly if you have thought about changing your own classroom style -- but perhaps your classroom style has always been interactive.
78 Mosteller: I tried to be very interactive with my class, but I did use the lecture system. One thing I found in the last year that I taught, was that there was a way to get the students to interact a little better with me. I attended a seminar here that Harvard president Derek Bok had put on for educationists in New England. The purpose was to figure out how to evaluate college education and how to improve college education through evaluation. I learned that certain professors used a device which was called the ``minute paper.'' The word ``minute'' expresses the idea that something is done in a very short time. The idea of the professor who used it first was that he just wanted to find out what the students wanted to know and what they were having trouble with. So I thought that I would try it out. In the last 2 or 3 minutes of the class, I would ask the students to write down what was the most important thing in the lesson and what they'd like to know more about.
79 Moore: They did this anonymously?
80 Mosteller: Yes, unless they wanted to write their name down. So they just wrote down the answer to that question. We did it for a few days, and I prepared responses to questions that seemed to emerge and handed them out, but it all seemed rather bland to me. Of course, I'm an older professor and the students are somewhat polite, so there wasn't much criticism. There weren't any mean questions, or even many remarks. Then I said to the class ``Now we've done this for a week, and it seems pretty bland to me. What is the matter with it?'' One student said it was very simple: I wasn't getting any feedback because I was asking them what was the most important point of the class and I'd already written on the board at the beginning of class the four main points in the lecture. I was not likely to get anything much different from what I wrote down, and what the students wanted to know more about was bound to be more about how to do whatever it was that we'd just done. I wasn't learning much from the minute papers, so I was impressed by this remark. This kid had seen through the whole process.
81 So I said maybe we should have some other kind of question, like ``What is the muddiest point in the lecture?'' The students erupted with applause, and from then on that was the question at the end of the hour. We still also ask them what they want to know more about. That stuck a chord with them, and they wrote quite briskly and handed the stuff in and I would write up the handout for the next time and then take up two or three points to begin the next hour. This was very effective, and pretty soon the students were wanting to know why we hadn't been doing this all along. (Because we didn't begin it until nearly the middle of the course.) I had to confess that I had never heard of this idea before, and that I just got it at the seminar that President Bok had created for this very purpose of improving teaching. I thought it might not be very good in a more mathematical course because it seemed to me that one of the things that makes statistics hard is that examples come from all over the place and so you might feel that if you had self-contained mathematical kinds of ideas with a lot of continuity that this trick wouldn't be of much help. But since then I've heard from some people who have tried it out in calculus courses, and they find it works just fine. So I'm pleased with this idea.
82 Moore: It's an excellent idea. I want to ask one other question related to the current reform movement in math education. Going along with this has been a widespread feeling that in research universities, and by osmosis in other types of institutions, the faculty don't consider teaching their primary job. Many of the distinguished bodies -- for example the National Research Council's Committee on the Mathematical Sciences in the Year 2000 -- that have urged the reform of teaching also mention the need to change faculty incentives. Recently the higher administration in places like Cornell and Stanford has to some extent echoed this. You're certainly an example of someone who has not neglected either side of the responsibility to do original scientific work and to pass it on. So I'd be very interested to hear your comments on the somewhat hot issue about what faculty incentives are and whether there's a need for a change.
83 Mosteller: I haven't really felt that way about our teachers in statistics, nor about Harvard generally. Indeed, I started to answer a questionnaire some time ago about how I felt about teaching, and the students, and so on. I finally quit answering the questionnaire because there never seemed to be a place for me to say that I thought very well of the students and admired many of them and was pleased to be associated with them in the work we were doing. I think there may be an idea that students are a nuisance, but I haven't found it so. I found students to be very bright and very helpful -- not always interested in what I'm interested in, but why should they be?
84 Moore: After all, it would be terrible if you had to be interested in everything they're interested in, so that's a fair exchange.
85 Mosteller: Yes. What happens to me is that I often find myself working with a student on some problem or other outside class. Also, I tried to make sure that in our classes there was a time period that was devoted to the student actually engaging in some kind of research, even in the most elementary course. It usually turned out that the student either had research that he or she was doing, or that they were happy to go and pick up a topic and work on it. This has always represented a way for students even in the elementary courses to participate in the idea of research.
86 In the more advanced courses, we have better opportunities because we're all doing quite a bit of research. Dr. Rubin and Dr. Dempster both have relationships with the Census Bureau and with other statistical organizations, and I sometimes have a task to carry out for the Institute of Medicine or some other organization of that kind. Very often these relationships breed problems that give students opportunities to participate. The graduate students almost uniformly have opportunities to participate in some kind of project research. In this department, and I'm sure in many other departments around the country, this seems to lead to a much broader appreciation of statistics on the part of the students. We get a much more rounded student body, when they complete their work with us, than we'd have otherwise.
87 Moore: That brings us to the subject of graduate education, that is, the professional training of future statisticians. A few years ago in Statistical Science there appeared a conversation between you and John Tukey, moderated by Frank Anscombe. One of the things that struck me in reading that was the contrast between what graduate training was like in statistics in Princeton in the mid-1940s and what it is typically like now. You remarked in that interview that there was only a single graduate course in statistics at Princeton, which Sam Wilks taught. You were very heavily involved in project work, perhaps especially during the war, and there was a constant stream of visitors and seminars. It sounds, at least at Harvard, as though graduate study is still much like that, though there are now more courses. Students learn in part by being involved in ongoing research projects. [The article referred to is ``Frederick Mosteller and John W. Tukey: A conversation,'' Statistical Science , 3 (1988), 136-144.]
88 Mosteller: I think that's so. And it's so both here in the Department of Statistics in the Faculty of Arts and Sciences and also in the Department of Biostatistics in the School of Public Health. There the students are heavily involved in clinical trials, in the analysis of experiments being carried out by both biological scientists and physicians. So it is commonplace to have plenty of practical tasks along with course work here. We also have occasional seminars taught jointly with the Department of Psychology, where Robert Rosenthal has always had a great interest in statistics. He's been very close to Don Rubin, who's currently the chairman of this department, and they have a joint seminar on statistical problems in social science that they carry on year after year. Carl Morris has a joint appointment in the medical school and Bernard Rosner's primary appointment is there.
89 We've had occasional big seminars that have contributed something to statistics. The first Coleman report led us to have what we call a faculty seminar. It was a very big seminar with perhaps 100 participants, and it led to a good deal of research in education over a period of time, brought on by the problems the Coleman report addressed. There were difficulties in the analyses, and I think many misinterpretations in the original analysis that was carried out. We had to deal with the interpretation of regression when you take out more and more variables and try to explain how important a particular factor such as teachers or parents might be for explaining school achievement.
90 The most recent such seminar was held at the School of Public Health, and we developed many products. Students from the statistics department joined in that seminar. It produced several books and papers. One was the book by Sharon Anderson et al. on analysis of observational studies. Another was a set of papers that helped to organize the field of assessment of technologies, called Costs, Risks, and Benefits of Surgery . That book was written by a collection of people, some of whom were students in statistics and some of whom were physicians or biostatisticians in the School of Public Health. Again that was an opportunity. We set up that seminar because it seemed to me that our department, the Department of Statistics, was a little bare in practical applications for a certain brief period. The seminar met a need, because it was a time when we began to analyze health services in more detail, when they were setting up a center for the analysis of health practices at the School of Public Health. That made a matrix for study of this question of costs and benefits of surgery.
91 Moore: The style of graduate study that you describe is very different from the style of study in a mathematics department, for example, where there's much less involvement with other disciplines, much less involvement in project work, and where a thesis is more an individual exploration of a theoretical issue. It's well perhaps for statisticians, in thinking of graduate programs in their own field, to keep in mind that contrast of styles.
92 Mosteller: Yes, that's quite right. We do have people who write theses that have a strong subject matter slant, and of course theoretical statistics is basic to theses in this department. But for many theses the analysis of data that emerges from investigations plays a very important role.
93 I might say that in the School of Public Health, they have a different concept of the thesis than the concept we still hold here in the Faculty of Arts and Sciences. There's no rule against us doing what they do over on the other side of the river -- it's just that we don't ordinarily do it. Here, essentially the student picks a problem and writes a little monograph on a topic that leads to new findings or new methods or new theorems in statistics. Students in the School of Public Health write a thesis that typically consists of three papers rather than a single long monograph. The papers are presumably somewhat related. They may be all be related to the same disease, or they may all be related to some aspect of statistics, but they are not necessarily interconnected like a book on a single topic. Moreover, the papers are not necessarily written individually by the student. Sometimes they're written with others. The argument made at the School of Public Health is that the student has got to learn to be an interdisciplinary worker, and that short papers are what the student has got to learn to publish. The student will rarely have an opportunity to publish material like the kind of thesis that we write here in Arts and Sciences -- and of course it is typical that our Arts and Sciences student may have to break the thesis into several parts if he or she wants to publish the whole thing.
94 It does seem to me interesting that this other concept of a thesis has emerged on the other side of the river. We talked about it here, and we all agree there's nothing stopping the student from doing that very same thing, except for writing it jointly with other people. We are committed still to the idea a student writes his or her own thesis, and not as a joint effort. I think the School of Public Health plan is an impressive idea and I've been more and more persuaded about its practicality. One thing is, it keeps them moving. Often students are hung up for years when they write the monographic thesis. By and large, the shorter papers that are being written on the other side of the river are being done in a much more timely fashion. It keeps the student involved in teams that are trying to accomplish something specific and finish it. So I think this idea needs a little more attention on the part of Arts and Sciences instructors.
95 Moore: Statistics is certainly a good place to start an idea like that. As a methodological discipline, it has in a sense inherent ties to other disciplines and to group work.
96 Mosteller: Yes. Almost all statisticians I know do a certain amount of interdisciplinary work.
97 Moore: You are now an emeritus professor at Harvard. What does the future hold for you as far as work with impact on education?
98 Mosteller: I'm the director of a technology assessment group. Some students in statistics or in health policy management come to our group and write their thesis or some part of it with us and learn how to carry out technology assessments. In particular, we do a great deal of research on what's called meta-analysis, that is, ways of putting together data from somewhat disparate sources so as to find out what the common ground is among them. We're developing methods to improve that process and make it more effective, and students seem to be eager to learn to do that. We've had a lot of educational success. Again, it's an example of the student participating with ongoing research.
99 John Bailar and I will be giving some short courses related to the book that I spoke of, Medical Uses of Statistics . That will be another opportunity to help in statistical education for physicians. We -- Hoaglin, Tukey, and I -- continue to write books on statistics. We have put out one book on exploratory analysis of variance. We have material for a second volume in that series, so I hope that will produce something of value to students along with the texts we've done before.
100 Moore: I greatly appreciate your taking the time to talk with me. If I can be a bit unfair, let me put one concluding question. If you had one short word of advice to offer to teachers of statistics less experienced than yourself, what would it be?
101 Mosteller: Be sure to keep plenty of concrete examples in your teaching. And if you can hear a pin drop in your classroom, nobody's following the lecture (laughter).
S. S. Wilks, Elementary Statistical Analysis . Princeton, NJ: Princeton University Press, 1948.
Frederick Mosteller with a group of the Commission on Mathematics of the College Entrance Examination Board, Introductory Probability and Statistical Inference for Secondary Schools: An Experimental Course , preliminary edition. New York: College Entrance Examination Board, 1957.
Frederick Mosteller, R. E. K. Rourke, and G. B. Thomas, Jr., three books; the first is the basic reference, the other two are derived from it by abridgements and slight rewriting: Probability with Statistical Applications . Reading, MA: Addison-Wesley Publishing Company, 1961. 2nd edition, Addison-Wesley, 1970. Probability and Statistics , official textbook for Continental Classroom. Reading, MA: Addison-Wesley Publishing Company, 1961. Probability: A First Course . Reading, MA: Addison-Wesley Publishing Company, 1961.
Frederick Mosteller and David L. Wallace, Inference and Disputed Authorship: The Federalist . Reading, MA: Addison-Wesley Publishing Company, 1964. Now reissued with an additional chapter as: Applied Bayesian and Classical Inference: The Case of the Federalist Papers . New York: Springer Verlag, 1984.
Edited by Judith M. Tanur; and by Frederick Mosteller, Chairman, William H. Kruskal, Richard F. Link, Richard S. Pieters, and Gerald R. Rising, The Joint Committee on the Curriculum in Statistics and Probability of the American Statistical Association and the National Council of Teachers of Mathematics, Statistics: A Guide to the Unknown . San Francisco: Holden-Day, 1972. 2nd edition, San Francisco: Holden-Day, 1978. 3rd edition, Pacific Grove, CA: Wadsworth & Brooke/Cole, 1989.
Frederick Mosteller, Chairman, William H. Kruskal, Richard S. Pieters, and Gerald R. Rising, The Joint Committee on the Curriculum in Statistics and Probability of the American Statistical Association and the National Council of Teachers of Mathematics, Statistics by Example: Exploring Data , Statistics by Example: Weighing Chances , Statistics by Example: Detecting Patterns , and Statistics by Example: Finding Models . Menlo Park, CA: Addison-Wesley Publishing Company, 1973.
Yvonne M.M. Bishop, Stephen E. Fienberg, and Paul W. Holland, with the collaboration of Richard J. Light and Frederick Mosteller, Discrete Multivariate Analysis: Theory and Practice . Cambridge, MA: MIT Press, 1975. First paper edition, MIT Press, 1977.
J.P. Bunker, B.A. Barnes, F. Mosteller, eds., Costs, Risks, and Benefits of Surgery . New York: Oxford University Press, 1977.
Frederick Mosteller and John W. Tukey, Data Analysis and Regression: A Second Course in Statistics . Reading, MA: Addison-Wesley Publishing Company, 1977.
Sharon Anderson, Ariane Auquier, Walter W. Hauck, David Oakes, Walter Vandaele, and Herbert I. Weisberg, Statistical Methods for Comparative Studies . New York: John Wiley & Sons, 1980.
David C. Hoaglin, Frederick Mosteller, and John W. Tukey, eds., Understanding Robust and Exploratory Data Analysis . New York: John Wiley & Sons, Inc., 1983.
David C. Hoaglin, Frederick Mosteller, and John W. Tukey, eds., Exploring Data: Tables, Trends, and Shapes . New York: John Wiley & Sons, 1985.
David C. Hoaglin, Frederick Mosteller, and John W. Tukey, eds., Exploring Data: Tables, Trends, and Shapes . New York: John Wiley & Sons, 1985.
Frederick Mosteller, Stephen E. Fienberg, and Robert E.K. Rourke, Beginning Statistics with Data Analysis . Reading, MA: Addison-Wesley Publishing Company, 1983.
John C. Bailar III and Frederick Mosteller, eds., Medical Uses of Statistics . Waltham, MA: New England Journal of Medicine Books, 1986. 2nd edition, 1992.
Joseph A. Ingelfinger, Frederick Mosteller, Laurence A. Thibodeau, and James H. Ware, Biostatistics in Clinical Medicine , 2nd edition. New York: Macmillan, 1987.
David C. Hoaglin, Frederick Mosteller, and John W. Tukey, eds., Fundamentals of Exploratory Analysis of Variance . New York: John Wiley & Sons, 1991.