A Capstone Course for Undergraduate Statistics Majors

John D. Spurrier
University of South Carolina

Journal of Statistics Education Volume 9, Number 1 (2001)

Copyright © 2001 by John D. Spurrier, all rights reserved.
This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the author and advance notification of the editor.


Key Words: Active learning; Oral communication; Written communication.

Abstract

This article discusses a capstone course for undergraduate statistics majors at the University of South Carolina. The course synthesizes lessons learned throughout the curriculum and develops students' nonstatistical skills to the level expected of professional statisticians. Student teams participate in a series of inexpensive laboratory experiments that emphasize ideas and techniques of applied and mathematical statistics, mathematics, and computing. They also study modules on important nonstatistical skills. Students prepare written and oral reports. If a report is not of professional quality, the student receives feedback and repeats the report. All students leave the course with a better understanding of how the pieces of their education fit together and with a firm understanding of the communication skills required of a professional statistician.

1. Introduction

Undergraduate statistics majors receive broad training in mathematics, applied and theoretical statistics, computing, communication, and general education. However, these students seldom combine the various skills acquired throughout the curriculum with the modest exceptions of using calculus in mathematical statistics and using a statistical computing package in applied statistics. They also have little experience performing experiments or working in teams. These students graduate with a vast array of knowledge but are often very inexperienced in putting the pieces together. They are not ready to make their greatest possible impact in the workplace. We can and must do better!

I have developed a senior-level capstone course in statistics at the University of South Carolina to give undergraduate statistics majors experience in combining their skills and in participating in projects from start to completion. Capstone courses serve as the crowning point of the undergraduate education in several disciplines. These courses synthesize knowledge acquired throughout the curriculum. Capstone courses also offer a natural setting for an exit exam, if one is required, or for an informal assessment of what the students know and don't know near the end of their undergraduate studies. Gardner, J.N., Van der Veer, G., and Associates (1998) contains a series of essays on the use of capstone courses in other disciplines.

The goals and format of the Capstone Course in Statistics are described in Section 2. Course logistics are described in Section 3. Details of the capstone experiences are described in Section 4. The nonstatistical skills modules are described in Section 5. Student reaction to the course is described in Section 6. Lessons learned by the instructor are summarized in Section 7. Section 8 contains concluding remarks.

2. Goals and Format

A goal of the course is to present students with a variety of experiences similar to what they might encounter early in their careers. Some of the experiences are relatively simple, while others are more complex. Each experience requires the students to combine several skills. Most experiences require data collection and teamwork. The experiences convince students that they have the necessary skills to solve important problems.

Another goal is to receive professional-quality performance from the students. Just doing enough to pass is not tolerated. The nonstatistical skills modules on written and oral reports in the text, Spurrier (2000), clearly state the expectations for professional-quality written and oral reports. These expectations are reinforced by requiring students to redo work that does not meet them. Requiring students to redo work that does not meet the requirements reinforces these expectations.

The instructor assumes the role of coach-facilitator in the capstone experiences. The instructor presents the setting and the necessary background information. Students work in teams to choose factors, develop operational definitions, perform experiments, and collect the data. The instructor goes from team to team observing and asking questions to ensure that the teams understand what they are to do, that all students are participating, and that the teams are making the necessary decisions and collecting the data in a timely fashion. The students individually analyze the data and prepare reports. The text guides students through each experience as well as discussing the nonstatistical skills of technical writing, oral presentations, producing computer generated visual aids, seeking employment, and providing statistical consulting.

Students receive considerable feedback on all assignments. They can repeat the formal written and oral reports after receiving feedback. This reflects the underlying interest in having the students leave the course with the ability to make professional quality oral and written presentations.

3. Logistics

This one semester hour course is required for the B.S. in Statistics degree. The meeting site is a seminar table in a computer lab. The table is used for class discussions and as workspace for the experiments. Students use computers only briefly during class but extensively outside of class. Micrometers, rulers, and tape measures are available for data collection.

The one semester hour format allows for five or six of the text’s eleven capstone experiences, four of the five nonstatistical skills modules, a formal written report on one of the capstone experiences, a formal 5-minute oral presentation on another capstone experience, and several smaller assignments on the other capstone experiences. The smaller assignments require a mixture of data analysis, mathematical development, computing, and writing. A three semester hour format could use all eleven capstone experiences, additional formal reports, additional assignments, the statistical consulting module, some actual and role play consulting experiences, mock employment interviews, guest lectures from industrial and government statisticians, and additional discussion. A small amount of additional measuring equipment would be necessary for the additional experiences. I would prefer a three semester hour format, but it was not possible on our campus due to an already crowded curriculum.

We begin the course with the nonstatistical skills module on seeking employment to encourage students to start their employment search as early as possible. We then do three capstone experiences. These are followed by the nonstatistical skills modules on written presentation, oral presentation, and producing visual aids. The students make their reports and complete other capstone experiences during the second half of the semester.

The formal written report and formal oral report each count for 25% of the course grade. The other assignments count for 50% of the grade. There are no exams. The formal reports are graded on several components. Example grading sheets are given in the Appendix. The text's written report and oral report checklists contain points of emphasis for grading each component. A formal rubric for assigning component scores has not yet been developed.

4. The Capstone Experiences

This section describes the eleven capstone experiences, with four being described in detail.

4.1 Data Preparation Experience

In the first experience, the student is a junior statistician for the On Time Statistics consulting company. Sayah Medical Clinic has hired the company to perform a customer satisfaction survey. The tasks are to develop a written coding protocol, data entry format, and data editing plan for a survey instrument developed by another statistician. Under the scenario, a coding clerk will code the completed surveys and enter the data using the protocol and format.

While our students have previous experience in entering small data sets, most have no experience writing specific instructions to allow someone else to perform coding tasks and little experience in checking entered data for correctness. The experience is an eye-opener regarding the need for accuracy and precision.

4.2 Classification Experience

In a second experience, the student works in a college’s consulting unit and is helping a biology major classify plant leaves. Students are given the images of 16 leaves from each of two species of fruit trees and a metric ruler. Their task is to develop a rule for classifying leaves according to their species. Students develop operational definitions for leaf length and width and then measure the leaves. They enter their data and check for data entry errors and outliers.

As our students have not studied discriminant analysis, developing a classification rule appears to be a very challenging task. They don’t realize that they have all the skills necessary to solve this problem.

To get started, they assume that leaf length and width follows a bivariate normal distribution for each species. They assess this assumption later. We discuss the approach of classifying a leaf as coming from the first species if the estimated joint density for species one is greater than the estimated joint density for species two at the leaf’s (length, width) value. They agree that this is a plausible approach.

The students estimate the parameters and find the region in the length—width plane such that a leaf is classified as coming from species one. Finding and displaying the region requires the students to use matrix algebra, inequality, and plotting skills. After the students develop a classification rule, they learn how to do the analysis in SAS®. By the end of the experience, students see they have the skills to solve a "difficult" problem and are impressed that their calculations match those from SAS®.

4.3 Variance Component Experience

In a third experience, the student is a statistician working in the Quality Department of the Sharp Point Tack Company. Sharp Point is experiencing considerable variation in the measurements of the lengths of their nominal one-half inch carpet tacks. The text indicates that possible sources of variation are tack-to-tack differences, operator-to-operator differences, micrometer-to-micrometer differences, and lack of repeatability of the measurement process (error). The tasks are to perform a replicated 4 by 3 by 3 random effects factorial experiment, to estimate the variance components, and to recommend an action to management to reduce measurement variability.

Students learn there is more to performing a replicated three-factor experiment than is described in most texts. First, they learn the logistical problems of taking the 72 measurements using the tack-operator-micrometer combinations specified by the randomization. Getting the specified tacks, micrometers, and operators together in the correct order can be difficult. Second, they learn that collecting measurements requires considerable care. An outlier has a special meaning when it becomes obvious to everyone that you made a major measurement error. Third, they are reminded that entering data correctly is extremely important. Errors can cause major changes in the variance component estimates and in the recommendation to management.

One group of students found that the lack of repeatability (error) variance component dwarfed all others. Having little prior training, the student operators were not good at measuring tack lengths. As the tack lengths are variable to the eye, the much larger lack of repeatability variance component made a strong impression on the students. They then discussed strategies for reducing the lack of repeatability variability. Their options were to recommend additional operator training or replacing the micrometers with easier to use instruments.

4.4 Response Surface Experience

In a fourth experience, the student works as a statistician for Flying B Aerospace, a producer of balsa wood airplanes. The text describes that an engineer wants assistance in finding the wing position and nose weighting that maximizes the model X234J airplane’s flight distance. Budget constraints allow for at most 36 test flights. The task is to design and perform a series of experiments to find wing position and nose weight setting to maximize flight distance. Students are to fit a response surface model, test the fit of the model, and recommend a setting. The text introduces students, who are familiar with multiple regression, to the response surface model, to the use of factorial and central composite designs, and to tests of fit of a response surface model.

The students must first choose the design, including the number of flights to use in their first experiment. They are reluctant to make a decision. After some prodding, students usually choose a design using about half of the 36 total flights. The first stage experiment for one group of students showed that too much weight on the nose reduced the flight distance and that there was much variation between the replicates particularly for the treatment combinations that produced larger means. Their fitted quadratic response surface did not have a maximum near the settings used in the experiment and the test of fit suggested the model was not reasonable. There was no clear pattern regarding the effect of wing position.

In a discussion of the results of the first stage, the students decided to perform a second experiment consisting of 12 additional flights involving three wing positions and two nose weightings. These nose weightings were at and below the original lowest setting. They also decided to model the log of flight distance to stabilize the variance.

The second stage experiment showed that too little weight on the nose also reduced the flight distance. Again, there was no clear effect of wing position. One flight using no nose weight resulted in a zero flight distance. The students decided to model log(flight distance + 1) as a quadratic response surface with wing position and log(nose weight +1) as the predictors. The model fit well. A plot of the response surface indicated a strong effect due to nose weight and a lesser effect due to wing position.

The students used the last six flights to establish the recommended settings. The peak in final fitted surface suggested that the plane design should have the wings 3 mm. closer to the nose and add 0.03 oz. of nose weight to maximize flight distance.

4.5 Other Capstone Experiences

The other seven experiences deal with the writing of a script for a telephone survey, the selection of the sample sizes in a one sample problem, the design of an experiment to compare two automobile horn activation buttons, the development of a regression model to predict the weight of small rocks, the development of logistic regression to predict the probability that a lead weight dropped from a fixed height will penetrate a facial tissue stretched on an embroidery hoop, the development of a stratified sampling plan for estimating the proportion of voters who plan to vote for a particular candidate in a city with three ethnic groups, and comparing Bayes and frequentist estimators of the probability that a particular baseball player gets a hit.

5. Nonstatistical Skills Modules

The text includes nonstatistical skills modules on important skills needed by professional statisticians. These topics are not traditionally part of the statistics curriculum. Some statisticians have learned these skills through a helpful mentor; others have not been so lucky. These modules were written after extensive reading and discussions with experts in the fields of technical writing, speech, human resources, and applied statistics.

Students learn to prepare professional quality written reports in the technical writing module. The goal of technical writing, to inform the reader, differs from the goals of some other types of writing. We discuss the importance of using precise, readable, and concise language to communicate to busy readers. I explain the role of each major section of a technical report. Students also learn how to produce precise and focused tables and figures. We talk about strategies for writing the first draft and the importance of rewriting to improve precision and readability. Finally, we discuss strategies for developing as writers throughout their careers. The text contains a technical writing checklist for self-assessment.

The oral presentation module consists of two parts. The first part deals with strategies for effective oral presentations. In the second part, students gain experience producing visual aids with PowerPoint®.

We begin the discussion of oral presentations with the presentation goal. It is generally to inform a particular audience of the results of an investigation and, perhaps, to advocate an action. Then we learn the importance of understanding the presentation ground rules such as the amount of time, the type of room, availability of equipment, etc. We review the major sections of the oral presentation from the greeting through the question and answer period. We stress the importance of developing an outline with time allocations for the major sections. We also discuss presentation style including the use of speech notes, the speaker’s appearance, and the delivery. We discuss strategies for overcoming the fear of making a presentation.

We learn about the effective use of visual aids and handouts and stress the importance of prior practice with the equipment. We also discuss strategies for responding to questions. We review the importance of having several rehearsals. Finally, we talk about strategies for developing as speakers throughout their careers. The text contains an oral presentation checklist for self-assessment.

The module on PowerPoint® gives students a brief introduction to the software package and then gives hands-on experience in producing professional quality overhead transparencies and projected computer images.

The module on seeking employment begins by emphasizing that the students will be conducting a marketing effort — marketing their skills to potential employers. We discuss examples of employers of statisticians from government, industry, and academia and the nonstatistical skills that these employers look for. Students learn strategies for creating resumes that emphasize these important skills. We discuss a variety of sources to learn of job openings. Students should make full use of the campus career center but not to use it as their only source. We also discuss the interview process and give examples of questions to expect and questions to ask. We talk about the expected features of a job offer and the job applicant’s options on receiving an offer. Finally, we discuss what to do if they don’t have a job when they graduate. The text contains a job search checklist to remind the students of the key steps in marketing their skills.

The statistical consulting module, which we don’t include in our course due to a lack of time, outlines the process of statistical consulting from greeting the new client through reviewing the final written report. There is also a discussion of the ethical expectations for professional statisticians. The students learn that individuals or groups may have much to gain or lose depending on the outcome of their analyses and that it is crucial for the statistician to make an objective decision without regard to whom the outcome favors. They also learn that it is their responsibility to maintain confidentiality about data they collect and analyze.

6. Student Reaction

The course has been taught four times to a total of 22 students. Students indicate in course evaluations that the course is extremely valuable. Based on their strong recommendation, the course is now required for all statistics majors. Some students feel that the course requires too much work for one semester hour.

Students actively participate in the experiences. This is more than role play. They identify with the data they collect and analyze. Their written and oral reports give the clear impression that they are statisticians working for the hypothetical employers. Solving the clients’ problems is important to them. Students are often strong advocates for their positions.

Students enter the course realizing that they have considerable knowledge but lack confidence in their abilities to solve important problems. They are initially reluctant to make decisions. They also enter with little confidence in their abilities to write and speak about statistical investigations. The thought of making an oral presentation with visual aids is intimidating.

Most students struggle initially. Their primary weaknesses are in the areas of making decisions, writing precisely, and editing data sets. Some initial analyses are wrong due to undetected data entry errors.

The quality of work improves dramatically as the semester progresses. Some of their work is far above my expectations. In the end, almost all students do excellent work. The exceptions continue to make careless errors.

While the initial formal written presentations vary greatly in quality, the revisions are of professional quality. The oral reports are quite good. Students excel in their use of visual aids. The initial oral presentations are generally of professional quality. The exceptions suffer from incorrect statistical analyses and lack of rehearsal. These students improve considerably in their second attempt.

7. Lessons Learned by the Instructor

The role of coach-facilitator differs from that of lecturer. Rather than presenting a large amount of information, the instructor spends most of the time observing and making meaningful comments as needed. It takes a while to get used to talking much less.

It is extremely useful to follow-up a capstone experience with a class discussion. These discussions allow the class to focus on lessons learned in the experience and to compare the work of various students. Students who struggle with precision are often amazed at the quality of work that some of their classmates have done.

Grading of formal written and oral reports is quite different from grading traditional homework. The grading sheets and the text's checklists help maintain grading consistency and force the instructor to consider many aspects of the presentation. It is difficult to assess all aspects of an oral presentation in real time. I video the oral presentations and assign the grade while watching the video. One can watch as many times as necessary to assess each aspect of the presentation.

There have been few software and hardware problems. These advanced undergraduates have used statistical computing packages in several courses prior to enrolling in the capstone course. At times, some students have forgotten some basic computing facts and need a refresher. The same is true for some basic mathematics and statistics tools. These lapses give feedback in assessing our undergraduate curriculum.

8. Conclusions

The capstone course students and I believe that the capstone course at the University of South Carolina has helped the students integrate the various pieces of their undergraduate education. The students leave the course with a better understanding of what is expected of a professional statistician and most students are able to achieve these expectations.

It is also possible to distribute the capstone materials through various courses in the curriculum. This has the advantage of starting the integration of topics earlier in the undergraduate curriculum. It may be more difficult to maintain a consistent message regarding what is expected of the student if the material is used in many courses.

The eleven capstone experiences in the text are not an exhaustive set. Creative teachers may want to supplement these to show other problems that statistics graduates might see early in their careers.


Acknowledgments

This article is based upon work supported by the National Science Foundation under Grant No. DUE-9455292. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation. The author thanks the referees and an Associate Editor for helpful comments. PowerPoint® is a registered trademark of Microsoft Corporation. SAS® is a registered trademark of SAS Institute, Inc.


Appendix: Grading Sheets

Written Report Grading Sheet

Name _______________________________ Chapter _____________

Item

Max.

Points

Your

Points

Comments

Title

5

 

 

Abstract

10

 

 

Introduction

10

 

 

Materials and Methods

10

 

 

Results (including tables and figures)

25

 

 

Discussion

20

 

 

Conclusion

10

 

 

Writing Style

5

 

 

Spelling, grammar

5

 

 

Total report grade

100

 

 

Oral Presentation Grading Sheet

Name _______________________________ Chapter _____________

Item

Max.

Points

Your

Points

Comments

Greeting and Introduction

5

 

 

Preview

4

 

 

Background

8

 

 

Description of Data Collection

8

 

 

Results of Data Analysis

20

 

 

Conclusions and Recommendations

10

 

 

Questions and Answers

5

 

 

Physical Delivery

10

 

 

Oral Delivery

10

 

 

Transitions and Flow

5

 

 

Visual Aids

10

 

 

Use of Time

5

 

 

Total report grade

100

 

 


References

Gardner, J.N., Van der Veer, G., and Associates (1998), The Senior Year Experience, San Francisco: Jossey-Bass.

Spurrier, John D. (2000), The Practice of Statistics: Putting The Pieces Together, Belmont, CA: Duxbury Press.


John D. Spurrier
Department of Statistics
University of South Carolina
Columbia, SC 29208
USA

spurrier@stat.sc.edu


Volume 9 (2001) | Archive | Index | Data Archive | Information Service | Editorial Board | Guidelines for Authors | Guidelines for Data Contributors | Home Page | Contact JSE | ASA Publications