Assessment and the Process of Learning Statistics

Ruth Hubbard
Queensland University of Technology

Journal of Statistics Education v.5, n.1 (1997)

Copyright (c) 1997 by Ruth Hubbard, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the author and advance notification of the editor.

Key Words: Assessment goals; Memorising procedures; Assessing understanding.

Abstract

Because assessment drives student learning, it can be used as a powerful tool to encourage students to adopt deep rather than surface learning strategies. Many standard assessment questions tend to reinforce the memorisation of procedures rather than the understanding of concepts. To counteract this trend, some techniques for constructing questions that test understanding of concepts and that address specific goals of statistical education are described and illustrated with examples.

1. Introduction

1 It is well known that assessment drives student learning. Cobb's (1993) quote from Resnick, "We get what we assess, and if we don't assess it, we won't get it" tells the whole story very succinctly. However it is important to recognise that assessment determines not only what students learn but how they go about learning it. Assessment drives the whole learning process. As a simple example, if the only assessment is at the end of the course, then students tend to defer the learning process until examination time draws near. By doing this they may waste many hours listening to lectures and class discussions on topics about which they know very little. If challenged about the value of this approach to learning, they will respond by saying that if they studied earlier in the semester they would probably forget what they had learned by examination time, so it is more efficient to learn it at the end.

2 As Gal and Ginsburg (1994) explained, for many students statistics is of no intrinsic interest but merely a hurdle to be overcome on the way to obtaining a degree. Students who view their statistics course in this way, and there are many of them, want to pass with a minimum of effort. They therefore focus on the assessment even more determinedly than they might in a subject in which they were genuinely interested. The most extreme approach taken by such students is to study only past examination papers instead of working on the exercises and activities prepared by the instructor. This suggests that however motivating we make the instruction, some students will fail to be motivated to take a real interest in our discipline unless we also make changes to assessment methods.

2. Standard Assessment

3 I make the assumption that the exercises in most of the current texts for elementary statistics courses reflect the type of assessment the users of those texts prescribe. Any responsible instructor should assess what they have been asking their students to learn and in the way they envisage the students have learned it. I will discuss several problems with using only the standard types of questions and suggest methods for constructing alternative question styles to avoid these problems. In addition the questions will address much more specific goals than the more general goals which are frequently quoted, for example in Gal and Ginsburg (1994).

4 To illustrate how students respond to standard textbook questions, here is a typical example of such a question.

A geologist collects hand-specimen sized pieces of limestone from a particular area. A qualitative assessment of both texture and colour is made with the following results.

                         Colour
          Texture Light  Medium  Dark
          Fine      4      20      8
          Medium    5      23     12
          Coarse    21     23      4

Is there evidence of association between colour and texture?

5 To answer this question the student must be able to abstract the correct structure from the given context and then to carry out the procedure appropriate for that structure. For many students these are non-trivial tasks, so they make up rules for themselves such as, "if it's a 2-way table do a chi-square." Of course this is an over-simplification that can lead to an incorrect identification. Unfortunately, even students who are capable of much deeper approaches to learning than this will take the minimalist approach if they believe that this is sufficient to obtain good grades. And their belief relies on the fact that the exercises they have been told to practise are all very similar. In Hubbard (1995) the different types of questions in standard mathematics and statistics texts are classified and the number of different question types is quite small. The smaller the number of question types, the more tempting it is for students to memorise procedures for responding to them. The restricted number of question types positively reinforces rote learning, and if we want to encourage other kinds of learning, we have to devise different kinds of questions.

6 Now let us look at this surface learning approach to assessment from the point of view of the instructor. Because the questions have become so stylised, it is very easy for the instructor to prepare examinations. The above example was from an examination for geologists but only a few words have to be changed and it will be suitable for engineers or accountants or psychologists. In addition, the standard questions are easy to grade. There is usually a unique numerical answer and a formal conclusion such as "chi-square(calc) > chi-square(crit) so reject the null hypothesis." The instructor can take a surface approach to assessing student learning in the same way that the student prepares for the assessment. And just as the students can obtain good grades with a minimum of effort, so the instructor, again with a minimum of effort, can claim that the course has been successful because the average grade is at a respectable level.

7 But the instructor using stylised questions has another problem. It is not possible to distinguish a correctly memorised response from a response that arises from an understanding of statistical theory and procedures. Using such questions it is difficult to reward serious study of the discipline. Errors are easily found, but evidence of understanding and the ability to apply statistical knowledge to less clearly defined problems goes undetected. So learning and assessment become part of a vicious circle. The more convinced students become that memorising standard question types is successful, the more pressure there is on the instructor to rely on standard questions in order to show that the course is a success.

8 Some courses now contain projects in which students collect, summarise and analyse their own data and laboratory classes in which students use statistical packages to investigate large data sets or to carry out simulations. Even in such courses, if a major part of the assessment at the end of the course consists of the standard questions, students will prepare for the examination by memorisation because they think it is more reliable than depending on the understanding they have gained while doing the projects and laboratory classes. Just as students can hold multiple and often contradictory beliefs about a particular situation (Konold 1995), they can also adopt different learning styles depending on the task at hand ( Laurillard 1979). Students in elementary statistics classes have usually studied mathematics at school for many years. Gordon (1995) has described the overwhelming preference for surface learning of mathematics in a group of students entering an elementary statistics course. However exciting the instruction may be, old habits are difficult to change. When students ask you to repeat what you said about the analysis on their screen and attempt to write your remarks down verbatim, they are preparing a memorised response for the examination. If the instructor said it, it must be the "best" answer, so why should they bother to think about the problem?

9 There is also a sameness about the questions in some of the newer texts that concentrate on laboratory work. They frequently ask open-ended questions but of a very standard form so that if students practise with one set of data in the laboratory they can make very similar responses in a test. The following sequence of laboratory questions is from Pelosi and Sandifer (1995).

Create a scatter plot for Speed and Delay.
Does there appear to be a relationship between Speed and Delay? Describe the relationship.
Find the correlation coefficient for Speed and Delay. Does its value agree with your description of the relationship?
Find the regression model for Delay and Speed. Write the regression equation.
Is the regression significant?
Plot the regression line along with the data. Does the plot indicate that the model does a good job of predicting Delay?

10 My own experience with setting sequences of questions of this kind as part of assessment is that students score extremely well. Because the questions are so predictable, they learn the sequence of responses by rote. The fact that they are reproducing memorised responses becomes quite obvious when they start discussing the variables used in the class investigation instead of the variables in the test question.

3. What Is Wrong With Memorised Responses?

11 When students adopt a surface approach to learning, they learn to respond to key words or data structures with memorised procedures as in the chi-square example quoted earlier. If they do not also understand the purpose of the analysis, the kinds of situations to which it should be applied, the type of data that must be collected, the algorithm involved in the test and the meaning of the conclusions, then they can only give a correct answer if the question is posed in exactly the form in which they have mastered it. In particular they will have difficulty applying their knowledge to real problems outside the statistics class which are never stated in textbook form. Furthermore, students quickly forget procedures that they have learned but not understood, as the following example shows. When graduate students from a variety of disciplines approach me for help with analysing or interpreting their data, I always begin by asking if they understand the statistical term "standard deviation." Their standard response is, "It is something that you calculate from a formula." In most cases no amount of probing succeeds in clarifying either the formula or what it measures.

4. A New Look at Assessment

12 Projects in which students create or collect data, present, analyse, and discuss it are a powerful tool for developing understanding (Mackisack 1994). However instructors often require some confirmation that students have contributed to group projects and have understood fundamental concepts. This requires the construction of test items that specifically address the concepts and to which there can be no prepared responses. Steinhorst and Keeler (1995) have described the construction of such questions as follows:

With practice we can find exercises that get at what the student understands about statistics rather than what they know how to calculate. A good conceptual question will have just the right amount of ambiguity. The students must think through various possible responses.

13 It is important that the questions do probe the understanding of concepts and are not just tricks designed to confuse students. There is also the problem of continually making up questions that are not replicas of ones that have previously appeared in class or in homework assignments. If an instructor produces a non-standard question and keeps on repeating it, then it becomes a standard question and students will learn a standard response. It will then lose some of its power to test the understanding of concepts. With practice, I have learned to become inventive, including at least one non-standard question in each piece of assessment. The students in their turn have learned that they are required to think as well as to reproduce knowledge. As students become aware that there are many different types of questions, they realize that memorizing responses for particular types is no longer profitable. It is not necessary to to keep on generating new questions forever, but it is important to increase the variety of questions. It is also important to create questions that require students to think about what they have learned and to encourage them to apply their knowledge in new ways, not just in different contexts. In this section I will discuss some questions that test specific aspects of statistics and illustrate them with examples. Some techniques that I have found very useful for creating new questions are

(a) Asking the students to make up the question, a reversal of the standard approach,
(b) Suggesting that some aspect of a standard situation changes and asking students to explain how the change affects the solution, and
(c) Linking graphical and symbolic representations of a concept.

These techniques are illustrated in the following examples.

4.1. Choosing a Model

14 A device that can be used in many situations is to require students to reverse their thinking; one application of this is to ask students to make up the question.

A few weeks ago you studied simple regression. Describe a problem that could occur in one of the other subjects you are studying or in the context of one of your hobbies that could be solved using simple regression. Pretend you have collected some data to solve your problem. Write the data down indicating the independent and dependent variables, but do not do any calculations.

15 This question tests whether students can recognise situations where simple regression is an appropriate model. In textbooks all the questions about regression occur at the end of the chapter on that topic so students may never have an opportunity of making such a decision. Yet this is the major question faced by anyone who wishes to apply their statistical knowledge in the future. To prevent students from preparing their responses in advance, the topic could be made more specific, for example, a problem about basketball or the weather or anything else that all the students understand.

4.2. Understanding a Model

16 Instead of distinguishing between models, "make up the question" can be used to find out whether students understand a particular model, any necessary assumptions, the form of the hypothesis, etc. The information can be presented by means of the input and output of a statistical package but the question is much more challenging than the direct regression question quoted above. In order to respond correctly the student must attend to all the detailed information contained in the printout without specific prompts.

           MTB> ZTEST 25 3.4 C1;
           SUBC>ALTERNATIVE -1.

                N    MEAN  STDEV   SE MEAN      Z   PVALUE
           C1  40   23.63   3.95     0.538  -2.55   0.0054

Make up a problem that someone could solve using the above printout. Make sure that your problem contains sufficient detail so that the person solving it would have enough information to type the above commands. Do not make up the data. Just assume the data were collected and produced the given statistics.

4.3. Using Statistical Language

17 Another important factor in applying statistics to new situations is the ability to describe a problem using correct statistical terminology. Instead of asking students for the definitions of technical terms which are easily memorised, ask them to describe a given problem using the appropriate technical words.

You have been asked to conduct an experiment to find out whether T-shirts made of different fabrics and colours offer different amounts of protection against sunburn. Explain how you would design such an experiment. In particular:

What are the experimental units?

What are the treatments?

What is the response variable?

What are the blocks (if used)?

How would you use randomisation?

4.4. Graphical Representation

18 Alternatively students can be asked to show their understanding of statistical terms by drawing diagrams. This requires a more thorough understanding than asking them to recognise features of diagrams drawn by someone else.

Sketch histograms for frequency distributions in which the mean is

Greater than the median,

About equal to the median,

Less than the median.

4.5. Searching Questions About Printouts

19 Hypothesis testing is an important goal of introductory courses but standard hypothesis testing questions fail to distinguish between students who have chosen the correct test by rote, or for the wrong reason, and students who really understand the assumptions on which the various tests are based and the implications of those assumptions. The next question probes these ideas and in addition shows that questions about printouts need not be less rigorous than questions that require the student to set up the hypotheses and carry out the computations.

In order to test a hypothesis about a population mean, the following Minitab printout was obtained.
           MTB >ZTEST 25 3.4 C1;
           SUBC>ALTERNATIVE -1.

           TEST OF MU = 25.0 VS MU L.T. 25.0
           THE ASSUMED SIGMA = 3.4

            N    MEAN   STDEV    SEMEAN      Z    PVALUE
           40   23.63    3.95      0.54  -2.55   0.00054
If you had used a TTEST instead of a ZTEST on the same data, which numbers in the last row of the printout would remain the same and which would change? Do you have enough information to decide whether each of the numbers that would change would increase or decrease? Justify your answers.

5. Experiences in the Classroom

20 When I began to include questions like those in the above examples in class exercises and examinations, many of the students were bewildered. Students who believe that learning consists only of reproducing knowledge have to change their learning strategy to cope with the new style of question. They would regularly ask me to give them more questions of the same kind so that they could "practise." New questions from my past examination papers were regularly incorporated into the tutorial questions for the following year, so that gradually the variety of the questions in the tutorial exercises increased. The examinations continued to include standard questions, in which students had to recognise structures in different contexts, because this is an important skill, but it is not the ONLY skill I wanted students to acquire. On the other hand, very capable students quickly realized that I was providing them with an opportunity to excel. Gradually as I became more confident, I included more non-standard questions, and the appearance of some non-standard questions in past examination papers convinced most students to attempt different learning styles. Nevertheless in a one semester statistics course it is difficult to change long-established learning habits. My most successful semester was with a group of students who I had previously taught in a reformed calculus class in which I had used similar questioning strategies.

6. Conclusion

21 The above examples have been used to illustrate that it is possible to construct test questions that

Focus explicitly on the important goals of a course,
Distinguish between deep and surface learning, i.e., the responses require the students to think about what they have learned, not merely to reproduce it,
Show students that a reproducing orientation to learning is not being encouraged, and
Test understanding of concepts from computer printouts in non-trivial ways.

All of the above question types are adaptable to other statistical topics and to more advanced levels.

22 It may appear at first that the construction of such questions is more difficult and time-consuming for the instructor. However if an instructor already has clear goals and a desire to improve the learning process, then with a little practice, the questions can be produced without a great deal of effort. In due course, if such questions should become fashionable, they may even begin to appear in the texts and instructors may be able to select them as they do now.

Acknowledgments

Part of this paper was presented at the Statistical Education Workshop associated with SISC'96, Sydney. The author's research on questions was supported by a Queensland University of Technology Teaching and Learning Grant.

References

Cobb, G. (1993), "Reconsidering Statistics Education: A National Science Foundation Conference," Journal of Statistics Education [Online], 1(1). (http://jse.amstat.org/v1n1/cobb.html)

Gal, I., and Ginsburg, L. (1994), "The Role of Beliefs and Attitudes in Learning Statistics: Towards an Assessment Framework," Journal of Statistics Education [Online], 2(2). (http://jse.amstat.org/v2n2/gal.html)

Gordon, S. (1995), "What Counts for Students Studying Statistics?," Higher Education Research and Development, 14(2), 167-184.

Hubbard, R. (1995), "53 Ways to Ask Questions in Mathematics and Statistics," Technical and Educational Services, Bristol.

Konold, C. (1995), "Issues in Assessing Conceptual Understanding in Probability and Statistics," Journal of Statistics Education [Online], 3(1). (http://jse.amstat.org/v3n1/konold.html)

Laurillard, D. (1979), "The Processes of Student Learning," Higher Education, 8, 395-409.

Mackisack, M. (1994), "What Is the Use of Experiments Conducted by Statistics Students?," Journal of Statistics Education [Online], 2(1). (http://jse.amstat.org/v2n1/mackisack.html)

Pelosi, M. K., and Sandifer, T. M. (1995), Doing Statistics with Minitab for Windows, New York: Wiley.

Steinhorst, R. K., and Keeler, C. M. (1995), "Developing Material for Introductory Statistics Courses from a Conceptual, Active Learning Viewpoint," Journal of Statistics Education [Online], 3(3). (http://jse.amstat.org/v3n3/steinhorst.html)

Ruth Hubbard
School of Mathematics
Queensland University of Technology
GPO Box 2434
Brisbane 4001
Australia

r.hubbard@fsc.qut.edu.au

Return to Table of Contents | Return to the JSE Home Page