James V. Pinto, Pin Ng, and David S. Allen
Northern Arizona University
Journal of Statistics Education Volume 11, Number 1 (2003), jse.amstat.org/v11n1/pinto.html
Copyright © 2003 by James V. Pinto, Pin Ng, and David S. Allen, all rights reserved. This text may be freely shared among individuals, but it may not be republished in any medium without express written consent from the authors and advance notification of the editor.
Key Words: Power function; Simulations; Type II error.
There is a potential misuse of the power function under the logical extreme when the null hypothesis is true. The power function is defined to measure the probability of rejecting the null given any value of the parameter being tested. It can be used to obtain the power and the values only under the alternative hypothesis. When the null is true, the power function can be used to obtain the size of the test. The power and the probability of committing a Type II error are, however, undefined and, hence, the power function should not be used to obtain these values.
It is not an easy or smooth task to calculate the power of a test for beginning statistics students. Until computer algorithms and simulations were designed to help in this process (see, for example, Doane, Mathieson, and Tracy 2001), the best one could do was to calculate a single point on the power function representing 1 - (as in Kvanli, Ravur, and Guynes 2000). Yu and Behrens (1994) illustrate the usefulness of simulations in determining the power of a test. It is possible to calculate every point on the power curve, but in-class demonstrations are usually limited to single points due to computational constraints and the repetitive nature of the process. Logical extremes, as detailed below, are usually ignored. The purpose of this note is to demonstrate that the logical extremes are nontrivial and that one of the extremes results in calculations of and the power of the test that are inconsistent with the common pedagogy. A detailed history of the nature of hypothesis testing, including a section of the power of the test in light of the Fisher approach versus the Neyman-Pearson approach, can be found in Lehmann (1993). He concluded that combining the best features of both could unify the two approaches. This historical debate will not be addressed in this paper.
We will use the familiar test on the population mean of a normal distribution with a known variance to set up the stage for logical extremes. We assume that the data generating process is where is the probability density function of a normal distribution with an unknown mean and a known variance .
We first consider testing the simple null hypothesis
Figure 1. Beta and the Power of the Test for Two-Tailed Test.
The power function, defined as the probability of rejecting the null given a specific value of
,
for the z test is given by
, where
is the cumulative distribution function of the standard normal distribution. This yields
1 - = 0.16 on the power function at this single point of
Figure 2. The Power Function for a Two-Tailed Test.
For the one-tailed test, we consider the composite null hypothesis
Figure 3. Beta and the Power of the Test for One-Tail Test.
Figure 4. The Power Function for a One-Tail Test.
The first logical extreme occurs when the two sampling distributions in both Figure 1 and
3 pull apart to a point where there is very little overlap between them. In this case the power of the test approaches 1 (as the two distributions in
Figure 1 and Figure 3 separate), and the approaches zero. This can easily be seen in
Figure 2 when
moves away from
The second logical extreme occurs when
approaches
This is where this logical extreme breaks down. In this case, it is incorrect to say that “the power of the test is 10% or the probability of committing a Type II error is 90%” since the power is defined as “the probability of rejecting a false null hypothesis” and a Type II error is defined to be “the failure to reject the false null hypothesis.” But there is no false null hypothesis when the null hypothesis is true at
Figure 5. Logical Extreme: Beta and the Power of the Test for Two-Tailed Test.
Similarly, the sampling distribution and the power function for the one-tailed test are presented in Figures 6 and 7.
Figure 6. Logical Extreme: Beta and the Power of the Test for One-Tail Test (Right).
Figure 7. Logical Extreme: The Power Function for a One-Tail Test (Right).
The power function in Figure 7 provides the probability of rejecting the null hypothesis over the whole range of values of
.
It can be used to obtain the power of the test only when
There is a potential misuse of the power function under the logical extreme when the null hypothesis is true. The power function is defined to measure the probability of rejecting the null given any value of the parameter being tested. It can be used to obtain the power and the values only under the alternative hypothesis. When the null is true, the power function can be used to obtain the size of the test. The power and the probability of committing a Type II error are, however, undefined and, hence, the power function should not be used to obtain these values.
The authors wish to thank two anonymous referees for their detailed and useful comments. All remaining errors are the responsibility of the authors.
James V. Pinto
College of Business Administration
Northern Arizona University
Flagstaff, AZ 86011
James.Pinto@nau.edu
Pin Ng
College of Business Administration
Northern Arizona University
Flagstaff, AZ 86011
Pin.Ng@nau.edu
David S. Allen
College of Business Administration
Northern Arizona University
Flagstaff, AZ 86011
David.Allen@nau.edu
Volume 11 (2003) | Archive | Index | Data Archive | Information Service | Editorial Board | Guidelines for Authors | Guidelines for Data Contributors | Home Page | Contact JSE | ASA Publications