NAME: Results from a Deceptive Water Taste Test TYPE: Designed experiment SIZE: 104 observations, 9 variables DESCRIPTIVE ABSTRACT: Data were collected to answer the research question, “Does brand name affect Longwood community members’ preferences for various types of bottled water?” The data are from a “deceptive” water taste test whereby subjects would taste water samples poured from brand-name containers (Fiji, Aquafina, and Sam’s Choice) that actually held the same generic store brand water (Wal-Mart Drinking Water, sold in 1-gal jugs). Subjects were asked to rank their choices from most to least preferred. For each subject their water preference was recorded along with their gender, age, class ranking, what type of water the subject usually drank (i.e., tap, bottled, or filtered), and the subject’s favorite brand of bottled water, if any. SOURCES: The deceptive water taste test was conducted at Longwood’s annual Oktoberfest celebration in the fall of 2008. There were 104 volunteer subjects each of whom signed an informed consent form prior to participating in the taste test (a procedure approved in advance by Longwood’s Human and Animal Subjects Research Review Committee). VARIABLE DESCRIPTIONS: A header line that contains the variable names is included in the data file. Each case has one line in the data file. The variables are tab delimited. Do not use the space delimited option to read the data file. The variables in the order they appear are: Gender - Gender of subject recorded as M or F for Male or Female, respectively. Age – Age of subject recorded in integer values. Class – Academic rank of subject recorded as F, SO, J, SR, O for freshman, sophomore, junior, senior, and other, respectively. UsuallyDrink - Type of water the subject usually consumes recorded as B, F, T, NA for bottled, filtered, tap, and not applicable, respectively. See the Special Notes below regarding the NA values. FavBotWatBrand - Favorite bottled water brand of the subject. Bottled water brands are spelled with all capital letters (e.g., AQUAFINA, DEER PARK, NONE). Note that some have blanks in them so do not delimitate the data file using blanks. Preference - List of ordered preferences from the double-blind taste test recorded as three-letter strings containing the letters A, F, and S (and also the string NONE), where A, S, and F, represent Aquafina, Sam’s Choice, and Fiji brand water, respectively. For example, if Preference = SAF, then the subject chose Sam’s Choice as her first preference, Aquafina as her second, and Fiji as her third, or least preferred, water type. First - The water type chosen as the first preference recorded as A, S, F, or blank. Note that if Preference = SAF then First = S. If Preference = NONE then First is blank. Second - The water type chosen as the second preference recorded as A, S, F, or blank. Note that if Preference = SAF then Second = A. If Preference = NONE then First is blank. Third - The water type chosen as the third preference or least preference recorded as A, S, F, or blank. Note that if Preference = SAF then Third = F. If Preference = NONE then First is blank. SPECIAL NOTES: As with any real-life data collection, we encountered problems with how data were recorded on the data sheets in the field and subsequently needed to make decisions regarding how these data would be recorded electronically. Below we detail the problems and our resolutions to the problems. Most of these changes were made to make the data usable for pedagogical purposes. We would be happy to provide copies of the raw data sheets to any interested parties. In the class column, some of our student experimenters recorded a generic S instead of SR or SO for the academic class of the subject. By examining the Age variable we believe we were able to determine with reasonable accuracy to which academic class (either SR or SO) those subjects belonged. We also had four instances of age being recorded as 21+. In these cases we just made the age 21. Lastly, there was one age that was recorded on a data sheet as 2, which we were certain was incorrect. However, since most of our analyses do not involve age, we left that number as it was recorded. If using the variable Age in any analysis, one might want to consider eliminating that particular line of data. For subjects who gave more than one answer to the question of water usually consumed, we used the first type cited by the subject. For instance, on our data sheets from the field we have a few entries recorded as B/F, B & T, T/F, T/B, etc. These entries were coded as B, B, T, and T, respectively, in the data files associated with this paper. There were only six of these types of entries on the field data sheets for the deceptive test. On the deceptive test we also had one subject who apparently did not answer this question so we entered NA for the type of water she usually consumed. In the deceptive test, one subject cited AQUAFINA or SAM’S for her favorite brand of bottled water. Following the convention above, since the subject said AQUAFINA first, we listed that as his/her favorite (last line of data set). Also, on all sheets, if a blank or N/A was recorded for the variable Favorite Brand of Water, that record was entered into our Excel spreadsheet as “NONE.” We also endeavored to type in all brand names with all capital letters. For the subjects who could not decide on a preference for either test, we listed NONE in the ordered preference field and left the remaining fields (First, Second, etc.) blank. In the deceptive test we did have one subject who could only give her most preferred. For this subject we recorded F in the ordered preference field, F in the 1st Choice field, and left the remaining fields blank. STORY BEHIND THE DATA: These data were collected as part of a joint honors project between our Statistical Decision Making (MATH171) and Exploring Science in Our World (GNED261) courses, each of which is a part of Longwood’s General Education curriculum. We built the link between these courses by involving our students in research related to our campus’ two-year sustainability theme, thereby challenging them to develop basic research skills within a civically and socially relevant context. Specifically, we asked them to focus on the bottled water phenomenon by considering their peers’ preference of water types and also researching the costs (economic and environmental) of bottled water consumption. The deceptive test was conducted to answer the research question “Does brand name affect Longwood community members’ preferences for various types of bottled water?” Further description can be found in the "Datasets and Stories" article, Water Taste Test Data, submitted to the Journal of Statistics Education. PEDAGOGICAL NOTES: These data are appropriate for performing exploratory data analyses and inference using the test for a single proportion and chi-square tests, including the goodness-of-fit test and the test for independence. The data also provide opportunities for students to consider the cell-count assumptions for chi-square tests, as some tests meet the assumptions and others do not. In addition to the raw data files, we also give some of the data in summarized form in the "Datasets and Stories" article, Water Taste Test Data, submitted to the Journal of Statistics Education. The results from the taste test raise further questions about sustainability issues related to consumer preference, bottled water consumption, and its environmental impacts. REFERENCES: Fink A. D. and M. L. Lunsford. 2009. “Bridging the Divides: Using a Collaborative Honors Research Experience to Link Academic Learning to Civic Issues," Honors in Practice, National Collegiate Honors Council, Volume 5, 97-113. SUBMITTED BY: M. Leigh Lunsford Department of Mathematics and Computer Science Longwood University Farmville, VA 23909 lunsfordml@longwood.edu Alix D. Dowling Fink Department of Biological and Environmental Sciences Longwood University Farmville, VA 23909 finkad@longwood.edu