NAME: Evaluating Aptness of a Regression Model TYPE: Full population of data (all software projects completed by the AT&T data center from 1986 through 1991). SIZE: 104 observations, 5 variables DESCRIPTIVE ABSTRACT: Values for function point count, actual work hours, operating system, database management system, and programming language are recorded for 104 software projects from AT&T. Function point counts are often used to help predict the number of work hours that will be required to complete a proposed software project. This data can be used to demonstrate the development and evaluation of a linear regression model. In particular this is an excellent data set for demonstrating violations of the standard assumptions of linear regression, and how to address those violations. SOURCE: The data were generously provided by Linda Hughes and Mary Dale of AT&T. VARIABLE DESCRIPTIONS: Columns Variable 1-4 Function Point Count 9-13 Work Hours 17 Operating System: (0) Unix (1) MVS 25 Database Management System: (1) IDMS (2) IMS (3) INFORMIX (4) INGRESS (5) Other 33 Language: (1) COBOL (2) PLI (3) C (4) Other STORY BEHIND THE DATA: Function points are a standard metric used for estimating the size of software development projects (International Function Point Users Group, 2005). As the number of function points for a proposed software project increases, so will the estimate of development effort required to produce the software increase. Regression models based on function points are an important tool used in the management of software development projects. PEDAGOGICAL NOTES: The data set is particularly interesting in that it violates most of the assumptions required of a linear regression model. When the data are transformed using a natural log transformation, the violations are corrected. Prediction intervals can also be productively used to provide insight into the practical limitations of the predictive accuracy of the final model. REFERENCES: International Function Point Users Group (2005). “About Function Point Analysis”, http://www.ifpug.org/about/about.htm. SUBMITTED BY: Jack E. Matson and Brian R. Huguenard Department of Decision Sciences and Management Tennessee Technological University Johnson Hall 306 1105 North Peachtree Street Cookeville, TN 38505-0001 Phone: (931) 372-3793 FAX: (931) 372-6249 JEMatson@tnech.edu