Why We Wrote This Book This book is about using graphs to explore and model continuous multi variate data. Such data are often modelled using the multivariate normal distribution and, indeed, there is a literatme of weighty statistical tomes presenting the mathematical theory of this activity. Our book is very dif ferent. Although we use the methods described in these books, we focus on ways of exploring whether the data do indeed have a normal distribution. We emphasize outlier detection, transformations to normality and the de tection of clusters and unsuspected influential subsets. We then quantify the effect of these departures from normality on procedures such as dis crimination and duster analysis. The normal distribution is central to our book because, subject to our exploration of departures, it provides useful models for many sets of data. However, the standard estimates of the parameters, especially the covari ance matrix of the observations, are highly sensitive to the presence of outliers. This is both a blessing and a curse. It is a blessing because, if we estimate the parameters with the outliers excluded, their effect is appre ciable and apparent if we then include them for estimation. It is however a curse because it can be hard to detect which observations are outliers. We use the forward search for this purpose.
The forward search provides a method of revealing the structure of data through a mixture of model fitting and informative plots. The continuous multivariate data that are the subject of this book are often analyzed as if they come from one or more normal distributions. Such analyses, including the need for transformation, may be distorted by the presence of unidentified subsets and outliers, both individual and clustered. These important features are disguised by the standard procedures of multivariate analysis. The book introduces methods that reveal the effect of each observation on fitted models and inferences.
The powerful methods of data analysis will be of importance to scientists and statisticians. Although the emphasis is on the analysis of data, theoretical developments make the book suitable for a graduate statistical course on multivariate analysis. Topics covered include principal components analysis, discriminant analysis, cluster analysis and the analysis of spatial data. S-Plus programs for the forward search are available on a web site.
This book is a companion to Atkinson and Riani's Robust Diagnostic Regression Analysis of which the reviewer for The Journal of the Royal Statistical Society wrote "I read this book, compulsive reading such as it was, in three sittings."
Anthony Atkinson is Emeritus Professor of Statistics at the London School of Economics. He is also the author of Plots, Transformations, and Regression and coauthor of Optimum Experimental Designs. Professor Atkinson has served as Editor of The Journal of the Royal Statistical Society, Series B.