This book is for anyone who has biomedical data and needs to identify variables that predict an outcome, for two-group outcomes such as tumor/not-tumor, survival/death, or response from treatment. Statistical learning machines are ideally suited to these types of prediction problems, especially if the variables being studied may not meet the assumptions of traditional techniques.
Learning machines come from the world of probability and computer science, but are not yet widely used in biomedical research. This introduction brings learning machine techniques to the biomedical world in an accessible way, explaining the underlying principles in nontechnical language and using extensive examples and figures. The authors connect these new methods to familiar techniques by showing how to use the learning machine models to generate smaller, more easily interpretable, traditional models.
Coverage includes single decision trees, multiple-tree techniques such as Random Forests™, neural nets, support vector machines, nearest neighbors, and boosting.
James D. Malley is a Research Mathematical Statistician in the Mathematical and Statistical Computing Laboratory, Division of Computational Bioscience, Center for Information Technology, at the National Institutes of Health.
Karen G. Malley is President of Malley Research Programming, Inc. in Rockville, Maryland, providing biostatistical programming services to the pharmaceutical industry and to the National Institutes of Health.
Sinisa Pajevic is a Staff Scientist in the Mathematical and Statistical Computing Laboratory, Division of Computational Bioscience, Center for Information Technology, at the National Institutes of Health.