Applied Predictive Modeling

This is a preview of subscription content, log in via an institution to check access.

Access this book

Subscribe and save

Springer+ Basic €32.70 /Month

Buy Now

Price includes VAT (France)

Softcover Book EUR 63.29

Price includes VAT (France)

Hardcover Book EUR 89.66

Price includes VAT (France)

Tax calculation will be finalised at checkout

Other ways to access

About this book

Applied Predictive Modeling covers the overall predictive modeling process, beginning with the crucial steps of data preprocessing, data splitting and foundations of model tuning. The text then provides intuitive explanations of numerous common and modern regression and classification techniques, always with an emphasis on illustrating and solving real data problems. The text illustrates all parts of the modeling process through many hands-on, real-life examples, and every chapter contains extensive R code for each step of the process.

This multi-purpose text can be used as an introduction to predictive models and the overall modeling process, a practitioner’s reference handbook, or as a text for advanced undergraduate or graduate level predictive modeling courses. To that end, each chapter contains problem sets to help solidify the covered concepts and uses data available in the book’s R package.

This text is intended for a broad audience as both an introduction to predictive models as well as a guide to applying them. Non-mathematical readers will appreciate the intuitive explanations of the techniques while an emphasis on problem-solving with real data across a wide variety of applications will aid practitioners who wish to extend their expertise. Readers should have knowledge of basic statistical ideas, such as correlation and linear regression analysis. While the text is biased against complex equations, a mathematical background is needed for advanced topics.

Similar content being viewed by others

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part I—Introduction and General Principles

Chapter © 2022

Statistical Data Mining of Clinical Data

Chapter © 2020

Foundations of Machine Learning-Based Clinical Prediction Modeling: Part II—Generalization and Overfitting

Chapter © 2022

Keywords

Table of contents (20 chapters)

Front Matter

Pages i-xiii

Introduction

General Strategies

Front Matter

Pages 17-17

A Short Tour of the Predictive Modeling Process

Pages 19-26

Data Pre-processing

Pages 27-59

Over-Fitting and Model Tuning

Pages 61-92

Regression Models

Front Matter

Pages 93-93

Measuring Performance in Regression Models

Pages 95-100

Linear Regression and Its Cousins

Pages 101-139

Nonlinear Regression Models

Pages 141-171

Regression Trees and Rule-Based Models

Pages 173-220

A Summary of Solubility Models

Pages 221-223

Case Study: Compressive Strength of Concrete Mixtures

Pages 225-243

Classification Models

Front Matter

Pages 245-245

Measuring Performance in Classification Models

Pages 247-273

Discriminant Analysis and Other Linear Classification Models

Pages 275-328

Nonlinear Classification Models

Pages 329-367

Classification Trees and Rule-Based Models

Pages 369-413

A Summary of Grant Application Models

Pages 415-418

Remedies for Severe Class Imbalance

Pages 419-443

Reviews

“…In teaching a data science course…I use a range of different resources because I need to cover working with data, model evaluation, and machine learning methods. The next time I teach this course, I will use only this book because it covers all of these aspects of the field.” (Louis Luangkesorn, lugerpitt.blogspot.com, June 2015)

“There are a wide variety of books available on predictive analytics and data modeling around the web…we’ve carefully selected the following 10 books, based on relevance, popularity, online ratings, and their ability to add value to your business. 1. Applied Predictive Modeling.” (Timothy King, Business Intelligence Solutions Review, solutions-review.com, June 2015)

"Applied Predictive Modeling aims to expose many of these techniques in a very readable and self-contained book. This is a very applied and hands-on book. It guides the reader through many examples that serve to illustrate main points, and it raises possible issues and considerations that are oftentimes overlooked or not sufficiently reflected upon. Highly recommended." (Bojan Tunguz, tunguzreview.com, June 2015)

“This monograph presents a very friendly practical course on prediction techniques for regression and classification models… It is a well-written book very useful to students and practitioners who need an immediate and helpful way to apply complex statistical techniques.” (Stan Lipovetsky, Technometrics, Vol. 56 (3), August 2014)

“In my judgment, Applied Predictive Modeling by Max Kuhn and Kjell Johnson (Springer 2013) ought to be at the very top of the reading list …They come across like coaches who really, really want you to be able to do this…Applied Predictive Modeling is a remarkable text…it is the succinct distillation of years of experience of two expert modelers…” (Joseph Rickert, blog.revolutionanalytics.com, June 2014)

"This strong, technical, hands-on treatment clearly spells out the concepts, and illustrates its themes tangibly with the language R, the most popular open source analytics solution." (Eric Siegel, Ph.D. Founder, Predictive Analytics World, Author, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die)

Authors and Affiliations

Division of Nonclinical Statistics, Pfizer Global Research and Development, Groton, USA

Arbor Analytics, Saline, USA

About the authors

Dr. Kuhn is a Director of Non-Clinical Statistics at Pfizer Global R&D in Groton Connecticut. He has been applying predictive models in the pharmaceutical and diagnostic industries for over 15 years and is the author of a number of R packages.

Dr. Johnson has more than a decade of statistical consulting and predictive modeling experience in pharmaceutical research and development. He is a co-founder of Arbor Analytics, a firm specializing in predictive modeling and is a former Director of Statistics at Pfizer Global R&D. His scholarly work centers on the application and development of statistical methodology and learning algorithms.

Bibliographic Information