Skip to main content

Introduction to Statistical Learning

Enrollment is Closed

About This Course

This is an introductory-level course in supervised learning, with a focus on regression and classification methods. The syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; support-vector machines. Some unsupervised learning methods are discussed: principal components and clustering (k-means and hierarchical).

This is not a math-heavy class, so we try and describe the methods without heavy reliance on formulas and complex mathematics. We focus on what we consider to be the important elements of modern data analysis. Computing is done in R. There are lectures devoted to R, giving tutorials from the ground up, and progressing with more detailed sessions that implement the techniques in each chapter.

The lectures cover all the material in An Introduction to Statistical Learning, with Applications in R by James, Witten, Hastie and Tibshirani (Springer, 2013). The pdf for this book is available for free on the book website.

You will find a detailed course syllabus for Stats 216 at syllabus.stanford.edu.

Prerequisites

First courses in statistics, linear algebra, and computing.

Instructor

Lester Mackey

Lester Mackey is an Assistant Professor in the Department of Statistics and, by courtesy, of Computer Science at Stanford University.

Teaching Assistants

Stephen Bates, Alex Chin, Jackson Gorham, Matteo Sesia, Andy Tsao

Course Production Team

Will Fithian and Sam Gross produced and formatted the quiz questions and review questions. Daniela Witten helped present some of the material in Chapter 5. Wes Choy managed the video production. Greg Maximov filmed and edited most of the course videos, as well as the interviews and group recordings. Greg Bruhns, Monica Diaz and Marc Sanders assisted with Open edX.

Frequently Asked Questions

Do I need to buy a textbook?

No, a free online version of An Introduction to Statistical Learning, with Applications in R by James, Witten, Hastie and Tibshirani (Springer, 2013) is available from that website. Springer has agreed to this, so no need to worry about copyright. Of course you may not distribiute printed versions of this pdf file.

  1. Course Number

    STATS216
  2. Classes Start

  3. Classes End

  4. Estimated Effort

    5 hours per week