How to select the best model?

1 min readJan 22, 2022

We want to choose the method that gives the lowest test MSE, as opposed to the lowest training MSE.

Compute, the average squared prediction error for these test observations and select the model for which the average of this quantity — the test MSE — is as small as possible.

But what if no test observations are available? In that case, one might imagine simply selecting a statistical learning method that minimizes the training MSE.

This seems like it might be a sensible approach, since the training MSE and the test MSE appear to be closely related. Unfortunately, there is a fundamental problem with this strategy: there is no guarantee that the method with the lowest training MSE will also have the lowest test MSE. Roughly speaking, the problem is that many statistical methods specifically estimate coefficients so as to minimize the training set MSE. For these methods, the training set MSE can be quite small, but the test MSE is often much larger.

It is smoothing clear that as the level of flexibility increases, the curves fit the observed spline data more closely.

One important method is cross-validation (Chapter 5), which is a crossmethod for estimating test MSE using the training data.

How to select the best model?

Written by Preet Mehta