Experiment: Double Descent

Double Descent

Category: Machine Learning

Summary: Measuring the test-error spike near the interpolation threshold and its decline again as a model becomes more overparameterized.

The double-descent phenomenon challenges the older intuition that larger models should monotonically overfit more. This experiment asks how test error changes as a single-hidden-layer network crosses the interpolation threshold where the number of parameters becomes comparable to the number of training samples.

The setup uses synthetic multi-class classification with fixed training and test sets while sweeping the hidden-layer width of an MLP. That creates a direct comparison between underparameterized, near-threshold, and strongly overparameterized regimes in the same task.

The point is to map the full shape of the curve, not just the best-performing width. The expected signature is a peak in test error near interpolation followed by a second descent as model size continues to grow.

Method: Single-hidden-layer MLP width sweep on synthetic classification data to compare generalization below, near, and above the interpolation threshold.

What is measured: Test error across width, location of the interpolation threshold, and comparative behavior in underparameterized versus overparameterized regimes.