Experiment: Random Label Memorization

Random Label Memorization

Category: Machine Learning

Summary: Replicating the claim that deep networks can memorize random labels while comparing the speed, norms, and generalization costs of memorizing real versus corrupted targets.

A central result in modern deep learning is that large networks can fit random labels even when those labels contain no meaningful structure. This experiment revisits that finding on a controlled Gaussian-cluster task, asking how memorization dynamics change as labels move from real to partly corrupted to fully random.

The model trains the same network under several label conditions and records how long it takes to reach perfect training accuracy, how large the weight norms become, and how much test accuracy survives. Those measurements separate raw memorization capacity from useful generalization.

The value of the experiment is not in proving that memorization is possible, which is already known, but in quantifying how the burden of memorization shows up in optimization speed and parameter growth. That helps connect the phenomenon to broader questions about implicit bias and generalization.

Method: Repeated MLP training on Gaussian-cluster classification under real, partially corrupted, and fully random labels, with trajectory logging of accuracy, loss, and norms.

What is measured: Epochs to perfect training accuracy, final train and test loss, final weight norm, weight-norm ratio across label conditions, memorization-speed ratio, and generalization gap.