Category: Machine Learning
Summary: Studying gradient starvation, where easy high-signal features suppress the learning of harder low-signal features during neural-network training.
Not all input features are equally easy to learn. This experiment asks how strongly high-signal features can dominate gradient updates and delay or suppress the learning of weaker features, a phenomenon often described as gradient starvation.
The model implements forward pass, backpropagation, and stochastic gradient descent directly in NumPy so the competition dynamics can be measured in a transparent way. The central comparison is between easy high-SNR features and harder low-SNR ones as training allocates representational capacity over time.
That makes the run a mechanistic study of learning order rather than a benchmark for final accuracy alone. The goal is to see when dominant features crowd out weaker structure and how that shapes the resulting representation.
Method: From-scratch NumPy neural-network training that tracks how gradients and learning progress are distributed between easy high-SNR and hard low-SNR features.
What is measured: Relative learning progress of high-SNR versus low-SNR features, evidence for gradient starvation, and resulting representation allocation during training.
