Experiment: Representation Alignment Dynamics

Representation Alignment Dynamics

Category: Machine Learning

Summary: Measuring whether neural networks trained from different random seeds converge toward similar hidden representations, and how that depends on width, depth, and training time.

Neural networks can achieve similar accuracy while using internally different features, so an open question is how reproducible learned representations really are. This experiment asks whether independent runs on the same task align in representation space, whether wider networks align more strongly, and which layers converge first during training.

The script trains families of simple networks on a synthetic classification task and compares hidden activations across random seeds using centered kernel alignment. Checkpoints through training turn the problem into a dynamical one: not just whether alignment appears, but when it appears and how architecture changes the trajectory.

That matters because alignment can distinguish stable shared feature learning from idiosyncratic memorization. The result helps connect width, lazy-learning ideas, and representation geometry in a controlled small-model setting.

Method: Repeated neural-network training across seeds with checkpointed centered kernel alignment measurements of hidden representations.

What is measured: CKA similarity across seeds, width and depth dependence of alignment, layer-wise convergence timing, and training-time evolution of representation similarity.