Category: Machine Learning
Summary: Testing whether wider networks lose compositional generalization because they learn more overlapping internal subspaces for different feature groups.
Compositional generalization may require different feature factors to remain cleanly separated inside a network. This experiment asks whether wider models instead learn overlapping subspaces for distinct feature groups, making them less able to recombine features correctly on unseen combinations.
The model trains networks on a compositional task, measures principal-angle overlap between learned subspaces, and tracks effective rank and generalization through training. It also compares how these quantities change with width and with weight decay, which may reduce representational overlap.
That makes the project a mechanism test rather than just another width sweep. The aim is to connect a geometric property of hidden representations to the practical failure of out-of-distribution compositional reasoning.
Method: Controlled neural-network training with principal-angle subspace analysis and checkpointed rank diagnostics across widths and regularization settings.
What is measured: Subspace overlap between feature groups, effective rank, in-distribution and out-of-distribution accuracy, compositional gap, and correlation between overlap and generalization.
