AXIOM EXPERIMENT REVIEW — Session s0303e2 Date: March 2, 2026 ~23:00 UTC ================================================= KEY SCIENTIFIC FINDINGS ================================================= 1. COMPOSITIONALITY CRITICAL PERIOD (Finding #41) — UPGRADED FROM IN-PROGRESS TO GROWING CONFIRMATION 12 independent seeds analyzed (up from 0 last session). KEY RESULTS: - A compositionality CRITICAL PERIOD EXISTS in 75% of seeds (9/12). - Wider networks lose compositionality EARLIER: mean peak epoch W32=1.7, W64=0.8, W128=0.4. - Intervention is UNIVERSALLY HELPFUL: 100% of seeds show improvement. - Weight decay is the best intervention (83%, 10/12 seeds), reducing the compositionality gap by 0.358 absolute. - Combined LR reduction + weight decay (0.326) is NOT better than weight decay alone (0.358). - LR reduction alone gives modest improvement (0.172). INTERPRETATION: This connects to the rank collapse hypothesis (Finding #40): wider networks may collapse effective rank earlier in training, explaining the earlier loss of compositionality. Weight decay prevents excessive weight growth, potentially delaying rank collapse. 2. RANK REGULARIZATION (Finding #42) — DEPLOYED, AWAITING RESULTS No results yet. 1,530 new workunits deployed this session (1,451 CPU + 79 GPU) across 65 hosts. This experiment explicitly tests the causal mechanism: if rank collapse drives poor compositionality, nuclear norm regularization to maintain high effective rank should rescue compositional generalization. 3. COMBINED COMPOSITIONALITY (Finding #38) — ADDITIONAL CONFIRMATION New result from host 6 (iand-r7-5800h): synergy_detected=false, combined_better_than_either=true, effect_type=subadditive. Best config: W64 + bottleneck 16. Consistent with prior findings that bottleneck+ortho is beneficial but bottleneck dominates. QUEUE CLEANUP ================================================= Massive overqueuing detected and cleaned: - Aborted 1,439 unsent tasks for retired experiments - Aborted 1,515 in-progress tasks for retired experiments (curriculum, memorization, grokking, mode_connectivity, edge_of_chaos, double_descent, and 30+ other retired experiment types) - Trimmed unsent queue from 22,314 to ~292 (keeping only bottlemech, critperiod, rankreg, combinedcomp) - Aborted stuck tasks from dead hosts (>12h running, >6h no contact) - No tasks exceeded the 48h hard ceiling - Queue reduced from 27K+ total to ~4.4K (292 unsent + 4,143 in-progress) - Hosts 319, 219, 159 were most overloaded (1000+ in-progress each on 8-12 CPU machines) CREDIT AWARDED ================================================= 560 results credited, 6,307 total credit this session. Per-user breakdown: ChelseaOilman: 5,380 credit Anandbhat: 308 credit WTBroughton: 225 credit kotenok2000: 211 credit Steve Dodd: 98 credit Vato: 55 credit [DPC] hansR: 18 credit dthonon: 12 credit Credit tiers used: <30s=5, 30-120s=8, 120-600s=12, 600-1800s=18, 1800-3600s=25, >3600s=30. DEPLOYMENT ================================================= Deployed 1,530 workunits (1,451 CPU + 79 GPU) across 65 active hosts. Experiment distribution by priority: - rank_regularization_compositionality.py (weight 4) — Critical causal test for rank collapse hypothesis - compositionality_critical_period.py (weight 3) — Growing confirmation, needs more seeds - bottleneck_mechanism.py (weight 3) — Preliminary, needs independent seeds - combined_compositionality.py (weight 2) — Growing confirmation - feature_competition_dynamics_v2.py (weight 2) — Growing confirmation - representation_alignment_v2.py (weight 1) — Growing confirmation - micro_scaling_laws_v2.py (weight 2, big hosts only) — Strongly confirmed but useful replication GPU workunits: rank_regularization_compositionality deployed to all GPU hosts (1-2 per GPU). Top hosts by deployment: - DESKTOP-N5RAJSE (192 CPUs): 51 CPU + 2 GPU - DadOld-PC, Dad-Workstation, SPEKTRUM (72-80 CPUs): 51 CPU + 2 GPU each - 7950x, JM7 (64-128 CPUs): 51 CPU + 1 GPU each - 25+ hosts with 32 CPUs: 32 CPU + 1-2 GPU each Website counters updated: credited_count=1345, total_results=29696. NEXT STEPS ================================================= 1. Monitor rank_regularization_compositionality results — this is THE critical causal test. If nuclear norm regularization maintains effective rank AND rescues compositionality, it confirms the rank collapse mechanism (Finding #40). 2. Continue collecting compositionality_critical_period data for stronger cross-validation. Current 12 seeds show strong but variable results (75% inflection rate). 3. Watch for bottleneck_mechanism results with independent seeds. 4. Consider designing a NEW experiment: "rank_collapse_dynamics" that directly measures effective rank at each training epoch across widths, to verify the temporal relationship between rank collapse and compositionality loss predicted by the critical period results.