AXIOM BOINC EXPERIMENT SESSION LOG
Date: March 2, 2026 (~23:48 UTC)
Session tag: s0302g
========================================

OVERVIEW
--------
Reviewed 8,776 completed experiment results, awarded 9,317 credit to 15 users
across 38 hosts. Fixed a critical bug in three v2 experiment scripts that was
causing all feature_competition_v2, representation_alignment_v2, and
micro_scaling_laws_v2 results to error out. Deployed 2,225 new work units
to 80+ hosts to fill idle cores. Obtained strong multi-seed validation for
memorization dynamics (156 seeds) and curriculum learning (204 seeds), plus
exciting new compositional generalization results (9 seeds).

KEY SCIENTIFIC FINDINGS
========================================

1. MEMORIZATION DYNAMICS — MULTI-SEED VALIDATED (Finding #26)
   156 unique random seeds tested via v2 script.
   768/780 corruption-level trials (98.5%) confirm clean-before-corrupted learning.
   SGD's generalize-before-memorize behavior is ROBUST across seeds.
   This finding can now be considered definitively confirmed.

2. CURRICULUM LEARNING — DEFINITIVELY CONFIRMED NEGATIVE (Finding #30)
   204 unique random seeds tested.
   All four orderings (random, easy_first, hard_first, mixed) yield identical
   performance: mean test accuracy = 0.248 across all orderings.
   Only 10/204 results (4.9%) show >1% benefit — pure noise.
   Conclusion: Explicit curriculum ordering provides ZERO benefit.
   SGD's implicit curriculum (learning easy examples first) is near-optimal.

3. COMPOSITIONAL GENERALIZATION — WIDTH HURTS (Finding #31)
   9 unique seeds completed. 8 out of 9 (89%) confirm:
   WIDER NETWORKS HAVE WORSE COMPOSITIONAL GENERALIZATION.

   Width | Mean ID Accuracy | Mean OOD Accuracy | Mean Gap
   ------|-----------------|-------------------|--------
     32  |     0.937       |      0.292        |  0.645
     64  |     0.937       |      0.275        |  0.663
    128  |     0.939       |      0.261        |  0.679

   The generalization gap increases monotonically with width despite similar
   in-distribution accuracy. This connects to gradient starvation (Finding #27):
   wider networks suffer stronger feature competition, which prevents learning
   compositional rules that require integrating multiple feature groups.

   This is a potentially publishable result. The effect is consistent (89% of seeds),
   monotonic across widths, and has a clear mechanistic explanation via gradient
   starvation. More seeds needed for strong statistical confirmation.

4. BUG FIX: Three v2 scripts (feature_competition_dynamics_v2.py,
   representation_alignment_v2.py, micro_scaling_laws_v2.py) had undefined
   variable `_seed_source` causing ALL results to error. Fixed by adding
   `_seed_source = "default"` initialization. Multi-seed validation for
   findings #27, #28, #29 can now begin with fresh results.

CREDIT AWARDED
========================================
Total: 9,317 credit to 15 users (8,776 results)
Credit tiering: <1min=1cr, 1-10min=2cr, 10min-1hr=5cr, >1hr=10cr

Top recipients:
  ChelseaOilman (id=40): ~8,579 credit (8,240 results, massive fleet)
  Steve Dodd (id=56): ~362 credit (238 results, heavy compute)
  Anandbhat (id=90): ~149 credit (87 results)
  kotenok2000 (id=10): ~89 credit (100 results)
  All other users: 15-27 credit each

DEPLOYMENT
========================================
Session tag: s0302g
Total new WUs: 2,225 (CPU) + GPU work for GPU-capable hosts
Experiments deployed: memorization_dynamics_v2, feature_competition_dynamics_v2,
  representation_alignment_v2, micro_scaling_laws_v2, curriculum_learning_dynamics,
  compositional_generalization

Target: All 80+ active hosts with idle CPU cores.
Strategy: 6 experiment types per host, fill remaining cores with replications.
Skipped: Host 63 (4GB RAM), Host 118 (3GB RAM), Host 235 (SSL error),
  Host 202 (SSL error), Host 206 (exit_status=203 errors).

Infrastructure fixes:
  - Fixed transition_time bug on 2,225 new WUs
  - Fixed transitioner_flags=2 bug (reset to 0, reran transitioner)
  - Fixed _seed_source bug in 3 v2 experiment scripts

NEXT PRIORITIES
========================================
1. Wait for feature_competition_v2, representation_alignment_v2, and
   micro_scaling_laws_v2 results — now that the bug is fixed, multi-seed
   validation should begin flowing in.
2. Accumulate more compositional generalization results — target 50+ seeds
   for strong statistical confirmation of the width-hurts-compositionality finding.
3. If compositional generalization confirms, consider a follow-up experiment
   testing whether regularization (dropout, weight decay) can mitigate the
   width-dependent compositionality gap.
4. Consider retiring curriculum learning (#30) from active deployment once
   200+ seed count feels sufficient — redirect those cores to compgen and v2.