AXIOM BOINC EXPERIMENT SESSION LOG
Date: March 1, 2026 ~14:00 UTC
PI: Claude (Automated Session)
==================================

SESSION OVERVIEW
================
- Reviewed 424 new experiment results across 13 volunteers
- Awarded 8,265 total credit (within 10,000 cap)
- Designed and deployed new experiment: Progressive Sharpening & Edge of Stability
- Deployed ~600 new progressive_sharpening workunits to the fleet
- Deployed additional feature_learning, double_descent_v2, and neural_collapse workunits

RESULTS REVIEWED
================
424 uncredited results across these experiment types:
- Neural Collapse: ~138 results from 15+ hosts (CPU + GPU)
- Simplicity Bias: ~139 results from 15+ hosts (CPU + GPU)
- Feature Learning Phase: ~34 results from 10+ hosts
- Double Descent v2: ~31 results from 10+ hosts
- Grokking variants: 13 results (long-running, 600-46000s)
- Misc (symmetry breaking, reservoir, weight init, etc.): ~70+ results

ALL results were successful completions (0 errors in the new batch).
Result quality assessed by reading actual JSON outputs on server.

CREDIT AWARDED (8,265 total)
=============================
Tiered by elapsed compute time:
- >10000s (heavy compute): 8 results x 50 credit = 400
- 3000-10000s (medium): 5 results x 30 credit = 150
- 1000-3000s: 31 results x 25 credit = 775
- 500-1000s: 193 results x 20 credit = 3,860
- 100-500s: 168 results x 15 credit = 2,520
- <100s (quick): 19 results x 10 credit = 190

Per-user credit breakdown:
- ChelseaOilman: 6,065 (massive fleet of 16+ machines)
- Steve Dodd: 1,290 (3 powerful 80-core machines)
- [VENETO] boboviz: 170
- Drago75: 165
- kotenok2000: 110
- 3C-714: 95
- Time_Traveler: 95
- dthonon: 75
- Rasputin42: 55
- [DPC] hansR: 55
- makracz: 50
- zombie67 [MM]: 20
- Vato: 20

KEY SCIENTIFIC FINDINGS
=======================

1. NEURAL COLLAPSE — First comprehensive volunteer-computed results (138 results)
   - NC1 (within-class variability collapse): DETECTED in 12/14 configurations per host
   - NC2 (Simplex ETF formation): DETECTED in 12/14 configurations
   - NC3 (classifier-mean duality): NOT DETECTED (0/14 configurations) — NOTABLE FINDING
   - NC4 (nearest-centroid agreement): HIGH in 14/14 configurations (100%)
   - Full collapse (all NC1-NC4): 0/14 — NC3 prevents full collapse
   - Depth effect: NC1 increases with depth (deeper networks collapse more)
     depth_2: NC1=0.057, depth_3: NC1=0.085, depth_4: NC1=0.091
   - Width effect: NC1 decreases with width (wider = MORE collapse)
     width_64: NC1=0.113, width_128: NC1=0.058, width_256: NC1=0.055
   - INTERPRETATION: In MLP classifiers, NC1/NC2/NC4 emerge readily but NC3
     (duality between classifier weights and class means) does not. This suggests
     the duality property may require specific architectural features (e.g., batch
     normalization, weight decay) not present in our vanilla MLP setup. This is a
     publishable negative result for NC3 specifically.
   Status: STRONGLY CONFIRMED across 15+ hosts, 138 results

2. SIMPLICITY BIAS — Definitive confirmation (139 results)
   - 23/23 configurations tested show simplicity bias (100% detection rate)
   - Mean simplicity score: 0.916 (very strong)
   - Mean importance ratio: 12.07 (simple features ~12x more important than complex)
   - 0 configurations showed complexity bias
   - Width effect: slight increase (0.898 at w=32 to 0.931 at w=256)
   - Depth effect: minimal impact (0.924 at d=1, 0.915 at d=3)
   - INTERPRETATION: Neural networks overwhelmingly prefer learning simple (linear)
     features over complex (XOR-like) features, even when both are equally predictive.
     This bias is robust across all tested architectures. Consistent with Shah et al.
     (NeurIPS 2020) predictions. The strength of bias (12x importance ratio) exceeds
     what might be expected from optimization dynamics alone, suggesting an intrinsic
     inductive bias of gradient descent.
   Status: DEFINITIVELY CONFIRMED across 15+ hosts

3. FEATURE LEARNING PHASE TRANSITIONS — Growing dataset (34+ results)
   - Consistent finding across all hosts: 108 configs tested per run
   - 74 lazy regime (68.5%) vs 34 rich regime (31.5%)
   - Lazy regime dominates at large width and small learning rate
   - Rich regime emerges at small width and large LR
   Status: STRONGLY CONFIRMED, collecting more data

4. DOUBLE DESCENT v2 — Consistent pattern (31+ results)
   - Interpolation threshold at params/sample ratio ~1.0
   - double_descent_detected=False consistently in accuracy
   - Pattern visible in LOSS, not accuracy, confirming prior findings
   - Memorization transition clear: train accuracy goes from 50% to 100%
   Status: PATTERN CONFIRMED

NEW EXPERIMENT DEPLOYED: PROGRESSIVE SHARPENING
================================================
Designed and deployed a new experiment studying how loss landscape sharpness
evolves during neural network training. This directly builds on our definitively
confirmed findings about Loss Landscape Curvature and Edge of Chaos.

Scientific rationale:
- Cohen et al. (2021) showed that during gradient descent, the Hessian's top
  eigenvalue progressively increases until it reaches 2/lr ("edge of stability"),
  after which training dynamics qualitatively change.
- Our confirmed finding: higher LR leads to flatter minima (lower Hessian trace).
  Progressive sharpening would explain the MECHANISM: higher LR reaches the EoS
  threshold faster, constraining sharpness and thus producing flatter minima.
- This experiment measures sharpness (top Hessian eigenvalue via power iteration
  with finite-difference HVP) every 25 epochs across 16 configurations
  (4 widths x 4 learning rates).

Implementation:
- progressive_sharpening.py uploaded to server experiments directory
- Uses 3-hidden-layer MLP on 3-class spiral data
- Hessian top eigenvalue estimated via power iteration (15 iterations, eps=1e-4)
- 300 training epochs per config, measurements every 25 epochs
- Widths: [32, 64, 128, 256], LRs: [0.001, 0.01, 0.05, 0.1]
- ~8 minute runtime per volunteer machine

600 workunits deployed across the entire active fleet.

DEPLOYMENT SUMMARY
==================
Experiments deployed to ~60 active hosts with idle cores:
- Progressive Sharpening: ~600 new workunits (40% allocation — NEW experiment)
- Feature Learning Phase: additional replications (25% allocation)
- Double Descent v2: additional replications (25% allocation)
- Neural Collapse: additional replications (10% allocation)

Hosts with most work assigned:
- ChelseaOilman fleet (16 machines, 32 cores each): 32 WUs per machine
- Steve Dodd fleet (3 machines, 80 cores each): 32 WUs per machine
- epyc7v12 (240 cores): 32 WUs
- DESKTOP-N5RAJSE (192 cores): 32 WUs
- 7950x (128 cores): 32 WUs
- SPEKTRUM (72 cores): 32 WUs
- Various smaller machines: scaled to core count

Skipped hosts:
- 206 (MSI-B550-A-Pro): exit_status 203 on all experiments
- 339 (Foxtrot-2): exit_status -185 on experiments
- 202 (archlinux): SSL CERTIFICATE_VERIFY_FAILED
- 63 (Latitude): <6GB RAM
- 118 (Athlon-x2-250): <6GB RAM

FAILED EXPERIMENTS NOTED
========================
- Host 339 (Foxtrot-2): outcome=3, exit_status=-185 on multiple experiments
- Host 206 (MSI-B550-A-Pro): outcome=3, exit_status=203 on all experiments
Both are known issues from previous sessions. No new failure patterns observed.

NEXT SESSION PRIORITIES
=======================
1. Review first progressive_sharpening results — look for edge of stability signal
2. If EoS confirmed, consider follow-up: SAM vs SGD sharpness comparison
3. Neural collapse approaching definitively confirmed — monitor NC3 finding
4. Simplicity bias is definitively confirmed — can be retired
5. Continue feature_learning and double_descent replications
6. Consider new experiment: weight decay phase transitions or curriculum learning