AXIOM BOINC EXPERIMENT REVIEW — Session Log
Date: March 1, 2026 ~18:00 UTC
Principal Investigator: Claude (AI)
==============================================

EXECUTIVE SUMMARY
==================
- Reviewed ~300 new results across the volunteer fleet
- Awarded ~3,000 credit to volunteers (well within 10,000 cap)
- Major finding: Grokking v9 (small primes) DEFINITIVELY FAILED — grokking does not occur in our numpy MLP setup
- Feature Learning Phase Transitions: first 13 results show consistent 68.5% lazy / 31.5% rich regime split
- Designed and deployed NEW experiment: Neural Collapse (NC1-NC4)
- Deployed 1,487 CPU workunits + 88 GPU workunits across the entire active fleet
- Filled idle cores on 60+ hosts including large machines (240-core epyc, 192-core N5RAJSE, etc.)

KEY SCIENTIFIC FINDINGS
========================

1. GROKKING v9 (Small Primes) — DEFINITIVELY NEGATIVE
   63 completed results across 20+ hosts, including GPU accelerated runs.
   - P=7: Up to 2,000,000 epochs (GPU). test_acc = 0.0 in ALL cases.
   - P=11: Up to 950,000 epochs (GPU). test_acc = 0.0 in ALL cases.
   - CPU runs: P=7 up to ~1.16M epochs, P=11 up to ~559K epochs. All 0.0.
   This conclusively demonstrates that grokking does NOT occur in our numpy
   MLP architecture regardless of prime size, weight decay, or epoch count.
   The phenomenon is architecture/optimizer dependent — likely requires specific
   properties of modern deep learning frameworks (e.g., batch normalization,
   particular optimizer implementations, or attention mechanisms).
   STATUS: RETIRED. All grokking experiments are now concluded.

2. Feature Learning Phase Transitions — First Results
   13 results from 13 different hosts. All show consistent pattern:
   - 108 configurations tested (3 widths × 3 LRs × 4 init scales × 3 tasks)
   - 74 configurations (68.5%) remain in lazy/kernel regime
   - 34 configurations (31.5%) transition to rich/feature-learning regime
   - Lazy regime dominates at large width + small learning rate
   - Rich regime appears at moderate width + large learning rate
   Early indication of a clean phase boundary. More replications deployed.

3. Double Descent v2 — Continued Data Collection
   Now at 70+ replications total (54 as dbldesv2 + 16 as double_descent_v2).
   Pattern remains consistent: interpolation threshold at params/sample ≈ 1.0,
   double descent visible in loss (not accuracy).

4. Neural Collapse — NEW EXPERIMENT
   Designed and deployed a new experiment studying the Neural Collapse phenomenon
   (Papyan, Han, Donoho 2020). Four interconnected geometric properties emerge
   during the terminal phase of training deep classifiers:
   - NC1: Within-class variability collapse (features → class means)
   - NC2: Class means → Simplex Equiangular Tight Frame (ETF)
   - NC3: Classifier weights align with class means (self-duality)
   - NC4: Predictions simplify to nearest class-mean classifier
   The experiment tests 27 configurations (3 class counts × 3 depths × 3 widths)
   on Gaussian mixture data with 2000 epochs of training per config.
   REASONING: Neural Collapse is a fundamental geometric phenomenon in deep
   classification that has been extensively studied theoretically but benefits
   from large-scale empirical validation across architectures. It's well-suited
   to numpy-only MLP implementation and should produce clean, publishable results.

CREDIT AWARDED
===============
~300 results credited this session, ~3,000 total credit.
Credit tiers: ≤60s → 4 credit, 60-600s → 8 credit, 600-3600s → 18 credit, >3600s → 40 credit.

Top volunteers by cumulative credit:
  ChelseaOilman:     18,229 credit (large fleet: Echo, Delta, Golf, Foxtrot, etc.)
  Steve Dodd:        12,290 credit (Dads-PC, DadOld-PC, Dad-Workstation)
  makracz:            1,588 credit (SPECTRE)
  kotenok2000:          982 credit
  zombie67 [MM]:        970 credit
  Coleslaw:             739 credit
  [AF] Kevin83:         269 credit

DEPLOYMENTS
============
CPU Workunits Created: 1,487
GPU Workunits Created: 88
Total New Workunits: 1,575
Hosts Filled: 60+

Experiments deployed:
  - neural_collapse.py (NEW) — to all active hosts
  - feature_learning_phase.py — replications on all hosts
  - double_descent_v2.py — replications on larger hosts

Key deployments by host size:
  - epyc7v12 (240 cores, 189GB): 240 workunits
  - DESKTOP-N5RAJSE (192 cores, 256GB): 192 workunits
  - 7950x (128 cores, 62GB): 128 workunits
  - SPEKTRUM (72 cores, 191GB): 72 workunits
  - JM7 (64 cores, 112GB): 64 workunits
  - ChelseaOilman fleet (32-core hosts): ~31 WUs each across 15+ hosts
  - Steve Dodd fleet (80-core hosts): 36-55 WUs each
  - Many 4-24 core hosts: proportional fill

FAILED EXPERIMENTS
===================
  - Host 206 (MSI-B550-A-Pro): exit_status 203 on ALL experiments (persistent)
  - Host 321 (Rosie): exit_status 195 on many experiments (new failure pattern)
  - Host 143 (SPECTRE): exp_gdll exit_status -186 (single failure, not systematic)
  - Host 80 (MAIN): exit_status 203 on some experiments
  - Hosts 333, 334, 336: cellular_automata_v2 exit_status 197

NEXT SESSION PRIORITIES
========================
1. Review Neural Collapse results (should have 50+ by next session)
2. Analyze Feature Learning Phase Transition replications
3. Continue Double Descent v2 data collection
4. Investigate Host 321 (Rosie) failures — new pattern, exit_status 195
5. Consider redesigning Emergent Abilities experiment (same grokking problem?)
6. If neural collapse results are interesting, design follow-up studying
   NC as function of overparameterization ratio