============================================================ AXIOM EXPERIMENT RESULTS — March 1, 2026 2:30 AM ============================================================ PREVIOUSLY RECORDED RESULT IDs (do not re-record these): 1509027, 1509028, 1509029, 1509030, 1509031, 1509034, 1509035, 1509036, 1509037, 1509039, 1509040, 1509041, 1509042, 1509044, 1509045, 1509046, 1509048, 1509049, 1509050, 1509051, 1509052, 1509054 CREDITED RESULT IDs (do not re-credit these): 1509034 (10cr ChelseaOilman), 1509035 (15cr philip-in-hongkong), 1509036 (15cr philip-in-hongkong), 1509037 (75cr Coleslaw), 1509039 (50cr makracz), 1509040 (15cr makracz), 1509041 (30cr makracz), 1509042 (10cr makracz), 1509044 (25cr zioriga), 1509045 (75cr ChelseaOilman), 1509046 (5cr Vato), 1509048 (5cr Drago75), 1509049 (5cr Coleslaw), 1509050 (5cr Steve Dodd), 1509051 (15cr Drago75), 1509052 (25cr Steve Dodd), 1509054 (15cr Vato) SUMMARY ------- New results this session: 3 Total completed (all time): 21 successful, 1 GPU failure (v6.02) Total pending: 6 (in-progress on hosts) Credit awarded this session: 55 NEW RESULTS (ranked by scientific interest) ------------------------------------------- 1. [1509052] WEIGHT INITIALIZATION LANDSCAPE — Host: Dads-PC (80 CPUs, 128GB Windows) User: Steve Dodd Runtime: 98.3s Credit: 25 (Good — comprehensive 13-scheme comparison with clear ranking) Findings: - 13 initialization schemes tested on [20,128,128,64,64,6] architecture (32k params) - Clear ranking: lecun_normal (34.5%) > orthogonal (31.1%) > identity_like (28.3%) > sparse_90pct (28.1%) > xavier_uniform (24.5%) - 4 schemes completely fail (zero gradient flow): uniform_0.001, uniform_0.01, uniform_1.0, zeros — stuck at 16.7% (random chance) - Gradient flow ratio perfectly predicts success: 0.0 flow = no learning - SURPRISE: He initialization (designed for ReLU) actually WORSE than Xavier/lecun — he_normal only 6.2% test acc despite 100% train - He init causes massive overfitting — 100% train but worst test performance among learning schemes - lecun_normal beats xavier — consistent with theory (fan_in only, better for ReLU input layers) - identity_like in top 3 is unexpected — suggests preserving input signal through layers helps - All models severely overfit (100% train, 6-35% test) — task may be too easy for train but hard for generalization Quality: Good — clear practical ranking, gradient flow analysis is informative, He init surprise is noteworthy 2. [1509051] BENFORD LAW NEURAL WEIGHTS — Host: ASUS (16 CPUs, 15GB Linux Mint) User: Drago75 Runtime: 13.2s Credit: 15 (Good — clean negative result across 6 architectures) Findings: - Neural network weights do NOT follow Benford's Law — definitive negative result - 6 architectures tested: [20,16,5] through [20,256,128,64,32,5] (400 to 48k weights) - Only the tiniest network (400 weights) briefly passes Benford at quarter/mid training (p=0.088, 0.111) - All larger networks strongly reject at every training stage (chi2: 35-1074, p≈0.000) - Chi2 grows proportionally with weight count — Benford deviation is systematic, not noise - Tested at 5 training stages: init, quarter, mid, three-quarter, final - Xavier initialization already violates Benford (expected — symmetric distributions don't follow log-uniform) - Training does NOT push weights toward Benford — chi2 stays flat or increases Quality: Good — well-documented negative result, scientifically useful (answers the question definitively) 3. [1509054] BENFORD LAW NEURAL WEIGHTS — Host: iand-r7-5800h3 (16 CPUs, 29GB Windows) User: Vato Runtime: 223.9s Credit: 15 (Good — cross-validation confirmed) Findings: - CROSS-VALIDATION of result 1509051 — results are IDENTICAL - All chi2 values match exactly (same random seed = deterministic experiment) - Confirms reproducibility across Linux (Mint) and Windows 11 platforms - Same negative result: weights do not follow Benford's Law - NOTE: Future experiments should use host-dependent seeds for independent cross-validation Quality: Good — confirms deterministic reproducibility, but same seed limits independent validation value CREDIT LEDGER (this session) ----------------------------- Steve Dodd (userid=56, host 123): +25 (Weight Init Landscape — good) Drago75 (userid=15, host 253): +15 (Benford Law — good negative result) Vato (userid=4, host 7): +15 (Benford Law — cross-validation) TOTAL: 55 STILL PENDING (in-progress, state=4) ------------------------------------- - exp_infobottleneck_gpu_host1: host 1 (Pyhelix, 16 CPUs) - exp_lottery_host1_v2: host 1 (Pyhelix, 16 CPUs) - exp_cellular_h15: host 15 (rose, 8 CPUs) - exp_edge_of_chaos_host60: host 60 (dell, 8 CPUs) - exp_critical_periods_h87: host 87 (Dad-Workstation, 80 CPUs) - exp_neural_scaling_laws_h258: host 258 (edge, unknown) REMAINING DEPLOYED (awaiting host check-in) --------------------------------------------- ~70 additional workunits created in prior sessions targeting hosts that haven't contacted the server recently. These will be picked up when hosts come online. KEY OBSERVATIONS ----------------- 1. Benford Law: ANSWERED — neural weights do not follow Benford's Law. Xavier init creates symmetric distributions that inherently violate Benford's log-uniform assumption. Training with SGD doesn't change this. The tiny-network exception (400 weights at mid-training) is likely just small-sample noise passing the chi2 threshold. 2. Weight Init: lecun_normal > orthogonal > identity_like — but ALL schemes overfit severely. The ranking is informative for initialization choice, but the experiment needs better regularization or a harder task to show test-accuracy separation cleanly. 3. Cross-validation note: Benford experiment uses fixed random seeds, producing identical results across hosts. Future experiments should incorporate host ID into the seed for truly independent replications (e.g., np.random.seed(42 + host_id)). RECOMMENDATIONS FOR NEXT BATCH ------------------------------- 1. WAIT — 6 experiments in-progress, ~70 more awaiting host check-in. Major results expected from Critical Learning Periods (h87), LR Phase Transitions (h287), and Pruning Lottery (h296). 2. Benford Law experiment is COMPLETE — no need for further replications. The question is definitively answered (negative). 3. Weight Init experiment could benefit from: - Adding dropout/L2 regularization to reduce overfitting - Testing on a harder multi-class problem - Including batch normalization interaction (some inits work better with batchnorm) 4. When cross-validating, inject host_id into random seed so replications are independent. 5. Priority experiments still awaiting results: - Critical Learning Periods (h87) — most scientifically interesting new experiment - LR Phase Transitions — should show sharp generalization phase boundary - Pruning Lottery (edge-popup algorithm) — strong lottery ticket variant - Loss Landscapes — 2D visualization - Double Descent v2 — should now show the phenomenon with label noise CROSS-VALIDATION STATUS ------------------------ Confirmed identical (deterministic, same seed): - Benford Law: h253 and h7 — identical results (same seed, different OS) Confirmed by multiple hosts (independent): - Cellular Automata: 2 runs on host 267 (fitness 0.455 both times) Awaiting cross-validation: - Reservoir Computing: original h321 (excellent), replication pending h249 - Information Bottleneck: original h209 (excellent), replication pending h105 - Mode Connectivity: original h141, replication pending h95 - Edge of Chaos v2: original h320 (excellent), replications pending h61, h71 - Loss Landscapes: 3 hosts pending (h209, h143, h177) - Weight Init: original h123 completed, replication pending h269