AXIOM
DISTRIBUTED AI

Model Quality

58.9%
Bit Accuracy (>50%=learning)
420/420
Experts
30%
Quality
Live Streaming: 14,899,830 syncs | WUs Completed: 203/hr | Signal: 26
Model Quality Score
Bit Accuracy by Data Type (>50% = learning)
54.5%
English
25.9%
Python
33.3%
JavaScript
37.5%
JSON
68.8%
HTML
22.2%
CSV
38.7%
Random (baseline)
47.8%
File Headers
Click to expand
Learning Rate: ↓ 0.0352 BPC/day
Trend Confidence: Low (R²=7.4%)
Data: 691 points over 70h
With R² below 10%, the trend is mostly noise. Need more data for a reliable estimate.
BPC: 1.0 = random guessing, lower = learning. Thick red = evaluator (ground truth), thin red = worker self-tests. Click BPC number for per-expert details.

What is Axiom Distributed AI?

Getting Started
1. Download BOINC (standard client)
2. Add project: https://axiom.heliex.net
3. Done! Seed training data downloads automatically
The streaming wrapper handles suspend/resume automatically. Source code (GPL)

HEBBIAN DISTRIBUTED LEARNING

"Neurons that fire together, wire together" — Axiom uses biologically-inspired Hebbian learning instead of traditional backpropagation.

The Model
  • 420 specialized experts
  • 42.6M parameters per expert
  • Transformer + self-attention
  • Total: 17.8B parameters
How It Works
Workers train on data in the contribute folder
Each self-tests quality via BPC scoring
Forward-only learning, no backpropagation
Training Data: 10MB of diverse public domain text is provided automatically. You can also add your own files to %USERPROFILE%\Axiom\contribute\ (text, code, documents).
Create Account

Already have an account? Log in.

Updates

Feb 9, 2026
v3.66: Differential learning rates - head layers (output + local prediction heads) now learn at 10x the rate of transformer body layers. Prevents representation collapse where attention/FFN layers homogenize all inputs to the same output. Body stays stable as a feature extractor while heads learn fast. Biases kept conservative to prevent class-prior domination. Expert weights reset to Xavier init.


Feb 9, 2026
v3.65: sqrt(N) gradient normalization - per-sample clipped matmul now divides by sqrt(batch_size) instead of batch_size. The meaningful signal in a batch of N samples lives at the sqrt(N) scale (like how coin-flip deviations scale with sqrt(N)). Dividing by N buried the signal; sqrt(N) preserves the consensus direction at natural strength. GPU and CPU now get consistent learning regardless of batch size.


Feb 9, 2026
v3.64: GPU batch normalization fix - per-sample clipped matmul now averages over batch size instead of summing, and bias updates use mean instead of sum. Previously, GPU batches (512 samples) produced 16x larger weight updates than CPU (32 samples), causing weights to overshoot, oscillate, and decay to zero (uniform output = exactly 1.0 BPC). Now learning rate is consistent regardless of batch size.


Feb 9, 2026
v3.63: GPU memory optimization - adaptive VRAM management now frees GPU training data before self-test evaluation, preventing out-of-memory crashes on smaller GPUs (8-12GB). Increased VRAM reserve from 1GB to 1.5GB for expert count calculation. All platforms updated.


Feb 9, 2026
v3.61: Per-sample learning signals - GPU weight updates now clip each sample's contribution individually before summing, instead of clipping the batch total. When 512 samples' corrections are summed then clipped, the diverse per-input signals cancel out leaving random noise. Per-sample clipping preserves each input's unique learning direction. This is the actual fix for GPU workers reporting exactly 1.0 BPC.


Feb 9, 2026
v3.60: GPU learning fix - removed batch-size averaging from all Hebbian weight updates. GPU workers (batch 128-512) were averaging their learning signal across the batch, causing random corrections to cancel out and leaving the model frozen at exactly 1.0 BPC. CPU workers (batch 1) were unaffected. Now both GPU and CPU produce meaningful weight updates, with existing norm clipping controlling magnitude. All platforms updated (CPU + GPU, Linux + Windows).


Feb 9, 2026
v3.59: Error-modulated Hebbian learning - the learning rule now compares its prediction against the actual answer before updating weights. Previously, experts collapsed to constant output (always predicting the same bit regardless of input). Now, updates only fire when the model is wrong, forcing it to learn input-dependent patterns instead of just the majority class bias. All 420 experts reset to fresh initialization. All platforms updated (CPU + GPU, Linux + Windows).


Feb 8, 2026
Weight cap and model reset - accumulated weights are now hard-clamped to [-10, +10] after every update to prevent weight explosion. All 420 experts have been reset to fresh random initialization. Previously, some experts had grown to absmax values of 300+ which caused overconfident predictions (looks like low BPC but catastrophically wrong on new data). The weight cap keeps predictions calibrated and stable.


Feb 8, 2026
v3.57: Worker self-testing - every worker now evaluates its own model quality before submitting results. A random 2KB sample from the training data is held out, and after training the model is tested on it. Workers report their bits-per-character (BPC) score alongside each contribution. The coordinator aggregates these into a real-time model quality metric. Trained experts are scoring 0.50-0.68 BPC (vs 1.0 random baseline). All platforms updated (CPU + GPU, Linux + Windows).


Feb 8, 2026
v3.56: Local learning for transformer layers - each layer now receives direct supervised feedback from the training target, not just the final output head. Like giving every floor of a factory a direct phone line to the customer. Reduces reliance on signal propagating through all 6 layers. Optimized for last-token activation (64x less compute). All platforms updated (CPU + GPU, Linux + Windows).


Feb 8, 2026
v3.51: Seed training data - workers now automatically download 10MB of diverse public domain text (Project Gutenberg) on startup, ensuring all volunteers have quality training data from day one. Data refreshes every 24 hours with a unique random mix from a 10GB+ server pool. Gossip gradient improvements: per-host naming, 10-minute decay, random peer gradient applied at task start. Entropy filter tightened to 7.0 bits/byte. Model evaluation now updates every 5 minutes.


Feb 8, 2026
Sample-based credit system - credits are now calculated from actual samples trained (0.001 credit per sample), replacing the old magnitude-based system. All existing user and host credits have been rescaled to match, bringing totals in line with other BOINC projects. Per-task credit is now visible in your task list. GPU workers earn more because they train more samples per task. Stats XML export active for third-party stats sites (Free-DC, BOINCstats).


Show older updates

Network Statistics
22
Contributors
14,899,830
Total Syncs
420
Experts
14,846,552
Credits
BOINC
© 2026 Axiom Project 2026