Thread 'Very low GPU load !'

Author	Message
Jean-Luc Send message Joined: 8 Mar 26 Posts: 11 Credit: 12,663,836 RAC: 624,090	Message 260 - Posted: 10 Mar 2026, 18:26:20 UTC Please, When my RTX 5070 calculates Axiom work units (WUs), its workload is 0%, and each WU is calculated in about 15 seconds (for a reward of 1 point per WU). So I tried increasing the number of WUs calculated in parallel on my GPU. And I'm seeing very strange behavior, never before observed in a BOINC project : on this "small" GPU, I can even run 64 WUs in parallel, or even 100. Calculation time for each WU : about 65 seconds. GPU load : still 0% (still for a reward of 1 point per WU). Are you absolutely certain there isn't a setting to adjust for your GPU WUs to use them (much) more efficiently ? ID: 260 · ▲ 0 Reply Quote

WTBroughton Send message Joined: 6 Feb 26 Posts: 4 Credit: 519,898 RAC: 14,311	Message 261 - Posted: 10 Mar 2026, 21:10:48 UTC - in response to Message 260. Last modified: 10 Mar 2026, 21:17:48 UTC Copied this from one of your output files: [GPU] This GPU binary requires NVIDIA driver 525+ for CUDA 12.x Don't know it helps. On my RTX 5070 I'm getting up to 750 credits depending on the experiment being run. ID: 261 · ▲ 0 Reply Quote

Jean-Luc Send message Joined: 8 Mar 26 Posts: 11 Credit: 12,663,836 RAC: 624,090	Message 263 - Posted: 10 Mar 2026, 21:16:19 UTC - in response to Message 260. I'm correcting my previous message : running 64 WUs simultaneously on the GPU only works with WUs "exp_schnakenberg_crossed_feed_diffusion_gpu". This does not work with WUs "exp_tailspike_covariance_localization_gpu", with which I can only do about ten in parallel. ID: 263 · ▲ 0 Reply Quote

Jean-Luc Send message Joined: 8 Mar 26 Posts: 11 Credit: 12,663,836 RAC: 624,090	Message 264 - Posted: 10 Mar 2026, 21:21:36 UTC - in response to Message 261. In reply to WTBroughton's message of 10 Mar 2026: Copied this from one of your output files: [GPU] This GPU binary requires NVIDIA driver 525+ for CUDA 12.x Don't know it helps. On my RTX 5070 I'm getting up to 750 credits depending on the experiment being run. My driver version : 570.211.01, CUDA Version: 12.8 So it can't be coming from that. Thanks ! ID: 264 · ▲ 0 Reply Quote

PyHelix Volunteer moderator Project administrator Send message Joined: 23 Jan 26 Posts: 85 Credit: 526,283 RAC: 12,766	Message 267 - Posted: 11 Mar 2026, 0:59:03 UTC - in response to Message 264. This is a known issue with newer GPUs running CUDA 12.8 (like your RTX 5070). The GPU binary your host had was built with CUDA 12.4 NVRTC builtins, but your system's NVRTC 12.8 looks for libnvrtc-builtins.so.12.8 which wasn't present. This caused every GPU kernel to fail to compile, so the experiments fell back to error handling and finished in seconds with minimal credit -- hence the 0% GPU load. We've patched a fix (v6.25) that properly symlinks the system's NVRTC builtins so CuPy can find them. Your host should automatically download the new version on its next scheduler contact. My server will auto-update me with your GPU results as soon as the first one comes in so I can verify everything is working correctly. ID: 267 · ▲ 0 Reply Quote

Jean-Luc Send message Joined: 8 Mar 26 Posts: 11 Credit: 12,663,836 RAC: 624,090	Message 273 - Posted: 11 Mar 2026, 14:06:07 UTC - in response to Message 267. Many thanks for your answer ! I'm waiting to see if the situation improves. ID: 273 · ▲ 0 Reply Quote