Thread 'Inefficiency of frequent "Requested by project" (120s) scheduler polling'

Message boards : Suggestions : Inefficiency of frequent "Requested by project" (120s) scheduler polling
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
PyHelix
Volunteer moderator
Project administrator

Send message
Joined: 23 Jan 26
Posts: 85
Credit: 529,972
RAC: 13,294
Message 346 - Posted: 17 Mar 2026, 10:40:57 UTC

Good call — you're right that the polling interval was too aggressive. I've updated the server config to next_rpc_delay=3600 (1 hour) as you recommended. Clients will still contact the server immediately when they finish a task or their work buffer runs low, so there should be no change in work delivery — just less unnecessary overhead on your end.
ID: 346 · 0      Reply Quote
Lazarus-uk

Send message
Joined: 22 Feb 26
Posts: 7
Credit: 137,288
RAC: 11,360
Message 347 - Posted: 17 Mar 2026, 15:49:01 UTC - in response to Message 346.  
Last modified: 17 Mar 2026, 16:02:17 UTC

The only problem with this solution is that many of my cpu tasks end with: EXIT_DISK_LIMIT_EXCEEDED and the result is shown as 'Error while computing'. Whilst the server does validate these as successful tasks, Boinc manager sees them as 'computation error' and will back-off from contacting axiom for over 3 Hrs which means work runs out. A couple of times this afternoon I have checked Boinc manager and the cores are sat idle.

It also doesn't help that the estimated run-time of cpu tasks is 3hrs 26mins.

Windows 10, Boinc 8.2.8
ID: 347 · 0      Reply Quote
PyHelix
Volunteer moderator
Project administrator

Send message
Joined: 23 Jan 26
Posts: 85
Credit: 529,972
RAC: 13,294
Message 350 - Posted: 17 Mar 2026, 20:13:44 UTC - in response to Message 347.  

Both issues should be resolved now:

1. EXIT_DISK_LIMIT_EXCEEDED — The disk space limit for tasks was set too high (3 GB), which caused BOINC to flag the output as exceeding the limit even though the actual result files are tiny. This has been corrected to 500 MB.

2. Estimated run-time showing 3h 26m — The FLOPS estimate used by BOINC to calculate the estimated runtime was calibrated for GPU tasks (which run longer on faster hardware). CPU tasks actually complete in about 10 minutes on average. I have updated the estimate so new tasks should show a much more realistic ~10 minute runtime. Existing tasks in your queue have also been updated.

The back-off behavior you saw was BOINC reacting to the EXIT_DISK_LIMIT_EXCEEDED errors — with that fixed, your client should no longer see computation errors or idle out. Let me know if it clears up on your end.
ID: 350 · 0      Reply Quote
Lazarus-uk

Send message
Joined: 22 Feb 26
Posts: 7
Credit: 137,288
RAC: 11,360
Message 351 - Posted: 17 Mar 2026, 20:17:48 UTC - in response to Message 350.  

In reply to PyHelix's message of 17 Mar 2026:
Both issues should be resolved now:

1. EXIT_DISK_LIMIT_EXCEEDED — The disk space limit for tasks was set too high (3 GB), which caused BOINC to flag the output as exceeding the limit even though the actual result files are tiny. This has been corrected to 500 MB.

2. Estimated run-time showing 3h 26m — The FLOPS estimate used by BOINC to calculate the estimated runtime was calibrated for GPU tasks (which run longer on faster hardware). CPU tasks actually complete in about 10 minutes on average. I have updated the estimate so new tasks should show a much more realistic ~10 minute runtime. Existing tasks in your queue have also been updated.

The back-off behavior you saw was BOINC reacting to the EXIT_DISK_LIMIT_EXCEEDED errors — with that fixed, your client should no longer see computation errors or idle out. Let me know if it clears up on your end.



Will do. Thank you
ID: 351 · 0      Reply Quote
Lazarus-uk

Send message
Joined: 22 Feb 26
Posts: 7
Credit: 137,288
RAC: 11,360
Message 357 - Posted: 18 Mar 2026, 9:47:19 UTC

All cpu tasks now ending in 'computation error',

18/03/2026 09:17:58 | axiom | Aborting task exp_graded_random_spring_gradon_threshold_bf005f_0: exceeded disk limit: 891.08MB > 476.84MB
18/03/2026 09:17:58 | axiom | Aborting task exp_graded_random_spring_gradon_threshold_800402_0: exceeded disk limit: 888.21MB > 476.84MB
18/03/2026 09:17:58 | axiom | Aborting task exp_foodweb_trophic_selfreg_rescue_threshold_a52311_0: exceeded disk limit: 889.26MB > 476.84MB
18/03/2026 09:17:58 | axiom | Aborting task exp_foodweb_trophic_selfreg_rescue_threshold_b714a9_0: exceeded disk limit: 891.08MB > 476.84MB

so still get long project back off times.

The cpu task duration correction factor (dcf) is now fixed. Much closer to actual run times. I may try to increase the amount of cached work but prefer to keep caches small as possible.

Windows 10, boinc 8.2.8
ID: 357 · 0      Reply Quote
PyHelix
Volunteer moderator
Project administrator

Send message
Joined: 23 Jan 26
Posts: 85
Credit: 529,972
RAC: 13,294
Message 360 - Posted: 18 Mar 2026, 11:12:24 UTC

Hi Lazarus-uk, thanks for the follow-up — you are right, the 500 MB limit was still too tight for CPU tasks on Windows. The PyInstaller binary extracts temporary files that can push disk usage to ~900 MB on some systems.

I have just increased the disk limit to 2 GB for all CPU tasks (both new and existing workunits in the database), so this should stop happening on your next scheduler contact. I have also reset any backoff flags on your host so you should start receiving work again right away.

Sorry for the back and forth on this one — appreciate you reporting it.
ID: 360 · 0      Reply Quote
Lazarus-uk

Send message
Joined: 22 Feb 26
Posts: 7
Credit: 137,288
RAC: 11,360
Message 361 - Posted: 18 Mar 2026, 12:10:07 UTC - in response to Message 360.  

No problem. Happy to help out.
ID: 361 · 0      Reply Quote
Lazarus-uk

Send message
Joined: 22 Feb 26
Posts: 7
Credit: 137,288
RAC: 11,360
Message 362 - Posted: 18 Mar 2026, 19:50:40 UTC

18/03/2026 19:28:25 | axiom | Aborting task exp_foodweb_trophic_selfreg_rescue_threshold_b6df83_0: exceeded disk limit: 2253.97MB > 1907.35MB
18/03/2026 19:28:25 | axiom | Aborting task exp_pollinator_phenology_core_collapse_threshold_8c25ca_0: exceeded disk limit: 2253.97MB > 1907.35MB
18/03/2026 19:28:25 | axiom | Aborting task exp_pollinator_phenology_core_collapse_threshold_0e366f_0: exceeded disk limit: 2253.97MB > 1907.35MB
18/03/2026 19:28:25 | axiom | Aborting task exp_ssh_heavytail_bond_delocalization_threshold_686efa_0: exceeded disk limit: 2201.55MB > 1907.35MB


Sorry..

Many did complete successfully but they seem to be getting bigger again
ID: 362 · 0      Reply Quote
PyHelix
Volunteer moderator
Project administrator

Send message
Joined: 23 Jan 26
Posts: 85
Credit: 529,972
RAC: 13,294
Message 364 - Posted: 18 Mar 2026, 22:39:38 UTC - in response to Message 362.  
Last modified: 18 Mar 2026, 22:39:38 UTC

No need to apologize — thanks for catching this again! You are right, some experiments are producing larger temp files than I expected. I have just bumped the disk limit from 2 GB to 3 GB for all CPU tasks (new and existing). That should give plenty of headroom.

The fix is already live — your next scheduler contact will pick up the new limit and those tasks should stop aborting.
ID: 364 · 0      Reply Quote
Lazarus-uk

Send message
Joined: 22 Feb 26
Posts: 7
Credit: 137,288
RAC: 11,360
Message 372 - Posted: 19 Mar 2026, 14:11:53 UTC - in response to Message 370.  

In reply to Mr P Hucker's message of 19 Mar 2026:
In reply to PyHelix's message of 18 Mar 2026:
No need to apologize — thanks for catching this again! You are right, some experiments are producing larger temp files than I expected. I have just bumped the disk limit from 2 GB to 3 GB for all CPU tasks (new and existing). That should give plenty of headroom.

The fix is already live — your next scheduler contact will pick up the new limit and those tasks should stop aborting.

This is where you had it originally! You first said "The disk space limit for tasks was set too high (3 GB)"


Yes, I think if you allowed them 10 GB they would use it all and probably run for an hour. The tasks are getting longer the more you allow them to use. They need to have defined stopping point.
ID: 372 · 0      Reply Quote
kotenok2000
Avatar

Send message
Joined: 28 Jan 26
Posts: 5
Credit: 407,633
RAC: 22,095
Message 401 - Posted: 21 Mar 2026, 6:12:05 UTC

I have read somewhere that ai chatbots sometimes infect operators with their speech patterns

https://www.reddit.com/r/aiwars/comments/1qyfubi/ever_notice_serial_ai_users_have_these_speech/
ID: 401 · 0      Reply Quote
PyHelix
Volunteer moderator
Project administrator

Send message
Joined: 23 Jan 26
Posts: 85
Credit: 529,972
RAC: 13,294
Message 403 - Posted: 21 Mar 2026, 9:04:35 UTC - in response to Message 402.  

Yeah. I'm with you on that. Healthy amounts of constructive criticsm is warrented, and often needed. I don't want to control the narrative; but there there were a couple of users getting into defamation territory on the project without proof. That's a hard line in my forms.

I've set up an LLM routine that reads posts, and enforces the form rules now. Borderline behaviour is tolerated but flagged. If it's consistent, those comments do get deleted and forwarded to my phone for review.
ID: 403 · 0      Reply Quote
PyHelix
Volunteer moderator
Project administrator

Send message
Joined: 23 Jan 26
Posts: 85
Credit: 529,972
RAC: 13,294
Message 407 - Posted: 21 Mar 2026, 15:10:28 UTC - in response to Message 404.  

I get what you are saying, and I hear it. However, in the case of the users involved, it was not a one off joke, but an number of accumulated posts over time that were borderline or did not follow the form rules. And yes, LLMs will be used here; I am one person show running this project - - it helps me stay on top of things.

Zombie67 did make an excellent suggestion to use a seperate account when using AI to respond. I've started doing that. In the case of this post, and others from this my personal account - - these are words written by me going forward.
ID: 407 · 0      Reply Quote
1 · 2 · Next

Message boards : Suggestions : Inefficiency of frequent "Requested by project" (120s) scheduler polling
Network Statistics
Powered byBOINC
© 2026 Axiom Project 2026