The queries to pick up tasks from the scheduler sometimes degenerate when the
number of tasks fetched is too low, which hangs the runner for all other tasks.
Adding this lower bound helps postgresql use proper optimizations to pull tasks.
Details
Details
this has been running in production for a while.
Diff Detail
Diff Detail
- Repository
- rDSCH Scheduling utilities
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
Comment Actions
Build has FAILED
Patch application report for D3164 (id=11239)
Could not rebase; Attempt merge onto 2ea919cd13...
Updating 2ea919c..92c0869 Fast-forward swh/scheduler/celery_backend/config.py | 6 ++++++ swh/scheduler/celery_backend/runner.py | 6 +++++- 2 files changed, 11 insertions(+), 1 deletion(-)
Changes applied before test
commit 92c08692867588c9512f5ff943b5b23ba4a59993 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Tue May 19 11:30:13 2020 +0200 Celery runner: only schedule tasks when the buffer is less than 80% full The queries to pick up tasks from the scheduler sometimes degenerate when the number of tasks fetched is too low, which hangs the runner for all other tasks. Adding this lower bound helps postgresql use proper optimizations to pull tasks. commit b83990613370c9b35505105f212213dd310903ba Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Tue May 19 11:12:55 2020 +0200 Disable the azure http logger in the celery worker base config This is suboptimal (we should move all of this to a logconfig where we can set this stuff), but this is consistent with how we do things currently.
Link to build: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/5/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/5/console