In effect, this will allow to run 2 runners:
- one for recurring tasks
- one for the save code now
This should decrease the probability of the scheduling tasks for the save code now to be
stuck behind the main scheduler runner.
Related to T3367
Differential D5826
runner: Separate scheduling tasks with and without priority concerns ardumont on Jun 8 2021, 5:39 PM. Authored by
Details
In effect, this will allow to run 2 runners:
This should decrease the probability of the scheduling tasks for the save code now to be Related to T3367 tox
Diff Detail
Event TimelineComment Actions Build is green Patch application report for D5826 (id=20846)Could not rebase; Attempt merge onto 9f7ab8fcdc... Updating 9f7ab8f..b76c647 Fast-forward swh/scheduler/backend.py | 90 ++++++++++++++++++++++---------- swh/scheduler/celery_backend/config.py | 23 +++++++- swh/scheduler/celery_backend/runner.py | 89 +++++++++++++++---------------- swh/scheduler/cli/admin.py | 38 ++++++++++++-- swh/scheduler/cli/origin.py | 65 +++++++++++++++++++++++ swh/scheduler/interface.py | 19 +++++++ swh/scheduler/tests/test_celery_tasks.py | 14 +++-- 7 files changed, 252 insertions(+), 86 deletions(-) Changes applied before testcommit b76c647b4fedb6ad3811a2f3c034b996db7c2a79 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jun 8 17:36:28 2021 +0200 runner: Separate scheduling tasks with and without priority concern Related to T3367 commit 974475fa08ebf9a31e68f89398633f97040f0d3e Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Jun 3 16:03:26 2021 +0200 send-to-celery: Add more options to allow scheduling of edge cases In the non optimal case, we may want to trigger specific case (not-yet enabled origins, origin from specific lister...). Related to T3350 commit 370ec4d66da913b409784bc949db402392594b0d Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Wed Jun 2 15:59:15 2021 +0200 Direct scheduling of origin visits in celery Summary: This stack of changes builds up to a CLI endpoint allowing us to schedule origin visits directly in Celery, bypassing the legacy scheduler entirely. This has zero test coverage save from old tests still passing, which is already something... It's being used on the actual production database to schedule actual tasks for git, npm and pypi. Included changes: - Drop duplicate docstring from backend - Make the origin visit scheduling cooldown configurable (Cosmetic changes) - Add a (longer) specific cooldown for failed origin visits - Add a specific cooldown for notfound origins Both of these changes prevent repeating visits on failing origins. This is necessary because, as we're using a consistent ordering with respect to the upstream information, we'd always be trying to load them, never reaching origins further down the stack. Listers should eventually disable these origins. - Add table sampling option to grab_next_visits Running common operations on all git origins is pretty intense. Using table sampling gives us the opportunity to at least schedule some jobs in (decently small) time. - Add a (very basic) scheduling policy for origins with no known last update This is especially useful for pypi, as well as some git hosters that do not provide the right info in their APIs. We will need to implement smarter heuristics to avoid repeated uneventful visits on these origins. - Split off the helper for available slots in a celery queue This is needed for the send-to-celery subcommand as well, so split it off of the runner module. - Add a swh scheduler origin send-to-celery subcommand Yes, finally! Test Plan: obviously needs at least /some/ test coverage. Reviewers: #reviewers Subscribers: ardumont Differential Revision: https://forge.softwareheritage.org/D5809 See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/354/ for more details. Comment Actions Build is green Patch application report for D5826 (id=20886)Rebasing onto 9d2618db8f... Current branch diff-target is up to date. Changes applied before testcommit 091336179afad8c4f4b97ffed18644a076893efc Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jun 8 17:36:28 2021 +0200 runner: Separate scheduling tasks with and without priority concern Related to T3367 See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/358/ for more details. Comment Actions Build is green Patch application report for D5826 (id=20897)Rebasing onto 9d2618db8f... Current branch diff-target is up to date. Changes applied before testcommit 4a2adc01fcfe4a63bbf06ae406a87851a12b931b Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jun 8 17:36:28 2021 +0200 runner: Separate scheduling tasks with and without priority concern In effect, this will allow to run 2 runners: - one for recurring tasks - one for the save code now This should decrease the probability of the scheduling tasks for the save code now to be stuck behind the main scheduler runner. Related to T3367 See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/359/ for more details. Comment Actions Build is green Patch application report for D5826 (id=20908)Rebasing onto 21c4279b99... Current branch diff-target is up to date. Changes applied before testcommit 0bafdccd09333aae5bdb81e496f0a09eabe51b35 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jun 8 17:36:28 2021 +0200 runner: Separate scheduling tasks with and without priority concern In effect, this will allow to run 2 runners: - one for recurring tasks - one for the save code now This should decrease the probability of the scheduling tasks for the save code now to be stuck behind the main scheduler runner. Related to T3367 See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/361/ for more details. Comment Actions Build is green Patch application report for D5826 (id=20917)Rebasing onto 21c4279b99... Current branch diff-target is up to date. Changes applied before testcommit f71a716f478ee8bfcf7f4e26f387768a89276deb Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jun 8 17:36:28 2021 +0200 runner: Separate scheduling tasks with and without priority concern In effect, this will allow to run 2 runners: - one for recurring tasks - one for the save code now This should decrease the probability of the scheduling tasks for the save code now to be stuck behind the main scheduler runner. Related to T3367 See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/362/ for more details. Comment Actions Build is green Patch application report for D5826 (id=20925)Rebasing onto 21c4279b99... Current branch diff-target is up to date. Changes applied before testcommit c7707b5c836c3f58bace115eb398599a989845aa Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jun 8 17:36:28 2021 +0200 runner: Separate scheduling tasks with and without priority concern In effect, this will allow to run 2 runners: - one for recurring tasks - one for the save code now This should decrease the probability of the scheduling tasks for the save code now to be stuck behind the main scheduler runner. Related to T3367 See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/363/ for more details. Comment Actions LGTM, still not a big fan of the usage of random in the tests ;), but otherwise, it matches what you explain to me this morning Comment Actions
\o/
lol, yeah but i'm not a big of hard-coding say the first element for example here. |