Page MenuHomeSoftware Heritage

Add a new cli endpoint to schedule recurrent visits in Celery
ClosedPublic

Authored by ardumont on Oct 20 2021, 4:50 PM.

Details

Summary

For each known visit type, we run a loop which:

  • monitors the size of the relevant celery queue
  • schedules more visits of the relevant type once the number of available slots goes over a given threshold (currently set to 5% of the max queue size).

The scheduling of visits combines multiple scheduling policies, for now
using static ratios set in the POLICY_RATIOS dict. We emit a warning
if the ratio of origins fetched for each policy is skewed with respect
to the original request (allowing, for now, manual adjustement of the
ratios).

The CLI endpoint spawns one thread for each visit type, which all handle
connections to RabbitMQ and the scheduler backend separately. For now,
we handle exceptions in the visit scheduling threads by (stupidly)
respawning the relevant thread directly. We should probably improve this
to give up after a specific number of tries.

Co-authored-by: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>

Related to T3667

Test Plan
  • docker test ran by @olasd
  • D6543: docker container update and scenario run (P1211)
  • tox

Diff Detail

Repository
rDSCH Scheduling utilities
Branch
arcpatch-D6520
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 24725
Build 38595: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 38594: arc lint + arc unit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Amend with reimplementation from pair programming session

Build has FAILED

Patch application report for D6520 (id=23805)

Rebasing onto ecc0e2803e...

Current branch diff-target is up to date.
Changes applied before test
commit 2c2258eef10b2a83cfbff0642daa45ccd8dbfc39
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Mon Oct 25 19:02:48 2021 +0200

    Add a new cli endpoint to schedule recurrent visits in Celery
    
    For each known visit type, we run a loop which:
     - monitors the size of the relevant celery queue
     - schedules more visits of the relevant type once the number of
     available slots goes over a given threshold (currently set to 5% of the
     max queue size).
    
    The scheduling of visits combines multiple scheduling policies, for now
    using static ratios set in the `POLICY_RATIOS` dict. We emit a warning
    if the ratio of origins fetched for each policy is skewed with respect
    to the original request (allowing, for now, manual adjustement of the
    ratios).
    
    The CLI endpoint spawns one thread for each visit type, which all handle
    connections to RabbitMQ and the scheduler backend separately. For now,
    we handle exceptions in the visit scheduling threads by (stupidly)
    respawning the relevant thread directly. We should probably improve this
    to give up after a specific number of tries.
    
    Co-authored-by: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>

Link to build: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/478/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/478/console

ardumont retitled this revision from Bootstrap orchestrator in charge of scheduling recurring tasks to wip: Bootstrap scheduler cog in charge of scheduling recurring tasks.Oct 27 2021, 12:10 PM
ardumont edited the summary of this revision. (Show Details)
ardumont edited reviewers, added: olasd; removed: ardumont.

Fix test

Tests still incomplete, i'm gonna use jenkins' coverage report to check what's missing.

Squash commit and use the correct commit range

Build has FAILED

Patch application report for D6520 (id=23841)

Could not rebase; Attempt merge onto ecc0e2803e...

Updating ecc0e28..59bcdc6
Fast-forward
 swh/scheduler/celery_backend/recurrent_visits.py | 297 +++++++++++++++++++++++
 swh/scheduler/cli/admin.py                       | 100 +++++++-
 swh/scheduler/tests/conftest.py                  |  12 +-
 swh/scheduler/tests/test_cli.py                  |   9 -
 swh/scheduler/tests/test_recurrent_visits.py     | 127 ++++++++++
 5 files changed, 534 insertions(+), 11 deletions(-)
 create mode 100644 swh/scheduler/celery_backend/recurrent_visits.py
 create mode 100644 swh/scheduler/tests/test_recurrent_visits.py
Changes applied before test
commit 59bcdc69a31ffa2d1fd8d1c60d5c9ac82d13328e
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Oct 27 12:38:39 2021 +0200

    fix tests

commit 12474caef814f3f19c303d7d00d549168ba58448
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Oct 27 12:09:42 2021 +0200

    Bootstrap orchestrator in charge of scheduling recurring tasks
    
    Summary:
    This new module is specifically in charge of scheduling regularly the recurring tasks
    (loader ones, either dvcs loader like: load-git, load-svn, or package loader:
    pypi, npm, ..., etc...).
    
    This new cli call will replace the manual processes started on scheduler nodes (saatchi,
    scheduler0.staging).
    
    Related to T3667
    
    Test Plan: todo
    
    Reviewers: #reviewers, ardumont
    
    Subscribers: olasd, vlorentz
    
    Maniphest Tasks: T3667
    
    Differential Revision: https://forge.softwareheritage.org/D6520

Link to build: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/479/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/479/console

Build has FAILED

Patch application report for D6520 (id=23842)

Rebasing onto ecc0e2803e...

Current branch diff-target is up to date.
Changes applied before test
commit 1adc6a5a9679ebcf420f022a41039c7399d5f982
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Oct 27 12:09:42 2021 +0200

    Bootstrap orchestrator in charge of scheduling recurring tasks
    
    Summary:
    This new module is specifically in charge of scheduling regularly the recurring tasks
    (loader ones, either dvcs loader like: load-git, load-svn, or package loader:
    pypi, npm, ..., etc...).
    
    This new cli call will replace the manual processes started on scheduler nodes (saatchi,
    scheduler0.staging).
    
    Related to T3667
    
    Test Plan: todo
    
    Reviewers: #reviewers, ardumont
    
    Subscribers: olasd, vlorentz
    
    Maniphest Tasks: T3667
    
    Differential Revision: https://forge.softwareheritage.org/D6520

Link to build: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/480/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/480/console

Fix doc warning seen as error in the build.
(work around it really)

Build is green

Patch application report for D6520 (id=23844)

Rebasing onto ecc0e2803e...

Current branch diff-target is up to date.
Changes applied before test
commit 57ac752d441e4b971f9b756ef4ca32db45d1d525
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Oct 27 12:09:42 2021 +0200

    Bootstrap orchestrator in charge of scheduling recurring tasks
    
    Summary:
    This new module is specifically in charge of scheduling regularly the recurring tasks
    (loader ones, either dvcs loader like: load-git, load-svn, or package loader:
    pypi, npm, ..., etc...).
    
    This new cli call will replace the manual processes started on scheduler nodes (saatchi,
    scheduler0.staging).
    
    Related to T3667
    
    Test Plan: todo
    
    Reviewers: #reviewers, ardumont
    
    Subscribers: olasd, vlorentz
    
    Maniphest Tasks: T3667
    
    Differential Revision: https://forge.softwareheritage.org/D6520

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/481/ for more details.

ardumont edited the test plan for this revision. (Show Details)
ardumont edited the summary of this revision. (Show Details)
ardumont edited the test plan for this revision. (Show Details)
ardumont edited the summary of this revision. (Show Details)
ardumont edited the test plan for this revision. (Show Details)

Add one test

Build is green

Patch application report for D6520 (id=23860)

Rebasing onto 0c7ef27b7e...

First, rewinding head to replay your work on top of it...
Applying: Bootstrap scheduler cog in charge of scheduling recurring tasks
Changes applied before test
commit dc3884273af71cd9fb8a3bed48b9fbcce438355b
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Oct 27 12:09:42 2021 +0200

    Bootstrap scheduler cog in charge of scheduling recurring tasks
    
    This new module is specifically in charge of scheduling regularly the recurring tasks
    (loader ones, either dvcs loader like: load-git, load-svn, or package loader:
    pypi, npm, ..., etc...).
    
    It's triggered through a cli subcommand `swh scheduler schedule-recurrent` which will
    replace the manual processes started on scheduler nodes (saatchi, scheduler0.staging).
    
    Related to T3667

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/483/ for more details.

Build is green

Patch application report for D6520 (id=23861)

Rebasing onto 0c7ef27b7e...

First, rewinding head to replay your work on top of it...
Applying: Bootstrap scheduler cog in charge of scheduling recurring tasks
Changes applied before test
commit 45690b0e5d7ba356a87124172f93826095249df0
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Oct 27 12:09:42 2021 +0200

    Bootstrap scheduler cog in charge of scheduling recurring tasks
    
    This new module is specifically in charge of scheduling regularly the recurring tasks
    (loader ones, either dvcs loader like: load-git, load-svn, or package loader:
    pypi, npm, ..., etc...).
    
    It's triggered through a cli subcommand `swh scheduler schedule-recurrent` which will
    replace the manual processes started on scheduler nodes (saatchi, scheduler0.staging).
    
    Related to T3667

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/484/ for more details.

ardumont retitled this revision from wip: Bootstrap scheduler cog in charge of scheduling recurring tasks to Add a new cli endpoint to schedule recurrent visits in Celery.Oct 27 2021, 5:14 PM
ardumont edited the summary of this revision. (Show Details)
ardumont edited the test plan for this revision. (Show Details)

Use @olasd's commit message

Build is green

Patch application report for D6520 (id=23863)

Rebasing onto 0c7ef27b7e...

First, rewinding head to replay your work on top of it...
Applying: Add a new cli endpoint to schedule recurrent visits in Celery
Changes applied before test
commit fc4c46b7dfc2a658cf4480b7f7a9aebafb5b618e
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Oct 27 12:09:42 2021 +0200

    Add a new cli endpoint to schedule recurrent visits in Celery
    
    For each known visit type, we run a loop which:
     - monitors the size of the relevant celery queue
     - schedules more visits of the relevant type once the number of
     available slots goes over a given threshold (currently set to 5% of the
     max queue size).
    
    The scheduling of visits combines multiple scheduling policies, for now
    using static ratios set in the `POLICY_RATIOS` dict. We emit a warning
    if the ratio of origins fetched for each policy is skewed with respect
    to the original request (allowing, for now, manual adjustement of the
    ratios).
    
    The CLI endpoint spawns one thread for each visit type, which all handle
    connections to RabbitMQ and the scheduler backend separately. For now,
    we handle exceptions in the visit scheduling threads by (stupidly)
    respawning the relevant thread directly. We should probably improve this
    to give up after a specific number of tries.
    
    Co-authored-by: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/485/ for more details.

Thanks for these updated tests! A few comments inline.

swh/scheduler/tests/test_recurrent_visits.py
66–69

So I wrote that one but I find it pretty terrible: it makes the command line exit as it tries to spawn the first thread. I guess at least it's testing the logic that pulls the task types out of the database?

This probably deserves an updated comment below. Something along the lines of "The actual scheduling threads won't spawn, they'll immediately terminate. This only exercises the logic to pull task types out of the database".

172

I assume that's immediately crashing with an exception for failing to initialize celery? The actual "thread termination" code doesn't seem to be covered.

(I don't have much of a suggestion to cover this better, I think we'll have a much better chance doing that in the docker integration tests).

swh/scheduler/tests/test_recurrent_visits.py
66–69

sure, done.

swh/scheduler/tests/test_recurrent_visits.py
172

oh yeah, i forgot about that...

I tried to actually have some data so it's doing actually something (and i mocked the
celery build_app call in that stashed code) but it's kinda repeating the other tests we
already have. The one that's actually calling the same function all this threaded code
actually calls spawn_visit_scheduler_thread (when you go to the bottom of it ;)

So in the end, i dropped it.

Maybe, i should just give it back a spin and simply mock that internal method call (well
that and build-app) so it does not crash.

what do you think?

(I don't have much of a suggestion to cover this better, I think we'll have a much
better chance doing that in the docker integration tests).

yes, maybe.

Adapt test comment to be a bit more explicit

Build is green

Patch application report for D6520 (id=23864)

Rebasing onto 0c7ef27b7e...

First, rewinding head to replay your work on top of it...
Applying: Add a new cli endpoint to schedule recurrent visits in Celery
Changes applied before test
commit a3d31829f2a1a724fa6bda8639114bc880707eb1
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Oct 27 12:09:42 2021 +0200

    Add a new cli endpoint to schedule recurrent visits in Celery
    
    For each known visit type, we run a loop which:
     - monitors the size of the relevant celery queue
     - schedules more visits of the relevant type once the number of
     available slots goes over a given threshold (currently set to 5% of the
     max queue size).
    
    The scheduling of visits combines multiple scheduling policies, for now
    using static ratios set in the `POLICY_RATIOS` dict. We emit a warning
    if the ratio of origins fetched for each policy is skewed with respect
    to the original request (allowing, for now, manual adjustement of the
    ratios).
    
    The CLI endpoint spawns one thread for each visit type, which all handle
    connections to RabbitMQ and the scheduler backend separately. For now,
    we handle exceptions in the visit scheduling threads by (stupidly)
    respawning the relevant thread directly. We should probably improve this
    to give up after a specific number of tries.
    
    Co-authored-by: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/486/ for more details.

swh/scheduler/tests/test_recurrent_visits.py
172

Maybe, i should just give it back a spin and simply mock that internal method call (well
that and build-app) so it does not crash.

Easier said than done.
I've updated the current noop test we discussed so we mock the build_app so that does not crash.

Build is green

Patch application report for D6520 (id=23869)

Rebasing onto 0c7ef27b7e...

First, rewinding head to replay your work on top of it...
Applying: Add a new cli endpoint to schedule recurrent visits in Celery
Changes applied before test
commit 5687383ce45afcb0967df3f9e335e4fc4ce67b23
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Oct 27 12:09:42 2021 +0200

    Add a new cli endpoint to schedule recurrent visits in Celery
    
    For each known visit type, we run a loop which:
     - monitors the size of the relevant celery queue
     - schedules more visits of the relevant type once the number of
     available slots goes over a given threshold (currently set to 5% of the
     max queue size).
    
    The scheduling of visits combines multiple scheduling policies, for now
    using static ratios set in the `POLICY_RATIOS` dict. We emit a warning
    if the ratio of origins fetched for each policy is skewed with respect
    to the original request (allowing, for now, manual adjustement of the
    ratios).
    
    The CLI endpoint spawns one thread for each visit type, which all handle
    connections to RabbitMQ and the scheduler backend separately. For now,
    we handle exceptions in the visit scheduling threads by (stupidly)
    respawning the relevant thread directly. We should probably improve this
    to give up after a specific number of tries.
    
    Co-authored-by: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/488/ for more details.

I'll let this stew for a bit so others get a chance to read through it if they'd like, but I think this would be ok to land.

swh/scheduler/tests/test_recurrent_visits.py
178–182

looks redundant

swh/scheduler/tests/test_recurrent_visits.py
178–182

indeed, fixing.

Build is green

Patch application report for D6520 (id=23874)

Rebasing onto 0c7ef27b7e...

First, rewinding head to replay your work on top of it...
Applying: Add a new cli endpoint to schedule recurrent visits in Celery
Changes applied before test
commit 1622a811f47744a06ceadf3da8b986ccf23ac5b6
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Oct 27 12:09:42 2021 +0200

    Add a new cli endpoint to schedule recurrent visits in Celery
    
    For each known visit type, we run a loop which:
     - monitors the size of the relevant celery queue
     - schedules more visits of the relevant type once the number of
     available slots goes over a given threshold (currently set to 5% of the
     max queue size).
    
    The scheduling of visits combines multiple scheduling policies, for now
    using static ratios set in the `POLICY_RATIOS` dict. We emit a warning
    if the ratio of origins fetched for each policy is skewed with respect
    to the original request (allowing, for now, manual adjustement of the
    ratios).
    
    The CLI endpoint spawns one thread for each visit type, which all handle
    connections to RabbitMQ and the scheduler backend separately. For now,
    we handle exceptions in the visit scheduling threads by (stupidly)
    respawning the relevant thread directly. We should probably improve this
    to give up after a specific number of tries.
    
    Co-authored-by: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/489/ for more details.

I don't really understand what this is for, but lgtm.

In the docstrings, you need to replace all single-backticks with double-backticks.

A few more nitpicks below.

swh/scheduler/celery_backend/recurrent_visits.py
80

using sum() on a list of lists copies the accumulator over and over, causing the operation to be quadratic instead of linear

106–122

move constants to the top of the file.

It should be mentioned in TERMINATE's docstring that it's a singleton used for identity comparison.

285–286

I don't really understand what this is for

That kind of remarks always makes me a big edgy.
What's not clear in the description that need rework so it's clearer?

This is the main algo that will trigger scheduling of recurring tasks (the output of listers).
Without this, we don't actually schedule anything besides indexer, vault cooking, save
code now and deposit tasks (which are dealt with the "ancient" archtectured scheduler runner).

Is this ^ clearer?


Thanks for the other pointer, will look.

What's not clear in the description that need rework so it's clearer?

What isn't clear to me is why we need to implement this, because the scheduler already does it.

This is the main algo that will trigger scheduling of recurring tasks (the output of listers).
Without this, we don't actually schedule anything besides indexer, vault cooking, save
code now and deposit tasks (which are dealt with the "ancient" archtectured scheduler runner).

Ah! So the goal is to replace the existing scheduler runner?

nvm, @olasd just explained to me this used to be done manually

What's not clear in the description that need rework so it's clearer?

What isn't clear to me is why we need to implement this, because the scheduler already does it.

No, it no longer does it since the scheduler/lister refactoring.
It's still ongoing and @olasd and me are trying to finalize it.
And it's annoyingly taking forever... ¯\_(ツ)_/¯

This is the main algo that will trigger scheduling of recurring tasks (the output of listers).
Without this, we don't actually schedule anything besides indexer, vault cooking, save
code now and deposit tasks (which are dealt with the "ancient" archtectured scheduler runner).

Ah! So the goal is to replace the existing scheduler runner?

That old one won't go away and will be in charge of the non-recurring tasks (what we called the oneshot).
This new cog is solely about the recurring ones.

What's not clear in the description that need rework so it's clearer?

What isn't clear to me is why we need to implement this, because the scheduler already does it.

In practice, what's currently "doing this" is a for loop that calls swh scheduler send-to-celery, which is started manually in a tmux on both staging and prod :-)

  • Adapt according to suggestions
  • drop unused --period flag in the cli

Build has FAILED

Patch application report for D6520 (id=23890)

Rebasing onto 0c7ef27b7e...

First, rewinding head to replay your work on top of it...
Applying: Add a new cli endpoint to schedule recurrent visits in Celery
Changes applied before test
commit 8013748008e695f7a9785bb7fd760356b3e563bf
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Oct 27 12:09:42 2021 +0200

    Add a new cli endpoint to schedule recurrent visits in Celery
    
    For each known visit type, we run a loop which:
     - monitors the size of the relevant celery queue
     - schedules more visits of the relevant type once the number of
     available slots goes over a given threshold (currently set to 5% of the
     max queue size).
    
    The scheduling of visits combines multiple scheduling policies, for now
    using static ratios set in the `POLICY_RATIOS` dict. We emit a warning
    if the ratio of origins fetched for each policy is skewed with respect
    to the original request (allowing, for now, manual adjustement of the
    ratios).
    
    The CLI endpoint spawns one thread for each visit type, which all handle
    connections to RabbitMQ and the scheduler backend separately. For now,
    we handle exceptions in the visit scheduling threads by (stupidly)
    respawning the relevant thread directly. We should probably improve this
    to give up after a specific number of tries.
    
    Co-authored-by: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>

Link to build: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/490/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/490/console

douardda added a subscriber: douardda.

Overall it looks very good to me. There is room for improvement, for sure, but let's put this to work and see how it performs.

This revision is now accepted and ready to land.Oct 28 2021, 12:36 PM

Drop unused period parameter in cli...

Build is green

Patch application report for D6520 (id=23893)

Rebasing onto 0c7ef27b7e...

First, rewinding head to replay your work on top of it...
Applying: Add a new cli endpoint to schedule recurrent visits in Celery
Changes applied before test
commit d721cd9604c9e7d102fdaa53563f4c8b50c1c8ac
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Oct 27 12:09:42 2021 +0200

    Add a new cli endpoint to schedule recurrent visits in Celery
    
    For each known visit type, we run a loop which:
     - monitors the size of the relevant celery queue
     - schedules more visits of the relevant type once the number of
     available slots goes over a given threshold (currently set to 5% of the
     max queue size).
    
    The scheduling of visits combines multiple scheduling policies, for now
    using static ratios set in the `POLICY_RATIOS` dict. We emit a warning
    if the ratio of origins fetched for each policy is skewed with respect
    to the original request (allowing, for now, manual adjustement of the
    ratios).
    
    The CLI endpoint spawns one thread for each visit type, which all handle
    connections to RabbitMQ and the scheduler backend separately. For now,
    we handle exceptions in the visit scheduling threads by (stupidly)
    respawning the relevant thread directly. We should probably improve this
    to give up after a specific number of tries.
    
    Co-authored-by: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/491/ for more details.

Build is green

Patch application report for D6520 (id=23897)

Rebasing onto 0c7ef27b7e...

Current branch diff-target is up to date.
Changes applied before test
commit 50d7fd7ff49b02654c1416f80e786acac3a980d5
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Oct 27 12:09:42 2021 +0200

    Add a new cli endpoint to schedule recurrent visits in Celery
    
    For each known visit type, we run a loop which:
     - monitors the size of the relevant celery queue
     - schedules more visits of the relevant type once the number of
     available slots goes over a given threshold (currently set to 5% of the
     max queue size).
    
    The scheduling of visits combines multiple scheduling policies, for now
    using static ratios set in the `POLICY_RATIOS` dict. We emit a warning
    if the ratio of origins fetched for each policy is skewed with respect
    to the original request (allowing, for now, manual adjustement of the
    ratios).
    
    The CLI endpoint spawns one thread for each visit type, which all handle
    connections to RabbitMQ and the scheduler backend separately. For now,
    we handle exceptions in the visit scheduling threads by (stupidly)
    respawning the relevant thread directly. We should probably improve this
    to give up after a specific number of tries.
    
    Co-authored-by: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/492/ for more details.