This new module is specifically in charge of scheduling regularly the recurring tasksFor each known visit type, we run a loop which:
(loader ones, either dvcs loader like: load-git, load-svn, or package loader:- monitors the size of the relevant celery queue
pypi, npm, - schedules more visits of the relevant type once the number of
...,available slots goes over a given threshold (currently set to 5% of the
etc...max queue size).
This new cli call will replace the manual processes started on scheduler nodes (saatchi,e scheduling of visits combines multiple scheduling policies, for now
using static ratios set in the `POLICY_RATIOS` dict. We emit a warning
scheduler0.staging).if the ratio of origins fetched for each policy is skewed with respect
to the original request (allowing, for now, manual adjustement of the
ratios).
The CLI endpoint spawns one thread for each visit type, which all handle
connections to RabbitMQ and the scheduler backend separately. For now,
we handle exceptions in the visit scheduling threads by (stupidly)
respawning the relevant thread directly. We should probably improve this
to give up after a specific number of tries.
Co-authored-by: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Related to T3667