Page MenuHomeSoftware Heritage

scheduler: Clean up dead code about priority/ratio
Closed, MigratedEdits Locked

Description

Following T3084, the priority code is no longer relevant.

Priority tasks are still used, they are rerouted by the runner to dedicated queues.
The number of tasks with a given ratio per priority is no longer used, that's the
part that needs clean up.

Clean it up from the base code.

Event Timeline

ardumont triaged this task as Normal priority.Apr 19 2021, 11:30 AM
ardumont created this task.
ardumont updated the task description. (Show Details)

Deployed the new scheduler 0.13 version on staging.

But that seems to have slowed the scheduler runner down somehow.

Apr 20 10:02:22 scheduler0 swh[1499079]: INFO:swh.scheduler.celery_backend.runner:Grabbed 96 tasks index-origin-metadata
Apr 20 10:06:16 scheduler0 swh[1499079]: INFO:swh.scheduler.celery_backend.runner:Grabbed 1768 tasks load-pypi
Apr 20 10:08:15 scheduler0 swh[1499079]: INFO:swh.scheduler.celery_backend.runner:Grabbed 296 tasks index-origin-metadata
Apr 20 10:13:57 scheduler0 swh[1499079]: INFO:swh.scheduler.celery_backend.runner:Grabbed 273 tasks index-origin-metadata

Against (pre 0.13 scheduler version):

Apr 20 09:58:49 scheduler0 swh[19642]: INFO:swh.scheduler.celery_backend.runner:Grabbed 10 tasks index-origin-metadata
Apr 20 09:59:01 scheduler0 swh[19642]: INFO:swh.scheduler.celery_backend.runner:Grabbed 10 tasks index-origin-metadata
Apr 20 09:59:14 scheduler0 swh[19642]: INFO:swh.scheduler.celery_backend.runner:Grabbed 11 tasks index-origin-metadata
Apr 20 09:59:26 scheduler0 swh[19642]: INFO:swh.scheduler.celery_backend.runner:Grabbed 11 tasks index-origin-metadata
Apr 20 09:59:38 scheduler0 swh[19642]: INFO:swh.scheduler.celery_backend.runner:Grabbed 10 tasks index-origin-metadata
Apr 20 09:59:50 scheduler0 swh[19642]: INFO:swh.scheduler.celery_backend.runner:Grabbed 9 tasks index-origin-metadata
Apr 20 10:00:02 scheduler0 swh[19642]: INFO:swh.scheduler.celery_backend.runner:Grabbed 9 tasks index-origin-metadata

Well, to be fair, it seems to grab more tasks now though...

Somewhat good news, the scheduling db was missing an index [1]

Adding it makes the scheduler runner way more snappier (and happy).

[1]

"task_type_next_run_idx" btree (type, next_run) WHERE status = 'next_run_not_scheduled'::task_status

Deployed in production as well (it was no problem for it as the index used was there already).

ardumont claimed this task.