Page MenuHomeSoftware Heritage

Make the runner work with unregistered tasks.
AbandonedPublic

Authored by vlorentz on Dec 17 2018, 4:45 PM.

Details

Reviewers
None
Group Reviewers
Reviewers
Summary

So the scheduler-runner does not depend on swh-indexer, swh-loader-*, ..

Diff Detail

Repository
rDSCH Scheduling utilities
Branch
runner-no-task-registration
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 3117
Build 3990: tox-on-jenkinsJenkins
Build 3989: arc lint + arc unit

Event Timeline

Silently generating a random queue name that will never be read by any worker doesn't sound quite right :)

After talking with @douardda and looking at the current code, it seems that almost all tasks are being sent to distinct queues (with the exception of a few lister tasks, that are sharing the same queue, but without that being functionally important). Turns out that the duplication between backend_name and task_queue is never really used.

Our proposal:

  1. make the runner send all tasks to a queue named after the fully qualified task class name (identical to the backend_name). Adapt the router to do so as well (should be as easy as making it return {'queue': task}).
  2. make sure the worker subscribes to all the class name-based queues for the imported modules. Also allow it to subscribe to more queues if needed (which it is for the transition period).
  3. drop the now useless task_queue settings in all task classes.
  4. deploy workers with the implicit class name based queue subscription, also draining the old queues.
  5. deploy new runner sending messages to class name based queues
  6. drop the now useless task_queues settings in all worker deployment manifests
In D839#17975, @olasd wrote:

Silently generating a random queue name that will never be read by any worker doesn't sound quite right :)

The idea was to make the workers pull from these queues as well.

Our proposal:

  1. make the runner send all tasks to a queue named after the fully qualified task class name (identical to the backend_name). Adapt the router to do so as well (should be as easy as making it return {'queue': task}).
  2. make sure the worker subscribes to all the class name-based queues for the imported modules. Also allow it to subscribe to more queues if needed (which it is for the transition period).
  3. drop the now useless task_queue settings in all task classes.
  4. deploy workers with the implicit class name based queue subscription, also draining the old queues.
  5. deploy new runner sending messages to class name based queues
  6. drop the now useless task_queues settings in all worker deployment manifests

Besides 3 and 6, that's also what I wanted to do; and I agree task_queue was not very useful, so I'm 100% in.

  • fix
  • Remove queue name prefixes.