Page MenuHomeSoftware Heritage

celery: auto add tasks declared in the swh.workers entry point in task_modules
Needs ReviewPublic

Authored by douardda on May 22 2019, 2:46 PM.

Details

Reviewers
None
Group Reviewers
Reviewers
Summary

allows to declare worker tasks in a 'swh.workers' entry point. This later is expected to be a callable which returns a dict which 'tasks' key is
the module name of the celery tasks exported.

Some other 'hooks' could be provided as callable in this dictionnary,
typically db initialization stuff in listers could be handled that way.

Not sure what a decent 'API' for this shoudl look like for now, still
experimenting.

See D1504 for a use case implementation

Diff Detail

Repository
rDSCH Scheduling utilities
Branch
wip
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 5884
Build 8062: tox-on-jenkinsJenkins
Build 8061: arc lint + arc unit

Event Timeline

douardda created this revision.May 22 2019, 2:46 PM
ardumont edited the summary of this revision. (Show Details)May 23 2019, 9:44 AM
vlorentz added inline comments.
swh/scheduler/tests/conftest.py
40

Why not toplevel?

ardumont added inline comments.
swh/scheduler/tests/conftest.py
40

I think the rationale is to implement a poor man's lazy policy ;)

You should mention that with this, we can drop some part of the scheduler's configuration file.
The task_modules in the yaml can be dropped.

Which is something less to worry about in the configuration part... i think ;)

ardumont added inline comments.May 23 2019, 2:45 PM
swh/scheduler/celery_backend/config.py
204

I have a little trouble reading that.

Can you explicit what the entrypoint.load() returns...

... plugging brain cables...

Nevermind, i think i grokked it:

  • entrypoint.load() returns the function registered in the setup.py. For example, you dubbed it swh.lister.*.__init__.register in the other diff.
  • and then you call it, to actually access the dict the register function returns

Ok.

Maybe this would be more readable?

for ...:
    worker_register_fn = entrypoint.load()
    CONFIG['task_modules'].append(worker_register_fn()['tasks])

I'm fine with this btw.

Not sure what a decent 'API' for this shoudl look like for now, still experimenting.

neither do i.

vlorentz added inline comments.May 24 2019, 3:13 PM
swh/scheduler/tests/conftest.py
40

It's a test fixture, it will always run if that file is imported

ardumont added inline comments.Mon, Jul 1, 12:10 PM
swh/scheduler/tests/conftest.py
40

heh, sounds about right indeed.

douardda updated this revision to Diff 5762.Wed, Jul 10, 4:03 PM

rebased + a very small bit of doc

Now depends on D1714

douardda added inline comments.Wed, Jul 10, 4:11 PM
swh/scheduler/celery_backend/config.py
204

a more explicit version is fine for me, will do

swh/scheduler/tests/conftest.py
40

Why not toplevel?

Why toplevel?

douardda updated this revision to Diff 5765.Wed, Jul 10, 4:15 PM

More explicit code for the loading of swh.workers entrypoint

in fact not just a more explcit, allows registry['tasks'] to be either a single string or a list of strings.

vlorentz added inline comments.Thu, Jul 18, 11:19 AM
swh/scheduler/tests/conftest.py
40

Because imports should always be at the toplevel unless there's a reason not to.