Page MenuHomeSoftware Heritage

Use swh.scheduler instead of celery in the OriginHeadIndexer.
ClosedPublic

Authored by vlorentz on Oct 26 2018, 10:43 AM.

Diff Detail

Repository
rDCIDX Metadata indexer
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

ardumont added inline comments.
swh/indexer/indexer.py
348

I would expect [dict] to mean a list of dict.
same for [str] the line just before (which is already there).

swh/indexer/origin_head.py
73

jsyk, there is a swh.scheduler.utils.create_task_dict (granted, right now it misses the retries_left).
Not that i require to change that here ;)

In that regard, you might be interested in a discussion to create generically task from the scheduler T1197.

Note: T898 might also be related

93

As i see this snippet popping up twice now... (in the same diff).
Maybe it'd be worth having an @property on a base class somewhere which does that?

vlorentz added inline comments.
swh/indexer/origin_head.py
73

jsyk, there is a swh.scheduler.utils.create_task_dict

Perfect

In that regard, you might be interested in a discussion to create generically task from the scheduler T1197.

I don't think you meant that task

93

yeah, but I don't see any place where it makes sense to have it

  • Fix type in the docstrings.
  • Use swh.scheduler.utils.create_task_dict.
swh/indexer/origin_head.py
73

I don't think you meant that task

Right T1157!
(we cannot amend remark, /me is sad)

Sounds good.

Impacts:

  • swh-site configuration
  • new task_type(s) on swh-scheduler db
This revision is now accepted and ready to land.Oct 26 2018, 7:06 PM
swh/indexer/origin_head.py
93

mmm, right.
I would have thought BaseIndexer. But i see that we could also share this in the BaseOrchestratorIndexer...
so nvm for now ;)

This revision was automatically updated to reflect the committed changes.