Page MenuHomeSoftware Heritage

tasks: Migrate load-git task message format
ClosedPublic

Authored by ardumont on Dec 9 2019, 9:16 AM.

Details

Summary

Pros:

  • This aligns the current behavior with other listers and loaders.
  • Reading task information will become clearer.
  • One step closer to worker's task unification. And possibly refactor a bit the scheduler's model (add a new column field url or something to ease introspection and admin queries)

Cons:

  • scheduler's recurring load-git tasks will need to be migrated (or some other way of maintenance if reasonable)
Test Plan

tox

Diff Detail

Repository
rDLDG Git loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

anlambert added inline comments.
swh/loader/git/tasks.py
17–21

How about dropping the keyword argument only constraint ?

By using the following signature:

def load_git(url: str, base_url: Optional[str] = None) -> Dict[str, Any]:

you maintain backward compatibility with the recurring task already registered
in the scheduler database while providing url with a keyword argument
is still allowed.

The same thing should be done in the other loaders (for instance it exists recurring
mercurial loader tasks in the scheduler database).

This feels simpler to me than migrating task arguments format directly in the database.

swh/loader/git/tasks.py
17–21

Yes, ok.

Thanks for the feedback.

This feels simpler to me than migrating task arguments format directly in the database.

I feel like we will need to do so eventually.

Do not force kwargs use but allow its use.

This revision is now accepted and ready to land.Dec 9 2019, 2:09 PM