Page MenuHomeSoftware Heritage

implement listers as plugins
Needs ReviewPublic

Authored by douardda on Wed, May 22, 2:49 PM.


Group Reviewers

This diff is mainly here for discussion on how to implement such a thing
and what kind of 'plugin system' could be provided, especially for features
like lister, loaders and more generally scheduler-managed workers.

Depends on D1503.

Diff Detail

rDLS Listers
Lint Skipped
Unit Tests Skipped
Build Status
Buildable 5885
Build 8064: tox-on-jenkinsJenkins
Build 8063: arc lint + arc unit

Event Timeline

douardda created this revision.Wed, May 22, 2:49 PM
vlorentz added a subscriber: vlorentz.EditedWed, May 22, 2:55 PM

Would it be possible to deduplicate the register code by putting all these functions in a single file directly in swh/lister/?

And you can replace the model import by Lister.MODEL, that's one less import to do

anlambert added inline comments.

We should build that list dynamically instead of hardcoding it. This will ease the adding of new listers.

Iterating on the sub-directories of the swh/lister folder could do the trick (core and tests must be
excluded though).

Other solution, import every submodules from swh.lister and check the presence of the register
function to determine if it is a plugin or not.


For npm, there is two models to initialize: swh.lister.npm.models.NpmModel and swh.lister.npm.models.NpmVisitModel

vlorentz added inline comments.Mon, May 27, 2:23 PM

this looks a lot like the code to generate SUPPORTED_LISTERS.

douardda added inline comments.Tue, Jun 4, 5:26 PM

No you cannot, that's the whole point of this idea: being able to declare plugins without having to load every possible python package or (recursively?) look for them in "some well known places".

I did not want to force listers to be 'installed' in a 'swh.lister' namespace (in the sense of PEP420 ).

Using this method based on entry points, a lister can be anywhere and does not need to lies within our swh namespace, and it is effectively loaded only if needed.

I did implement the main swh.lister as plugins here mainly to show how it can be done. These default/basic listers could come with the main swh.lister package (preloaded) without using the plugin mechanism. This is debatable.

TBH, I'm far from convinced this 'register' function is fine as it is in this diff (neither the data structure returned by the function nor the function name, however this later can be anything, since it's fully in the entrypoint declaration).


I know... Not sure yet if it is a good idea to avoid it.


that's typically why I'm not convinced by the 'API' of the plugin loading mechanism here. The true initialization work is in fact done in the 'init' hook, which is a simple function and thus can initialize as many databases/tables as one wants.