Page MenuHomeSoftware Heritage

Make indexers return a summary of their actions
ClosedPublic

Authored by ardumont on Mar 3 2020, 4:21 PM.

Details

Reviewers
vlorentz
Group Reviewers
Reviewers
Summary

This also tries to type a tad better the indexers.

Depends on D2760

Test Plan

tox

Diff Detail

Repository
rDCIDX Metadata indexer
Branch
make-indexers-summarize
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 10893
Build 16380: tox-on-jenkinsJenkins
Build 16379: arc lint + arc unit

Event Timeline

vlorentz requested changes to this revision.Mar 3 2020, 4:37 PM
vlorentz added a subscriber: vlorentz.
vlorentz added inline comments.
swh/indexer/fossology_license.py
98

Why is data optional?

99

Why type: ignore?

127

Why type: ignore?

swh/indexer/indexer.py
287–302

Why did you remove it?

swh/indexer/metadata.py
92

Dict[str, int]

(same comment below)

337

also here

swh/indexer/tasks.py
18–39

Does anything currently use these results?

This revision now requires changes to proceed.Mar 3 2020, 4:37 PM
swh/indexer/fossology_license.py
98

because we have a gazillion inconsistent indexer...
RevisionIndexer does not pass the data along...

99

because i don't want to spend my week satisfying more mypy.
without it it says, that the class has no tool...

which is somewhat true because it's not initialized in the __init__ method but in the prepare for valid reasons which i forgot (most possibly be able to deal with initialization in tess).

127

same.

Also, note that i won't refactor more the indexers right now...

I want to be able to graph what's happening right now.
The modifications i'm doing is for getting there.
I'm trying to move as few cogs as possible...

swh/indexer/indexer.py
287–302

forgot to post my diff comment which explains it.

302

because there is no unified parameter call in the subclasses.
The only consistency is that indexers are calling the run method at some point.

swh/indexer/tasks.py
18–39

the status is used by the scheduler.
the other summaries will simply be displayed in logs.
so yes (for all tasks)

the specific tasks you are targetting by this comment no longer runs for now.

vlorentz added inline comments.
swh/indexer/fossology_license.py
99

you could add tool: Any in the class declaration

This revision is now accepted and ready to land.Mar 3 2020, 8:09 PM
swh/indexer/fossology_license.py
99

thanks for that hint, i did not realize, i'll check that tomorrow ;)

Use correct range of commits :/

ardumont marked 2 inline comments as done.

Use correctly the correct range of commits T.T