Page MenuHomeSoftware Heritage

Make OriginIndexer call storage.origin_get a single time for all origins.
ClosedPublic

Authored by vlorentz on Feb 6 2019, 5:03 PM.

Diff Detail

Repository
rDCIDX Metadata indexer
Branch
OriginIndexer-origin_get-single-query
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 4055
Build 5328: tox-on-jenkinsJenkins
Build 5327: arc lint + arc unit

Event Timeline

douardda added a subscriber: douardda.

I see no link between this hunk and the diff's title. Did I miss something?
BTW, I do prefer the version before the diff: easier to read.

This revision now requires changes to proceed.Feb 7 2019, 10:04 AM

I see no link between this hunk and the diff's title. Did I miss something?
BTW, I do prefer the version before the diff: easier to read.

Ok I spoke a bit too fast. This does indeed what it pretends. But it's a bit cryptic as is. The 'double loop' on ids is confusing (zip + list comprehension).

  • Make the code more readable.
ardumont added a subscriber: ardumont.
ardumont added inline comments.
swh/indexer/indexer.py
567

You do not need the zip call, you can use origin['id'].

that might also avoid any problem in regards of inconsistency regarding the list's order

This revision now requires changes to proceed.Feb 7 2019, 10:47 AM
vlorentz added inline comments.
swh/indexer/indexer.py
567

origin can be None.

ardumont added inline comments.
swh/indexer/indexer.py
567

ok, so the zip call is for logging the missing origin id.

This revision is now accepted and ready to land.Feb 7 2019, 4:43 PM
This revision was automatically updated to reflect the committed changes.