Page MenuHomeSoftware Heritage

Make the OriginMetadataIndexer fetch rev metadata from the storage instead of getting them via the scheduler.
ClosedPublic

Authored by vlorentz on Fri, Nov 23, 4:42 PM.

Diff Detail

Repository
rDCIDX Object indexer
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

vlorentz created this revision.Fri, Nov 23, 4:42 PM
ardumont accepted this revision.Fri, Nov 23, 5:11 PM

Some non blocking remark/question.

swh/indexer/metadata.py
309

wondering whether we are not missing a tool id here (not an immediate problem i think).

swh/indexer/origin_head.py
87

how come you don't need that anymore?

This revision is now accepted and ready to land.Fri, Nov 23, 5:11 PM
vlorentz marked 2 inline comments as done.Fri, Nov 23, 5:18 PM
vlorentz added inline comments.
swh/indexer/metadata.py
309

Good point, I didn't think of that.

That makes me realize this line only work with the mock idx storage. revision_metadata_get returns a list, with possibly more than one item per id when there are multiple tools.

swh/indexer/origin_head.py
87

D704 made it optional

ardumont added inline comments.Fri, Nov 23, 5:23 PM
swh/indexer/origin_head.py
87

Yes but why you needed it and now you don't?

Also, i think it was opened only for that case ;)

vlorentz updated this revision to Diff 2229.Fri, Nov 23, 5:26 PM
  • Fix revision_metadata_get mock and its usage.
vlorentz marked an inline comment as done.Fri, Nov 23, 5:52 PM
vlorentz added inline comments.
swh/indexer/origin_head.py
87

Because the origin int meta indexer now calls revision_metadata_get instead of getting revisions_metadata via the scheduler.

This revision was automatically updated to reflect the committed changes.