Page MenuHomeSoftware Heritage

CLI: generalize 'map lookup' to lookup many identifiers at once
ClosedPublic

Authored by zack on Nov 30 2019, 2:50 PM.

Details

Summary

Multiple identifiers (of either kind) can be passed either directly on the CLI
or via stdin. In the latter case logical lines in stidn will be preserved in
stdout.

Closes T2112

Diff Detail

Repository
rDGRPH Compressed graph representation
Branch
feature/cli-stdin-lookup
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 9548
Build 14057: tox-on-jenkinsJenkins
Build 14056: arc lint + arc unit

Event Timeline

zack created this revision.

Updating D2379: CLI: generalize 'map lookup' to lookup many identifiers at once

seirl requested changes to this revision.Dec 4 2019, 2:32 PM
seirl added inline comments.
swh/graph/cli.py
213

Isn't there an overhead to the mmap call here? Couldn't the mappings be created in the closure instead maybe?

229

for line in sys.stdin:

232

You're going to get a trailing space on all lines here, it might be significant at some point. It should be possible to str.join instead?

This revision now requires changes to proceed.Dec 4 2019, 2:32 PM
zack marked 3 inline comments as done.Dec 6 2019, 12:08 PM
zack added inline comments.
swh/graph/cli.py
229

(WTH was I thinking?)

232

Done.

Note that, as a consequence of this change, logical lines that contain non resolvable IDs will become ambiguous in the output feed (e.g., if you 4 IDs and one failed to resolve, you cannot determine which one failed to in the output). IMO that's fine, as inputs with non resolvable IDs are doomed to fail anyway.

zack marked 2 inline comments as done.

Updating D2379: CLI: generalize 'map lookup' to lookup many identifiers at once

This revision is now accepted and ready to land.Dec 6 2019, 3:44 PM