Page MenuHomeSoftware Heritage

Fix inconsistent behavior of skipped_content_missing across backends.
ClosedPublic

Authored by vlorentz on Feb 12 2020, 4:09 PM.

Details

Summary

Two fixes:

  • in-mem ignored None keys
  • cassandra yielded input dicts as-is instead of a dict with just the hashes

Diff Detail

Repository
rDSTO Storage manager
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

vlorentz created this revision.Feb 12 2020, 4:09 PM
olasd added a subscriber: olasd.Feb 14 2020, 4:32 PM

Looks fine save from a few comments/questions.

The "content key algortithm" stuff should probably be moved to a helper in swh.model, as we keep repeating it everywhere.

swh/storage/in_memory.py
262–267

Took me a while to understand this logic. Why not use _content_key_algorithm() here?

You could also make a dict with these keys, which would alleviate the n² nature of the lookup (even though it's probably not a big deal considering how tiny n is)

277–279

As you only iterate once, no need for the list(content)

swh/storage/tests/test_storage.py
342–355

??

olasd requested changes to this revision.Feb 14 2020, 4:32 PM
This revision now requires changes to proceed.Feb 14 2020, 4:32 PM
vlorentz updated this revision to Diff 9551.Feb 14 2020, 5:49 PM

apply comments

olasd accepted this revision.Feb 17 2020, 4:19 PM
This revision is now accepted and ready to land.Feb 17 2020, 4:19 PM