Page MenuHomeSoftware Heritage

Fix inconsistent behavior of skipped_content_missing across backends.
ClosedPublic

Authored by vlorentz on Feb 12 2020, 4:09 PM.

Details

Summary

Two fixes:

  • in-mem ignored None keys
  • cassandra yielded input dicts as-is instead of a dict with just the hashes

Diff Detail

Repository
rDSTO Storage manager
Branch
consistent-skipped_content_missing
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 10612
Build 15871: tox-on-jenkinsJenkins
Build 15870: arc lint + arc unit

Event Timeline

Looks fine save from a few comments/questions.

The "content key algortithm" stuff should probably be moved to a helper in swh.model, as we keep repeating it everywhere.

swh/storage/in_memory.py
262–267

Took me a while to understand this logic. Why not use _content_key_algorithm() here?

You could also make a dict with these keys, which would alleviate the n² nature of the lookup (even though it's probably not a big deal considering how tiny n is)

277–279

As you only iterate once, no need for the list(content)

swh/storage/tests/test_storage.py
341–354

??

olasd requested changes to this revision.Feb 14 2020, 4:32 PM
This revision now requires changes to proceed.Feb 14 2020, 4:32 PM
This revision is now accepted and ready to land.Feb 17 2020, 4:19 PM