Page MenuHomeSoftware Heritage

Make storage proxies use swh-model objects instead of dicts.
ClosedPublic

Authored by vlorentz on Feb 12 2020, 4:49 PM.

Details

Summary

This means that instead of having the validation proxy right before the backend class,
it must now be at the beginning of pipelines.

Depends on D2663.

Diff Detail

Repository
rDSTO Storage manager
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

olasd added a subscriber: olasd.

Thanks!

I guess my comments are mostly nitpicks at this point, but feel free to ping me if you want a re-review after making the changes.

swh/storage/buffer.py
81–82

Renaming s to stats would be clearer to me. It'd be nice to do the renaming throughout, at some point.

You could also add a comment saying that if not stats: means that object_add didn't flush the buffers, and we should check for the volume of contents.

83–84

queue, content instead of q, c?

84

Also, unrelated to this diff, we should store and update the sum of content lengths in an attribute on the fly to avoid a gratuitous quatratic behavior.

85

At some point we should rename this min_batch_size to buffer_threshold or something clearer.

swh/storage/filter.py
45–57

Same comment as the one on D2663. Eventually, we should make sure to fix both of these to use the "content key" (which will be easier to do once we actually add that method to the Content/SkippedContent model objects).

70

Can I has return type?

This revision is now accepted and ready to land.Feb 14 2020, 5:51 PM
swh/storage/filter.py
70

cheeseburger typing