Page MenuHomeSoftware Heritage

Arbitrary slicing on PathSlicingObjStorage
ClosedPublic

Authored by qcampos on Jun 14 2016, 5:54 PM.

Details

Summary

Allow the sha1 slicing of a content to be fully customizable.

For example, a content's sha1 : "abcdef1234567890" in a storage with slicing "0:2/0:5" will be stored at "root/ab/abcde/".

Also, make the required changes in the swh.storage package to follow those modifications.

Diff Detail

Repository
rDSTO Storage manager
Branch
T433
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 207
Build 311: Software Heritage Python tests
Build 310: arc lint + arc unit

Event Timeline

qcampos retitled this revision from to Arbitrary slicing on PathSlicingObjStorage.
qcampos updated this object.
qcampos edited the test plan for this revision. (Show Details)
qcampos added a reviewer: olasd.
qcampos edited edge metadata.

Correct a docstring that was not up-to-date.

qcampos edited edge metadata.

Move a constant into the superclass.

qcampos edited edge metadata.

Add an default argument that was missing.

olasd requested changes to this revision.Jun 15 2016, 5:37 PM
olasd edited edge metadata.

A few comments inline before merging :)

swh/storage/objstorage/objstorage_pathslicing.py
91

typo : 0:4 is only four characters long

120

You could instantiate the slice objects here (use slice(map(int)) instead of tuple(map(int))), and reuse them directly when constructing the path.

184–185

We should probably move that check at the instanciation of the storage rather than do it on each access: the length of an object id is constant.

185–186

hex_obj_id[bounds] for bounds in self.bounds instead of unpacking start, end and repacking them.

This revision now requires changes to proceed.Jun 15 2016, 5:37 PM

Didn't knew I could create a slice item. Thanks !

swh/storage/objstorage/objstorage_pathslicing.py
184–185

Do we have a way, at instantiation, to know the size of a hash given the ID_HASH_ALGO algorithm without hard-coding it?

qcampos edited edge metadata.

Correct a typo ;
and use a slice object instead of unpacking [start, stop] manually

swh/storage/objstorage/objstorage_pathslicing.py
184–185

Not really, no; we can add an ID_HASH_LENGTH variable next to ID_HASH_ALGO.

qcampos marked 3 inline comments as done.
qcampos edited edge metadata.

Put the hash lenght test at initialization instead of doing it each access.

swh/storage/objstorage/objstorage.py
8

That should be 40 ! :)

Correct the sha1 hexadecimal hash's length.

swh/storage/objstorage/objstorage.py
8

Woops! Thats better indeed.

olasd edited edge metadata.
This revision is now accepted and ready to land.Jun 16 2016, 3:09 PM
This revision was automatically updated to reflect the committed changes.