Page MenuHomeSoftware Heritage

Refactor the objstorage module
ClosedPublic

Authored by qcampos on Jun 6 2016, 3:04 PM.

Details

Reviewers
olasd
Group Reviewers
Reviewers
Maniphest Tasks
T433: Refactor objstorage module
Commits
rDSTOC45eab09a42dc: Update an administration script to make it follow the changes of objstorage api
rDSTOC7c72d902f855: Update the storage module in order to follow objstorage refactoring
rDSTOC785592aadb36: Change the objstorage http server in order to follow the changes of the local…
rDSTOC90c716d5265a: Create a base API that define the objstorage behavior
rDSTOCc664995863c8: Add an implementation of the object storage api and a test class
rDSTOC96ff4ef5aeb1: Update the content integrity checker in order to follow objstorage changes
R65:785592aadb36: Change the objstorage http server in order to follow the changes of the local…
R65:7c72d902f855: Update the storage module in order to follow objstorage refactoring
R65:c664995863c8: Add an implementation of the object storage api and a test class
R65:45eab09a42dc: Update an administration script to make it follow the changes of objstorage api
R65:96ff4ef5aeb1: Update the content integrity checker in order to follow objstorage changes
R65:90c716d5265a: Create a base API that define the objstorage behavior
rDSTO45eab09a42dc: Update an administration script to make it follow the changes of objstorage api
rDSTO96ff4ef5aeb1: Update the content integrity checker in order to follow objstorage changes
rDSTO785592aadb36: Change the objstorage http server in order to follow the changes of the local…
rDSTO7c72d902f855: Update the storage module in order to follow objstorage refactoring
rDSTOc664995863c8: Add an implementation of the object storage api and a test class
rDSTO90c716d5265a: Create a base API that define the objstorage behavior
Summary

Create a base API that define the objstorage behavior

Also add a mixin test class for the contract verification.

Add an implementation of the object storage api and a test class

Update the storage module in order to follow objstorage refactoring

Ref T433

Diff Detail

Repository
rDSTO Storage manager
Branch
T433-objstorage-refactoring
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 159
Build 236: Software Heritage Python tests
Build 235: arc lint + arc unit

Event Timeline

qcampos retitled this revision from to Refactor the objstorage module.
qcampos updated this object.
qcampos edited the test plan for this revision. (Show Details)
qcampos added a reviewer: olasd.
qcampos edited edge metadata.
  • Update the storage module in order to follow objstorage refactoring
  • Update the content integrity checker in order to follow objstorage changes
  • Change the objstorage http server in order to follow the changes of the local version
  • Update an administration script to make it follow the changes of objstorage api
qcampos edited edge metadata.

Correction of a syntax error

ardumont added inline comments.
swh/storage/objstorage/objstorage_pathslicing.py
281

The warning of iter applies here. This check can take a looooonnnnggg time to finish.

I think this should be a file existence check here.
It will be faster than iterating over a long list of ids... (which is, from the current implementation's standpoint, the result of walking the disk)

swh/storage/objstorage/objstorage_pathslicing.py
281

I mean the __iter__ function...

Well, i took a look since olasd was not there yet.
Good to go for me.

Some small nitpicks:

  • maybe improve the test on the check function to avoid using the implicit __iter__ call. This must be slower that just checking the file's existence.
  • maybe clarify the variables on the test mentioned.

I'll let olasd review some more.

Cheers,

swh/storage/tests/objstorage_testing.py
47

I did not get the test's intent at first sight.
Can you please rename the variables to something more explicit?

For example:

  • content1 -> original_content_data
  • content2 -> corrupted_content_data
  • ra -> content2_id
  • rr -> content_restored_id

For me, the tests can help in documenting the code so we must strive to make them clear.

Of course, I apologize in advance if some existing are not ^^
It's a goal to achieve which is not always easy to reach!

swh/storage/objstorage/objstorage_pathslicing.py
281

Don't the in syntax only use the __contains__ method (which is a file existence check) ?

swh/storage/objstorage/objstorage_pathslicing.py
281

Indeed, nice catch!
Nothing to do then ^^

olasd edited edge metadata.

Thanks for this refactoring. There will be some followup once this lands :)

This revision is now accepted and ready to land.Jun 9 2016, 2:26 PM
qcampos edited edge metadata.

Update the diff after merge with D44.

Also, change test variable names following ardumont's comment.

This revision now requires review to proceed.Jun 13 2016, 11:46 AM
olasd edited edge metadata.
This revision is now accepted and ready to land.Jun 13 2016, 2:20 PM
This revision was automatically updated to reflect the committed changes.