Page MenuHomeSoftware Heritage

Refactor the objstorage module
ClosedPublic

Authored by qcampos on Jun 6 2016, 3:04 PM.

Details

Reviewers
olasd
Group Reviewers
Reviewers
Maniphest Tasks
T433: Refactor objstorage module
Commits
rDSTOC45eab09a42dc: Update an administration script to make it follow the changes of objstorage api
rDSTOC7c72d902f855: Update the storage module in order to follow objstorage refactoring
rDSTOC785592aadb36: Change the objstorage http server in order to follow the changes of the local…
rDSTOC90c716d5265a: Create a base API that define the objstorage behavior
rDSTOCc664995863c8: Add an implementation of the object storage api and a test class
rDSTOC96ff4ef5aeb1: Update the content integrity checker in order to follow objstorage changes
R65:785592aadb36: Change the objstorage http server in order to follow the changes of the local…
R65:7c72d902f855: Update the storage module in order to follow objstorage refactoring
R65:c664995863c8: Add an implementation of the object storage api and a test class
R65:45eab09a42dc: Update an administration script to make it follow the changes of objstorage api
R65:96ff4ef5aeb1: Update the content integrity checker in order to follow objstorage changes
R65:90c716d5265a: Create a base API that define the objstorage behavior
rDSTO45eab09a42dc: Update an administration script to make it follow the changes of objstorage api
rDSTO96ff4ef5aeb1: Update the content integrity checker in order to follow objstorage changes
rDSTO785592aadb36: Change the objstorage http server in order to follow the changes of the local…
rDSTO7c72d902f855: Update the storage module in order to follow objstorage refactoring
rDSTOc664995863c8: Add an implementation of the object storage api and a test class
rDSTO90c716d5265a: Create a base API that define the objstorage behavior
Summary

Create a base API that define the objstorage behavior

Also add a mixin test class for the contract verification.

Add an implementation of the object storage api and a test class

Update the storage module in order to follow objstorage refactoring

Ref T433

Diff Detail

Repository
rDSTO Storage manager
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 176
Build 267: Software Heritage Python tests
Build 266: arc lint + arc unit

Event Timeline

qcampos retitled this revision from to Refactor the objstorage module.
qcampos updated this object.
qcampos edited the test plan for this revision. (Show Details)
qcampos added a reviewer: olasd.
qcampos edited edge metadata.
  • Update the storage module in order to follow objstorage refactoring
  • Update the content integrity checker in order to follow objstorage changes
  • Change the objstorage http server in order to follow the changes of the local version
  • Update an administration script to make it follow the changes of objstorage api
qcampos edited edge metadata.

Correction of a syntax error

ardumont added inline comments.
swh/storage/objstorage/objstorage_pathslicing.py
281

The warning of iter applies here. This check can take a looooonnnnggg time to finish.

I think this should be a file existence check here.
It will be faster than iterating over a long list of ids... (which is, from the current implementation's standpoint, the result of walking the disk)

swh/storage/objstorage/objstorage_pathslicing.py
281

I mean the __iter__ function...

Well, i took a look since olasd was not there yet.
Good to go for me.

Some small nitpicks:

  • maybe improve the test on the check function to avoid using the implicit __iter__ call. This must be slower that just checking the file's existence.
  • maybe clarify the variables on the test mentioned.

I'll let olasd review some more.

Cheers,

swh/storage/tests/objstorage_testing.py
47

I did not get the test's intent at first sight.
Can you please rename the variables to something more explicit?

For example:

  • content1 -> original_content_data
  • content2 -> corrupted_content_data
  • ra -> content2_id
  • rr -> content_restored_id

For me, the tests can help in documenting the code so we must strive to make them clear.

Of course, I apologize in advance if some existing are not ^^
It's a goal to achieve which is not always easy to reach!

swh/storage/objstorage/objstorage_pathslicing.py
281

Don't the in syntax only use the __contains__ method (which is a file existence check) ?

swh/storage/objstorage/objstorage_pathslicing.py
281

Indeed, nice catch!
Nothing to do then ^^

olasd edited edge metadata.

Thanks for this refactoring. There will be some followup once this lands :)

This revision is now accepted and ready to land.Jun 9 2016, 2:26 PM
qcampos edited edge metadata.

Update the diff after merge with D44.

Also, change test variable names following ardumont's comment.

This revision now requires review to proceed.Jun 13 2016, 11:46 AM
olasd edited edge metadata.
This revision is now accepted and ready to land.Jun 13 2016, 2:20 PM
This revision was automatically updated to reflect the committed changes.