Page MenuHomeSoftware Heritage

Delete operation in objstorage
ClosedPublic

Authored by seirl on Aug 11 2017, 6:55 PM.

Diff Detail

Repository
rDOBJS Object storage
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

zack requested changes to this revision.Aug 12 2017, 8:31 PM
zack added a reviewer: olasd.
zack added a subscriber: zack.
zack added inline comments.
swh/objstorage/multiplexer/filter/resilient_filter.py
9–15 ↗(On Diff #786)

So, my understanding is that with this the default of the object storage (e.g., for all object storage instantiations that we already have in production) will be that deletes are allowed. To disable them, one will need to explicitly add the ResilientStorageFilter. Did I get that right?

If so, I'm unhappy about the design, because delete hadn't been implemented in the beginning on purpose, for obvious safety reasons. I'd rather not change that only because we are reusing the object storage in a new context, unexpected at a time (i.e., the vault, which needs to be able to do cache expiry).

So I'd rather see something that goes the other way. Default instantiations of the object storage should disallow deletes. Even better: they should not have a delete() method at all.

If one wants to have delete, they should do something explicit in the code, i.e., add a "filter" (although the name would be wrong in that context), or maybe just instantiate a different subclass that expose a delete() method (dunno, FragileObjectStorage or BrittleObjectStorage).

Thoughts?

This revision now requires changes to proceed.Aug 12 2017, 8:31 PM
swh/objstorage/multiplexer/filter/resilient_filter.py
9–15 ↗(On Diff #786)

I don't understand how you want to deal with all the objstorage implementations if the delete method is in a subclass. One subclass per implementation? FragileRemoteObjStorage, FragileMultiplexerObjStorage, FragilePathslicingObjStorage ...?

swh/objstorage/multiplexer/filter/resilient_filter.py
9–15 ↗(On Diff #786)

The goal is that the currently deployed instances of object storage — which do *not* have a delete method — still don't have it after this change is accepted. This way, even if we mess up the configuration, code that is currently in production cannot call delete() and remove objects from the main object storage.

To achieve that, you can either have one implementation of delete() for each subclass that needs it, but I agree that would be unfortunate. Or you can have an intermediate subclass that implements that method, and further subclasses that inherits from that. Or you can have a mixin that adds delete(), and mix it in wherever it's needed.

swh/objstorage/multiplexer/filter/resilient_filter.py
9–15 ↗(On Diff #786)

After some IRL discussions, we decided to:

  • keep all the methods in the class implementations
  • add a permission system for the methods with delete=False by default
  • remove the useless/redundant/unused filters
seirl edited edge metadata.
objstorage: add allow_delete permission parameter
zack requested changes to this revision.EditedSep 13 2017, 11:07 AM

This looks generally good to me.

I've only a couple of nits that I'd like to see fixed before accepting this:

  • the test for checking that deletion is not allowed when allow_delete is explicitly set to False is good. But I'd like to have an additional test, showing that by default deletions are not allowed. By the look of it I suspect that might need instantiating a dedicated object storage for that test, but that's a good idea anyhow
  • when stumbling on the code, it might be obscure why the delete methods are calling super() why others are not. It's worth adding comments there explaining that it is so because the parent class is responsible of checking for delete permission
This revision now requires changes to proceed.Sep 13 2017, 11:07 AM

follow zack's recommandations

(@olasd has already approved this, but it was still waiting for me)

This revision is now accepted and ready to land.Sep 14 2017, 10:35 AM
This revision was automatically updated to reflect the committed changes.
This revision was automatically updated to reflect the committed changes.