Page MenuHomeSoftware Heritage

swh.storage.buffer: Add buffering proxy storage implementation
ClosedPublic

Authored by ardumont on Oct 8 2019, 2:46 PM.

Diff Detail

Repository
rDSTO Storage manager
Branch
master
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 8173
Build 11784: tox-on-jenkinsJenkins
Build 11783: arc lint + arc unit

Event Timeline

vlorentz requested changes to this revision.EditedOct 8 2019, 3:00 PM

s/threshold/min_batch_size/

s/Sequence/Iterable/ because we don't do random access (see definition of sequences here: https://docs.python.org/3/glossary.html#term-sequence )

I think any call to a non-add method should trigger a flush of related objects before forwarding the call (eg. revision_missing flushes revisions).

swh/storage/buffer.py
20

.. code-block:: yaml

This revision now requires changes to proceed.Oct 8 2019, 3:00 PM

Also, I understand how that's currently useful, but FYI, batching content has absolutely no effect for Cassandra (the Cassandra backend breaks batches into individual records)

Also, I understand how that's currently useful, but FYI, batching content has absolutely no effect for Cassandra (the Cassandra backend breaks batches into individual records)

that's the beauty of it!
That's used through configuration so we will be able to simply remove the configuration about the buffering storage when we migrate to cassandra!

swh/storage/buffer.py
20

I tried, that's too smart for me...

swh/storage/tests/test_buffer.py
19

this should check the content was not sent to the backend.

Same comment for other instances below

23

len(content[0]['data']) + len(content[1]['data']).

Same comment for other instances below

  • Fix code-block syntax
  • Add some more assertions on storage writing or not

BUILD has failed

./swh/storage/buffer.py:82:16: E999 SyntaxError: invalid syntax

Well i fixed my local mypy but the ci mypy (different version is not happy about it)...
That's annoying.

Fix ci mypy which is not happy, local mypy was ok

Rebase on latest development

vlorentz requested changes to this revision.Oct 8 2019, 4:31 PM

You missed this comment:

s/threshold/min_batch_size/

s/Sequence/Iterable/ because we don't do random access (see definition of sequences here: https://docs.python.org/3/glossary.html#term-sequence )

I think any call to a non-add method should trigger a flush of related objects before forwarding the call (eg. revision_missing flushes revisions).

This revision now requires changes to proceed.Oct 8 2019, 4:31 PM

You missed this comment:

I saw and then i forgot.

Adapt according to last missing review points

This revision was not accepted when it landed; it landed in state Needs Review.Oct 8 2019, 4:42 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.

This revision was not accepted when it landed; it landed in state Needs Review.

damn! I messed up.
i checked out one commit prior to my latest master... to push only to that level...
i git push-ed but that pushed my master anyway...

oh well, i'll start back using branch...