Page MenuHomeSoftware Heritage

storage: Add proxy storage with retry policy
ClosedPublic

Authored by ardumont on Jan 9 2020, 5:27 PM.

Details

Summary

This implements the actual behavior the loader-core (dvcs ones) exposes. This
will allow to configure other loaders with the same retry (currently limited to
content_add here for discussion purposes [1]).

This should allow to:

  • take care of issues like [2] (hash collision in package loaders otherwise taken care of in dvcs loaders)
  • clean up some more the current loader-core (original loader-core wraps the storage's entrypoints with send_{content,origin,...} methods which adds that retry policy) [3]

[1] I will complete the implementation if we agree to continue in that
direction

[2] https://sentry.softwareheritage.org/share/issue/d4f1208b7eec4b43b11e38494ff039cc/

[3] https://forge.softwareheritage.org/source/swh-loader-core/browse/master/swh/loader/core/loader.py$331-361

Test Plan

tox

Diff Detail

Repository
rDSTO Storage manager
Branch
add-retry-proxy-storage
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 10031
Build 14842: tox-on-jenkinsJenkins
Build 14841: arc lint + arc unit

Event Timeline

vlorentz requested changes to this revision.Jan 9 2020, 5:52 PM
vlorentz added a subscriber: vlorentz.
vlorentz added inline comments.
swh/storage/__init__.py
18–19

nitpick: for consistency, it should be a substantive, like "retryer" (which is a neologism, for lack of a better word)

62

same

swh/storage/retrying.py
20 ↗(On Diff #8923)

the name should indicate it's a boolean predicate. eg. should_retry_loading

25–29 ↗(On Diff #8923)

This should be a module-level constant, and the name (and comment) should describe what the list is used for. eg. RETRY_EXCEPTIONS.

27 ↗(On Diff #8923)

Why retry on IntegrityError?

31–32 ↗(On Diff #8923)

For some reason, this conditional is twisting my brain. Could you add a comment?

Or remove the not to switch the two branches

38–42 ↗(On Diff #8923)

Can't traceback.format_exc be used here?

59–61 ↗(On Diff #8923)

and content_metadata_add

swh/storage/tests/test_retrying.py
34–58 ↗(On Diff #8923)

unittest.mock has a neat feature to remove the need for the _count_content_add function:

def test_retrying_proxy_storage_with_retry(sample_data, mocker):
    """Multiple retries for hash collision and psycopg2 error

    """
    mock_memory = mocker.patch('swh.storage.in_memory.Storage.content_add')
    mock_memory.side_effect = [
        HashCollision('Refuse to add content'),
        psycopg2.IntegrityError('Refuse differently to add content'),
        {'content:add': 1}
    ]

Example:

>>> m = unittest.mock.Mock(side_effect=[Exception('foo'), Exception('bar'), 'baz'])
>>> m()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/unittest/mock.py", line 960, in __call__
    return _mock_self._mock_call(*args, **kwargs)
  File "/usr/lib/python3.7/unittest/mock.py", line 1024, in _mock_call
    raise result
Exception: foo
>>> m()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.7/unittest/mock.py", line 960, in __call__
    return _mock_self._mock_call(*args, **kwargs)
  File "/usr/lib/python3.7/unittest/mock.py", line 1024, in _mock_call
    raise result
Exception: bar
>>> m()
'baz'

;)

This revision now requires changes to proceed.Jan 9 2020, 5:52 PM
swh/storage/__init__.py
18–19

hum, i found this ugly.

Plus, we do not really see that consistency (among all storages i mean).

swh/storage/retrying.py
20 ↗(On Diff #8923)

ok, although, here it should even be named should_retry_adding, it was initially from the loader so loading made sense there.

27 ↗(On Diff #8923)

I don't remember what lead to this.

Here is the original function though (loader-core's loader, slightly adapted) [1]

[1] https://forge.softwareheritage.org/source/swh-loader-core/browse/master/swh/loader/core/loader.py$24-50

38–42 ↗(On Diff #8923)

Maybe, i don't really know ;)

59–61 ↗(On Diff #8923)

yes, like i said in the description, the implementation is incomplete for now.

I will complete the other _add endpoint (and the tests) if we agree to integrate this.

I did not want to spend too much time on adding all endpoints (and the gazillion tests attached to it).

swh/storage/tests/test_retrying.py
34–58 ↗(On Diff #8923)

Oh yeah, i forgot.

That's awesome! Thanks!

swh/storage/retrying.py
27 ↗(On Diff #8923)

Ah, but as the comment says it all.
It could so happens that 2 workers load the same repository mostly at the same time (for some reason).
It happened in the past at least.

That could make concurrency transaction issues (because the other transaction got committed faster and then you have integrity failure on content for example, content already inserted and what not).

Retrying a new transaction should make the issue disappear as the _add methods are filtering data prior to inserting them (at least for the dag objects iirc).

Adapt according to review:

  • Rename should_retry_adding function
  • Extract exception that triggers retry into a module variable
  • Improve tests using rightfully the .side_effect method
  • Add tests to should_retry_adding function
anlambert added inline comments.
swh/storage/retry.py
22

I think we should only catch HashCollision exceptions here in order to not be tightly linked to a specific storage backend (here PostgreSQL).

At the moment, we want to retry on add operations when two different workers concurrently write the same object (content, directory, release, snapshot, revision) in a table.

Nevertheless, it means that some adaptation needs to be made in storage backend implementations as currently the HashCollision exception is only raised for the content add case.

swh/storage/__init__.py
18–19

Also, see retry as a verb like buffer and filter are, maybe that will help.

swh/storage/retry.py
22

For other dag objects, I'm not sure HashCollision convey the right information
though. Maybe a new exception DuplicatedEntry or something.

In any case, that should go in another diff.

Otherwise, this one will be a tag big.

This revision is now accepted and ready to land.Jan 10 2020, 2:32 PM

Improve docstrings and sentences