Page MenuHomeSoftware Heritage

Add svn_retry decorator to mitigate possible network errors
ClosedPublic

Authored by anlambert on Mar 23 2022, 4:26 PM.

Details

Summary

The subversion loader was not really resilient to network errors.
In particular, if a network error (for instance a connection reset
by peer) happened during one of the following subversion command
executed by the loader: export, checkout, propget or remote access
creation, the loading was aborted.

So instead of aborting the loading process, prefer to retry the
subversion command that failed due to a network error with the
hope that it will not happen again.

To do so introduce a svn_retry decorator based on the use of
the tenacity module. It enables to retry a subversion command
at most three times using exponential backoff as wait policy.

Should mitigate:

Diff Detail

Repository
rDLDSVN Subversion (SVN) loader
Branch
svn-retry
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 27752
Build 43441: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 43440: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D7416 (id=26831)

Rebasing onto 74c010656a...

Current branch diff-target is up to date.
Changes applied before test
commit 8f10b5c05bece7dd0dddd1d725301fbadb94f723
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Wed Jan 5 14:42:18 2022 +0100

    Add svn_retry decorator to mitigate possible network errors
    
    The subversion loader was not really resilient to network errors.
    In particular, if a network error (for instance a connection reset
    by peer) happened during one of the following subversion command
    executed by the loader: export, checkout, propget or remote access
    creation, the loading was aborted.
    
    So instead of aborting the loading process, prefer to retry the
    subversion command that failed due to a network error with the
    hope that it will not happen again.
    
    To do so introduce a svn_retry decorator based on the use of
    the tenacity module. It enables to retry a subversion command
    at most three times using exponential backoff as wait policy.

See https://jenkins.softwareheritage.org/job/DLDSVN/job/tests-on-diff/309/ for more details.

Update: Improve debug logging for export and checkout command.

Build is green

Patch application report for D7416 (id=26838)

Rebasing onto 74c010656a...

Current branch diff-target is up to date.
Changes applied before test
commit 104fb313570aa9ac6b7b9bb6de68103531aeb848
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Wed Jan 5 14:42:18 2022 +0100

    Add svn_retry decorator to mitigate possible network errors
    
    The subversion loader was not really resilient to network errors.
    In particular, if a network error (for instance a connection reset
    by peer) happened during one of the following subversion command
    executed by the loader: export, checkout, propget or remote access
    creation, the loading was aborted.
    
    So instead of aborting the loading process, prefer to retry the
    subversion command that failed due to a network error with the
    hope that it will not happen again.
    
    To do so introduce a svn_retry decorator based on the use of
    the tenacity module. It enables to retry a subversion command
    at most three times using exponential backoff as wait policy.

See https://jenkins.softwareheritage.org/job/DLDSVN/job/tests-on-diff/310/ for more details.

ardumont added a subscriber: ardumont.

Nice

Couple of questions/remarks inline.

swh/loader/svn/svn.py
277

what's that one?

swh/loader/svn/svn_retry.py
30

is that somehow a union of startswith(string0) or startswith(string1)... ?
(if so neat)

swh/loader/svn/tests/test_svn_retry.py
90

why do you still need it, isn't that mocked already with the autouse svn_retry_sleep_mocker fixture above?

This revision is now accepted and ready to land.Mar 24 2022, 10:18 AM
anlambert added inline comments.
swh/loader/svn/svn.py
277

I simply wrapped the parameters of client.checkout, you can find the details here.

Maybe I can reference the name of the C function of the subversion API in the docstring
if one wants the parameter details ?

swh/loader/svn/svn_retry.py
30

Exactly, see documentation of str.startswith.
I discovered that feature thanks to @vlorentz.

swh/loader/svn/tests/test_svn_retry.py
90

The svn_retry_sleep_mocker fixture is local to test_externals.py file and does not apply to this one.

And I need the mock reference to assert the calls.

swh/loader/svn/svn.py
277

thx and yes please, it'd less of a hassle to find it back.

Update:

  • rename get_remote_access method to remote_access
  • add references to subversion C API functions in docstrings for getting details about parameters
  • fix commit message, max number of attempts is five not three

Build is green

Patch application report for D7416 (id=26856)

Rebasing onto 74c010656a...

Current branch diff-target is up to date.
Changes applied before test
commit 6c83c506151e1a63e186e135608454859b4ac66b
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Wed Jan 5 14:42:18 2022 +0100

    Add svn_retry decorator to mitigate possible network errors
    
    The subversion loader was not really resilient to network errors.
    In particular, if a network error (for instance a connection reset
    by peer) happened during one of the following subversion command
    executed by the loader: export, checkout, propget or remote access
    creation, the loading was aborted.
    
    So instead of aborting the loading process, prefer to retry the
    subversion command that failed due to a network error with the
    hope that it will not happen again.
    
    To do so introduce a svn_retry decorator based on the use of
    the tenacity module. It enables to retry a subversion command
    at most five times using exponential backoff as wait policy.

See https://jenkins.softwareheritage.org/job/DLDSVN/job/tests-on-diff/311/ for more details.