Page MenuHomeSoftware Heritage

lauchpad: Manage unhandled exception when reading page of result
AbandonedPublic

Authored by ardumont on Feb 17 2022, 11:06 AM.

Details

Summary

Prior to this commit, the listing could fail when reading the page result in lauchpad.
This now traps the exception and let the listing continue. In effect, this could mean we
lose some part of data when listing incrementally.

We do need the full listing to happen regularly anyway. So that sounds like a fair
trade-off.

This also allows the listing to finish in case of those issues happening.

Related to T3945
Depends on D7194

Test Plan

tox (existing tests are happy, i don't see clearly how to simulate that error in tests though so as is sounds good to me)

docker

Without this commit, this can happen:

swh-lister_1                        | [2022-02-17 09:39:28,685: INFO/MainProcess] Task swh.lister.launchpad.tasks.IncrementalLaunchpadLister[51954eb9-5c55-467e-9091-729361ab4ef8] received
swh-lister_1                        | [2022-02-17 09:45:31,974: ERROR/ForkPoolWorker-1] Task swh.lister.launchpad.tasks.IncrementalLaunchpadLister[51954eb9-5c55-467e-9091-729361ab4ef8] raised unexpected: RestfulError()
swh-lister_1                        | Traceback (most recent call last):
swh-lister_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/celery/app/trace.py", line 450, in trace_task
swh-lister_1                        |     R = retval = fun(*args, **kwargs)
swh-lister_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/scheduler/task.py", line 55, in __call__
swh-lister_1                        |     result = super().__call__(*args, **kwargs)
swh-lister_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/celery/app/trace.py", line 731, in __protected_call__
swh-lister_1                        |     return self.run(*args, **kwargs)
swh-lister_1                        |   File "/src/swh-lister/swh/lister/launchpad/tasks.py", line 27, in list_launchpad_incremental
swh-lister_1                        |     return lister.run().dict()
swh-lister_1                        |   File "/src/swh-lister/swh/lister/pattern.py", line 130, in run
swh-lister_1                        |     full_stats.origins += self.send_origins(origins)
swh-lister_1                        |   File "/src/swh-lister/swh/lister/pattern.py", line 233, in send_origins
swh-lister_1                        |     for batch_origins in grouper(origins, n=1000):
swh-lister_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/core/utils.py", line 47, in grouper
swh-lister_1                        |     for _data in itertools.zip_longest(*args, fillvalue=stop_value):
swh-lister_1                        |   File "/src/swh-lister/swh/lister/launchpad/lister.py", line 149, in get_origins_from_page
swh-lister_1                        |     for repo in repos:
swh-lister_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/lazr/restfulclient/resource.py", line 819, in __iter__
swh-lister_1                        |     next_get = self._root._browser.get(URI(next_link))
swh-lister_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/lazr/restfulclient/_browser.py", line 439, in get
swh-lister_1                        |     response, content = self._request(url, extra_headers=headers)
swh-lister_1                        |   File "/srv/softwareheritage/venv/lib/python3.7/site-packages/lazr/restfulclient/_browser.py", line 429, in _request
swh-lister_1                        |     raise error
swh-lister_1                        | lazr.restfulclient.errors.RestfulError

Diff Detail

Repository
rDLS Listers
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 26943
Build 42132: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 42131: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D7195 (id=26075)

Could not rebase; Attempt merge onto 31b4429ced...

Updating 31b4429..cdecccc
Fast-forward
 swh/lister/launchpad/lister.py                     | 151 ++++++++++++++++-----
 .../tests/data/launchpad_bzr_response.json         | 126 +++++++++++++++++
 swh/lister/launchpad/tests/test_lister.py          | 132 ++++++++++++++----
 3 files changed, 348 insertions(+), 61 deletions(-)
 create mode 100644 swh/lister/launchpad/tests/data/launchpad_bzr_response.json
Changes applied before test
commit cdecccca7799958c06cd1fd11560d6dd5bb53f9d
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Feb 17 11:02:47 2022 +0100

    lauchpad: Manage unhandled exception when reading page of result
    
    Prior to this commit, the listing could fail when reading the page result in lauchpad.
    This now traps the exception and let the listing continue. In effect, this could mean we
    lose some part of data when listing incrementally.
    
    We do need the full listing to happen regularly anyway. So that sounds like a fair
    trade-off.
    
    This also allows the listing to finish in case of those issues happening.
    
    Related to T3945

commit ac55637e8228424c98bc551ec70c24bea345b9ed
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Feb 17 09:52:19 2022 +0100

    lauchpad: Manage unhandled exception when listing
    
    Prior to this commit, the listing could fail when reading a page of data in lauchpad.
    This now traps the exception and let the listing continue. If the page is empty, it's
    now no longer accounted for.
    
    This actually allows the listing to finish in case of issues.
    
    Related to T3945

commit 262f9369c837e293f8389dd9f7a6a965c09f621e
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Feb 16 17:56:13 2022 +0100

    launchpad: Allow bzr origins listing
    
    Related to T3945

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/457/ for more details.

anlambert added a subscriber: anlambert.
anlambert added inline comments.
swh/lister/launchpad/lister.py
152

Use this instead:

from swh.lister.utils import throttling_retry, retry_if_exception

def retry_if_restful_error(retry_state):
    return retry_if_exception(retry_state, lambda e: isinstance(e, RestfulError))

@throttling_retry(retry=retry_if_restful_error)
def get_origins_from_page(self, page: LaunchpadPageType) -> Iterator[ListedOrigin]:
This revision now requires changes to proceed.Feb 17 2022, 11:23 AM

Right, i'll close this one and amend the previous one with the suggested change for both
(i intended to add that kind of change after that diff once the unmanaged exceptions were handled).