Page MenuHomeSoftware Heritage

phabricator: Ensure request errors are raised as exceptions
ClosedPublic

Authored by anlambert on Jan 14 2021, 5:24 PM.

Details

Summary

This ensures that a celery task will be marked as failed if a request error
happens when listing origins.

Depends on D4864

Diff Detail

Repository
rDLS Listers
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D4865 (id=17234)

Could not rebase; Attempt merge onto c782275296...

Updating c782275..8e39fc4
Fast-forward
 swh/lister/phabricator/lister.py                   |   15 +-
 .../phabricator/tests/data/api_empty_response.json |    8 -
 .../phabricator/tests/data/api_first_response.json | 2538 --------------------
 .../data/api_first_response_other_instance.json    | 2336 ------------------
 .../phabricator/tests/data/api_next_response.json  | 2354 ------------------
 .../data/api_response_undefined_protocol.json      |   60 -
 ...oldest,attachments[uris]=1,after=,api.token=foo |    1 -
 .../data/phabricator_api_repositories_page1.json   | 1246 ++++++++++
 .../data/phabricator_api_repositories_page2.json   | 1308 ++++++++++
 swh/lister/phabricator/tests/test_lister.py        |  116 +-
 swh/lister/phabricator/tests/test_tasks.py         |    2 +-
 11 files changed, 2669 insertions(+), 7315 deletions(-)
 delete mode 100644 swh/lister/phabricator/tests/data/api_empty_response.json
 delete mode 100644 swh/lister/phabricator/tests/data/api_first_response.json
 delete mode 100644 swh/lister/phabricator/tests/data/api_first_response_other_instance.json
 delete mode 100644 swh/lister/phabricator/tests/data/api_next_response.json
 delete mode 100644 swh/lister/phabricator/tests/data/api_response_undefined_protocol.json
 delete mode 120000 swh/lister/phabricator/tests/data/https_forge.softwareheritage.org/api_diffusion.repository.search,order=oldest,attachments[uris]=1,after=,api.token=foo
 create mode 100644 swh/lister/phabricator/tests/data/phabricator_api_repositories_page1.json
 create mode 100644 swh/lister/phabricator/tests/data/phabricator_api_repositories_page2.json
Changes applied before test
commit 8e39fc4586607f690c3529a0180e25c88d35d505
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Thu Jan 14 17:23:07 2021 +0100

    phabricator: Ensure request errors are raised as exceptions
    
    This ensures that a celery task will be marked as failed if a request error
    happens when listing origins.

commit fbc02034165a82c77891ab20dbd06cc5cca1e682
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Thu Jan 14 14:44:32 2021 +0100

    phabricator: Add test for new lister implementation
    
    Also remove no more used JSON files.

commit 370a04dbf23a8ecc78af438bb1f066630a7b3d8d
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Thu Jan 14 13:50:48 2021 +0100

    phabricator: Allow to pass forge base URL as lister parameter

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/95/ for more details.

This revision is now accepted and ready to land.Jan 14 2021, 7:16 PM
olasd requested changes to this revision.Jan 15 2021, 11:26 AM
olasd added a subscriber: olasd.

Could we keep the (more verbose) logger.warning call too? I don't remember what the HTTPError traceback contains when such an issue happens.

This revision now requires changes to proceed.Jan 15 2021, 11:26 AM

Build is green

Patch application report for D4865 (id=17268)

Could not rebase; Attempt merge onto c782275296...

Updating c782275..bc3dd1a
Fast-forward
 swh/lister/phabricator/lister.py                   |   25 +-
 .../phabricator/tests/data/api_empty_response.json |    8 -
 .../phabricator/tests/data/api_first_response.json | 2538 --------------------
 .../data/api_first_response_other_instance.json    | 2336 ------------------
 .../phabricator/tests/data/api_next_response.json  | 2354 ------------------
 .../data/api_response_undefined_protocol.json      |   60 -
 ...oldest,attachments[uris]=1,after=,api.token=foo |    1 -
 .../data/phabricator_api_repositories_page1.json   | 1246 ++++++++++
 .../data/phabricator_api_repositories_page2.json   | 1308 ++++++++++
 swh/lister/phabricator/tests/test_lister.py        |  134 +-
 swh/lister/phabricator/tests/test_tasks.py         |    2 +-
 11 files changed, 2691 insertions(+), 7321 deletions(-)
 delete mode 100644 swh/lister/phabricator/tests/data/api_empty_response.json
 delete mode 100644 swh/lister/phabricator/tests/data/api_first_response.json
 delete mode 100644 swh/lister/phabricator/tests/data/api_first_response_other_instance.json
 delete mode 100644 swh/lister/phabricator/tests/data/api_next_response.json
 delete mode 100644 swh/lister/phabricator/tests/data/api_response_undefined_protocol.json
 delete mode 120000 swh/lister/phabricator/tests/data/https_forge.softwareheritage.org/api_diffusion.repository.search,order=oldest,attachments[uris]=1,after=,api.token=foo
 create mode 100644 swh/lister/phabricator/tests/data/phabricator_api_repositories_page1.json
 create mode 100644 swh/lister/phabricator/tests/data/phabricator_api_repositories_page2.json
Changes applied before test
commit bc3dd1ad156085cc76a8a8c15adc96189da1f7d1
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Thu Jan 14 17:23:07 2021 +0100

    phabricator: Ensure request errors are raised as exceptions
    
    This ensures that a celery task will be marked as failed if a request error
    happens when listing origins.

commit 01072ea7904ebb33964ed06e841a7d8e3985c7fc
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Thu Jan 14 14:44:32 2021 +0100

    phabricator: Add test for new lister implementation
    
    Also remove no longer used JSON files.

commit c5652905952a55afee31dfc126b73e05534de43d
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Thu Jan 14 13:50:48 2021 +0100

    phabricator: Allow to pass forge base URL as lister parameter

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/106/ for more details.

In D4865#121702, @olasd wrote:

Could we keep the (more verbose) logger.warning call too? I don't remember what the HTTPError traceback contains when such an issue happens.

The HTTPError thrown by raise_for_status contains at least the status code and the faulty URL plus a possible reason.
Not sure if we need to log the response content here but I can still add an utility method in base lister class to do so.

In D4865#121702, @olasd wrote:

Could we keep the (more verbose) logger.warning call too? I don't remember what the HTTPError traceback contains when such an issue happens.

The HTTPError thrown by raise_for_status contains at least the status code and the faulty URL plus a possible reason.
Not sure if we need to log the response content here but I can still add an utility method in base lister class to do so.

As far as I can tell, the reason is usually just a text description of the status code. Usually we'll also want to log the body of the error response too, as that can contain more useful information on *why* the request failed.

See for instance:

Python 3.9.1+ (default, Jan 10 2021, 15:42:50) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.19.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import requests

In [2]: while True:
   ...:     try:
   ...:         r = requests.get('https://api.github.com/repositories')
   ...:         r.raise_for_status()
   ...:     except Exception as e:
   ...:         print(e)
   ...:         print(r.content)
   ...:         break
   ...: 
403 Client Error: rate limit exceeded for url: https://api.github.com/repositories
b'{"message":"API rate limit exceeded for 90.44.76.174. (But here\'s the good news: Authenticated requests get a higher rate limit. Check out the documentation for more details.)","documentation_url":"https://docs.github.com/rest/overview/resources-in-the-rest-api#rate-limiting"}\n'
In D4865#122050, @olasd wrote:
In D4865#121702, @olasd wrote:

Could we keep the (more verbose) logger.warning call too? I don't remember what the HTTPError traceback contains when such an issue happens.

The HTTPError thrown by raise_for_status contains at least the status code and the faulty URL plus a possible reason.
Not sure if we need to log the response content here but I can still add an utility method in base lister class to do so.

As far as I can tell, the reason is usually just a text description of the status code. Usually we'll also want to log the body of the error response too, as that can contain more useful information on *why* the request failed.

See for instance:

Python 3.9.1+ (default, Jan 10 2021, 15:42:50) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.19.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import requests

In [2]: while True:
   ...:     try:
   ...:         r = requests.get('https://api.github.com/repositories')
   ...:         r.raise_for_status()
   ...:     except Exception as e:
   ...:         print(e)
   ...:         print(r.content)
   ...:         break
   ...: 
403 Client Error: rate limit exceeded for url: https://api.github.com/repositories
b'{"message":"API rate limit exceeded for 90.44.76.174. (But here\'s the good news: Authenticated requests get a higher rate limit. Check out the documentation for more details.)","documentation_url":"https://docs.github.com/rest/overview/resources-in-the-rest-api#rate-limiting"}\n'

Ack, will reinstate the logged warning then.

Rebase and reinstate warning when HTTP request ends up with error

Build is green

Patch application report for D4865 (id=17304)

Could not rebase; Attempt merge onto d1fbccd988...

Updating d1fbccd..a41c03e
Fast-forward
 swh/lister/phabricator/lister.py                   |   23 +-
 .../phabricator/tests/data/api_empty_response.json |    8 -
 .../phabricator/tests/data/api_first_response.json | 2538 --------------------
 .../data/api_first_response_other_instance.json    | 2336 ------------------
 .../phabricator/tests/data/api_next_response.json  | 2354 ------------------
 .../data/api_response_undefined_protocol.json      |   60 -
 ...oldest,attachments[uris]=1,after=,api.token=foo |    1 -
 .../data/phabricator_api_repositories_page1.json   | 1246 ++++++++++
 .../data/phabricator_api_repositories_page2.json   | 1308 ++++++++++
 swh/lister/phabricator/tests/test_lister.py        |  134 +-
 swh/lister/phabricator/tests/test_tasks.py         |    2 +-
 11 files changed, 2694 insertions(+), 7316 deletions(-)
 delete mode 100644 swh/lister/phabricator/tests/data/api_empty_response.json
 delete mode 100644 swh/lister/phabricator/tests/data/api_first_response.json
 delete mode 100644 swh/lister/phabricator/tests/data/api_first_response_other_instance.json
 delete mode 100644 swh/lister/phabricator/tests/data/api_next_response.json
 delete mode 100644 swh/lister/phabricator/tests/data/api_response_undefined_protocol.json
 delete mode 120000 swh/lister/phabricator/tests/data/https_forge.softwareheritage.org/api_diffusion.repository.search,order=oldest,attachments[uris]=1,after=,api.token=foo
 create mode 100644 swh/lister/phabricator/tests/data/phabricator_api_repositories_page1.json
 create mode 100644 swh/lister/phabricator/tests/data/phabricator_api_repositories_page2.json
Changes applied before test
commit a41c03e4c8b9a5486386be7f9b325125557f49e7
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Thu Jan 14 17:23:07 2021 +0100

    phabricator: Ensure request errors are raised as exceptions
    
    This ensures that a celery task will be marked as failed if a request error
    happens when listing origins.

commit b743c36496f03a38a0b922a1c8e765a873a1c732
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Thu Jan 14 14:44:32 2021 +0100

    phabricator: Add test for new lister implementation
    
    Also remove no longer used JSON files.

commit d691c04eb80f3f70906fc2e61e19ef057cc312ba
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Thu Jan 14 13:50:48 2021 +0100

    phabricator: Allow to pass forge base URL as lister parameter

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/114/ for more details.

Perfect, thanks a lot!

This revision is now accepted and ready to land.Jan 18 2021, 5:27 PM