Page MenuHomeSoftware Heritage

pypi.lister: Handle xml-rpc throttling properly
AbandonedPublic

Authored by ardumont on Jul 9 2021, 10:50 AM.

Details

Summary

Actual run in docker made this apparent [1]

[1] almost immediately

xmlrpc.client.Fault: <Fault -32500: 'HTTPTooManyRequests: The action could not be performed because there were too many requests by the client. Limit may reset in 1 seconds.'>

Depends on D5977
Related to T3399

Test Plan

tox

Diff Detail

Repository
rDLS Listers
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 22543
Build 35132: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 35131: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D5983 (id=21568)

Could not rebase; Attempt merge onto 698be475e9...

Updating 698be47..92e7029
Fast-forward
 swh/lister/pypi/lister.py                        | 169 ++++++++++++---
 swh/lister/pypi/tasks.py                         |   4 +-
 swh/lister/pypi/tests/data/https_pypi.org/simple |  12 --
 swh/lister/pypi/tests/test_lister.py             | 259 +++++++++++++++++------
 4 files changed, 342 insertions(+), 102 deletions(-)
 delete mode 100644 swh/lister/pypi/tests/data/https_pypi.org/simple
Changes applied before test
commit 92e7029f8ed5ab658dac3f4e68d18f8c42075c3f
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Jul 9 10:42:57 2021 +0200

    pypi.lister: Handle xml-rpc throttling properly
    
    Related to T3399

commit 77f7da32e06361e8a4c860ad8c884582e5804796
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jul 7 16:14:09 2021 +0200

    Make PyPI lister incremental and complete in regards to last_update
    
    This rewrote the current implementation to actually use pypi's xml-rpc api which allows
    to be incremental. It also allows to fetch the last release date per package. This last
    part actually make it possible to update the "last_update" entry in the ListedOrigin
    model.
    
    Related to T3399

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/327/ for more details.