Page MenuHomeSoftware Heritage

package/utils: Add retry policy to download in case of throttling
ClosedPublic

Authored by anlambert on Sep 5 2022, 3:46 PM.

Details

Summary

Some HTTP download requests might be throttled by remote servers
so add retry mechanism with exponential backoff to fix tarball
downloads in some loaders.

I observed that kind of issues with the AUR loader when testing
it in docker, see below:

docker-swh-loader-1  | [2022-09-05 13:38:26,727: ERROR/ForkPoolWorker-74] Failed to load branch releases/1.79-1/nvi.tar.gz for https://aur.archlinux.org/packages/nvi
docker-swh-loader-1  | Traceback (most recent call last):
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/package/loader.py", line 649, in load
docker-swh-loader-1  |     res = self._load_release(p_info, origin)
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/package/loader.py", line 822, in _load_release
docker-swh-loader-1  |     dl_artifacts = self.download_package(p_info, tmpdir)
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/package/loader.py", line 399, in download_package
docker-swh-loader-1  |     return [download(p_info.url, dest=tmpdir, filename=p_info.filename)]
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/package/utils.py", line 109, in download
docker-swh-loader-1  | ValueError: Fail to query 'https://aur.archlinux.org/cgit/aur.git/snapshot/nvi.tar.gz'. Reason: 429
docker-swh-loader-1  | [2022-09-05 13:38:26,728: DEBUG/ForkPoolWorker-74] default version: 1.79-1
docker-swh-loader-1  | [2022-09-05 13:38:26,728: DEBUG/ForkPoolWorker-74] extra branches: {}
docker-swh-loader-1  | [2022-09-05 13:38:26,728: DEBUG/ForkPoolWorker-74] releases: {'1.79-1': []}
docker-swh-loader-1  | [2022-09-05 13:38:26,728: DEBUG/ForkPoolWorker-74] snapshot: {'branches': {}}
docker-swh-loader-1  | [2022-09-05 13:38:26,729: DEBUG/ForkPoolWorker-74] snapshot: Snapshot(branches=ImmutableDict({}), id=hash_to_bytes('1a8893e6a86f444e8be8e7bda6cb34fb1735a00e'))
docker-swh-loader-1  | [2022-09-05 13:38:26,729: DEBUG/ForkPoolWorker-74] Flushing 1 objects of type snapshot
docker-swh-loader-1  | [2022-09-05 13:38:26,823: WARNING/ForkPoolWorker-74] 1 failed branches
docker-swh-loader-1  | [2022-09-05 13:38:26,823: WARNING/ForkPoolWorker-74] Failed branches: releases/1.79-1/nvi.tar.gz
docker-swh-loader-1  | [2022-09-05 13:38:26,827: INFO/ForkPoolWorker-74] Task swh.loader.package.aur.tasks.LoadAur[8fafddd4-9005-4609-a385-c02e546ce52e] succeeded in 0.3548696079997171s: {'status': 'uneventful', 'snapshot_id': '1a8893e6a86f444e8be8e7bda6cb34fb1735a00e'}
docker-swh-loader-1  | [2022-09-05 13:38:26,847: DEBUG/ForkPoolWorker-74] Loading config file /loader.yml
docker-swh-loader-1  | [2022-09-05 13:38:26,852: INFO/MainProcess] Task swh.loader.git.tasks.UpdateGitRepository[1b7cae47-9cc1-42e8-b5e8-84231efce0e8] received
docker-swh-loader-1  | [2022-09-05 13:38:26,915: DEBUG/ForkPoolWorker-74] last snapshot: None
docker-swh-loader-1  | [2022-09-05 13:38:26,916: DEBUG/ForkPoolWorker-74] package_info: AurPackageInfo(url='https://aur.archlinux.org/cgit/aur.git/snapshot/libomxil-component-xvideo.tar.gz', filename='libomxil-component-xvideo.tar.gz', directory_extrinsic_metadata=[], name='libomxil-component-xvideo', version='0.1-1', last_modified='2015-11-27T07:01:23+00:00')
docker-swh-loader-1  | [2022-09-05 13:38:27,068: ERROR/ForkPoolWorker-74] Failed to load branch releases/0.1-1/libomxil-component-xvideo.tar.gz for https://aur.archlinux.org/packages/libomxil-component-xvideo
docker-swh-loader-1  | Traceback (most recent call last):
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/package/loader.py", line 649, in load
docker-swh-loader-1  |     res = self._load_release(p_info, origin)
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/package/loader.py", line 822, in _load_release
docker-swh-loader-1  |     dl_artifacts = self.download_package(p_info, tmpdir)
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/package/loader.py", line 399, in download_package
docker-swh-loader-1  |     return [download(p_info.url, dest=tmpdir, filename=p_info.filename)]
docker-swh-loader-1  |   File "/src/swh-loader-core/swh/loader/package/utils.py", line 109, in download
docker-swh-loader-1  | ValueError: Fail to query 'https://aur.archlinux.org/cgit/aur.git/snapshot/libomxil-component-xvideo.tar.gz'. Reason: 429
docker-swh-loader-1  | [2022-09-05 13:38:27,069: DEBUG/ForkPoolWorker-74] default version: 0.1-1
docker-swh-loader-1  | [2022-09-05 13:38:27,069: DEBUG/ForkPoolWorker-74] extra branches: {}
docker-swh-loader-1  | [2022-09-05 13:38:27,069: DEBUG/ForkPoolWorker-74] releases: {'0.1-1': []}
docker-swh-loader-1  | [2022-09-05 13:38:27,069: DEBUG/ForkPoolWorker-74] snapshot: {'branches': {}}
docker-swh-loader-1  | [2022-09-05 13:38:27,069: DEBUG/ForkPoolWorker-74] snapshot: Snapshot(branches=ImmutableDict({}), id=hash_to_bytes('1a8893e6a86f444e8be8e7bda6cb34fb1735a00e'))
docker-swh-loader-1  | [2022-09-05 13:38:27,069: DEBUG/ForkPoolWorker-74] Flushing 1 objects of type snapshot
docker-swh-loader-1  | [2022-09-05 13:38:27,116: WARNING/ForkPoolWorker-74] 1 failed branches
docker-swh-loader-1  | [2022-09-05 13:38:27,116: WARNING/ForkPoolWorker-74] Failed branches: releases/0.1-1/libomxil-component-xvideo.tar.gz
docker-swh-loader-1  | [2022-09-05 13:38:27,118: INFO/ForkPoolWorker-74] Task swh.loader.package.aur.tasks.LoadAur[5ce421cd-1610-4554-b0a4-26b4791145d3] succeeded in 0.2711032420011179s: {'status': 'uneventful', 'snapshot_id': '1a8893e6a86f444e8be8e7bda6cb34fb1735a00e'}

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Branch
package-download-retry
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 31316
Build 48989: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 48988: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D8390 (id=30282)

Rebasing onto 68e68e3f92...

Current branch diff-target is up to date.
Changes applied before test
commit afc10d5c76c5054b4ef6e3ae554d9e7f6ade462b
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Mon Sep 5 15:08:34 2022 +0200

    package/utils: Add retry policy to download in case of throttling
    
    Some HTTP download requests might be throttled by remote servers
    so add retry mechanism with exponential backoff to fix tarball
    downloads in some loaders.

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/857/ for more details.

Is there no Retry-After header in the HTTP response?

ardumont added inline comments.
swh/loader/package/tests/test_utils.py
33–36

Is there no Retry-After header in the HTTP response?

Nope.

15:56 $ curl -i https://aur.archlinux.org/cgit/aur.git/snapshot/vim-pathogen-git.tar.gz
HTTP/2 429 
server: nginx
date: Mon, 05 Sep 2022 13:56:15 GMT
content-type: text/html
content-length: 162
strict-transport-security: max-age=31536000; includeSubdomains; preload

<html>
<head><title>429 Too Many Requests</title></head>
<body>
<center><h1>429 Too Many Requests</h1></center>
<hr><center>nginx</center>
</body>
</html>

lgtm (and this unifies with how we deal the somewhat similar cases in the listers')

This revision is now accepted and ready to land.Sep 5 2022, 3:59 PM

Simplify exception check in test_download_fail_to_download

Build is green

Patch application report for D8390 (id=30287)

Rebasing onto 68e68e3f92...

Current branch diff-target is up to date.
Changes applied before test
commit 3ae2df31fe996e820109e0508f5f2a1ab48f9421
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Mon Sep 5 15:08:34 2022 +0200

    package/utils: Add retry policy to download in case of throttling
    
    Some HTTP download requests might be throttled by remote servers
    so add retry mechanism with exponential backoff to fix tarball
    downloads in some loaders.

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/858/ for more details.