Page MenuHomeSoftware Heritage

package.utils: Define a timeout on download connections
ClosedPublic

Authored by ardumont on Apr 14 2020, 2:55 PM.

Details

Summary

To avoid stuck processes.

Without this, requests can be stuck indefinitely as per T2357.

Related to T2357

Test Plan

tox

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build has FAILED

Patch application report for D3017 (id=10723)

Rebasing onto 599b6fbadb...

Current branch diff-target is up to date.
Changes applied before test
commit 67db8fc4f89668ccb6ea111c253ae4a418f9e52c
Author: Antoine R. Dumont (@ardumont) <antoine.romain.dumont@gmail.com>
Date:   Tue Apr 14 14:53:03 2020 +0200

    package.utils: Define a timeout on download connections
    
    To avoid stuck processes.
    
    Related to T2357

Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/23/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/23/console

Build has failed

o_O

14:56:13  ImportError while loading conftest '/var/lib/jenkins/workspace/DLDBASE/tests-on-diff/conftest.py'.
14:56:13  conftest.py:12: in <module>
14:56:13      from swh.storage.tests.conftest import *  # noqa
14:56:13  .tox/py3/lib/python3.7/site-packages/swh/storage/tests/conftest.py:34: in <module>
14:56:13      from swh.journal.writer.inmemory import InMemoryJournalWriter
14:56:13  E   ModuleNotFoundError: No module named 'swh.journal'

Now what!?

Reproduced with:

tox -r
vlorentz added a subscriber: vlorentz.

I believed it was the default. Oof.

https://2.python-requests.org/en/master/user/quickstart/#timeouts

Nearly all production code should use this parameter in nearly all requests

This revision is now accepted and ready to land.Apr 14 2020, 3:11 PM

I don't think we need a comment here, it's quite obvious

olasd added inline comments.
swh/loader/package/utils.py
90

Can you check that the timeout also applies to iter_content?

swh/loader/package/utils.py
90

the current implementation mentions this (navigating through my venv to the code within):

# Special case for urllib3.
if hasattr(self.raw, 'stream'):
    try:
        for chunk in self.raw.stream(chunk_size, decode_content=True):
            yield chunk
    except ProtocolError as e:
        raise ChunkedEncodingError(e)
    except DecodeError as e:
        raise ContentDecodingError(e)
    except ReadTimeoutError as e:
        raise ConnectionError(e)

So i guess it applies.

Is that check enough or did you have some more automated check in mind?

swh/loader/package/utils.py
90

Yeah, I just meant checking the documentation (I've seen conflicting reports); this looks fine, thanks.

This revision was landed with ongoing or failed builds.Apr 14 2020, 3:47 PM
This revision was automatically updated to reflect the committed changes.

damn, i forgot the build failed!