Page MenuHomeSoftware Heritage

package.utils: Define a timeout on download connections
ClosedPublic

Authored by ardumont on Apr 14 2020, 2:55 PM.

Details

Summary

To avoid stuck processes.

Without this, requests can be stuck indefinitely as per T2357.

Related to T2357

Test Plan

tox

Diff Detail

Event Timeline

Build has FAILED

Patch application report for D3017 (id=10723)

Rebasing onto 599b6fbadb...

Current branch diff-target is up to date.
Changes applied before test
commit 67db8fc4f89668ccb6ea111c253ae4a418f9e52c
Author: Antoine R. Dumont (@ardumont) <antoine.romain.dumont@gmail.com>
Date:   Tue Apr 14 14:53:03 2020 +0200

    package.utils: Define a timeout on download connections
    
    To avoid stuck processes.
    
    Related to T2357

Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/23/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/23/console

Build has failed

o_O

14:56:13  ImportError while loading conftest '/var/lib/jenkins/workspace/DLDBASE/tests-on-diff/conftest.py'.
14:56:13  conftest.py:12: in <module>
14:56:13      from swh.storage.tests.conftest import *  # noqa
14:56:13  .tox/py3/lib/python3.7/site-packages/swh/storage/tests/conftest.py:34: in <module>
14:56:13      from swh.journal.writer.inmemory import InMemoryJournalWriter
14:56:13  E   ModuleNotFoundError: No module named 'swh.journal'

Now what!?

Reproduced with:

tox -r
vlorentz added a subscriber: vlorentz.

I believed it was the default. Oof.

https://2.python-requests.org/en/master/user/quickstart/#timeouts

Nearly all production code should use this parameter in nearly all requests

This revision is now accepted and ready to land.Apr 14 2020, 3:11 PM

I don't think we need a comment here, it's quite obvious

olasd added inline comments.
swh/loader/package/utils.py
90

Can you check that the timeout also applies to iter_content?

swh/loader/package/utils.py
90

the current implementation mentions this (navigating through my venv to the code within):

# Special case for urllib3.
if hasattr(self.raw, 'stream'):
    try:
        for chunk in self.raw.stream(chunk_size, decode_content=True):
            yield chunk
    except ProtocolError as e:
        raise ChunkedEncodingError(e)
    except DecodeError as e:
        raise ContentDecodingError(e)
    except ReadTimeoutError as e:
        raise ConnectionError(e)

So i guess it applies.

Is that check enough or did you have some more automated check in mind?

swh/loader/package/utils.py
90

Yeah, I just meant checking the documentation (I've seen conflicting reports); this looks fine, thanks.

This revision was landed with ongoing or failed builds.Apr 14 2020, 3:47 PM
This revision was automatically updated to reflect the committed changes.

damn, i forgot the build failed!