Page MenuHomeSoftware Heritage

package/utils: Add FTP protocol support to download function
ClosedPublic

Authored by anlambert on Sep 10 2021, 4:07 PM.

Details

Summary

Add support to download file using FTP protocol through the
use of the urllib.request.urlopen function from Python standard
library.

Related to T2687

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Branch
ftp-download-support
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 23508
Build 36673: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 36672: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D6239 (id=22574)

Rebasing onto 50b062adc7...

Current branch diff-target is up to date.
Changes applied before test
commit 6f4d279e694cb37c9e0bd9cc7c7e82c51d2c64d1
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Sep 10 15:43:22 2021 +0200

    package/utils: Add FTP protocol support to download function
    
    Add support to download file using FTP protocol through the
    use of the urllib.request.urlopen function from Python standard
    library.
    
    Related to T2687

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/534/ for more details.

vlorentz added inline comments.
swh/loader/package/utils.py
85–87

What about something like this to keep it streaming? (the response will be closed when decrefed anyway)

swh/loader/package/utils.py
85–87

Nice, just tested in docker and it works great, will update the diff then.

Update: also stream FTP responses.

olasd added inline comments.
swh/loader/package/utils.py
85–87

We should call response.close() in both cases so the connection gets properly returned to the pool. We can do that after we're done streaming it (before checking the hashes).

swh/loader/package/utils.py
106

(add response.close() here)

Build was aborted

Patch application report for D6239 (id=22576)

Rebasing onto 50b062adc7...

Current branch diff-target is up to date.
Changes applied before test
commit def3e4cbe4549db48a954eb1034da20b8bdd416c
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Sep 10 15:43:22 2021 +0200

    package/utils: Add FTP protocol support to download function
    
    Add support to download file using FTP protocol through the
    use of the urllib.request.urlopen function from Python standard
    library.
    
    Related to T2687

Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/535/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/535/console

Build is green

Patch application report for D6239 (id=22577)

Rebasing onto 50b062adc7...

Current branch diff-target is up to date.
Changes applied before test
commit 1266b2e8c49a2fee32ef348e888f1b8134e62f51
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Sep 10 15:43:22 2021 +0200

    package/utils: Add FTP protocol support to download function
    
    Add support to download file using FTP protocol through the
    use of the urllib.request.urlopen function from Python standard
    library.
    
    Related to T2687

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/536/ for more details.

Update: explicitely close response once processed

Build is green

Patch application report for D6239 (id=22578)

Rebasing onto 50b062adc7...

Current branch diff-target is up to date.
Changes applied before test
commit 067450ab3784d9ff0aba6a66a8738ba90c7b86c5
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Sep 10 15:43:22 2021 +0200

    package/utils: Add FTP protocol support to download function
    
    Add support to download file using FTP protocol through the
    use of the urllib.request.urlopen function from Python standard
    library.
    
    Related to T2687

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/537/ for more details.

This revision is now accepted and ready to land.Sep 14 2021, 10:14 AM

Build is green

Patch application report for D6239 (id=22631)

Rebasing onto 0efaf7a0ef...

Current branch diff-target is up to date.
Changes applied before test
commit d5e54a5eea1e0d6c6fac7e87892b684a5f53d911
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Fri Sep 10 15:43:22 2021 +0200

    package/utils: Add FTP protocol support to download function
    
    Add support to download file using FTP protocol through the
    use of the urllib.request.urlopen function from Python standard
    library.
    
    Related to T2687

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/545/ for more details.