Page MenuHomeSoftware Heritage

nixguix: Use content-disposition from http head request if provided
ClosedPublic

Authored by ardumont on Oct 25 2022, 5:50 PM.

Details

Summary

As a last fallback after the content-type check, instead of raising immediately.

Related to T3781
Depends on D8773

Diff Detail

Repository
rDLS Listers
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 32592
Build 51057: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 51056: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D8774 (id=31632)

Could not rebase; Attempt merge onto 8355fee25f...

Merge made by the 'recursive' strategy.
 swh/lister/nixguix/lister.py                       | 47 +++++++++++++++++++---
 swh/lister/nixguix/tests/data/sources-success.json | 14 +++++++
 swh/lister/nixguix/tests/test_lister.py            | 27 +++++++++++--
 3 files changed, 78 insertions(+), 10 deletions(-)
Changes applied before test
commit 68e73bab04e71a57f64311f33c286377b27d34ff
Merge: 8355fee 3670b51
Author: Jenkins user <jenkins@localhost>
Date:   Tue Oct 25 15:50:39 2022 +0000

    Merge branch 'diff-target' into HEAD

commit 3670b519275047331cbc833f54f698ca534c0180
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 25 17:46:30 2022 +0200

    nixguix: Use content-disposition from http head request if provided
    
    As a last fallback after the content-type check, instead of raising immediately.
    
    Related to T3781

commit e1733421cdba831ba1619c05c89189d795fc8f49
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 25 17:04:29 2022 +0200

    nixguix: Deal with edge case url with version instead of extension
    
    Prior to this, some urls were detected as file because their version name were wrongly
    detected as extension, hence not matching tarball extensions.
    
    Related to T3781

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/831/ for more details.

Build is green

Patch application report for D8774 (id=31636)

Could not rebase; Attempt merge onto 8355fee25f...

Merge made by the 'recursive' strategy.
 swh/lister/nixguix/lister.py                       | 47 +++++++++++++++++++---
 swh/lister/nixguix/tests/data/sources-success.json | 14 +++++++
 swh/lister/nixguix/tests/test_lister.py            | 29 +++++++++++--
 3 files changed, 80 insertions(+), 10 deletions(-)
Changes applied before test
commit fdb4b698a10b16b191fb1a0db1a218518e32c574
Merge: 8355fee a479820
Author: Jenkins user <jenkins@localhost>
Date:   Tue Oct 25 16:17:07 2022 +0000

    Merge branch 'diff-target' into HEAD

commit a47982036af0b328f521ae3eeb12207fdc892fb0
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 25 17:46:30 2022 +0200

    nixguix: Use content-disposition from http head request if provided
    
    As a last fallback after the content-type check, instead of raising immediately.
    
    Related to T3781

commit 78c67d67e1bc9e767811d99729a11536991a3764
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 25 17:04:29 2022 +0200

    nixguix: Deal with edge case url with version instead of extension
    
    Prior to this, some urls were detected as file because their version name were wrongly
    detected as extension, hence not matching tarball extensions.
    
    Related to T3781

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/833/ for more details.

anlambert added inline comments.
swh/lister/nixguix/lister.py
257

not sure if we should assert here, content_disposition_type could have a value different from attachment while having a filename (see [spec}(https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Disposition) ).

258

I think you should rather check if the header value contains filename= and split on that.

ardumont marked 2 inline comments as done.

Adapt according to last review

swh/lister/nixguix/lister.py
258

Sounds better indeed.

swh/lister/nixguix/tests/test_lister.py
253

added some more fields in there to check the parsing does the job ok.

This revision is now accepted and ready to land.Oct 26 2022, 12:02 PM

Build is green

Patch application report for D8774 (id=31639)

Could not rebase; Attempt merge onto 8355fee25f...

Updating 8355fee..81688ca
Fast-forward
 swh/lister/nixguix/lister.py                       | 50 +++++++++++++++++++---
 swh/lister/nixguix/tests/data/sources-success.json | 21 +++++++++
 swh/lister/nixguix/tests/test_lister.py            | 36 ++++++++++++++--
 3 files changed, 97 insertions(+), 10 deletions(-)
Changes applied before test
commit 81688ca17e667c693fbc46abadf9275bb99a54f1
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 25 17:46:30 2022 +0200

    nixguix: Use content-disposition from http head request if provided
    
    As a last fallback after the content-type check, instead of raising immediately.
    
    Related to T3781

commit 026fea21da49b30fda9c5adbb24c6d7a8c24c8df
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Oct 25 17:04:29 2022 +0200

    nixguix: Deal with edge case url with version instead of extension
    
    Prior to this, some urls were detected as file because their version name were wrongly
    detected as extension, hence not matching tarball extensions.
    
    Related to T3781

See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/836/ for more details.