Page MenuHomeSoftware Heritage

common/origin_save/origin_exists: Handle Internet Archive artifact URLs
ClosedPublic

Authored by anlambert on Jun 11 2021, 4:43 PM.

Details

Summary

Some HTTP hosted tarballs have been archived by the Internet Archive.
In that case to check URL validity and get tarball metadata, HEAD requests
must follow redirection and info regarding last modified date or content
length must be retrieved from different HTTP response headers.

Related to T3365

Diff Detail

Repository
rDWAPPS Web applications
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D5859 (id=20967)

Rebasing onto da39599f34...

Current branch diff-target is up to date.
Changes applied before test
commit a0db251b3280852ebb078c848f05a07db5bab98f
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Fri Jun 11 16:36:55 2021 +0200

    common/origin_save/origin_exists: Handle Internet Archive artifact URLs
    
    Some HTTP hosted tarballs have been archived by the Internet Archive.
    In that case to check URL validity and get tarball metadata, HEAD requests
    must follow redirection and info regarding last modified date or content
    length must be retrieved from different HTTP response headers.
    
    Related to T3365

See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/864/ for more details.

ardumont added a subscriber: ardumont.

Awesome, thanks.

This revision is now accepted and ready to land.Jun 11 2021, 5:10 PM