Page MenuHomeSoftware Heritage

package.utils: Drop unneeded hashes from download computation
ClosedPublic

Authored by ardumont on Dec 20 2019, 1:33 PM.

Details

Summary

This should fix the current frequent issue on the debian loader [1]

  • loader retrieve badly the length information from the content-length header
  • We check for download size with that incorrect data
  • so download call function fails because it checks the local size and it's difference

all of this for the sha1_git computation which needs the length.
But we do not really need that hash (nor blake).

So simplifying by removing will actually fix [1]

[1] https://sentry.softwareheritage.org/share/issue/49d464bee7b24ad080598b32cd3eb9a8/

[2] https://sentry.softwareheritage.org/organizations/swh/issues/?project=9&query=is%3Aunresolved&sort=freq&statsPeriod=14d

Test Plan

tox

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

package.utils: Drop unneeded hashes from download computation

Drop the following computation hashes:

  • length (which was read from the content-length header, and sometimes it was wrongly read because of some missing header in the initial request)
  • sha1_git
  • blake2s256
ardumont retitled this revision from debian.loader: Fix wrong request content-length received to package.utils: Drop unneeded hashes from download computation.Dec 20 2019, 2:03 PM
ardumont edited the summary of this revision. (Show Details)

Drop unneeded instructions about response.headers

Just one tiny inline change, but looks good, thanks!

swh/loader/package/utils.py
101–102

'checksums': computed_hashes,?

This revision is now accepted and ready to land.Dec 20 2019, 2:17 PM

Fix unneeded dict comprehension