Page MenuHomeSoftware Heritage

Handle corrupted package release tarballs
ClosedPublic

Authored by anlambert on Tue, May 28, 2:14 PM.

Details

Summary

It exists packages with corrupted tarballs uploaded on the npm registry.

This results in the following error reported by the loader:

[2019-05-08 03:27:54,488: ERROR/ForkPoolWorker-196] Loading failure, updating to `partial` status
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 895, in load
    more_data_to_fetch = self.fetch_data()
  File "/usr/lib/python3/dist-packages/swh/loader/npm/loader.py", line 203, in fetch_data
    data = next(self.new_versions)
  File "/usr/lib/python3/dist-packages/swh/loader/npm/client.py", line 149, in prepare_package_versions
    version_data)
  File "/usr/lib/python3/dist-packages/swh/loader/npm/client.py", line 182, in _prepare_package_version
    tarball.uncompress(filepath, path)
  File "/usr/lib/python3/dist-packages/swh/core/tarball.py", line 163, in uncompress
    raise ValueError('File %s is not a supported archive.' % tarpath)
ValueError: File /tmp/swh.loader.npm/swh.loader.npm.13oefc8p-17173/promjs/0.1.0/promjs-0.1.0.tgz is not a supported archive.

So skip associated package release processing but keep a reference of it in the produced
snapshot (i.e. indicate we found the release but we could not process it).

Related T1726

Test Plan

A new test has been added.

Diff Detail

Repository
rDLDNPM npm loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

anlambert created this revision.Tue, May 28, 2:14 PM

Out of curiosity, did you check what the tarball actually look like?

ardumont accepted this revision.Tue, May 28, 2:28 PM

Sounds fine btw

This revision is now accepted and ready to land.Tue, May 28, 2:28 PM
This revision was automatically updated to reflect the committed changes.