Page MenuHomeSoftware Heritage

package.loader: catch the EOFError exception in uncompress function
ClosedPublic

Authored by lewo on Mar 20 2020, 11:29 AM.

Details

Summary

This exception has been seen on staging but we were unfortunately not
able to reproduce it. It could be related to underlay storage
issues. Here is the traceback:

Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 291, in load
  self._load_revision(p_info, origin)
File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 368, in _load_revision
  uncompressed_path = self.uncompress(dl_artifacts, dest=tmpdir)
File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 211, in uncompress
  uncompress(a_path, dest=uncompressed_path)
File "/usr/lib/python3/dist-packages/swh/core/tarball.py", line 72, in uncompress
  shutil.unpack_archive(tarpath, extract_dir=dest)
File "/usr/lib/python3.7/shutil.py", line 999, in unpack_archive
  func(filename, extract_dir, **kwargs)
File "/usr/lib/python3.7/shutil.py", line 934, in _unpack_tarfile
  tarobj.extractall(extract_dir)
File "/usr/lib/python3.7/tarfile.py", line 2002, in extractall
  numeric_owner=numeric_owner)
File "/usr/lib/python3.7/tarfile.py", line 2044, in extract
  numeric_owner=numeric_owner)
File "/usr/lib/python3.7/tarfile.py", line 2114, in _extract_member
  self.makefile(tarinfo, targetpath)
File "/usr/lib/python3.7/tarfile.py", line 2163, in makefile
  copyfileobj(source, target, tarinfo.size, ReadError, bufsize)
File "/usr/lib/python3.7/tarfile.py", line 247, in copyfileobj
  buf = src.read(bufsize)
File "/usr/lib/python3.7/bz2.py", line 178, in read
  return self._buffer.read(size)
File "/usr/lib/python3.7/_compression.py", line 68, in readinto
  data = self.read(len(byte_view))
File "/usr/lib/python3.7/_compression.py", line 99, in read
  raise EOFError("Compressed file ended before the "
EOFError: Compressed file ended before the end-of-stream marker was reached

This happened when decompressing the file
https://downloads.sourceforge.net//asio/asio-1.12.1.tar.bz2 but the
loader executed locally can decompress it without any issue.

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Branch
catch-EOFError
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 11256
Build 17010: tox-on-jenkinsJenkins
Build 17009: arc lint + arc unit

Event Timeline

Did you try truncating a .tar file to reproduce the issue?

swh/loader/package/loader.py
381

you can add a test on functional loader which patches the uncompress method to raise this.

Did you try truncating a .tar file to reproduce the issue?

No, and that's a good idea!
I just tried but unfortunately, I didn't reproduce;(

I actually succeeded with a bigger truncated file actually: 70KB

I added a test for the EOFError.

D2860 and especially are going your way D2862 ;)

What about this diff? I think it's still valuable since it catch more precisely errors and add a test case;)
WDYT?

Yes, i think we can land it after D2860 and D2862 ;)

D2862 will removes the try: except in the _load_revision and captures all in the main loop.
So in effect, that will make that diff add a scenario and not touch the code.

Needs a rebase now ;)

And then i'll accept it.

This revision now requires changes to proceed.Mar 23 2020, 1:03 PM

UPdate cmomit message to only mention the test

This revision is now accepted and ready to land.Mar 23 2020, 2:35 PM