Page MenuHomeSoftware Heritage

archive-loader: add support for .tar.Z and .tar.lz tarball types
Closed, MigratedEdits Locked

Description

Current run on archive loader (with gnu origins) makes apparent some missing tarball types support.

Examples follow (but possibly not limited to):

Nov 27 11:16:18 worker15 python3[13194]: [2019-11-27 11:16:18,693: ERROR/ForkPoolWorker-2] Fail to load https://ftp.gnu.org/gnu/gettext/
                                         Traceback (most recent call last):
                                           File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 299, in load
                                             dl_artifacts, dest=tmpdir)
                                           File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 212, in uncompress                                                                                                                                                      uncompress(a_path, dest=uncompressed_path)
                                           File "/usr/lib/python3/dist-packages/swh/core/tarball.py", line 164, in uncompress
                                             raise ValueError('File %s is not a supported archive.' % tarpath)
                                         ValueError: File /tmp/tmpx5y4rxia/gettext-0.19.4.tar.lz is not a supported archive.
Nov 27 11:18:13 worker12 python3[14766]: [2019-11-27 11:18:13,216: ERROR/ForkPoolWorker-2] Fail to load https://ftp.gnu.org/gnu/groff/
                                         Traceback (most recent call last):
                                           File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 299, in load
                                             dl_artifacts, dest=tmpdir)
                                           File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 212, in uncompress
                                             uncompress(a_path, dest=uncompressed_path)
                                           File "/usr/lib/python3/dist-packages/swh/core/tarball.py", line 164, in uncompress
                                             raise ValueError('File %s is not a supported archive.' % tarpath)
                                         ValueError: File /tmp/tmph7sqy64t/groff-1.02.tar.Z is not a supported archive.

Event Timeline

ardumont renamed this task from Improve tarball support to archive-loader: Improve tarball support.Nov 27 2019, 12:22 PM
ardumont triaged this task as Normal priority.
ardumont created this task.
ardumont updated the task description. (Show Details)
ardumont added a project: Core Loader.
zack renamed this task from archive-loader: Improve tarball support to archive-loader: add support for .tar.Z and .tar.lz tarball types.Nov 27 2019, 12:25 PM

loader-mercurial for one use a dependency patool [1] which deals with more archive types.

[1] https://github.com/wummel/patool#patool

shutil (standard lib) could be more appropriate [1]
since it allows registering of other formats in its api.

[1] https://docs.python.org/3.7/library/shutil.html#archiving-operations

ardumont claimed this task.

Deployed.