Multiple types of issues are currently reported in our sentry instance since the loading
started. Opened here so they are publicly shareable [1] (not investigated).
I've added [3] which is all the last log of the failing worker which should give the origin in failure plus the actual encoutered issue.
- Problem during unpacking ...mssql-2.1.0.tbz. Reason: Unknown archive format '.../mssql-2.1.0.tbz'...
- https://sentry.softwareheritage.org/share/issue/e06cfad6d7b84dafabdb6c5f1e2ddb38/ IsADirectoryError([Errno 21] Is a directory: '/tmp/tmp1sq9qtky/')
- https://sentry.softwareheritage.org/share/issue/3a2db2bdcceb4421a29d235685b84e81/ OSError([Errno 36] File name too long: '/tmp/tmpd3fex7_2/weberizer-0.6.2.tar.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5BA2674WEWV2CIOD%2F20210917%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20210917T070403Z&X-Amz-Expires=300&X-Amz-SignedHeaders=host&X-Amz-Signature=d5a0693b69f150d751c884bf44fc35568f7fb1339b038958b2021873b0d10cb8')
- Problem during unpacking /tmp/tmpnof815ac/ec34f9a8d1ee28130bed89ea486cf168 Reason: Unknown archive format '/tmp/tmpnof815ac/ec34f9a8d1ee28130bed89ea486cf168'
From afar, the 404, we cannot do much about it (P1117#7495 for the origins in question).
We have at least 2 unsupported archive formats "rpm", "tbz" ("tbz2"). Fixing those
sound like the most important. Plus it's beneficial for other package loaders (e.g
archive, cran, pypi, nixguix, ...).
The connection error ones might be worked around adding some retry decorators like those
existing in lister.
[1] kibana is not opened so my dashboard opening was not that helpful...
[2] full extract of all events "so far" in F4628907 (contains more than just opam tasks).
[3] F4628948
[4] http://kibana0.internal.softwareheritage.org:5601/goto/079dbfb481d31f3a86b8f41c3133e884 (staff only though)