Page MenuHomeSoftware Heritage

staging: current opam loading issues
Closed, MigratedEdits Locked

Description

Multiple types of issues are currently reported in our sentry instance since the loading
started. Opened here so they are publicly shareable [1] (not investigated).

I've added [3] which is all the last log of the failing worker which should give the origin in failure plus the actual encoutered issue.

  • https://sentry.softwareheritage.org/share/issue/3a2db2bdcceb4421a29d235685b84e81/ OSError([Errno 36] File name too long: '/tmp/tmpd3fex7_2/weberizer-0.6.2.tar.gz?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5BA2674WEWV2CIOD%2F20210917%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20210917T070403Z&X-Amz-Expires=300&X-Amz-SignedHeaders=host&X-Amz-Signature=d5a0693b69f150d751c884bf44fc35568f7fb1339b038958b2021873b0d10cb8')

From afar, the 404, we cannot do much about it (P1117#7495 for the origins in question).

We have at least 2 unsupported archive formats "rpm", "tbz" ("tbz2"). Fixing those
sound like the most important. Plus it's beneficial for other package loaders (e.g
archive, cran, pypi, nixguix, ...).

The connection error ones might be worked around adding some retry decorators like those
existing in lister.

[1] kibana is not opened so my dashboard opening was not that helpful...

[2] full extract of all events "so far" in F4628907 (contains more than just opam tasks).

[3] F4628948

[4] http://kibana0.internal.softwareheritage.org:5601/goto/079dbfb481d31f3a86b8f41c3133e884 (staff only though)

Event Timeline

ardumont triaged this task as Normal priority.Aug 5 2021, 6:39 PM
ardumont created this task.
ardumont updated the task description. (Show Details)
ardumont updated the task description. (Show Details)

I'll trigger a new run of loading the opam origins in staging so the dataset of issues is updated.

P1158 and P1159 with some updated errors from the last run.

P1158 and P1159 with some updated errors from the last run.

I'll udpate those tomorrow as it's still ongoing.

P1158 and P1159 with some updated errors from the last run.

I'll udpate those tomorrow as it's still ongoing.

The ingestion is done (queue for opam tasks in staging scheduler is empty).

I updated the pastes with 404 [1] and unpacking [2] errors.
(so it references all errors from that last run)

[1] P1158

[2] P1159

@anlambert fixed plenty of issues including some from this ticket (thanks a bunch).

I'm planning on deploying those changes soon (around noon, i'm on something else currently).

And i'll trigger another run on staging after that ;)

@anlambert fixed plenty of issues including some from this ticket (thanks a bunch).

I'm planning on deploying those changes soon (around noon, i'm on something else currently).

And i'll trigger another run on staging after that ;)

done.

Heads up, it seems the main issues mentioned above have subsided.
It's still ongoing but the tendency seem to go the right way.

I'm looking at the kibana dashboard again [1]

[1] http://kibana0.internal.softwareheritage.org:5601/goto/82def232f7c606a05e2b451066a948e3

vlorentz updated the task description. (Show Details)

For the rpm support, [1] may help.

In the end, @anlambert made me notice that this error is not from the opam loader.
Our sentry instance aggregates issues per package loader. As they are all part of
swh-loader-core, they are seen as one.
So there is no need for this in the end as antoine fixed it for the pypi loader (by dismissing those iirc).

Deployed the multiple fixes we did with @anlambert.
That and @mclovin's upstream fixes. We should be good.

Closing this now.

ardumont changed the task status from Open to Work in Progress.Sep 20 2021, 2:44 PM
ardumont closed this task as Resolved.
ardumont claimed this task.
ardumont moved this task from Backlog to Weekly backlog on the System administration board.
ardumont moved this task from Weekly backlog to in-progress on the System administration board.
ardumont moved this task from deployed/landed/monitoring to done on the System administration board.