Page MenuHomeSoftware Heritage

Loading tasks are currently hanging
Closed, ResolvedPublic

Description

After updating my docker environment, I noticed every loading tasks are hanging when trying to add an origin to the storage.

This is also the reason why the docker tests are failing since a couple of days: https://jenkins.softwareheritage.org/view/all/job/swh-docker-dev/

I managed to track the instruction in swh-storage that makes any loading task hanging, this is related to writing to the journal.

By checking the date the issue first appeared in the docker tests, this seems related to the latest released version of swh-journal (0.0.16).

It looks like the issue also appears in production as no "Save code now" requests succeed since the last four days: https://archive.softwareheritage.org/save/#requests

Event Timeline

anlambert triaged this task as High priority.Sep 18 2019, 2:42 PM
anlambert created this task.
olasd added a subscriber: olasd.Sep 18 2019, 3:03 PM

The production issue with the save code now feature is another one: the git loaders are all stuck connected to bitbucket.org:443, waiting for it to send them data.

Looks like T1962 wasn't a mercurial-specific issue, but a bitbucket.org specific issue instead, and we should implement the timeout for git clone as well...

The only swh.journal behavior change (save from the underlying library) is that the direct writer flushes all its additions to the journal before returning; I guess that has unintended consequences on transient environments.

olasd closed this task as Resolved.Sep 30 2019, 12:28 PM
olasd claimed this task.

I think both issues have been solved separately.