Page MenuHomeSoftware Heritage

hangs with: ERROR:root:Unknown SWHID: 'swh:1:dir:be2222d58268b1ba08e612ed57c21c703653e653'
Closed, MigratedEdits Locked

Description

While working on the tutorial (T2676), I've tried the following:

$ cd swhfs/  # mounted swhfs mountpoint
$ cd archive/swh:1:dir:83ae623f312d4b803329d6da9aaddb8c38b84278/  # dir of the original doom 3 gpl release
$ sloccount .

this works for a while, fetching objects and eventually hangs with the following error in the log:

ERROR:root:Unknown SWHID: 'swh:1:dir:be2222d58268b1ba08e612ed57c21c703653e653'

that directory however do exist in the archive.

It might be an actual hang, or it might be that I've hit the rate limit (I wasn't using any auth token or the VPN). Either way the failure mode is not good, as the user has no way of knowing what's happening.

And the only error message given is misleading.

Event Timeline

zack triaged this task as High priority.
zack created this task.
zack updated the task description. (Show Details)

Looks like a lot of unrelated error are hidden behind "unknown SWHID". For instance, using the Web API via the VPN with the VPN down, results in that error too, with the syscall hanging.
There seems to be two things to fix about this:

  • more appropriate error reporting (rate limit exceeded, can't connect to API server, etc.)
  • the syscall should not hang, but rather return the user space an appropriate failure (note indeed that these hangs are bad, of the kind you can't Ctrl-C out of)

In a different test, I've seen the Too many request error, but only after Ctrl-C, with an asyncio issue:

ERROR:root:Unknown name during lookup: 'HEAD'
   unique: 1866, error: -2 (No such file or directory), outsize: 16
^CERROR:asyncio:Task exception was never retrieved
future: <Task finished name='Task-45' coro=<_session_loop() done, defined at /home/zack/.virtualenvs/swh/lib/python3.8/site-packages/_pyfuse3.py:28> exception=HTTPError('429 Client Error: Too Many Requests for url: https://archive.softwareheritage.org/api/1/content/sha1_git:642be81e2679d4524e736bafe6c4576360a9fd5c/raw/')>

This might be the key issue here, a missing (async) exception handler somewhere.

haltode changed the task status from Open to Work in Progress.Oct 19 2020, 11:03 AM
haltode moved this task from Backlog to In progress on the Software Heritage filesystem board.