Page MenuHomeSoftware Heritage

vault timeout on cooking revision_gitfast for repositories with numerous number of revisions
Closed, MigratedEdits Locked

Description

De : Software Heritage Vault <bot@softwareheritage.org>
Date: ven. 26 juil. 2019 à 14:34
Subject: Bundle failed: revision_gitfast e2fb236
To: <roberto@dicosmo.org>


You have requested the following bundle from the Software Heritage
Vault:

Object Type: revision_gitfast
Object ID: e2fb236a3a2b4d026e27c0e65e1f6d1898c6cbee

This bundle could not be cooked for the following reason:

Timeout reached while assembling the requested bundle

We apologize for the inconvenience.

Event Timeline

ardumont triaged this task as Normal priority.Jul 26 2019, 3:22 PM
ardumont created this task.

First trying to check and reproduce, for this find the task's id and reschedule it:

$ psql service=swh-vault  # providing you have the right .pg_service.conf .pg_pass config)
> swh-vault=> select * from vault_bundle where type='revision_gitfast' and object_id='\xe2fb236a3a2b4d026e27c0e65e1f6d1898c6cbee';
   id    |       type       |                 object_id                  |  task_id  | task_status | sticky |          ts_created          | ts_done |        ts_last_access        |                     progress_msg
---------+------------------+--------------------------------------------+-----------+-------------+--------+------------------------------+---------+------------------------------+-------------------------------------------------------
 3297243 | revision_gitfast | \xe2fb236a3a2b4d026e27c0e65e1f6d1898c6cbee | 195003985 | failed      | f      | 2019-07-26 12:34:27.81315+00 |         | 2019-07-26 12:34:27.81315+00 | Timeout reached while assembling the requested bundle
(1 row)

$ workon swh
$ SCHEDULER_API_URL=http://saatchi.internal.softwareheritage.org:5008/; swh scheduler --url $SCHEDULER_API_URL task respawn 195003985
/home/tony/.virtualenvs/swh/lib/python3.7/site-packages/swh/scheduler/__init__.py:69: DeprecationWarning: Call to deprecated class SWHRemoteAPI. (Use the RPCClient instead) -- Deprecated since version 0.0.64.
  return SchedulerBackend(**args)
Respawn tasks ('195003985',)
ardumont renamed this task from vault timeout on cooking revision_gitfast to vault timeout on cooking revision_gitfast for repositories with numerous number of revisions.Jul 26 2019, 3:29 PM
ardumont updated the task description. (Show Details)

The issue came from the fact that the vault tries to retrieve the whole revisions log in a single call to the storage API.

As for the above case, the number of revisions to retrieve is pretty large (> 38000), the underlying request to the storage database timeouts as it has been configured this way.

The proper solution to resolve that issue is to take the same approach as in T1177, retrieve the revisions log in a paginated way client-side and thus avoid reaching request timeouts.

Fortunately, revisions walker enabling to perform that task have already been implemented in the storage so this should not be complicated to fix.