Page MenuHomeSoftware Heritage

Status "scheduled" over hours, but archiving doesn't seem to start
Closed, ResolvedPublic

Description

There are two scheduled save requests (3.9.2021, 02:08:04 and 3.9.2021, 06:58:36) for which the archiving process doesn't start. Possibly an error, but no further information given in the "Info" row of the list of Save requests. - Thanks for checking!

Update (3.9.2021, 08:40): the second mentioned request has succeeded.

Event Timeline

mdidas created this object in space S1 Public.
vlorentz triaged this task as Unbreak Now! priority.Sep 3 2021, 10:52 AM
vlorentz added a project: Save Code Now.
vlorentz added a subscriber: vlorentz.

@mdidas what are the repositories you requested?

(was looking into it)
I'd say this one qualifies (in UTC+2):

11:09:40 swh-web@belvedere:5432=> select * from save_origin_request where loading_task_status='scheduled' ;
+-------+-------------------------------+------------+----------------------------------------------+----------+-----------------+------------+---------------------+--------------+----------+
|  id   |         request_date          | visit_type |                  origin_url                  |  status  | loading_task_id | visit_date | loading_task_status | visit_status | user_ids |
+-------+-------------------------------+------------+----------------------------------------------+----------+-----------------+------------+---------------------+--------------+----------+
| 91474 | 2021-09-03 00:08:04.846377+00 | git        | https://github.com/beraute/Klamath-mountains | accepted |       398756284 | (null)     | scheduled           | (null)       | (null)   |
+-------+-------------------------------+------------+----------------------------------------------+----------+-----------------+------------+---------------------+--------------+----------+
(1 row)

For some unknown reason, that particular origins is scheduled but I don't see any log about that run.
I triggered it back and it got finished almost immediately (uneventful).

And now it's no longer marked as scheduled in the save code now db.

I actually requested https://github.com/tue-alga/CoordinatedSchematization which did not start for over 45 minutes or so (which is unusual), then noticed that the one above was also in "scheduled" state over hours and decided to open a task here.

I actually requested https://github.com/tue-alga/CoordinatedSchematization which did
not start for over 45 minutes or so (which is unusual), then noticed that the one
above was also in "scheduled" state over hours and decided to open a task here.

It can happen that if we have more messages in the save code now queues than workers
actually consuming, we can take some delay in ingesting new repositories. Undetermined
because it all really depends on the repositories we are ingesting (some can take
hours).

I've opened a task to try and monitor this pattern [1].

Thanks for raising concern.

[1] T3548

ardumont claimed this task.