Page MenuHomeSoftware Heritage

Unstuck nixguix loading visits
Closed, ResolvedPublic

Description

It's stuck. Investigate and fix.

14:51 <samplet> Hi SWH!  According to the Web interface, the nixguix loader hasn’t had a successful visit of Guix’s “sources.json” in over a month.
14:52 <samplet> That seems to be true of nixpkgs, too.
14:54 <+ardumont> samplet: yes, it's only partial because there are some types of origins referenced that's not dealt with yet
14:54 <+ardumont> and also because some origins ends up being 404
14:54 <+ardumont> so as there are lots of origins in those manifest, there is a high probability it won't succeed
14:55 <+ardumont> but it does its best to ingest the remaining part
14:55 <+anlambert> ardumont: samplet is right, check visits reports of guix for instance, they are all red since november https://archive.softwareheritage.org/browse/origin/visits/?origin_url=https://guix.gnu.org/sources.json
14:55 <samplet> ardumont: Ah.  That explains all the yellow!  But there’s nothing but red in November.
14:55 <+anlambert> and it seems the loader is stopped since one month
14:56 <+ardumont> heh, so that needs an unbreaking task ;)
14:56 <+ardumont> thx for the heads up
14:56 <+ardumont> (in any case ;)
14:57 <samplet> I noticed because Guix’s SWH coverage tracking shows a noticeable drop for commits from November.
14:57 <samplet> Thanks for taking a look!

As olasd [1] pointed out, maybe start by looking at the scheduler part. Is nixguix visit still scheduled?

[1]

15:03 <+olasd> is anything even scheduling the nixguix loading tasks now?

Event Timeline

ardumont triaged this task as High priority.Dec 2 2021, 2:59 PM
ardumont created this task.
ardumont added a project: Nixguix loader.

As olasd [1] pointed out, maybe start by looking at the scheduler part. Is nixguix visit still scheduled?

yes, it is.

That loader is a bit of a mix between a lister and a loader (it has no longer a lister to depend on).
So it's scheduled the old way (contrary to other loaders).

It was stuck [1] as it can be sometimes due to some unknown identified issue in the old scheduler policy way (we moved away from for the other loaders).
It's not unstuck [2].

[1]

15:08:58 softwareheritage-scheduler@belvedere:5432=> select * from task where type = 'load-nixguix' limit 10;
+-[ RECORD 1 ]-----+------------------------------------------------------------------------------------------------------+
| id               | 337282717                                                                                            |
| type             | load-nixguix                                                                                         |
| arguments        | {"args": [], "kwargs": {"url": "https://guix.gnu.org/sources.json"}}                                 |
| next_run         | 2021-11-02 14:40:30.010285+00                                                                        |
| current_interval | 1 day                                                                                                |
| status           | next_run_scheduled                                                                                   |
| policy           | recurring                                                                                            |
| retries_left     | 0                                                                                                    |
| priority         | (null)                                                                                               |
+-[ RECORD 2 ]-----+------------------------------------------------------------------------------------------------------+
| id               | 334411727                                                                                            |
| type             | load-nixguix                                                                                         |
| arguments        | {"args": [], "kwargs": {"url": "https://nix-community.github.io/nixpkgs-swh/sources-unstable.json"}} |
| next_run         | 2021-09-27 10:45:31.329689+00                                                                        |
| current_interval | 1 day                                                                                                |
| status           | next_run_scheduled                                                                                   |
| policy           | recurring                                                                                            |
| retries_left     | 0                                                                                                    |
| priority         | (null)                                                                                               |
+-[ RECORD 3 ]-----+------------------------------------------------------------------------------------------------------+
| id               | 337282718                                                                                            |
| type             | load-nixguix                                                                                         |
| arguments        | {"args": [], "kwargs": {"url": "https://guix.gnu.org/sources.json"}}                                 |
| next_run         | 2020-12-12 01:34:38.576948+00                                                                        |
| current_interval | 1 day                                                                                                |
| status           | disabled                                                                                             |
| policy           | recurring                                                                                            |
| retries_left     | 3                                                                                                    |
| priority         | (null)                                                                                               |
+------------------+------------------------------------------------------------------------------------------------------+

Time: 63.439 ms
15:09:10 softwareheritage-scheduler@belvedere:5432=> update task set status='next_run_not_scheduled', next_run=now() where task='load-nixguix' where status !='disab

[2]

15:12:14 softwareheritage-scheduler@belvedere:5432=> update task set status='next_run_not_scheduled', next_run=now() where type='load-nixguix' and status !='disabled';
UPDATE 2
Time: 38.280 ms
15:15:31 softwareheritage-scheduler@belvedere:5432=> select * from task where type = 'load-nixguix' and status != 'disabled' limit 10;
+-[ RECORD 1 ]-----+------------------------------------------------------------------------------------------------------+
| id               | 337282717                                                                                            |
| type             | load-nixguix                                                                                         |
| arguments        | {"args": [], "kwargs": {"url": "https://guix.gnu.org/sources.json"}}                                 |
| next_run         | 2021-12-02 14:12:17.543537+00                                                                        |
| current_interval | 1 day                                                                                                |
| status           | next_run_scheduled                                                                                   |
| policy           | recurring                                                                                            |
| retries_left     | 0                                                                                                    |
| priority         | (null)                                                                                               |
+-[ RECORD 2 ]-----+------------------------------------------------------------------------------------------------------+
| id               | 334411727                                                                                            |
| type             | load-nixguix                                                                                         |
| arguments        | {"args": [], "kwargs": {"url": "https://nix-community.github.io/nixpkgs-swh/sources-unstable.json"}} |
| next_run         | 2021-12-02 14:12:17.543537+00                                                                        |
| current_interval | 1 day                                                                                                |
| status           | next_run_scheduled                                                                                   |
| policy           | recurring                                                                                            |
| retries_left     | 0                                                                                                    |
| priority         | (null)                                                                                               |
+------------------+------------------------------------------------------------------------------------------------------+

Time: 5.474 ms

Now remains to determine why the last visits are red.
I'll have a look a tad later.

It seems the process just fails on an assertion [1].
So indeed, it does not go the full way...

It seems related to extid mapping...
This needs further digging.

@vlorentz, is that failing assertion still True? (since we moved from synthetic revision to release)

[1]

Dec 02 14:43:02 worker10 python3[2786815]: [2021-12-02 14:43:02,094: ERROR/ForkPoolWorker-1] Task swh.loader.package.nixguix.tasks.LoadNixguix[e6006578-c5e7-4a8f-9fa3-26936f760d7d] raised unexpected: AssertionError([CoreSWHID(namespace='swh', scheme_version=1, object_id=b'\xa3\xa4\x89\xa8\x9e\xef\x16pu\x13\x17\x10\r+\r9\x01\xbcS&', object_type=<ObjectType.REVISION: 'rev'>), CoreSWHID(namespace='swh', scheme_version=1, object_id=b'\xa3\xa4\x89\xa8\x9e\xef\x16pu\x13\x17\x10\r+\r9\x01\xbcS&', object_type=<ObjectType.REVISION: 'rev'>)])
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 385, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/swh/scheduler/task.py", line 55, in __call__
    result = super().__call__(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 650, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/sentry_sdk/integrations/celery.py", line 161, in _inner
    reraise(*exc_info)
  File "/usr/lib/python3/dist-packages/sentry_sdk/_compat.py", line 57, in reraise
    raise value
  File "/usr/lib/python3/dist-packages/sentry_sdk/integrations/celery.py", line 156, in _inner
    return f(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/swh/loader/package/nixguix/tasks.py", line 14, in load_nixguix
    return NixGuixLoader.from_configfile(url=url).load()
  File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 576, in load
    known_extids, p_info, last_snapshot_targets
  File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 327, in resolve_object_from_extids
    assert len(extid_targets) == 1, extid_targets
AssertionError: [CoreSWHID(namespace='swh', scheme_version=1, object_id=b'\xa3\xa4\x89\xa8\x9e\xef\x16pu\x13\x17\x10\r+\r9\x01\xbcS&', object_type=<ObjectType.REVISION: 'rev'>), CoreSWHID(namespace='swh', scheme_version=1, object_id=b'\xa3\xa4\x89\xa8\x9e\xef\x16pu\x13\x17\x10\r+\r9\x01\xbcS&', object_type=<ObjectType.REVISION: 'rev'>)]

@vlorentz, is that failing assertion still True? (since we moved from synthetic revision to release)

When i asked, i forgot that the query is done per type (release or revision), i was then
inclined to think that we had old revisions targetting the same directory (target) and
the new releases doing the same hence failing that assertion. As it's done per object
type, my assumption was incorrect so ok for that.

However, nothing prevents anything from having multiple releases named differently which
targets the same directory (target). Then, that assertion will fail, so, in the end, I
don't think that assertion holds.

Ah yes, indeed. Make release_extid_targets a set to deduplicate target SWHIDs, and the assertion should be fine.

Deployed with loader.core v1.2.
I've just scheduled a new nixguix load run for both guix and nixpkgs.
Let's see if that unblocks it.

Deployed and triggered another scheduling.

ardumont changed the task status from Open to Work in Progress.Dec 3 2021, 5:39 PM

It finished during the week end [1].
As a secondary check, the scheduling is not stuck [2].

[1] P1234

[2]

09:37:53 softwareheritage-scheduler@belvedere:5432=> select * from task where type='load-nixguix' and status !='disabled';
+-[ RECORD 1 ]-----+------------------------------------------------------------------------------------------------------+
| id               | 334411727                                                                                            |
| type             | load-nixguix                                                                                         |
| arguments        | {"args": [], "kwargs": {"url": "https://nix-community.github.io/nixpkgs-swh/sources-unstable.json"}} |
| next_run         | 2021-12-06 18:28:52.511228+00                                                                        |
| current_interval | 1 day                                                                                                |
| status           | next_run_not_scheduled                                                                               |
| policy           | recurring                                                                                            |
| retries_left     | 3                                                                                                    |
| priority         | (null)                                                                                               |
+-[ RECORD 2 ]-----+------------------------------------------------------------------------------------------------------+
| id               | 337282717                                                                                            |
| type             | load-nixguix                                                                                         |
| arguments        | {"args": [], "kwargs": {"url": "https://guix.gnu.org/sources.json"}}                                 |
| next_run         | 2021-12-06 10:40:20.53655+00                                                                         |
| current_interval | 1 day                                                                                                |
| status           | next_run_not_scheduled                                                                               |
| policy           | recurring                                                                                            |
| retries_left     | 3                                                                                                    |
| priority         | (null)                                                                                               |
+------------------+------------------------------------------------------------------------------------------------------+