@douardda noticed that https://sentry.softwareheritage.org/share/issue/99d67860c3484c7ab709154962ca8eb6/ shows a considerable increase in the number of "NotFound" repositories on GitHub, since 2022-06-15 or 2022-06-16.
This may not be an issue, but I find it surprising.
I have looked at one such origin in particular: https://sentry.softwareheritage.org/organizations/swh/issues/10253/events/47b2b8f714364acea483c7e16f3a4ffb/
The loader started visiting on "2022-06-21 07:51:02,699" (according to breadcrumbs in Sentry).
The scheduler entry for this origin is:
softwareheritage-scheduler=> select * from listed_origins where url='https://github.com/Stanley-Ezeaku/kotlin'; -[ RECORD 1 ]----------+----------------------------------------- lister_id | 6632ef5e-322b-402b-8f28-d090f76ed6b7 url | https://github.com/Stanley-Ezeaku/kotlin visit_type | git extra_loader_arguments | {} enabled | f first_seen | 2021-06-10 02:15:29.470435+00 last_seen | 2022-06-21 07:51:03.813845+00 last_update | 2020-02-27 09:11:58+00
and the associated lister:
softwareheritage-scheduler=> select * from listers where id='6632ef5e-322b-402b-8f28-d090f76ed6b7'; -[ RECORD 1 ]-+------------------------------------- id | 6632ef5e-322b-402b-8f28-d090f76ed6b7 name | github instance_name | github created | 2021-02-04 08:01:51.163997+00 current_state | {"last_seen_id": 490551028} updated | 2022-05-10 07:23:49.246279+00
This is surprising, because according to last_seen, the lister saw this origin 1.2s after we started loading it (or claimed to see it; this might be a lister bug).