T2993 deployed and first run completed.
But, we face discrepancies:
- have 98M [1] origins referenced in the cache table but we should have around 131M [2] origins. That difference ~33M is currently unexplained and high.
- Checking for example on "linux" origin, the cache data computed is off as well [3]
[1]
softwareheritage-scheduler=> select now(), count(*) from origin_visit_stats; now | count -------------------------------+---------- 2021-01-28 08:34:40.152554+00 | 98231002
[2]
softwareheritage=> select count(distinct origin) from origin_visit_status where status in ('full', 'partial');
[3]
softwareheritage-scheduler=> select * from origin_visit_stats where url='https://github.com/torvalds/linux'; url | visit_type | last_eventful | last_uneventful | last_failed | last_notfound | last_snapshot | last_scheduled -----------------------------------+------------+-------------------------------+-----------------+-------------------------------+---------------+--------------------------------------------+---------------- https://github.com/torvalds/linux | git | 2017-09-07 18:43:13.021746+00 | | 2018-08-23 11:53:06.553328+00 | | \x3e3045be901bacc7594176e79ba13fe030f601e2 | (1 row) softwareheritage=> select * from origin_visit_status where origin=2 order by date desc limit 10; origin | visit | date | status | metadata | snapshot | type --------+-------+-------------------------------+---------+----------+--------------------------------------------+------ 2 | 67 | 2020-09-21 21:55:01.586191+00 | full | | \xc7beb2432b7e93c4cf6ab09cd194c7c1998df2f9 | 2 | 67 | 2020-09-21 19:15:24.238712+00 | created | | | 2 | 66 | 2020-09-21 17:12:11.930011+00 | partial | | | 2 | 66 | 2020-09-21 17:07:41.94459+00 | created | | | 2 | 65 | 2020-08-24 11:51:54.472736+00 | full | | \xb16664848afbd3e867e8fce516ef15c1772679b2 | 2 | 65 | 2020-08-24 09:22:41.181224+00 | created | | | 2 | 64 | 2020-03-19 23:29:59.614232+00 | full | | \x89eed60d46be8b8963a1a2268762aee5bbb41038 | 2 | 63 | 2020-01-20 19:50:46.750039+00 | full | | \xcabcc7d7bf639bbe1cc3b41989e1806618dd5764 | 2 | 62 | 2019-12-16 13:44:56.685885+00 | ongoing | | | 2 | 61 | 2019-08-25 14:04:07.603463+00 | full | | \xeb8087624d47f6e8ee89692df041b2f568fb0e5f |
Related to T2993