It so happens that when some replication lag occurs on the replica db (which is used by
the main archive), some information may end up missing in the save code now ui.
It's happening because currently the update routine reads entries to update out of the
save code now statuses considered not final. Then update the save code now entries with
information read out of the other backends at this instant (unbeknownst of the lag as
it's an implem detail), refresh the status with what it got at this instant and never
comes back to update what's missing later. Because it only uses the status of the save
code now which now moved on to a final state.
This needs to be fixed by broadening the scope of the query reading the save code now
entries to update, for example using the ones without any visit_date nor any visit
status (even though their status is final). To avoid having too much information to
update, a limit of 1 month back should also put in place.
[1] https://forge.softwareheritage.org/source/swh-web/browse/master/swh/web/common/origin_save.py$613-615