HomeSoftware Heritage

Only record last_visited and last_successful in origin_visit_stats

Description

Only record last_visited and last_successful in origin_visit_stats

After using this schema for a while, all queries can be implemented in
terms of these two timestamps, instead of the four original
last_eventful, last_uneventful, last_failed and last_notfound
timestamps.

This ends up simplifying the logic within the journal client, as well as
that of the grab_next_visits query builder.

To make this change work, we also stop considering out of order messages
altogether in journal_client. This welcome simplification is an accuracy
tradeoff that is explained in the updated documentation of the journal
client:

.. [1] Ignoring out of order messages makes the initialization of the

origin_visit_status table (from a full journal) less deterministic: only the
`last_visit`, `last_visit_state` and `last_successful` fields are guaranteed
to be exact, the `next_position_offset` field is a best effort estimate
(which should converge once the client has run for a while on in-order
messages).

Details

Provenance
olasdAuthored on Jul 23 2021, 11:48 AM
olasdPushed on Jul 23 2021, 3:58 PM
Differential Revision
D6018: Only record last_visited and last_successful in origin_visit_stats
Parents
rDSCH3ca0d659503f: test_journal_client: Unify test assertion like the rest
Branches
Unknown
Tags
Unknown
Build Status
Buildable 22735
Build 35454: test-and-buildJenkins console · Jenkins