Page MenuHomeSoftware Heritage

origin_visit: distinguish "fetch date" and "injection date"
Closed, MigratedEdits Locked

Description

The current schema for the origin_visit table doesn't allow distinguishing the date we have fetched the data, and the date we have injected the data in our archive.

This is particularly problematic for origins where we have archived the content in bulk but took some time to actually inject it, for instance the Google Code mercurial repositories.

This is the data that's currently in fetch_history, but not joined with any other table (so you have to guess which entry in origin_visit corresponds to which fetch).

Event Timeline

I _think_ this usecase is solved with the origin_visit_status table (created vs. ongoing vs. completed). @vlorentz?