Page MenuHomeSoftware Heritage

FUSE: add history/by-date/ dir for revision objects
Closed, MigratedEdits Locked

Description

Revisions objects known to swh-graph should have a history/by-date/ dir listing commits organized by timestamp, sharded as history/by-date/YYYY/MM/DD (cf.: the [[ https://www.sciencedirect.com/science/article/pii/S2352711018300712 | commits-by-date directory of RepoFS ]] as related work).

Note: there are two possible timestamps to use for this, either committer or author. We should pick a default, and possibly make it configurable at mount time. (Or do we want two different directories, by-author-date/ v. by-committer-date/?)

The main difficulty is that populating the dir will take time, as we need to fetch the meta information of each commit from the archive. Hence, population should be asynchronous and incremental, probably in a separate thread. Each time ls is performed in the dir (or any subdir), it should return what is already known up to that point. A dedicated .status file should be present under history/by-date/ during population, ideally with a message telling the a status report when cat .status is done; the file should be removed when done.