Page MenuHomeSoftware Heritage

FUSE: update cache with new origin visits
Closed, ResolvedPublic

Description

We should update cache when new visits are added from a specific origin.

Event Timeline

haltode triaged this task as Normal priority.Dec 2 2020, 2:46 PM
haltode created this task.
haltode created this object in space S1 Public.

The difficulty with this one is deciding when to re-query the backend to check if there are new visits. Doing it too often will make the cache of visit metadata useless. Doing it too seldomly will make you miss new visits. Either way, we probably need to add a timestamp somewhere in the cache to note down when the metadata have been fetched last (!= most recent visit timestamp).

Considering the current rate of archive visit, it's probably pointless to re-fetch origin visits more than once per day.

This also raises the question of whether we want to enable users to selectively remove cache entries, e.g., a variant of "swh fs clean" that only removes something from the cache, rather than all of it. One maybe decent UI to do that would be something like swh fs clean [ID]..., where ID is either a SWHID or a origin URL; calling that will make swh-fuse only remove the parts of the cache about the ID. (This would be a separate task though.)

haltode changed the task status from Open to Work in Progress.Dec 15 2020, 1:32 PM
haltode moved this task from Backlog to In progress on the Software Heritage filesystem board.