Page MenuHomeSoftware Heritage

FUSE: add support for origin visits
Closed, ResolvedPublic


Draft proposal/RFC for how to offer an origin/visit view in SwhFS:

  • we add a new top-level directory /origin (at the same level of /archive and `/meta)
  • one can cd origin/<ORIGIN_URL>/, where ORIGIN_URL is an URL-encoded origin URL that can be passed to the /origin endpoint of the Web API
  • within origin/<ORIGIN_URL>/, dir listing returns a list of visit timestamps, one for each visit, analogous to the /origin/visit/ Web API endpoint (caveat here: there might be multiple visits at the exact same timestamp, done with different loaders. But adding an addition layer origin/<ORIGIN_URL>/<VISIT_TYPE>/ seems really overkill. So I'm tempted to just ignore this problem and, in case of conflict, just arbitrarily pick one)
  • within origin/<ORIGIN_URL>/<VISIT_TIMESTAMP>/ we have the equivalent of the /visit Web API endpoint with the following layout
    • snapshot → points to the snapshot object as a symlink to archive/<SWHID>
    • meta.json → all the meta info returned by /visit (note that this is not a symlink to meta/<SWHID> as visits are not identifiable by SWHIDs

Feedback welcome !

Note: it's clear that cd origin/<ORIGIN_URL> is not going to be practically useful for users, because the chances of getting right the origin URL are very slim. But we can add a separate command, either in swh-fuse or elsewhere, mimicking Web UI search that returns the right URL and then compose it with what is proposed in this task.

Event Timeline

zack triaged this task as Normal priority.Mon, Nov 16, 9:50 PM
zack lowered the priority of this task from Normal to Low.
zack created this task.
zack updated the task description. (Show Details)Mon, Nov 16, 9:56 PM
zack raised the priority of this task from Low to Normal.Fri, Nov 20, 2:49 PM
haltode changed the task status from Open to Work in Progress.Tue, Nov 24, 2:28 PM
haltode moved this task from Backlog to In progress on the Software Heritage filesystem board.