Page MenuHomeSoftware Heritage

FUSE: add support for origin visits
Closed, MigratedEdits Locked

Description

Draft proposal/RFC for how to offer an origin/visit view in SwhFS:

  • we add a new top-level directory /origin (at the same level of /archive and `/meta)
  • one can cd origin/<ORIGIN_URL>/, where ORIGIN_URL is an URL-encoded origin URL that can be passed to the /origin endpoint of the Web API
  • within origin/<ORIGIN_URL>/, dir listing returns a list of visit timestamps, one for each visit, analogous to the /origin/visit/ Web API endpoint (caveat here: there might be multiple visits at the exact same timestamp, done with different loaders. But adding an addition layer origin/<ORIGIN_URL>/<VISIT_TYPE>/ seems really overkill. So I'm tempted to just ignore this problem and, in case of conflict, just arbitrarily pick one)
  • within origin/<ORIGIN_URL>/<VISIT_TIMESTAMP>/ we have the equivalent of the /visit Web API endpoint with the following layout
    • snapshot → points to the snapshot object as a symlink to archive/<SWHID>
    • meta.json → all the meta info returned by /visit (note that this is not a symlink to meta/<SWHID> as visits are not identifiable by SWHIDs

Feedback welcome !

Note: it's clear that cd origin/<ORIGIN_URL> is not going to be practically useful for users, because the chances of getting right the origin URL are very slim. But we can add a separate command, either in swh-fuse or elsewhere, mimicking Web UI search that returns the right URL and then compose it with what is proposed in this task.