Design documentation for the FUSE file representation.
Diff Detail
- Repository
- rDGRPH Compressed graph representation
- Branch
- fuse-design-doc
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 15326 Build 23607: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 23606: arc lint + arc unit
Event Timeline
Build is green
Patch application report for D3974 (id=14004)
Rebasing onto eaf0323a1c...
Current branch diff-target is up to date.
Changes applied before test
commit 5639ac580b443be5749efa80560e0a4e20208b3f Author: Thibault Allançon <haltode@gmail.com> Date: Thu Sep 17 13:27:19 2020 +0200 WIP: fuse design doc
See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/32/ for more details.
Build is green
Patch application report for D3974 (id=14006)
Rebasing onto eaf0323a1c...
Current branch diff-target is up to date.
Changes applied before test
commit ff791874538d7155d38066e1d22cd03777faba69 Author: Thibault Allançon <haltode@gmail.com> Date: Thu Sep 17 13:27:19 2020 +0200 WIP: fuse design doc
See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/33/ for more details.
docs/fuse.rst | ||
---|---|---|
4–7 | What I meant in our IRC conversation about this is that readers don't even need to know about swh-graph. So, my suggestion for this intro text would be something like (links to be added):
| |
17–19 | Aside from the reference to the fact that the graph structure is "compressed" (that should go), why this? I think files should not be empty, when opening cnt files one should get the file content. Also, file entry names will, at least in the general case, not be SWHIDs, but the legitimate entry names. Maybe this "by default" part should just go away, and will describe the file naming below, case by case? | |
24–26 | if I understand correctly the state of the discussion with @seirl, we are now going towards:
Either way, all the endpoints we will need will be accessible via the Web API, so we can drop the conditionality on having storage or not. Also, local/remote distinction is no longer relevant, as we'll always access stuff via the Web API. There is potentially a conditionality on whether the Web API endpoints under /graph have access (or not) to the edge labels. I think the right way to go about it is some sort of graceful degradation in the code (e.g., all files have read-only perms, and we use SWHID names instead of entry names), rather than warn the user about it upfront. | |
58–65 | do we care? |
Build is green
Patch application report for D3974 (id=14055)
Rebasing onto bc5614a2c6...
First, rewinding head to replay your work on top of it... Applying: WIP: fuse design doc
Changes applied before test
commit df4ed1a6fde575aae50e23da82671528633decea Author: Thibault Allançon <haltode@gmail.com> Date: Thu Sep 17 13:27:19 2020 +0200 WIP: fuse design doc
See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/38/ for more details.
docs/fuse.rst | ||
---|---|---|
18 | improvement for this one: "Each archive element (or, equivalently, node in the archive Merkle DAG) is represented as one entity in the virtual file system (VFS). The type and content of file system entities depend on the type of archive element being represented. For each supported node type we describe below the corresponding VFS representation." | |
20 | We can switch now to a more structured/formal style now.
| |
23 | here we need to say:
| |
25 | Looks like an important one is missing here, the parent(s) commit(s).
| |
28–30 | both authorship info and timestamps are recurring in various type of VFS entries, it might be worth to factor them out in separate sections and point to them from here stuff that will need to go in the factored out sections is (at least):
| |
45 | There is no guarantee that target points to a source tree. In fact, rel objects can point to any kind of object. So:
| |
52–53 |
this is going to be a tricky one... |