enable the deposit of a tarball with empty paths that will be defined using the associated metadata
Description
Status | Assigned | Task | ||
---|---|---|---|---|
Migrated | gitlab-migration | T855 support directory entries pointing to revisions when loading directories | ||
Migrated | gitlab-migration | T1021 SWORD deposit of metadata about an existing SWH object | ||
Migrated | gitlab-migration | T960 draft specs for deposit with incomplete tarball |
Event Timeline
Goal: deposit a tarball for which part or all the content is already in the SWH archive
the paths to the missing directories/content must be provided as empty paths in the tarball
the list linking each path to the object in the archive will be provided with the metadata
./path/to/file.txt | swh:1:cnt:aaaaaaaaaaaaaaaaaaaaa... |
./path/to/dir/ | swh: 1:dir:aaaaaaaaaaaaaaaaaaaaa... |
Note: the name of the file or the directory is given by the path and is not part of the identified object.
Checks on deposit:
- the paths in the metadata are explicit in the tarball
- the path name corresponds to the object type
- the paths in the tarball are empty
- the identifiers exists in the SWH archive
Load the data from the deposit:
- load the existing data
- create links from the path to the SWH object through the identifier
- calculate identifier of the new objects
this is the direction for the swiss-deposit, the description above will be included in docs/specs/half-deposit.rst
and docs/specs/meta-deposit.rst (deposits with only metadata)
great, thanks!
I'll be AFK for a while, so I can't check the diff, but if you (@moranegg ) can point me to the current version (on docs.s.o?, if it's deployed), I'll be happy to have a look before it's implemented
It is not yet deployed on the docs web pages, but I'll put a link here when it is.
Also I'm planing some modifications on the swh xml schema for the swh-id properties (mentioned in T1152).
Regarding implementation, no plans of implementing it are on the horizon, it is something to consider for the priority/yearly planning.
I can also open a review documentation subtask.
Oh, ok, thanks for clarifying. Yes, dedicated review task assigned to me about this would be welcome (once you've a pointer to the spec that you think it's ready for review).
We will indeed prioritize later but, as an advanced sneak peek, we will want to have a working prototype of this by 2018, end of the year (for the CCS deposit use case).