enable the deposit of a tarball with empty paths that will be defined using the associated metadata
|Migrated||gitlab-migration||T855 support directory entries pointing to revisions when loading directories|
|Migrated||gitlab-migration||T1021 SWORD deposit of metadata about an existing SWH object|
|Migrated||gitlab-migration||T960 draft specs for deposit with incomplete tarball|
Goal: deposit a tarball for which part or all the content is already in the SWH archive
the paths to the missing directories/content must be provided as empty paths in the tarball
the list linking each path to the object in the archive will be provided with the metadata
Note: the name of the file or the directory is given by the path and is not part of the identified object.
Checks on deposit:
- the paths in the metadata are explicit in the tarball
- the path name corresponds to the object type
- the paths in the tarball are empty
- the identifiers exists in the SWH archive
Load the data from the deposit:
- load the existing data
- create links from the path to the SWH object through the identifier
- calculate identifier of the new objects
this is the direction for the swiss-deposit, the description above will be included in docs/specs/half-deposit.rst
and docs/specs/meta-deposit.rst (deposits with only metadata)
It is not yet deployed on the docs web pages, but I'll put a link here when it is.
Also I'm planing some modifications on the swh xml schema for the swh-id properties (mentioned in T1152).
Regarding implementation, no plans of implementing it are on the horizon, it is something to consider for the priority/yearly planning.
I can also open a review documentation subtask.
Oh, ok, thanks for clarifying. Yes, dedicated review task assigned to me about this would be welcome (once you've a pointer to the spec that you think it's ready for review).
We will indeed prioritize later but, as an advanced sneak peek, we will want to have a working prototype of this by 2018, end of the year (for the CCS deposit use case).