diff --git a/docs/specs/metadata_example.xml b/docs/specs/metadata_example.xml index c681e559..7ae6c8e4 100644 --- a/docs/specs/metadata_example.xml +++ b/docs/specs/metadata_example.xml @@ -1,35 +1,31 @@ - "{http://www.w3.org/2005/Atom}author": { - "{http://www.w3.org/2005/Atom}email": "hal@ccsd.cnrs.fr", - "{http://www.w3.org/2005/Atom}name": "HAL" - }, HAL hal@ccsd.cnrs.fr hal hal-01243573 The assignment problem https://hal.archives-ouvertes.fr/hal-01243573 other identifier, DOI, ARK Domain description author1 Inria UPMC author2 Inria UPMC diff --git a/docs/specs/spec-meta-deposit.rst b/docs/specs/spec-meta-deposit.rst index 517757fc..887baef2 100644 --- a/docs/specs/spec-meta-deposit.rst +++ b/docs/specs/spec-meta-deposit.rst @@ -1,84 +1,98 @@ The metadata-deposit ==================== Goal ---- A client wishes to deposit only metadata about an object in the Software Heritage archive. -The meta-deposit is a special deposit where no content is +The metadata-deposit is a special deposit where no content is provided and the data transfered to Software Heritage is only the metadata about an object or several objects in the archive. Requirements ------------ The scope of the meta-deposit is different than the sparse-deposit. While a sparse-deposit creates a revision with referenced -directories and content files, the meta-deposit references one of the following: +directories and content files, the metadata-deposit references one of the +following: - origin - snapshot - revision - release A complete metadata example --------------------------- The reference element is included in the metadata xml atomEntry under the -swh namespace: +swh namespace (a link for the published schema will be provided during +the implementation of the metadata deposit): .. code:: xml HAL hal@ccsd.cnrs.fr hal hal-01243573 The assignment problem https://hal.archives-ouvertes.fr/hal-01243573 other identifier, DOI, ARK Domain description author1 Inria UPMC author2 Inria UPMC Examples by target type ^^^^^^^^^^^^^^^^^^^^^^^ +Reference an origin: -With ${type} in {snp (snapshot), rev (revision), rel (release) }: +.. code:: xml + + + + + + + + +Reference a snapshot, revision or release: .. code:: xml + With ${type} in {snp (snapshot), rev (revision), rel (release) }: Loading procedure ------------------ -In this case, the meta-deposit will be injected as a metadata entry at the -appropriate level (origin_metadata, revision_metadata, etc.). Contrary to the -complete and sparse deposit, there will be no object creation. +In this case, the metadata-deposit will be injected as a metadata entry at the +appropriate level (origin_metadata, revision_metadata, etc.) with the information +about the contributor of the deposit. Contrary to the complete and sparse +deposit, there will be no object creation. diff --git a/docs/specs/spec-sparse-deposit.rst b/docs/specs/spec-sparse-deposit.rst index e08f5728..ffa001ae 100644 --- a/docs/specs/spec-sparse-deposit.rst +++ b/docs/specs/spec-sparse-deposit.rst @@ -1,101 +1,102 @@ The sparse-deposit ================== Goal ---- A client wishes to transfer a tarball for which part of the content is already in the SWH archive. Requirements ------------ To do so, a list of paths with targets must be provided in the metadata and the paths to the missing directories/content should not be included in the tarball. The list will be referred to as the manifest list using the entry name 'bindings' in the metadata. +----------------------+-------------------------------------+ | path | swh-id | +======================+=====================================+ | path/to/file.txt | swh:1:cnt:aaaaaaaaaaaaaaaaaaaaa... | +----------------------+-------------------------------------+ | path/to/dir/ | swh:1:dir:aaaaaaaaaaaaaaaaaaaaa... | +----------------------+-------------------------------------+ Note: the *name* of the file or the directory is given by the path and is not part of the identified object. A concrete example ------------------ The manifest list is included in the metadata xml atomEntry under the -swh namespace: +swh namespace (a link for the published schema will be provided during +the implementation of the sparse deposit): .. code:: xml HAL hal@ccsd.cnrs.fr hal hal-01243573 The assignment problem https://hal.archives-ouvertes.fr/hal-01243573 other identifier, DOI, ARK Domain description author1 Inria UPMC author2 Inria UPMC Deposit verification -------------------- After checking the integrity of the deposit content and metadata, the following checks should be added: 1. validate the manifest list structure with a correct swh-id for each path (syntax check on the swh-id format) 2. verify that the path name corresponds to the object type 3. locate the identifiers in the SWH archive Each failing check should return a different error with the deposit and result in a 'rejected' deposit. Loading procedure ------------------ The injection procedure should include: - load the tarball new data - create new objects using the path name and create links from the path to the SWH object using the identifier - calculate identifier of the new objects at each level - return final swh-id of the new revision Invariant: the same content should yield the same swh-id, that's why a complete deposit with all the content and a sparse-deposit with the correct links will result with the same root directory swh-id. The same is expected with the revision swh-id if the metadata provided is identical.