diff --git a/docs/specs/spec-meta-deposit.rst b/docs/specs/spec-meta-deposit.rst index a4f700c8..517757fc 100644 --- a/docs/specs/spec-meta-deposit.rst +++ b/docs/specs/spec-meta-deposit.rst @@ -1,100 +1,84 @@ The metadata-deposit ==================== Goal ---- A client wishes to deposit only metadata about an object in the Software Heritage archive. The meta-deposit is a special deposit where no content is -deposited and the data transfered to Software Heritage is only +provided and the data transfered to Software Heritage is only the metadata about an object or several objects in the archive. +Requirements +------------ The scope of the meta-deposit is different than the -sparse-deposit, while a sparse-deposit creates a revision with referenced +sparse-deposit. While a sparse-deposit creates a revision with referenced directories and content files, the meta-deposit references one of the following: - origin - snapshot - revision - release A complete metadata example --------------------------- The reference element is included in the metadata xml atomEntry under the swh namespace: .. code:: xml HAL hal@ccsd.cnrs.fr hal hal-01243573 The assignment problem https://hal.archives-ouvertes.fr/hal-01243573 other identifier, DOI, ARK Domain description author1 Inria UPMC author2 Inria UPMC -examples by target type +Examples by target type ^^^^^^^^^^^^^^^^^^^^^^^ -snapshot -********* -.. code:: xml - - - - - +With ${type} in {snp (snapshot), rev (revision), rel (release) }: -revision -******** .. code:: xml - + -release -******* -.. code:: xml - - - - - Loading procedure ------------------ In this case, the meta-deposit will be injected as a metadata entry at the -appropriate level (origin_metadata, revision_metadata, etc.) and won't result -in the creation of a new object like with the complete deposit and the -sparse-deposit. +appropriate level (origin_metadata, revision_metadata, etc.). Contrary to the +complete and sparse deposit, there will be no object creation. diff --git a/docs/specs/spec-sparse-deposit.rst b/docs/specs/spec-sparse-deposit.rst index 6fd12f5c..e08f5728 100644 --- a/docs/specs/spec-sparse-deposit.rst +++ b/docs/specs/spec-sparse-deposit.rst @@ -1,104 +1,101 @@ The sparse-deposit ================== Goal ---- A client wishes to transfer a tarball for which part of the content is already in the SWH archive. Requirements ------------ -To do so, the paths to the missing directories/content must be provided as -empty paths in the tarball and the list linking each path to the object in the -archive will be provided as part of the metadata. The list will be refered to -as the manifest list. +To do so, a list of paths with targets must be provided in the metadata and +the paths to the missing directories/content should not be included +in the tarball. The list will be referred to +as the manifest list using the entry name 'bindings' in the metadata. +----------------------+-------------------------------------+ | path | swh-id | +======================+=====================================+ | path/to/file.txt | swh:1:cnt:aaaaaaaaaaaaaaaaaaaaa... | +----------------------+-------------------------------------+ | path/to/dir/ | swh:1:dir:aaaaaaaaaaaaaaaaaaaaa... | +----------------------+-------------------------------------+ Note: the *name* of the file or the directory is given by the path and is not part of the identified object. A concrete example ------------------ The manifest list is included in the metadata xml atomEntry under the swh namespace: .. code:: xml HAL hal@ccsd.cnrs.fr hal hal-01243573 The assignment problem https://hal.archives-ouvertes.fr/hal-01243573 other identifier, DOI, ARK Domain description author1 Inria UPMC author2 Inria UPMC -The tarball sent with the deposit will contain the following empty paths: -- path/to/file.txt -- path/to/second_file.txt -- path/to/dir/ Deposit verification -------------------- After checking the integrity of the deposit content and metadata, the following checks should be added: -1. validate the manifest list structure with a swh-id for each path -2. verify that the paths in the manifest list are explicit and empty in the tarball -3. verify that the path name corresponds to the object type -4. locate the identifiers in the SWH archive +1. validate the manifest list structure with a correct swh-id for each path (syntax check on the swh-id format) +2. verify that the path name corresponds to the object type +3. locate the identifiers in the SWH archive -Each one of the verifications should return a different error with the deposit +Each failing check should return a different error with the deposit and result in a 'rejected' deposit. Loading procedure ------------------ The injection procedure should include: -- load the tarball data +- load the tarball new data - create new objects using the path name and create links from the path to the SWH object using the identifier - calculate identifier of the new objects at each level - return final swh-id of the new revision -Invariant: the same content should yield the same swhid, that's why a complete -deposit with all the content and a sparse-deposit with the correct links will -result with the same root directory swh-id and if the metadata are identical -also with the same revision swh-id. +Invariant: the same content should yield the same swh-id, +that's why a complete deposit with all the content and +a sparse-deposit with the correct links will result +with the same root directory swh-id. +The same is expected with the revision swh-id if the metadata provided is +identical. diff --git a/docs/specs/specs.rst b/docs/specs/specs.rst index 608183c4..bb86993d 100644 --- a/docs/specs/specs.rst +++ b/docs/specs/specs.rst @@ -1,13 +1,13 @@ .. _swh-deposit-specs: -Software Heritage Deposit Specifications -======================================== +Blueprint Specifications +========================= .. toctree:: :maxdepth: 1 :caption: Contents: blueprint.rst spec-loading.rst spec-sparse-deposit.rst spec-meta-deposit.rst