diff --git a/docs/specs/metadata_example.xml b/docs/specs/metadata_example.xml
index 59c5ed82..c681e559 100644
--- a/docs/specs/metadata_example.xml
+++ b/docs/specs/metadata_example.xml
@@ -1,38 +1,35 @@
"{http://www.w3.org/2005/Atom}author": {
"{http://www.w3.org/2005/Atom}email": "hal@ccsd.cnrs.fr",
"{http://www.w3.org/2005/Atom}name": "HAL"
},
HAL
hal@ccsd.cnrs.fr
hal
hal-01243573
The assignment problem
https://hal.archives-ouvertes.fr/hal-01243573
other identifier, DOI, ARK
Domain
description
author1
Inria
UPMC
author2
Inria
UPMC
-
-
- ./path/to/file.txt
- aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-
-
+
+
+
diff --git a/docs/specs/spec-meta-deposit.rst b/docs/specs/spec-meta-deposit.rst
index 9ba0c8ef..a4f700c8 100644
--- a/docs/specs/spec-meta-deposit.rst
+++ b/docs/specs/spec-meta-deposit.rst
@@ -1,104 +1,100 @@
-The meta-deposit
-================
+The metadata-deposit
+====================
Goal
----
A client wishes to deposit only metadata about an object in the Software
Heritage archive.
The meta-deposit is a special deposit where no content is
deposited and the data transfered to Software Heritage is only
the metadata about an object or several objects in the archive.
The scope of the meta-deposit is different than the
sparse-deposit, while a sparse-deposit creates a revision with referenced
directories and content files, the meta-deposit references one of the following:
- origin
- snapshot
- revision
- release
A complete metadata example
---------------------------
The reference element is included in the metadata xml atomEntry under the
swh namespace:
.. code:: xml
HAL
hal@ccsd.cnrs.fr
hal
hal-01243573
The assignment problem
https://hal.archives-ouvertes.fr/hal-01243573
other identifier, DOI, ARK
Domain
description
author1
Inria
UPMC
author2
Inria
UPMC
- origin
- https://github.com/user/repo
+
examples by target type
^^^^^^^^^^^^^^^^^^^^^^^
snapshot
*********
.. code:: xml
- snapshot
- swh:1:snp:aaaaaaaaaaaaaa...
+
revision
********
.. code:: xml
- revision
- swh:1:rev:aaaaa............
+
release
*******
.. code:: xml
- release
- swh:1:rel:aaaaaaaaaaaaaa....
+
Loading procedure
------------------
In this case, the meta-deposit will be injected as a metadata entry at the
appropriate level (origin_metadata, revision_metadata, etc.) and won't result
in the creation of a new object like with the complete deposit and the
sparse-deposit.
diff --git a/docs/specs/spec-sparse-deposit.rst b/docs/specs/spec-sparse-deposit.rst
index 534957a8..6fd12f5c 100644
--- a/docs/specs/spec-sparse-deposit.rst
+++ b/docs/specs/spec-sparse-deposit.rst
@@ -1,109 +1,104 @@
The sparse-deposit
==================
Goal
----
A client wishes to transfer a tarball for which part of the content is
already in the SWH archive.
Requirements
------------
To do so, the paths to the missing directories/content must be provided as
empty paths in the tarball and the list linking each path to the object in the
archive will be provided as part of the metadata. The list will be refered to
as the manifest list.
+----------------------+-------------------------------------+
| path | swh-id |
+======================+=====================================+
-| ./path/to/file.txt | swh:1:cnt:aaaaaaaaaaaaaaaaaaaaa... |
+| path/to/file.txt | swh:1:cnt:aaaaaaaaaaaaaaaaaaaaa... |
+----------------------+-------------------------------------+
-| ./path/to/dir/ | swh:1:dir:aaaaaaaaaaaaaaaaaaaaa... |
+| path/to/dir/ | swh:1:dir:aaaaaaaaaaaaaaaaaaaaa... |
+----------------------+-------------------------------------+
Note: the *name* of the file or the directory is given by the path and is not
part of the identified object.
A concrete example
------------------
The manifest list is included in the metadata xml atomEntry under the
swh namespace:
.. code:: xml
HAL
hal@ccsd.cnrs.fr
hal
hal-01243573
The assignment problem
https://hal.archives-ouvertes.fr/hal-01243573
other identifier, DOI, ARK
Domain
description
author1
Inria
UPMC
author2
Inria
UPMC
-
-
- ./path/to/file.txt
- swh:1:cnt:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-
-
- ./path/to/second_file.txt
- swh:1:cnt:bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
-
-
- ./path/to/dir/
- swh:1:dir:ddddddddddddddddddddddddddddddddd
-
-
+
+
+
+
+
+
The tarball sent with the deposit will contain the following empty paths:
- path/to/file.txt
- path/to/second_file.txt
- path/to/dir/
Deposit verification
--------------------
After checking the integrity of the deposit content and
metadata, the following checks should be added:
1. validate the manifest list structure with a swh-id for each path
2. verify that the paths in the manifest list are explicit and empty in the tarball
3. verify that the path name corresponds to the object type
4. locate the identifiers in the SWH archive
Each one of the verifications should return a different error with the deposit
and result in a 'rejected' deposit.
Loading procedure
------------------
The injection procedure should include:
- load the tarball data
- create new objects using the path name and create links from the path to the
SWH object using the identifier
- calculate identifier of the new objects at each level
- return final swh-id of the new revision
Invariant: the same content should yield the same swhid, that's why a complete
deposit with all the content and a sparse-deposit with the correct links will
result with the same root directory swh-id and if the metadata are identical
also with the same revision swh-id.
diff --git a/docs/specs/swh.xsd b/docs/specs/swh.xsd
index 37ac2cca..4dbf0ac6 100644
--- a/docs/specs/swh.xsd
+++ b/docs/specs/swh.xsd
@@ -1,23 +1,41 @@
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+