Changeset View
Changeset View
Standalone View
Standalone View
docs/specs/protocol-reference.rst
.. _deposit-protocol: | .. _deposit-protocol: | ||||
Protocol reference | Protocol reference | ||||
~~~~~~~~~~~~~~~~~~ | ================== | ||||
The swh-deposit protocol is an extension SWORDv2_ protocol, and the | The swh-deposit protocol is an extension SWORDv2_ protocol, and the | ||||
swh-deposit client and server should work with any other SWORDv2-compliant | swh-deposit client and server should work with any other SWORDv2-compliant | ||||
implementation which provides some :ref:`mandatory attributes <mandatory-attributes>` | implementation which provides some :ref:`mandatory attributes <mandatory-attributes>` | ||||
However, we define some extensions by the means of extra tags in the Atom | However, we define some extensions by the means of extra tags in the Atom | ||||
entries, that should be used when interacting with the server to use it optimally. | entries, that should be used when interacting with the server to use it optimally. | ||||
This means the swh-deposit server should work with a generic SWORDv2 client, but | This means the swh-deposit server should work with a generic SWORDv2 client, but | ||||
works much better with these extensions. | works much better with these extensions. | ||||
All these tags are in the ``https://www.softwareheritage.org/schema/2018/deposit`` | All these tags are in the ``https://www.softwareheritage.org/schema/2018/deposit`` | ||||
XML namespace, denoted using the ``swhdeposit`` prefix in this section. | XML namespace, denoted using the ``swhdeposit`` prefix in this section. | ||||
Origin creation with the ``<swhdeposit:create_origin>`` tag | Origin creation with the ``<swhdeposit:create_origin>`` tag | ||||
=========================================================== | ----------------------------------------------------------- | ||||
Motivation | Motivation | ||||
---------- | ^^^^^^^^^^ | ||||
This is the main extension we define. | This is the main extension we define. | ||||
This tag is used after a deposit is completed, to load it in the Software Heritage | This tag is used after a deposit is completed, to load it in the Software Heritage | ||||
archive. | archive. | ||||
The SWH archive references source code repositories by an URI, called the | The SWH archive references source code repositories by an URI, called the | ||||
:term:`origin` URL. | :term:`origin` URL. | ||||
This URI is clearly defined when SWH pulls source code from such a repository; | This URI is clearly defined when SWH pulls source code from such a repository; | ||||
but not for the push approach used by SWORD, as SWORD clients do not intrinsically | but not for the push approach used by SWORD, as SWORD clients do not intrinsically | ||||
have an URL. | have an URL. | ||||
Usage | Usage | ||||
----- | ^^^^^ | ||||
Instead, clients are expected to provide the origin URL themselves, by adding | Instead, clients are expected to provide the origin URL themselves, by adding | ||||
a tag in the Atom entry they submit to the server, like this: | a tag in the Atom entry they submit to the server, like this: | ||||
.. code:: xml | .. code:: xml | ||||
<atom:entry xmlns:atom="http://www.w3.org/2005/Atom" | <atom:entry xmlns:atom="http://www.w3.org/2005/Atom" | ||||
xmlns:swh="https://www.softwareheritage.org/schema/2018/deposit"> | xmlns:swh="https://www.softwareheritage.org/schema/2018/deposit"> | ||||
Show All 9 Lines | <atom:entry xmlns:atom="http://www.w3.org/2005/Atom" | ||||
<!-- ... --> | <!-- ... --> | ||||
</atom:entry> | </atom:entry> | ||||
This will create an origin in the Software Heritage archive, that will point to | This will create an origin in the Software Heritage archive, that will point to | ||||
the source code artifacts of this deposit. | the source code artifacts of this deposit. | ||||
Semantics of origin URLs | Semantics of origin URLs | ||||
------------------------ | ^^^^^^^^^^^^^^^^^^^^^^^^ | ||||
Origin URLs must be unique to an origin, ie. to a software project. | Origin URLs must be unique to an origin, ie. to a software project. | ||||
The exact definition of a "software project" is left to the clients of the deposit. | The exact definition of a "software project" is left to the clients of the deposit. | ||||
They should be designed so that future releases of the same software will have | They should be designed so that future releases of the same software will have | ||||
the same origin URL. | the same origin URL. | ||||
As a guideline, consider that every GitHub/GitLab project is an origin, | As a guideline, consider that every GitHub/GitLab project is an origin, | ||||
and every package in Debian/NPM/PyPI is also an origin. | and every package in Debian/NPM/PyPI is also an origin. | ||||
While origin URLs are not required to resolve to a source code artifact, | While origin URLs are not required to resolve to a source code artifact, | ||||
we recommend they point to a public resource describing the software project, | we recommend they point to a public resource describing the software project, | ||||
including a link to download its source code. | including a link to download its source code. | ||||
This is not a technical requirement, but it improves discoverability. | This is not a technical requirement, but it improves discoverability. | ||||
Clients may not submit arbitrary URLs; the server will check the URLs they submit | Clients may not submit arbitrary URLs; the server will check the URLs they submit | ||||
belongs a "namespace" they own, known as the ``provider_url`` of the client. | belongs a "namespace" they own, known as the ``provider_url`` of the client. | ||||
For example, if a client has their ``provider_url`` set to ``https://example.org/foo/`` | For example, if a client has their ``provider_url`` set to ``https://example.org/foo/`` | ||||
they will not be able to submit deposits to origins whose URL starts with | they will not be able to submit deposits to origins whose URL starts with | ||||
``https://example.org/foo/``. | ``https://example.org/foo/``. | ||||
Fallbacks | Fallbacks | ||||
--------- | ^^^^^^^^^ | ||||
If the ``<swhdeposit:create_origin>`` is not provided (either because they are generic | If the ``<swhdeposit:create_origin>`` is not provided (either because they are generic | ||||
SWORDv2 implementations or old implementations of an swh-deposit client), the server | SWORDv2 implementations or old implementations of an swh-deposit client), the server | ||||
falls back to creating one based on the ``provider_url`` and the ``Slug`` header | falls back to creating one based on the ``provider_url`` and the ``Slug`` header | ||||
(as defined in the AtomPub_ specification) by concatenating them. | (as defined in the AtomPub_ specification) by concatenating them. | ||||
If the ``Slug`` header is missing, the server generates one randomly. | If the ``Slug`` header is missing, the server generates one randomly. | ||||
This fallback is provided for compliance with SWORDv2_ clients, but we do not | This fallback is provided for compliance with SWORDv2_ clients, but we do not | ||||
recommend relying on it, as it usually creates origins URL that are not meaningful. | recommend relying on it, as it usually creates origins URL that are not meaningful. | ||||
Adding releases to an origin, with the ``<swhdeposit:add_to_origin>`` tag | Adding releases to an origin, with the ``<swhdeposit:add_to_origin>`` tag | ||||
========================================================================= | ------------------------------------------------------------------------- | ||||
When depositing a source code artifact for an origin (ie. software project) that | When depositing a source code artifact for an origin (ie. software project) that | ||||
was already deposited before, clients should not use ``<swhdeposit:create_origin>``, | was already deposited before, clients should not use ``<swhdeposit:create_origin>``, | ||||
as the origin was already created by the original deposit; and | as the origin was already created by the original deposit; and | ||||
``<swhdeposit:add_to_origin>`` should be used instead. | ``<swhdeposit:add_to_origin>`` should be used instead. | ||||
It is used very similarly to ``<swhdeposit:create_origin>``: | It is used very similarly to ``<swhdeposit:create_origin>``: | ||||
Show All 18 Lines | |||||
This will create a new :term:`revision` object in the Software Heritage archive, | This will create a new :term:`revision` object in the Software Heritage archive, | ||||
with the last deposit on this origin as its parent revision, | with the last deposit on this origin as its parent revision, | ||||
and reference it from the origin. | and reference it from the origin. | ||||
If the origin does not exist, it will error. | If the origin does not exist, it will error. | ||||
Metadata | Metadata | ||||
======== | -------- | ||||
Format | Format | ||||
------ | ^^^^^^ | ||||
While the SWORDv2 specification recommends the use of DublinCore_, | While the SWORDv2 specification recommends the use of DublinCore_, | ||||
we prefer the CodeMeta_ vocabulary, as we already use it in other components | we prefer the CodeMeta_ vocabulary, as we already use it in other components | ||||
of Software Heritage. | of Software Heritage. | ||||
While CodeMeta is designed for use in JSON-LD, it is easy to reuse its vocabulary | While CodeMeta is designed for use in JSON-LD, it is easy to reuse its vocabulary | ||||
and embed it in an XML document, in three steps: | and embed it in an XML document, in three steps: | ||||
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines | <codemeta:author> | ||||
<codemeta:name>Author 2</codemeta:name> | <codemeta:name>Author 2</codemeta:name> | ||||
</codemeta:author> | </codemeta:author> | ||||
</entry> | </entry> | ||||
.. _mandatory-attributes: | .. _mandatory-attributes: | ||||
Mandatory attributes | Mandatory attributes | ||||
-------------------- | ^^^^^^^^^^^^^^^^^^^^ | ||||
All deposits must include: | All deposits must include: | ||||
* an ``<atom:author>`` tag with an ``<atom:name>`` and ``<atom:email>``, and | * an ``<atom:author>`` tag with an ``<atom:name>`` and ``<atom:email>``, and | ||||
* either ``<atom:name>`` or ``<atom:title>`` | * either ``<atom:name>`` or ``<atom:title>`` | ||||
We also highly recommend their CodeMeta equivalent, and any other relevant | We also highly recommend their CodeMeta equivalent, and any other relevant | ||||
metadata, but this is not enforced. | metadata, but this is not enforced. | ||||
.. _metatadata-only-deposit | .. _metatadata-only-deposit | ||||
Metadata-only deposit | Metadata-only deposit | ||||
===================== | --------------------- | ||||
The swh-deposit server can also be without a source code artifact, but only | The swh-deposit server can also be without a source code artifact, but only | ||||
to provide metadata that describes an arbitrary origin or object in | to provide metadata that describes an arbitrary origin or object in | ||||
Software Heritage; known as extrinsic metadata. | Software Heritage; known as extrinsic metadata. | ||||
Unlike regular deposits, there are no restricting on URL prefixes, | Unlike regular deposits, there are no restricting on URL prefixes, | ||||
so any client can provide metadata on any origin; and no restrictions on which | so any client can provide metadata on any origin; and no restrictions on which | ||||
objects can be described. | objects can be described. | ||||
▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines |