-``:`` is used as separator between the logical parts of core identifiers. The ``swh``
-prefix makes explicit that these identifiers are related to *SoftWare
+``:`` is used as separator between the logical parts of core identifiers. The
+``swh`` prefix makes explicit that these identifiers are related to *SoftWare
Heritage*. ``1`` (``<scheme_version>``) is the current version of this
-identifier *scheme*; future editions will use higher version numbers, possibly
-breaking backward compatibility (but without breaking the resolvability of
-SWHIDs that conform to previous versions of the scheme).
+identifier *scheme*. Future editions will use higher version numbers, possibly
+breaking backward compatibility, but without breaking the resolvability of
+SWHIDs that conform to previous versions of the scheme.
A SWHID points to a single object, whose type is explicitly captured by
``<object_type>``:
@@ -151,23 +162,27 @@
quotes), a space, the length of the content as decimal digits, a NULL byte,
and the actual content of the file.
+
Qualifiers
-~~~~~~~~~~
+----------
``;`` is used as separator between the core identifier and the optional
-qualifiers, and optional qualifiers. Each qualifier is specified as a
+qualifiers, as well as between qualifiers. Each qualifier is specified as a
key/value pair, using ``=`` as a separator.
The following *context qualifiers* are available:
-* **origin** : the *software origin* where an object has been found or observed
+* **origin:** the *software origin* where an object has been found or observed
in the wild, as an URI;
-* **visit** : the core identifier of a *snapshot* corresponding to a specific
+
+* **visit:** the core identifier of a *snapshot* corresponding to a specific
*visit* of a repository containing the designated object;
-* **anchor** : a *designated node* in the Merkle DAG relative to which a *path
+
+* **anchor:** a *designated node* in the Merkle DAG relative to which a *path
to the object* is specified, as the core identifier of a directory, a
revision, a release or a snapshot;
-* **path** : the *absolute file path*, from the *root directory* associated to
+
+* **path:** the *absolute file path*, from the *root directory* associated to
the *anchor node*, to the object; when the anchor denotes a directory or a
revision, and almost always when it's a release, the root directory is
uniquely determined; when the anchor denotes a snapshot, the root directory
@@ -176,7 +191,7 @@
The following *fragment qualifier* is available:
-* **lines** : *line number(s)* of interest, usually within a content object
+* **lines:** *line number(s)* of interest, usually within a content object
We recommend to equip identifiers meant to be shared with as many qualifiers as
possible. While qualifiers may be listed in any order, it is good practice to
@@ -186,44 +201,69 @@
there, then the *anchor* qualifier is superfluous; similarly, if the *path* is
empty, it may be omitted.
+
+Interoperability
+================
+
+
+URI scheme
+----------
+
+The ``swh`` URI scheme is registered at IANA for SWHIDs. The present documents
+constitutes the scheme specification for such URI scheme.
+
+
Git compatibility
-~~~~~~~~~~~~~~~~~
+-----------------
SWHIDs for contents, directories, revisions, and releases are, at present,
compatible with the `Git <https://git-scm.com/>`_ way of `computing identifiers
<https://git-scm.com/book/en/v2/Git-Internals-Git-Objects>`_ for its objects.
The ``<object_id>`` part of a SWHID for a content object is the Git blob
identifier of any file with the same content; for a revision it is the Git
-commit identifier for the same revision, etc. This is not the case for snapshot
-identifiers, as Git does not have a corresponding object type.
+commit identifier for the same revision, etc. This is not the case for
+snapshot identifiers, as Git does not have a corresponding object type.
Note that Git compatibility is incidental and is not guaranteed to be
maintained in future versions of this scheme (or Git).
Examples
---------
+========
+
Core identifiers
-~~~~~~~~~~~~~~~~
+----------------
* ``swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2`` points to the content
of a file containing the full text of the GPL3 license
+
* ``swh:1:dir:d198bc9d7a6bcf6db04f476d29314f157507d505`` points to a directory
containing the source code of the Darktable photography application as it was
at some point on 4 May 2017
+
* ``swh:1:rev:309cf2674ee7a0749978cf8265ab91a60aea0f7d`` points to a commit in
the development history of Darktable, dated 16 January 2017, that added
undo/redo supports for masks
+
* ``swh:1:rel:22ece559cc7cc2364edc5e5593d63ae8bd229f9f`` points to Darktable
release 2.3.0, dated 24 December 2016
+
* ``swh:1:snp:c7c108084bc0bf3d81436bf980b46e98bd338453`` points to a snapshot
of the entire Darktable Git repository taken on 4 May 2017 from GitHub
+
Identifiers with qualifiers
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
+---------------------------
-* The following `fully qualified SWHID <https://archive.softwareheritage.org/swh:1:cnt:4d99d2d18326621ccdd70f5ea66c2e2ac236ad8b;;origin=https://gitorious.org/ocamlp3l/ocamlp3l_cvs.git;visit=swh:1:snp:d7f1b9eb7ccb596c2622c4780febaa02549830f9;anchor=swh:1:rev:2db189928c94d62a3b4757b3eec68f0a4d4113f0;path=/Examples/SimpleFarm/simplefarm.ml;lines=9-15>`_ denotes the lines 9 to 15 of a file content that can be found at absolute path ``/Examples/SimpleFarm/simplefarm.ml`` from the root directory of the revision ``swh:1:rev:2db189928c94d62a3b4757b3eec68f0a4d4113f0`` that is contained in the snapshot ``swh:1:snp:d7f1b9eb7ccb596c2622c4780febaa02549830f9`` taken from the origin ``https://gitorious.org/ocamlp3l/ocamlp3l_cvs.git``
-* This is an example of `a fully qualified SWHID with a percent escaped file path <https://archive.softwareheritage.org/swh:1:cnt:f10371aa7b8ccabca8479196d6cd640676fd4a04;origin=https://github.com/web-platform-tests/wpt;visit=swh:1:snp:b37d435721bbd450624165f334724e3585346499;anchor=swh:1:rev:259d0612af038d14f2cd889a14a3adb6c9e96d96;path=/html/semantics/document-metadata/the-meta-element/pragma-directives/attr-meta-http-equiv-refresh/support/x%3Burl=foo/>`_
+tool, available from the `swh.model <https://pypi.org/project/swh.model/>`_
+Python package under the GPL license.
-An important property of SWHIDs is that a core identifier is *intrinsic*: it can
-be *computed from the object itself* using the `swh-identify <https://docs.softwareheritage.org/devel/swh-model/cli.html>`_ utility, or equivalently using standard git tools.
+SWHIDs are also automatically computed by Software Heritage for all archived
+objects as part of its archival activity, and can be looked up via the project
-Note that resolution via Identifiers.org currently only supports *core identifiers* due to `syntactic incompatibilities with qualifiers <http://identifiers.org/documentation#custom_requests>`_.