diff --git a/docs/api.rst b/docs/api.rst --- a/docs/api.rst +++ b/docs/api.rst @@ -25,51 +25,42 @@ --------------------- The vault stores bundles corresponding to different kinds of objects (see -:ref:`data-model`). The following object kinds are currently supported by the -Vault: +:ref:`data-model`). -- directories -- revisions -- snapshots +The URL fragment ``:bundletype/:swhid`` is used throughout the vault API to +identify vault objects. See :ref:`persistent-identifiers` for details on +the syntax and meaning of ``:swhid``. -The URL fragment ``:objectkind/:objectid`` is used throughout the vault API to -identify vault objects. The syntax and meaning of ``:objectid`` for the -different object kinds is detailed below. -In the case of revisions, a third parameter, ``:format``, must be used to -specify the format of the resulting bundle. The URL fragment then becomes -``:objectkind/:objectid/:format``. +Bundle types +------------ -Directories -~~~~~~~~~~~ +Flat +~~~~ -- object kind: ``directory`` -- URL fragment: ``directory/:dir_id`` +Flat bundles are simple tarballs that can be read without any specialized software. -where ``:dir_id`` is a :py:func:`directory identifier -`. +When cooking directories, they are (very close to) the original directories that +were ingested. +When cooking other types of objects, they have multiple root directories, +each corresponding to an original object (revision, ...) -The only format available for a directory export is a gzip-compressed -tarball. You can extract the resulting bundle using: +This is typically only useful to cook directories; cooking other types of objects +(revisions, releases, snapshots) are usually done with ``git-bare`` as it is +more efficient and closer to the original repository. + +You can extract the resulting bundle using: .. code:: shell tar xaf bundle.tar.gz -Revisions -~~~~~~~~~ - -- object kind: ``revision`` -- URL fragment: ``revision/:rev_id/:format`` - -where ``:rev_id`` is a :py:func:`revision identifier -` and ``:format`` is the export -format. +gitfast +~~~~~~~ -The only format available for a revision export is ``gitfast``: a -gzip-compressed `git fast-export +A gzip-compressed `git fast-export `_. You can extract the resulting bundle using: @@ -80,18 +71,27 @@ git checkout HEAD -Repository snapshots -~~~~~~~~~~~~~~~~~~~~ +git-bare +~~~~~~~~ + +A tarball that can be decompressed to get a real git repository. +It is without a checkout, so it is the equivalent of what one would get +with ``git clone --bare``. + +This is the most flexible bundle type, as it allow to perfectly recreate +original git repositories, including branches. + +You can extract the resulting bundle using: -.. TODO +.. code:: shell + + tar xaf bundle.tar.gz -**(NOT AVAILABLE YET)** +Then explore its content like a normal ("non-bare") git repository by cloning it: -- object kind: ``snapshot`` -- URL fragment: ``snapshot/:snp_id`` +.. code:: shell -where ``:snp_id`` is a :py:func:`snapshot identifier -`. + git clone path/to/extracted/:swhid Cooking and status checking @@ -103,8 +103,8 @@ before it can be retrieved. Cooking is idempotent, and a no-op in between a previous cooking operation and expiration. -.. http:post:: /vault/:objectkind/:objectid[/:format] -.. http:get:: /vault/:objectkind/:objectid[/:format] +.. http:post:: /vault/:bundletype/:swhid +.. http:get:: /vault/:bundletype/:swhid **Request body**: optionally, an ``email`` POST parameter containing an e-mail to notify when the bundle cooking has ended. @@ -119,7 +119,7 @@ **Response:** :statuscode 200: bundle available for cooking, status of the cooking - :statuscode 400: malformed identifier hash or format + :statuscode 400: malformed SWHID :statuscode 404: unavailable bundle or object not found .. sourcecode:: http @@ -129,9 +129,8 @@ { "id": 42, - "fetch_url": "/api/1/vault/directory/:dir_id/raw/", - "obj_id": ":dir_id", - "obj_type": "directory", + "fetch_url": "/api/1/vault/flat/:swhid/raw/", + "swhid": ":swhid", "progress_message": "Creating tarball...", "status": "pending" } @@ -145,10 +144,7 @@ - ``fetch_url``: the URL that can be used for the retrieval of the bundle - - ``obj_type``: an internal identifier uniquely representing the object - kind and the format of the required bundle. - - - ``obj_id``: the identifier of the requested bundle + - ``swhid``: the identifier of the requested bundle - ``progress_message``: a string describing the current progress of the cooking. If the cooking failed, ``progress_message`` will contain the @@ -166,9 +162,7 @@ Retrieve a specific bundle from the vault with: -.. http:get:: /vault/:objectkind/:objectid[/:format]/raw - - Where ``:format`` is optional, depending on the object kind. +.. http:get:: /vault/:bundletype/:swhid/raw **Allowed HTTP Methods:** :http:method:`get`, :http:method:`head`, :http:method:`options` diff --git a/docs/getting-started.rst b/docs/getting-started.rst --- a/docs/getting-started.rst +++ b/docs/getting-started.rst @@ -19,10 +19,9 @@ .. code:: shell - curl -X POST https://archive.softwareheritage.org/api/1/vault/directory/:dir_id/ + curl -X POST https://archive.softwareheritage.org/api/1/vault/flat/:swhid/ -where ``:dir_id`` is a :py:func:`directory identifier -`. This initial request and all +where ``:swhid`` is a :ref:`persistent-identifier`. This initial request and all subsequent requests to this endpoint will return some JSON data containing information about the progress of bundle creation: @@ -30,9 +29,8 @@ { "id": 42, - "fetch_url": "/api/1/vault/directory/:dir_id/raw/", - "obj_id": ":dir_id", - "obj_type": "directory", + "fetch_url": "/api/1/vault/flat/:swhid/raw/", + "swhid": ":swhid", "progress_message": "Creating tarball...", "status": "pending" } @@ -42,7 +40,7 @@ .. code:: shell - curl -o bundle.tar.gz https://archive.softwareheritage.org/api/1/vault/directory/:dir_id/raw + curl -o bundle.tar.gz https://archive.softwareheritage.org/api/1/vault/flat/:swhid/raw tar xaf bundle.tar.gz E-mail notifications