diff --git a/docs/index.rst b/docs/index.rst index a6009a15..cf9e0071 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,80 +1,81 @@ .. _swh-deposit: Software Heritage - Deposit =========================== Push-based deposit of software source code artifacts and metadata to the Software Heritage (SWH) Archive. Description ----------- Most of the software source code artifacts present in the SWH Archive are gathered by the mean of :term:`loader ` workers run by the SWH project from sourve code origins identified by :term:`lister ` workers. This is a pull mechanism: it's the responsibility of the SWH project to gather and collect source code artifacts that way. Alternatively, SWH allows its partners to push source code artifacts and metadata directly into the Archive with a push-based mechanism. By using this possibility different actors, holding software artifacts or metadata, can preserve their assets without having to pass through an intermediate collaborative development platform, which is already harvested by SWH (e.g GitHub, Gitlab, etc.). This mechanism is the `deposit`. The main idea is the deposit is an authenticated access to an API allowing the user to provide source code artifacts -- with metadata -- to be ingested in the SWH Archive. The result of that is a :ref:`SWHID ` that can be used to uniquely and persistently identify that very piece of source code. This unique identifier can then be used to `reference the source code `_ (e.g. in a `scientific paper `_) and retrieve it using the :ref:`vault ` feature of the SWH Archive platform. The differences between a piece of code uploaded using the deposit rather than simply asking SWH to archive a repository using the `save code now `_ feature are: - a deposited artifact is provided from one of the SWH partners which is regarded as a trusted authority, - a deposited artifact requires metadata properties describing the source code artifact, - a deposited artifact has a codemeta_ metadata entry attached to it, - a deposited artifact has the same visibility on the SWH Archive than a collected repository, - a deposited artifacts can be searched with its provided url property on the SWH Archive, - the deposit API uses the `SWORD v2`_ API, thus requires some tooling to send deposits to SWH. These tools are provided with this repository. -See the :ref:`getting-started` page for more details on how to use the deposit + +See the :ref:`user-manual` page for more details on how to use the deposit client tools to push a deposit in the SWH Archive. .. _codemeta: https://codemeta.github.io/ .. _`SWORD v2`: http://swordapp.org/sword-v2/ .. toctree:: :maxdepth: 2 :caption: Contents: - getting-started + user-manual spec-api metadata dev-info sys-info specs/index tests/tests_HAL.rst Reference Documentation ----------------------- .. toctree:: :maxdepth: 2 /apidoc/swh.deposit diff --git a/docs/getting-started.rst b/docs/user-manual.rst similarity index 91% rename from docs/getting-started.rst rename to docs/user-manual.rst index 6aa57bbf..458c3a0f 100644 --- a/docs/getting-started.rst +++ b/docs/user-manual.rst @@ -1,285 +1,287 @@ -Getting Started -=============== +User Manual +=========== This is a guide for how to prepare and push a software deposit with the `swh deposit` commands. -The API is rooted at https://deposit.softwareheritage.org/1. - -For more details, see the `main documentation <./index.html>`__. Requirements ------------ -You need to be referenced on SWH's client list to have: - -* credentials (needed for the basic authentication step) - - - in this document we reference ```` as the client's name and - ```` as its associated authentication password. - -* an associated collection_. +You need to have an account on the Software Heritage deposit application to be +able to use the service. +Please `contact the Software Heritage team `_ for +more information on how to get access to this service. -.. _collection: https://bitworking.org/projects/atom/rfc5023#rfc.section.8.3.3 +For testing purpose, a test instance `is available +`_ [#f1]_ and will be used in the examples below. +Once you have an account, you should get a set of access credentials as a +`login` and a `password` (identified as ```` and ```` in the +remaining of this document.) -`Contact us for more information. -`__ Prepare a deposit ----------------- * compress the files in a supported archive format: - zip: common zip archive (no multi-disk zip files). - tar: tar archive without compression or optionally any of the following compression algorithm gzip (`.tar.gz`, `.tgz`), bzip2 (`.tar.bz2`) , or lzma (`.tar.lzma`) * (Optional) prepare a metadata file (more details :ref:`deposit-metadata`): Push deposit ------------ You can push a deposit with: * a single deposit (archive + metadata): The user posts in one query a software source code archive and associated metadata. The deposit is directly marked with status ``deposited``. * a multisteps deposit: 1. Create an incomplete deposit (marked with status ``partial``) 2. Add data to a deposit (in multiple requests if needed) 3. Finalize deposit (the status becomes ``deposited``) Single deposit ^^^^^^^^^^^^^^ Once the files are ready for deposit, we want to do the actual deposit in one shot, sending exactly one POST query: * 1 archive (content-type ``application/zip`` or ``application/x-tar``) * 1 metadata file in atom xml format (``content-type: application/atom+xml;type=entry``) For this, we need to provide the: * arguments: ``--username 'name' --password 'pass'`` as credentials * archive's path (example: ``--archive path/to/archive-name.tgz``) * software's name (optional if a metadata filepath is specified and the artifact's name is included in the metadata file). * author's name (optional if a metadata filepath is specified and the authors are included in the metadata file). This can be specified multiple times in case of multiple authors. * (optionally) metadata file's path ``--metadata path/to/file.metadata.xml``. * (optionally) ``--slug 'your-id'`` argument, a reference to a unique identifier the client uses for the software object. If not provided, A UUID will be generated by SWH. You can do this with the following command: minimal deposit .. code:: shell $ swh deposit upload --username name --password secret \ --author "Jane Doe" \ --author "John Doe" \ --name 'je-suis-gpl' \ --archive je-suis-gpl.tgz with client's external identifier (``slug``) .. code:: shell $ swh deposit upload --username name --password secret \ --author "Jane Doe" \ --name 'je-suis-gpl' \ --archive je-suis-gpl.tgz \ --slug je-suis-gpl to a specific client's collection .. code:: shell $ swh deposit upload --username name --password secret \ --author "Jane Doe" \ --name 'je-suis-gpl' \ --archive je-suis-gpl.tgz \ --collection 'second-collection' You just posted a deposit to your collection on Software Heritage If everything went well, the successful response will contain the elements below: .. code:: shell { 'deposit_status': 'deposited', 'deposit_id': '7', 'deposit_date': 'Jan. 29, 2018, 12:29 p.m.' } Note: As the deposit is in ``deposited`` status, you can no longer update the deposit after this query. It will be answered with a 403 forbidden answer. If something went wrong, an equivalent response will be given with the `error` and `detail` keys explaining the issue, e.g.: .. code:: shell { 'error': 'Unknown collection name xyz', 'detail': None, 'deposit_status': None, 'deposit_status_detail': None, 'deposit_swh_id': None, 'status': 404 } multisteps deposit ^^^^^^^^^^^^^^^^^^^^^^^^^ The steps to create a multisteps deposit: 1. Create an incomplete deposit ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ First use the ``--partial`` argument to declare there is more to come .. code:: shell $ swh deposit upload --username name --password secret \ --archive foo.tar.gz \ --partial 2. Add content or metadata to the deposit ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Continue the deposit by using the ``--deposit-id`` argument given as a response for the first step. You can continue adding content or metadata while you use the ``--partial`` argument. To only add one new archive to the deposit: .. code:: shell $ swh deposit upload --username name --password secret \ --archive add-foo.tar.gz \ --deposit-id 42 \ --partial To only add metadata to the deposit: .. code:: shell $ swh deposit upload --username name --password secret \ --metadata add-foo.tar.gz.metadata.xml \ --deposit-id 42 \ --partial or: .. code:: shell $ swh deposit upload --username name --password secret \ --name 'add-foo' --author 'someone' \ --deposit-id 42 \ --partial 3. Finalize deposit ~~~~~~~~~~~~~~~~~~~ On your last addition (same command as before), by not declaring it ``--partial``, the deposit will be considered completed. Its status will be changed to ``deposited`` Update deposit ---------------- * replace deposit: - only possible if the deposit status is ``partial`` and ``--deposit-id `` is provided - by using the ``--replace`` flag - ``--metadata-deposit`` replaces associated existing metadata - ``--archive-deposit`` replaces associated archive(s) - by default, with no flag or both, you'll replace associated metadata and archive(s): .. code:: shell $ swh deposit upload --username name --password secret \ --deposit-id 11 \ --archive updated-je-suis-gpl.tgz \ --replace * update a loaded deposit with a new version: - by using the external-id with the ``--slug`` argument, you will link the new deposit with its parent deposit: .. code:: shell $ swh deposit upload --username name --password secret \ --archive je-suis-gpl-v2.tgz \ --slug 'je-suis-gpl' \ Check the deposit's status -------------------------- You can check the status of the deposit by using the ``--deposit-id`` argument: .. code:: shell $ swh deposit status --username name --password secret \ --deposit-id 11 .. code:: json { 'deposit_id': '11', 'deposit_status': 'deposited', 'deposit_swh_id': None, 'deposit_status_detail': 'Deposit is ready for additional checks \ (tarball ok, metadata, etc...)' } The different statuses: - **partial**: multipart deposit is still ongoing - **deposited**: deposit completed - **rejected**: deposit failed the checks - **verified**: content and metadata verified - **loading**: loading in-progress - **done**: loading completed successfully - **failed**: the deposit loading has failed When the deposit has been loaded into the archive, the status will be marked ``done``. In the response, will also be available the , . For example: .. code:: json { 'deposit_id': '11', 'deposit_status': 'done', 'deposit_swh_id': 'swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9', 'deposit_swh_id_context': 'swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9;origin=https://forge.softwareheritage.org/source/jesuisgpl/;visit=swh:1:snp:68c0d26104d47e278dd6be07ed61fafb561d0d20;anchor=swh:1:rev:e76ea49c9ffbb7f73611087ba6e999b19e5d71eb;path=/', 'deposit_status_detail': 'The deposit has been successfully \ loaded into the Software Heritage archive' } + + + +.. rubric:: Footnotes + +.. [#f1] the test instance of the deposit is not yet available external users, + but it should be available soon.