diff --git a/docs/user-manual.rst b/docs/user-manual.rst index 458c3a0f..e7a29438 100644 --- a/docs/user-manual.rst +++ b/docs/user-manual.rst @@ -1,287 +1,411 @@ +.. _user-manual: + User Manual =========== This is a guide for how to prepare and push a software deposit with the `swh deposit` commands. Requirements ------------ You need to have an account on the Software Heritage deposit application to be able to use the service. Please `contact the Software Heritage team `_ for more information on how to get access to this service. For testing purpose, a test instance `is available `_ [#f1]_ and will be used in the examples below. Once you have an account, you should get a set of access credentials as a `login` and a `password` (identified as ```` and ```` in the -remaining of this document.) +remaining of this document). A deposit account also comes with a "provider URL" +which is used by SWH to build the :term:`Origin URL` of deposits +created using this account. + + +Installation +------------ + +To install the `swh.deposit` command line tools, you need a working Python 3.7+ +environment. It is strongly recommended you use a `virtualenv +`_ for this. + +.. code:: console + + $ python3 -m virtualenv deposit + [...] + $ source deposit/bin/activate + (deposit)$ pip install swh.deposit + [...] + (deposit)$ swh deposit --help + Usage: swh deposit [OPTIONS] COMMAND [ARGS]... + + Deposit main command + + Options: + -h, --help Show this message and exit. + + Commands: + admin Server administration tasks (manipulate user or... + status Deposit's status + upload Software Heritage Public Deposit Client Create/Update... + (deposit)$ + +Note: in the examples below, we use the `jq`_ tool to make json outputs nicer. +If you do have it already, you may install it using your distribution's +packaging system. For example, on a Debian system: + +.. _jq: https://stedolan.github.io/jq/ + +.. code:: console + + $ sudo apt install jq Prepare a deposit ----------------- + * compress the files in a supported archive format: - zip: common zip archive (no multi-disk zip files). - tar: tar archive without compression or optionally any of the following compression algorithm gzip (`.tar.gz`, `.tgz`), bzip2 (`.tar.bz2`) , or lzma (`.tar.lzma`) * (Optional) prepare a metadata file (more details :ref:`deposit-metadata`): +Example: + +Assuming you want to deposit the source code of `belenios +`_ version 1.12 + +.. code:: console + + (deposit)$ wget https://gitlab.inria.fr/belenios/belenios/-/archive/1.12/belenios-1.12.zip + [...] + 2020-10-28 11:40:37 (4,56 MB/s) - ‘belenios-1.12.zip’ saved [449880/449880] + (deposit)$ + +Then you need to prepare a metadata file allowing you to give detailed +information on your deposited source code. A rather minimal Atom with Codemeta +file could be: + +.. code:: console + + (deposit)$ cat metadata.xml + + + Verifiable online voting system + belenios + belenios-01243065 + test-01243065 + https://gitlab.inria.fr/belenios/belenios + test + Online voting + Verifiable online voting system + 1.12 + opam + stable + ocaml + + GNU Affero General Public License + + + Belenios + belenios@example.com + + + Belenios Test User + + + + (deposit)$ + +Please read the :ref:`deposit-metadata` page for a more detailed view on the +metadata file formats and semantics. + + +Push a deposit +-------------- -Push deposit ------------- You can push a deposit with: * a single deposit (archive + metadata): The user posts in one query a software source code archive and associated metadata. The deposit is directly marked with status ``deposited``. * a multisteps deposit: 1. Create an incomplete deposit (marked with status ``partial``) 2. Add data to a deposit (in multiple requests if needed) 3. Finalize deposit (the status becomes ``deposited``) +Overall, a deposit can be a in series of steps as follow: + +.. figure:: images/status.svg + :alt: + +The important things to notice for now is that it can be: + +partial: + the deposit is partially received + +expired: + deposit has been there too long and is now deemed + ready to be garbage collected + +deposited: + deposit is complete and is ready to be checked to ensure data consistency + +verified: + deposit is fully received, checked, and ready for loading + +loading: + loading is ongoing on swh's side + +done: + loading is successful + +failed: + loading is a failure + + +When you push a deposit, it is either in the `deposited` state or in the +`partial` state if you asked for a partial upload. + + + Single deposit ^^^^^^^^^^^^^^ - -Once the files are ready for deposit, we want to do the actual deposit -in one shot, sending exactly one POST query: +Once the files are ready for deposit, we want to do the actual deposit in one +shot, i.e. sending both the archive (zip) file and the metadata file. * 1 archive (content-type ``application/zip`` or ``application/x-tar``) * 1 metadata file in atom xml format (``content-type: application/atom+xml;type=entry``) For this, we need to provide the: * arguments: ``--username 'name' --password 'pass'`` as credentials * archive's path (example: ``--archive path/to/archive-name.tgz``) -* software's name (optional if a metadata filepath is specified and the - artifact's name is included in the metadata file). -* author's name (optional if a metadata filepath is specified and the authors - are included in the metadata file). This can be specified multiple times in - case of multiple authors. -* (optionally) metadata file's path ``--metadata - path/to/file.metadata.xml``. -* (optionally) ``--slug 'your-id'`` argument, a reference to a unique identifier - the client uses for the software object. If not provided, A UUID will be - generated by SWH. +* metadata file path (example: ``--metadata path/to/metadata.xml``) + +to the `swh deposit upload` command. -You can do this with the following command: -minimal deposit -.. code:: shell +Example: - $ swh deposit upload --username name --password secret \ - --author "Jane Doe" \ - --author "John Doe" \ - --name 'je-suis-gpl' \ - --archive je-suis-gpl.tgz +To push the Belenios 1.12 we prepared previously on the testing instance of the +deposit: -with client's external identifier (``slug``) +.. code:: console -.. code:: shell + (deposit)$ ls + belenios-1.12.zip metadata.xml deposit + (deposit)$ swh deposit upload --username --password \ + --url https://deposit.staging.swh.network/1 \ + --slug belenios-01243065 \ + --archive belenios.zip \ + --metadata metadata.xml \ + --format json | jq + { + 'deposit_status': 'deposited', + 'deposit_id': '1', + 'deposit_date': 'Oct. 28, 2020, 1:52 p.m.', + 'deposit_status_detail': None + } - $ swh deposit upload --username name --password secret \ - --author "Jane Doe" \ - --name 'je-suis-gpl' \ - --archive je-suis-gpl.tgz \ - --slug je-suis-gpl + (deposit)$ -to a specific client's collection -.. code:: shell +You just posted a deposit to your main collection on Software Heritage (staging +area)! - $ swh deposit upload --username name --password secret \ - --author "Jane Doe" \ - --name 'je-suis-gpl' \ - --archive je-suis-gpl.tgz \ - --collection 'second-collection' +The returned value is a JSON dict, in which you will notably find the deposit +id (needed to check for its status later on) and the current status, which +should be `deposited` if no error has occurred. + +Note: As the deposit is in ``deposited`` status, you can no longer +update the deposit after this query. It will be answered with a 403 +(Forbidden) answer. +If something went wrong, an equivalent response will be given with the +`error` and `detail` keys explaining the issue, e.g.: -You just posted a deposit to your collection on Software Heritage +.. code:: console + { + 'error': 'Unknown collection name xyz', + 'detail': None, + 'deposit_status': None, + 'deposit_status_detail': None, + 'deposit_swh_id': None, + 'status': 404 + } -If everything went well, the successful response will contain the -elements below: -.. code:: shell +Once the deposit has been done, you can check its status using the `swh deposit +status` command: - { - 'deposit_status': 'deposited', - 'deposit_id': '7', - 'deposit_date': 'Jan. 29, 2018, 12:29 p.m.' - } +.. code:: console -Note: As the deposit is in ``deposited`` status, you can no longer -update the deposit after this query. It will be answered with a 403 -forbidden answer. + (deposit)$ swh deposit status --username --password \ + --url https://deposit.staging.swh.network/1 \ + --deposit-id 1 -f json | jq + { + "deposit_id": "1", + "deposit_status": "done", + "deposit_status_detail": "The deposit has been successfully loaded into the Software Heritage archive", + "deposit_swh_id": "swh:1:dir:63a6fc0ed8f69bf66ccbf99fc0472e30ef0a895a", + "deposit_swh_id_context": "swh:1:dir:63a6fc0ed8f69bf66ccbf99fc0472e30ef0a895a;origin=https://softwareheritage.org/belenios-01234065;visit=swh:1:snp:0ae536667689da7047bfb7aa9f37f5958e9f4647;anchor=swh:1:rev:17ad98c940104d45b6b6bd6fba9aa832eeb95638;path=/", + "deposit_external_id": "belenios-01234065" + } -If something went wrong, an equivalent response will be given with the -`error` and `detail` keys explaining the issue, e.g.: -.. code:: shell - { - 'error': 'Unknown collection name xyz', - 'detail': None, - 'deposit_status': None, - 'deposit_status_detail': None, - 'deposit_swh_id': None, - 'status': 404 - } +Multisteps deposit +^^^^^^^^^^^^^^^^^^ +In this case, the deposit is created by several requests, uploading objects +piece by piece. The steps to create a multisteps deposit: -multisteps deposit -^^^^^^^^^^^^^^^^^^^^^^^^^ -The steps to create a multisteps deposit: +1. Create an partial deposit +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -1. Create an incomplete deposit -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ First use the ``--partial`` argument to declare there is more to come -.. code:: shell +.. code:: console - $ swh deposit upload --username name --password secret \ - --archive foo.tar.gz \ - --partial + $ swh deposit upload --username name --password secret \ + --archive foo.tar.gz \ + --partial 2. Add content or metadata to the deposit ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Continue the deposit by using the ``--deposit-id`` argument given as a response for the first step. You can continue adding content or metadata while you use the ``--partial`` argument. To only add one new archive to the deposit: -.. code:: shell +.. code:: console - $ swh deposit upload --username name --password secret \ - --archive add-foo.tar.gz \ - --deposit-id 42 \ - --partial + $ swh deposit upload --username name --password secret \ + --archive add-foo.tar.gz \ + --deposit-id 42 \ + --partial To only add metadata to the deposit: -.. code:: shell - - $ swh deposit upload --username name --password secret \ - --metadata add-foo.tar.gz.metadata.xml \ - --deposit-id 42 \ - --partial +.. code:: console -or: - -.. code:: shell - - $ swh deposit upload --username name --password secret \ - --name 'add-foo' --author 'someone' \ - --deposit-id 42 \ - --partial + $ swh deposit upload --username name --password secret \ + --metadata add-foo.tar.gz.metadata.xml \ + --deposit-id 42 \ + --partial 3. Finalize deposit ~~~~~~~~~~~~~~~~~~~ On your last addition (same command as before), by not declaring it ``--partial``, the deposit will be considered completed. Its status will be -changed to ``deposited`` +changed to ``deposited``: + +.. code:: console + + $ swh deposit upload --username name --password secret \ + --metadata add-foo.tar.gz.metadata.xml \ + --deposit-id 42 Update deposit ----------------- +-------------- + * replace deposit: - only possible if the deposit status is ``partial`` and ``--deposit-id `` is provided - by using the ``--replace`` flag - ``--metadata-deposit`` replaces associated existing metadata - ``--archive-deposit`` replaces associated archive(s) - by default, with no flag or both, you'll replace associated metadata and archive(s): -.. code:: shell +.. code:: console - $ swh deposit upload --username name --password secret \ - --deposit-id 11 \ - --archive updated-je-suis-gpl.tgz \ - --replace + $ swh deposit upload --username name --password secret \ + --deposit-id 11 \ + --archive updated-je-suis-gpl.tgz \ + --replace * update a loaded deposit with a new version: - by using the external-id with the ``--slug`` argument, you will link the new deposit with its parent deposit: -.. code:: shell +.. code:: console $ swh deposit upload --username name --password secret \ --archive je-suis-gpl-v2.tgz \ --slug 'je-suis-gpl' \ Check the deposit's status -------------------------- You can check the status of the deposit by using the ``--deposit-id`` argument: -.. code:: shell +.. code:: console - $ swh deposit status --username name --password secret \ - --deposit-id 11 + $ swh deposit status --username name --password secret \ + --deposit-id 11 .. code:: json - { - 'deposit_id': '11', - 'deposit_status': 'deposited', - 'deposit_swh_id': None, - 'deposit_status_detail': 'Deposit is ready for additional checks \ - (tarball ok, metadata, etc...)' - } - -The different statuses: - -- **partial**: multipart deposit is still ongoing -- **deposited**: deposit completed -- **rejected**: deposit failed the checks -- **verified**: content and metadata verified -- **loading**: loading in-progress -- **done**: loading completed successfully -- **failed**: the deposit loading has failed + { + "deposit_id": 11, + "deposit_status": "deposited", + "deposit_swh_id": null, + "deposit_status_detail": "Deposit is ready for additional checks \ + (tarball ok, metadata, etc...)" + } When the deposit has been loaded into the archive, the status will be marked ``done``. In the response, will also be available the , . For example: .. code:: json - { - 'deposit_id': '11', - 'deposit_status': 'done', - 'deposit_swh_id': 'swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9', - 'deposit_swh_id_context': 'swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9;origin=https://forge.softwareheritage.org/source/jesuisgpl/;visit=swh:1:snp:68c0d26104d47e278dd6be07ed61fafb561d0d20;anchor=swh:1:rev:e76ea49c9ffbb7f73611087ba6e999b19e5d71eb;path=/', - 'deposit_status_detail': 'The deposit has been successfully \ - loaded into the Software Heritage archive' - } + { + "deposit_id": 11, + "deposit_status": "done", + "deposit_swh_id": "swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9", + "deposit_swh_id_context": "swh:1:dir:d83b7dda887dc790f7207608474650d4344b8df9;\ + origin=https://forge.softwareheritage.org/source/jesuisgpl/;\ + visit=swh:1:snp:68c0d26104d47e278dd6be07ed61fafb561d0d20;\ + anchor=swh:1:rev:e76ea49c9ffbb7f73611087ba6e999b19e5d71eb;path=/", + "deposit_status_detail": "The deposit has been successfully \ + loaded into the Software Heritage archive" + } .. rubric:: Footnotes -.. [#f1] the test instance of the deposit is not yet available external users, +.. [#f1] the test instance of the deposit is not yet available to external users, but it should be available soon.