diff --git a/docs/dev-info.rst b/docs/dev-info.rst index 459ecf49..6e0a02bc 100644 --- a/docs/dev-info.rst +++ b/docs/dev-info.rst @@ -1,174 +1,174 @@ -Develop on swh-deposit +Hacking on swh-deposit ====================== There are multiple modes to run and test the server locally: * development-like (automatic reloading when code changes) * production-like (no reloading) * integration tests (no side effects) Except for the tests which are mostly side effects free (except for the database access), the other modes will need some configuration files (up to 2) to run properly. Database -------- swh-deposit uses a database to store the state of a deposit. The default db is expected to be called swh-deposit-dev. To simplify the use, the following makefile targets can be used: schema ~~~~~~ .. code:: shell make db-create db-prepare db-migrate data ~~~~ Once the db is created, you need some data to be injected (request types, client, collection, etc...): .. code:: shell make db-load-data db-load-private-data The private data are about having a user (``hal``) with a password (``hal``) who can access a collection (``hal``). Add the following to ``../private-data.yaml``: .. code:: yaml - model: deposit.depositclient fields: user_ptr_id: 1 collections: - 1 - model: auth.User pk: 1 fields: first_name: hal last_name: hal username: hal password: "pbkdf2_sha256$30000$8lxjoGc9PiBm$DO22vPUJCTM17zYogBgBg5zr/97lH4pw10Mqwh85yUM=" - model: deposit.depositclient fields: user_ptr_id: 1 collections: - 1 url: https://hal.inria.fr drop ~~~~ For information, you can drop the db: .. code:: shell make db-drop Development-like environment ---------------------------- Development-like environment needs one configuration file to work properly. Configuration ~~~~~~~~~~~~~ **``{/etc/softwareheritage | ~/.config/swh | ~/.swh}``/deposit/server.yml**: .. code:: yaml # dev option for running the server locally host: 127.0.0.1 port: 5006 # production authentication: activated: true white-list: GET: - / # 20 Mib max size max_upload_size: 20971520 Run ~~~ Run the local server, using the default configuration file: .. code:: shell make run-dev Production-like environment --------------------------- Production-like environment needs two configuration files to work properly. This is more close to what's actually running in production. Configuration ~~~~~~~~~~~~~ This expects the same file describes in the previous chapter. Plus, an additional private **settings.yml** file containing secret information that is not in the source code repository. **``{/etc/softwareheritage | ~/.config/swh | ~/.swh}``/deposit/private.yml**: .. code:: yaml secret_key: production-local db: name: swh-deposit-dev A production configuration file would look like: .. code:: yaml secret_key: production-secret-key db: name: swh-deposit-dev host: db port: 5467 user: user password: user-password Run ~~~ .. code:: shell make run Note: This expects gunicorn3 package installed on the system Tests ----- To run the tests: .. code:: shell make test As explained, those tests are mostly side-effect free. The db part is dealt with by django. The remaining part which patches those side-effect behavior is dealt with in the ``swh/deposit/tests/__init__.py`` module. Sum up ------ Prepare everything for your user to run: .. code:: shell make db-drop db-create db-prepare db-migrate db-load-private-data run-dev diff --git a/docs/getting-started.rst b/docs/getting-started.rst index 25714473..fb4cfac2 100644 --- a/docs/getting-started.rst +++ b/docs/getting-started.rst @@ -1,323 +1,269 @@ Getting Started =============== This is a guide for how to prepare and push a software deposit with the swh-deposit commands. The api is rooted at https://deposit.softwareheritage.org. For more details, see the `main documentation <./index.html>`__. Requirements ------------ You need to be referenced on SWH's client list to have: * a credential (needed for the basic authentication step) - in this document we reference ```` as the client's name and ```` as its associated authentication password. * an associated collection `Contact us for more information. `__ Prepare a deposit ----------------- * compress the files in a supported archive format: - zip: common zip archive (no multi-disk zip files). - tar: tar archive without compression or optionally any of the following compression algorithm gzip (.tar.gz, .tgz), bzip2 (.tar.bz2) , or lzma (.tar.lzma) * prepare a metadata file (`more details <./metadata.html>`__.): - specify metadata schema/vocabulary (CodeMeta is recommended) - specify *MUST* metadata (url, authors, software name and the external\_identifier) - add all available information under the compatible metadata term An example of an atom entry file with CodeMeta terms: .. code:: xml Je suis GPL 12345 forge.softwareheritage.org/source/jesuisgpl/ Yes, this is another implementation of "Hello, world!” when you run it. GPL https://www.gnu.org/licenses/gpl.html Reuben Thomas Maintainer Sami Kerola Maintainer -Check authentication with a service document request ----------------------------------------------------- - -Start with a simple request to check credentials and retrieve the -*collection iri* onto which the deposit will be pushed . - -.. code:: shell - - curl -i --user : https://deposit.softwareheritage.org/1/servicedocument/ - - -The successful response: -^^^^^^^^^^^^^^^^^^^^^^^^ -.. code:: shell - - HTTP/1.0 200 OK - Server: WSGIServer/0.2 CPython/3.5.3 - Content-Type: application/xml - - - - - 2.0 - 209715200 - - - The Software Heritage (SWH) Archive - - Software Collection - application/zip - application/x-tar - Collection Policy - Software Heritage Archive - Collect, Preserve, Share - false - http://purl.org/net/sword/package/SimpleZip - https://deposit.softwareheritage.org/1// - - - - -The error response 401 for Unauthorized access: -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -.. code:: shell - - curl -i https://deposit.softwareheritage.org/1// - HTTP/1.1 401 Unauthorized - Content-Type: application/xml - - - - Invalid username/password. - processing failed - - - API is protected by basic authentication - - - - - Push deposit ------------ You can push a deposit with: * a one single deposit (archive + metadata): The user posts in one query a software source code archive and associated metadata. The deposit is directly marked with status ``deposited``. -* a multipart deposit: +* a multisteps deposit: 1. Create an incomplete deposit (marked with status ``partial``) 2. Add data to a deposit (in multiple requests if needed) 3. Finalize deposit (the status becomes ``deposited``) Single deposit ^^^^^^^^^^^^^^ Once the files are ready for deposit, we want to do the actual deposit in one shot, sending exactly one POST query: * 1 archive (content-type ``application/zip`` or ``application/x-tar``) * 1 metadata file in atom xml format (``content-type: application/atom+xml;type=entry``) For this, we need to provide: * the arguments: ``--username 'name' --password 'pass'`` as credentials * the name of the archive (example: ``path/to/archive-name.tgz``) * in the same location of the archive and with the following namimg pattern for the metadata file: ``path/to/archive-name.metadata.xml`` * optionally, the --slug 'your-id' argument, a reference to a unique identifier the client uses for the software object. You can do this with the following command: minimal deposit .. code:: shell - $ swh-deposit --username 'name' --password 'pass' je-suis-gpl.tgz + $ swh-deposit ---username name --password secret \ + --archive je-suis-gpl.tgz with the client's identifier .. code:: shell - $ swh-deposit --username 'name' --password 'pass' je-suis-gpl.tgz --sulg '123456' + $ swh-deposit --username name --password secret \ + --archive je-suis-gpl.tgz \ + --sulg '123456' deposit to a specific client's collection .. code:: shell - $ swh-deposit --username 'name' --password 'pass' je-suis-gpl.tgz --collection 'second-collection' + $ swh-deposit --username name --password secret \ + --archive je-suis-gpl.tgz \ + --collection 'second-collection' You just posted a deposit to your collection on Software Heritage -If everything went well, a the successful response will contain the +If everything went well, the successful response will contain the elements below: -* ``HTTP/1.0 201 Created``: the deposit was created successfully -* Information about the deposit, such as: - - * deposit id - * deposit date - * deposit status will be ``deposited`` -* Entry points: - - * ``Location: /1///metadata/``: the EDIT-SE-IRI through - which we can update a deposit's metadata - * ``Location: /1///media/``: the EM-IRI through - which we can update a deposit's content - +.. code:: shell + { + 'deposit_status': 'deposited', + 'deposit_id': '7' + } Note: As the deposit is in ``deposited`` status, you cannot update the deposit after this query. It will be answered with a 403 forbidden answer. -multipart deposit +multisteps deposit ^^^^^^^^^^^^^^^^^^^^^^^^^ -The steps to create a multipart deposit: +The steps to create a multisteps deposit: 1. Create an incomplete deposit ~~~~~~~~~~~~~~~~~~~ First use the ``--partial`` argument to declare there is more to come .. code:: shell - $ swh-deposit --username 'name' --password 'secret' --partial \ + $ swh-deposit --username name --password secret --partial \ --archive foo.tar.gz 2. Add content or metadata to the deposit ~~~~~~~~~~~~~~~~~~~ Continue the deposit by using the ``--deposit-id`` argument given as a response for the first step. You can continue adding content or metadata while you use the ``--partial`` argument. .. code:: shell - $ swh-deposit --username 'name' --password 'secret' --partial \ - --deposit-id 42 --archive add-foo.tar.gz + $ swh-deposit --username name --password secret --partial \ + --archive add-foo.tar.gz \ + --deposit-id 42 + +In case you want to add only content without metadata: + +.. code:: shell + + $ swh-deposit --username name --password secret --partial \ + --archive add-foo.tar.gz \ + --archive-deposit + --deposit-id 42 + +If you want to add only metadata, use: + +.. code:: shell + + $ swh-deposit --username name --password secret --partial \ + --metadata add-foo.tar.gz.metadata.xml \ + --metadata-deposit + --deposit-id 42 3. Finalize deposit ~~~~~~~~~~~~~~~~~~~ On your last addition, by not declaring it as ``--partial``, the deposit will be considered as completed and its status will be changed to ``deposited``. -.. code:: shell - - $ swh-deposit --username 'name' --password 'secret' \ - --deposit-id 42 \ - --archive last-foo.tar.gz Update deposit ---------------- * replace deposit : - only possible if the deposit status is ``partial`` - by using the ``--replace`` argument + - you can replace only metadata with the --metadata-deposit flag + - or only the archive with --archive-deposit + - if none is used, you'll replace metadata and content .. code:: shell - $ swh-deposit --username 'name' --password 'secret' --replace\ + $ swh-deposit --username name --password secret --replace\ --deposit-id 11 \ --archive updated-je-suis-gpl.tar.gz * update a loaded deposit with a new version: - by using the external-id with the ``--slug`` argument which will link the new deposit with its parent deposit .. code:: shell - $ swh-deposit --username 'name' --password 'pass' --slug '123456' \ + $ swh-deposit --username name --password secret --slug '123456' \ --archive je-suis-gpl-v2.tgz Check the deposit's status -------------------------- You can check the status of the deposit by using the ``--deposit-id`` argument: .. code:: shell -$ swh-deposit --login 'name' --pass 'secret' --deposit-id '11' --status - -Response: +$ swh-deposit --username name --password secret --deposit-id '11' --status -.. code:: xml +.. code:: json - - 9 - deposited - deposit is fully received and ready for loading - + { + 'deposit_id': '11', + 'deposit_status': 'deposited', + 'deposit_swh_id': None, + 'deposit_status_detail': 'Deposit is ready for additional checks \ + (tarball ok, metadata, etc...)' + } The different statuses: - *partial* : multipart deposit is still ongoing - *deposited*: deposit completed - *rejected*: deposit failed the checks - *verified*: content and metadata verified - *loading*: loading in-progress - *done*: loading completed successfully - *failed*: the deposit loading has failed When the deposit has been loaded into the archive, the status will be marked ``done``. In the response, will also be available the . For example: -.. code:: xml +.. code:: json - - 55 - done - The deposit has been successfully loaded into the Software Heritage archive - swh:1:rev:34898aa991c90b447c27d2ac1fc09f5c8f12783e - + { + 'deposit_id': '11', + 'deposit_status': 'done', + 'deposit_swh_id': 'swh:1:rev:34898aa991c90b447c27d2ac1fc09f5c8f12783e', + 'deposit_status_detail': 'The deposit has been successfully \ + loaded into the Software Heritage archive' + } diff --git a/docs/index.rst b/docs/index.rst index bfb35d51..98965b86 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,22 +1,22 @@ .. _swh-deposit: Software Heritage Deposit ========================= .. toctree:: - :maxdepth: 3 + :maxdepth: 1 :caption: Contents: getting-started.rst spec-api.rst metadata.rst spec-loading.rst dev-info.rst sys-info.rst Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search` diff --git a/docs/sys-info.rst b/docs/sys-info.rst index 2e1e0ff6..582fbc7c 100644 --- a/docs/sys-info.rst +++ b/docs/sys-info.rst @@ -1,51 +1,51 @@ -Bootstrap swh-deposit on production -=================================== +Deployment of the swh-deposit +============================= As usual, the debian packaged is created and uploaded to the swh debian repository. Once the package is installed, we need to do a few things in regards to the database. Prepare the database setup (existence, connection, etc...). ----------------------------------------------------------- This is defined through the packaged ``swh.deposit.settings.production`` module and the expected **/etc/softwareheritage/deposit/private.yml**. As usual, the expected configuration files are deployed through our puppet manifest (cf. puppet-environment/swh-site, puppet-environment/swh-role, puppet-environment/swh-profile) Migrate/bootstrap the db schema ------------------------------- .. code:: shell sudo django-admin migrate --settings=swh.deposit.settings.production Load minimum defaults data -------------------------- .. code:: shell sudo django-admin loaddata --settings=swh.deposit.settings.production deposit_data This adds the minimal: - deposit request type 'archive' and 'metadata' - 'hal' collection Note: swh.deposit.fixtures.deposit\_data is packaged Add client and collection ------------------------- .. code:: shell python3 -m swh.deposit.create_user --platform production \ --collection \ --username \ --password This adds a user ```` which can access the collection ````. The password will be used for the authentication access to the deposit api. Note: This creation procedure needs to be improved.