diff --git a/docs/README.rst b/docs/README.rst index 016e49bc..938585a6 100644 --- a/docs/README.rst +++ b/docs/README.rst @@ -1,71 +1,71 @@ Software Heritage - Deposit =========================== Simple Web-Service Offering Repository Deposit (S.W.O.R.D) is an interoperability standard for digital file deposit. This repository is both the `SWORD v2`_ Server and a deposit command-line client implementations. This implementation allows interaction between a client (a repository) and a server (SWH repository) to deposit software source code archives and associated metadata. Description ----------- Most of the software source code artifacts present in the SWH Archive are gathered by the mean of :term:`loader ` workers run by the SWH project from sourve code origins identified by :term:`lister ` workers. This is a pull mechanism: it's the responsibility of the SWH project to gather and collect source code artifacts that way. Alternatively, SWH allows its partners to push source code artifacts and metadata directly into the Archive with a push-based mechanism. By using this possibility different actors, holding software artifacts or metadata, can preserve their assets without having to pass through an intermediate collaborative development platform, which is already harvested by SWH (e.g GitHub, Gitlab, etc.). This mechanism is the `deposit`. The main idea is the deposit is an authenticated access to an API allowing the user to provide source code artifacts -- with metadata -- to be ingested in the SWH Archive. The result of that is a :ref:`SWHID ` that can be used to uniquely and persistently identify that very piece of source code. This unique identifier can then be used to `reference the source code `_ (e.g. in a `scientific paper `_) and retrieve it using the :ref:`vault ` feature of the SWH Archive platform. The differences between a piece of code uploaded using the deposit rather than simply asking SWH to archive a repository using the `save code now `_ feature are: - a deposited artifact is provided from one of the SWH partners which is regarded as a trusted authority, - a deposited artifact requires metadata properties describing the source code artifact, - a deposited artifact has a codemeta_ metadata entry attached to it, - a deposited artifact has the same visibility on the SWH Archive than a collected repository, - a deposited artifact can be searched with its provided url property on the SWH Archive, - the deposit API uses the `SWORD v2`_ API, thus requires some tooling to send deposits to SWH. These tools are provided with this repository. See the :ref:`user-manual` page for more details on how to use the deposit client command line tools to push a deposit in the SWH Archive. See the :ref:`swh-api-specifications` reference pages of the SWORDv2 API implementation in `swh.deposit` if you want to do upload deposits using HTTP requests. Read the :ref:`metadata` chapter to get more details on what metadata are supported when doing a deposit. -See :ref:`swh-deposit-dev` if you want to hack the code of the `swh.deposit` module. +See :ref:`swh-deposit-dev-env` if you want to hack the code of the `swh.deposit` module. -See :ref:`swh-deposit-deployment` if you want to deploy your own copy of the +See :ref:`swh-deposit-prod-env` if you want to deploy your own copy of the `swh.deposit` stack. .. _codemeta: https://codemeta.github.io/ .. _`SWORD v2`: http://swordapp.org/sword-v2/ diff --git a/docs/spec-api.rst b/docs/api/api-documentation.rst similarity index 100% rename from docs/spec-api.rst rename to docs/api/api-documentation.rst diff --git a/docs/api/index.rst b/docs/api/index.rst new file mode 100644 index 00000000..2a715043 --- /dev/null +++ b/docs/api/index.rst @@ -0,0 +1,13 @@ +.. _swh-deposit-api: + +Deposit API +=========== + +.. toctree:: + :maxdepth: 2 + :caption: Contents: + + user-manual + api-documentation + metadata + use-cases diff --git a/docs/metadata.rst b/docs/api/metadata.rst similarity index 100% rename from docs/metadata.rst rename to docs/api/metadata.rst diff --git a/docs/specs/blueprint.rst b/docs/api/use-cases.rst similarity index 100% rename from docs/specs/blueprint.rst rename to docs/api/use-cases.rst diff --git a/docs/user-manual.rst b/docs/api/user-manual.rst similarity index 100% rename from docs/user-manual.rst rename to docs/api/user-manual.rst diff --git a/docs/cli.rst b/docs/cli.rst index fb7ebe9c..d004c79a 100644 --- a/docs/cli.rst +++ b/docs/cli.rst @@ -1,33 +1,35 @@ .. _swh-deposit-cli: Command-line interface ====================== Shared command-line interface ----------------------------- .. click:: swh.deposit.cli:deposit :prog: swh deposit :nested: short Administration utilities ------------------------ .. click:: swh.deposit.cli.admin:admin :prog: swh deposit admin :nested: full +.. _swh-deposit-cli-client: + Deposit client tools -------------------- .. click:: swh.deposit.cli.client:upload :prog: swh deposit :nested: full .. click:: swh.deposit.cli.client:status :prog: swh deposit :nested: full .. click:: swh.deposit.cli.client:metadata-only :prog: swh deposit :nested: full diff --git a/docs/index.rst b/docs/index.rst index 3655d546..bc7e9003 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,26 +1,21 @@ .. _swh-deposit: .. include:: README.rst .. toctree:: :maxdepth: 2 :caption: Contents: - user-manual - metadata - spec-api - dev-info - sys-info + api/index + internals/index specs/index - tests/tests_HAL.rst Reference Documentation ----------------------- .. toctree:: :maxdepth: 2 cli /apidoc/swh.deposit - authentication.rst diff --git a/docs/authentication.rst b/docs/internals/authentication.rst similarity index 100% rename from docs/authentication.rst rename to docs/internals/authentication.rst diff --git a/docs/dev-info.rst b/docs/internals/dev-environment.rst similarity index 97% rename from docs/dev-info.rst rename to docs/internals/dev-environment.rst index ecef1b49..7192ca6b 100644 --- a/docs/dev-info.rst +++ b/docs/internals/dev-environment.rst @@ -1,178 +1,178 @@ -.. _swh-deposit-dev: +.. _swh-deposit-dev-env: -Hacking on swh-deposit -====================== +Running swh-deposit locally +=========================== There are multiple modes to run and test the server locally: * development-like (automatic reloading when code changes) * production-like (no reloading) * integration tests (no side effects) Except for the tests which are mostly side effects free (except for the database access), the other modes will need some configuration files (up to 2) to run properly. Database -------- swh-deposit uses a database to store the state of a deposit. The default db is expected to be called swh-deposit-dev. To simplify the use, the following makefile targets can be used: schema ^^^^^^ .. code:: shell make db-create db-prepare db-migrate data ^^^^ Once the db is created, you need some data to be injected (request types, client, collection, etc...): .. code:: shell make db-load-data db-load-private-data The private data are about having a user (``hal``) with a password (``hal``) who can access a collection (``hal``). Add the following to ``../private-data.yaml``: .. code:: yaml - model: deposit.depositclient fields: user_ptr_id: 1 collections: - 1 - model: auth.User pk: 1 fields: first_name: hal last_name: hal username: hal password: "pbkdf2_sha256$30000$8lxjoGc9PiBm$DO22vPUJCTM17zYogBgBg5zr/97lH4pw10Mqwh85yUM=" - model: deposit.depositclient fields: user_ptr_id: 1 collections: - 1 url: https://hal.inria.fr drop ^^^^ For information, you can drop the db: .. code:: shell make db-drop Development-like environment ---------------------------- Development-like environment needs one configuration file to work properly. Configuration ^^^^^^^^^^^^^ **``{/etc/softwareheritage | ~/.config/swh | ~/.swh}``/deposit/server.yml**: .. code:: yaml # dev option for running the server locally host: 127.0.0.1 port: 5006 # production authentication: activated: true white-list: GET: - / # 20 Mib max size max_upload_size: 20971520 Run ^^^ Run the local server, using the default configuration file: .. code:: shell make run-dev Production-like environment --------------------------- Production-like environment needs additional section in the configuration file to work properly. This is more close to what's actually running in production. Configuration ^^^^^^^^^^^^^ This expects the same file describes in the previous chapter. Plus, an additional private section file containing private information that is not in the source code repository. **``{/etc/softwareheritage | ~/.config/swh | ~/.swh}``/deposit/private.yml**: .. code:: yaml private: secret_key: production-local db: name: swh-deposit-dev A production configuration file would look like: .. code:: yaml private: secret_key: production-secret-key db: name: swh-deposit-dev host: db port: 5467 user: user password: user-password Run ^^^ .. code:: shell make run Note: This expects gunicorn3 package installed on the system Tests ----- To run the tests: .. code:: shell make test As explained, those tests are mostly side-effect free. The db part is dealt with by django. The remaining part which patches those side-effect behavior is dealt with in the ``swh/deposit/tests/__init__.py`` module. Sum up ------ Prepare everything for your user to run: .. code:: shell make db-drop db-create db-prepare db-migrate db-load-private-data run-dev diff --git a/docs/internals/index.rst b/docs/internals/index.rst new file mode 100644 index 00000000..5b0affce --- /dev/null +++ b/docs/internals/index.rst @@ -0,0 +1,14 @@ +.. _swh-deposit-internals: + +Deposit internals +================= + +This chapter describes how swh-deposit works internally, +and how to run it (either in production or locally for development). + +.. toctree:: + :maxdepth: 1 + + dev-environment + prod-environment + authentication diff --git a/docs/sys-info.rst b/docs/internals/prod-environment.rst similarity index 96% rename from docs/sys-info.rst rename to docs/internals/prod-environment.rst index 6181bf41..69a4d28c 100644 --- a/docs/sys-info.rst +++ b/docs/internals/prod-environment.rst @@ -1,115 +1,115 @@ -.. _swh-deposit-deployment: +.. _swh-deposit-prod-env: -Deployment -========== +Production deployment +===================== The deposit is architectured around 3 parts: - server: a django application exposing an xml api, discussing with a postgresql backend (and optionally a keycloak instance) - worker(s): 1 worker service dedicated to check the deposit archive and metadata are correct (the checker), another worker service dedicated to actually ingest the deposit into the swh archive. - - client: a python sĨript `swh deposit` command line interface. + - client: a python script `swh deposit` command line interface. All those are packaged in 3 separated debian packages, created and uploaded to the swh debian repository. The deposit server and workers configuration are managed by puppet (cf. puppet-environment/swh-site, puppet-environment/swh-role, puppet-environment/swh-profile) In the following document, we will focus on the server actions that may be needed once the server is installed or upgraded. Prepare the database setup (existence, connection, etc...). ----------------------------------------------------------- This is defined through the packaged module ``swh.deposit.settings.production`` and the expected **/etc/softwareheritage/deposit/server.yml** configuration file. Environment (production/staging) -------------------------------- `SWH_CONFIG_FILENAME` must be defined and target the deposit server configuration file. So either 1. prefix the following commands or 2. export the environment variable in your shell session. For the remaining part of the documentation, we assume 2. has been configured. .. code:: shell export SWH_CONFIG_FILENAME=/etc/softwareheritage/deposit/server.yml Migrate the db schema --------------------- The debian package may integrate some new schema modifications. To run them: .. code:: shell sudo django-admin migrate --settings=swh.deposit.settings.production Add client and collection ------------------------- The deposit can be configured to use either the 1. django basic authentication framework or the 2. swh keycloak instance. If the server uses 2., the password is managed by keycloak so the option `--password`` is ignored. * basic .. code:: shell swh deposit admin \ --config-file $SWH_CONFIG_FILENAME \ --platform production \ user create \ --collection \ --username \ --password This adds a user ```` which can access the collection ````. The password will be used for checking the authentication access to the deposit api (if 1. is used). Note: - If the collection does not exist, it is created alongside - The password, if required, is passed as plain text but stored encrypted Reschedule a deposit --------------------- If for some reason, the loading failed, after fixing and deploying the new deposit loader, you can reschedule the impacted deposit through: .. code:: shell swh deposit admin \ --config-file $SWH_CONFIG_FILENAME \ --platform production \ deposit reschedule \ --deposit-id This will: - check the deposit's status to something reasonable (failed or done). That means that the checks have passed but something went wrong during the loading (failed: loading failed, done: loading ok, still for some reasons as in bugs, we need to reschedule it) - reset the deposit's status to 'verified' (prior to any loading but after the checks which are fine) and removes the different archives' identifiers (swh-id, ...) - trigger back the loading task through the scheduler Integration checks ------------------ There exists icinga checks running periodically on `staging`_ and `production`_ instances. If any problem arises, expect those to notify the #swh-sysadm irc channel. .. _staging: https://icinga.softwareheritage.org/search?q=deposit#!/monitoring/service/show?host=pergamon.softwareheritage.org&service=staging%20Check%20deposit%20end-to-end .. _production: https://icinga.softwareheritage.org/search?q=deposit#!/monitoring/service/show?host=pergamon.softwareheritage.org&service=production%20Check%20deposit%20end-to-end diff --git a/docs/specs/index.rst b/docs/specs/index.rst index 234ad473..b6b641e3 100644 --- a/docs/specs/index.rst +++ b/docs/specs/index.rst @@ -1,13 +1,12 @@ .. _swh-deposit-specs: -Blueprint Specifications -========================= +Specifications +============== .. toctree:: :maxdepth: 1 :caption: Contents: - blueprint.rst spec-loading.rst protocol-reference.rst spec-meta-deposit.rst diff --git a/docs/tests/tests_HAL.rst b/docs/tests/tests_HAL.rst deleted file mode 100644 index 0a1eeb4c..00000000 --- a/docs/tests/tests_HAL.rst +++ /dev/null @@ -1,67 +0,0 @@ -Tests scenarios for client -========================== - -Scenarios for HAL- on HAL's platform ------------------------------------- - -The same procedure is used for all tests: - -Software Author: - -#. prepare content -#. fill out form -#. submit - -HAL moderator: - -#. review content submitted -#. check metadata fields on HAL -#. validate submission - -SWH side: - -1. check content in SWH: - - - directory was created - - revision was created - - release was created when releaseNotes and softwareVersion was included (new feature!) - - origin corresponds to HAL url - -2. check metadata fields on SWH (in revision) -3. check directory -4. check swh-id on HAL -5. check browsability when entering SWH artifact from HAL -6. check vault artifact recreation -7. access deposit's origin from SWH - -+-------------+-------------------------------------------+----------+---------+-----------------------------------------+ -| scenario | test case | data | result | exceptions or specific checks | -+=============+===========================================+==========+=========+=========================================+ -| submit code | content: .tar.gz | .zip | success | | -+-------------+-------------------------------------------+----------+---------+-----------------------------------------+ -| submit code | content: .zip | .tar.gz | success | | -+-------------+-------------------------------------------+----------+---------+-----------------------------------------+ -| submit code | content: no content | empty | fail | blocked on HAL | -+-------------+-------------------------------------------+----------+---------+-----------------------------------------+ -| submit code | content: double compression (.zip in .zip)| .zip x 2 | fail | status `failed` on SWH | -+-------------+-------------------------------------------+----------+---------+-----------------------------------------+ -| submit code | all metadata-single entry | metadata | success | check that all metadata is transmitted | -+-------------+-------------------------------------------+----------+---------+-----------------------------------------+ -| submit code | multiple entries | metadata | success | languages / authors / descriptions | -+-------------+-------------------------------------------+----------+---------+-----------------------------------------+ -| new version | new content- same metadata | content | success | check new swh-id in SWH and HAL | -+-------------+-------------------------------------------+----------+---------+-----------------------------------------+ -| new version | same content- new metadata | metadata | ? | dead angle- doesn't arrives to SWH | -+-------------+-------------------------------------------+----------+---------+-----------------------------------------+ -| new version | new content-new metadata | C & M | success | check artifacts history in revisions | -+-------------+-------------------------------------------+----------+---------+-----------------------------------------+ -| submit code | deposit on another hal platform | C & M | success | | -+-------------+-------------------------------------------+----------+---------+-----------------------------------------+ - -Past known bugs: - -- v2 problem, where swh-id from first version is kept in the second version - instead of the new swh-id. -- when deposit workers are down- error 500 is returned on HAL without real - explanation (because there is no error on SWH- deposit status - stays `deposited`).