diff --git a/sysadm/mirror-operations/docker.rst b/sysadm/mirror-operations/docker.rst --- a/sysadm/mirror-operations/docker.rst +++ b/sysadm/mirror-operations/docker.rst @@ -11,8 +11,8 @@ Prerequisities -------------- -According you have a properly set up docker swarm cluster with support for the -`docker stack deploy +We assume that you have a properly set up docker swarm cluster with support for +the `docker stack deploy `_ command, e.g.: @@ -24,8 +24,9 @@ n9mfw08gys0dmvg5j2bb4j2m7 * host1 Ready Active Leader 18.09.7 -Note: on some systems (centos for example), making docker swarm works require -some permission tuning regarding the firewall and selinux. +Note: on some systems (centos for example), making docker swarm work requires some +permission tuning regarding the firewall and selinux. Please refer to `the upstream +docker-swarm documentation `_. In the following how-to, we will assume that the service `STACK` name is `swh` (this name is the last argument of the `docker stack deploy` command below). @@ -33,7 +34,7 @@ Several preparation steps will depend on this name. We also use `docker-compose `_ to merge compose -files, so make sure it iavailable on your system. +files, so make sure it is available on your system. You also need to clone the git repository: @@ -43,15 +44,15 @@ Set up volumes -------------- -Before starting the `swh` service, you may want to specify where the data -should be stored on your docker hosts. +Before starting the `swh` service, you will certainly want to specify where the +data should be stored on your docker hosts. By default docker will use docker volumes for storing databases and the content of the objstorage (thus put them in `/var/lib/docker/volumes`). -**Optional:** if you want to specify a different location to put a storage in, -create the storage before starting the docker service. For example for the -`objstorage` service you will need a storage named `_objstorage`: +**Optional:** if you want to specify a different location to put the data in, +you should create the docker volumes before starting the docker service. For +example, the `objstorage` service uses a volume named `_objstorage`: .. code-block:: bash @@ -62,20 +63,32 @@ swh_objstorage -If you want to deploy services like the `swh-objstorage` on several hosts, you -will need a shared storage area in which blob objects will be stored. Typically -a NFS storage can be used for this, or any existing docker volume driver like -`REX-Ray `_. This is not covered in this doc. +If you want to deploy services like the `objstorage` on several hosts, you will need a +shared storage area in which blob objects will be stored. Typically a NFS storage can be +used for this, or any existing docker volume driver like `REX-Ray +`_. This is not covered in this documentation. Please read the documentation of docker volumes to learn how to use such a device/driver as volume provider for docker. -Note that the provided `base-services.yaml` file have a few placement -constraints: containers that depends on a volume (db-storage and objstorage) -are stick to the manager node of the cluster, under the assumption persistent -volumes have been created on this node. Make sure this fits your needs, or -amend these placement constraints. +Note that the provided `base-services.yaml` file has placement constraints for the +`db-storage`, `db-web` and `objstorage` containers, that depend on the availability of +specific volumes (respectively `_storage-db`, `_web-db` and +`_objstorage`). These services are pinned to specific nodes using labels named +`org.softwareheritage.mirror.volumes.` (e.g. +`org.softwareheritage.mirror.volumes.objstorage`). +When you create a local volume for a given container, you should add the relevant label +to the docker swarm node metadata with: + +.. code-block:: bash + + docker node update \ + --label-add org.softwareheritage.mirror.volumes.objstorage=true \ + + +You have to set the node labels, or to adapt the placement constraints to your local +requirements, for the services to start. Managing secrets ---------------- @@ -85,17 +98,18 @@ Namely, you need to create a `secret` for: -- `postgres-password` +- `swh-mirror-db-postgres-password` +- `swh-mirror-web-postgres-password` For example: .. code-block:: bash - ~/swh-docker$ echo 'strong password' | docker secret create postgres-password - + ~/swh-docker$ xkcdpass -d- | docker secret create swh-mirror-db-postgres-password - [...] -Creating the swh base services +Spawning the swh base services ------------------------------ If you haven't done it yet, clone this git repository: @@ -105,52 +119,71 @@ ~$ git clone https://forge.softwareheritage.org/source/swh-docker.git ~$ cd swh-docker +This repository provides the docker compose/stack manifests to deploy all the relevant +services. + +.. note:: + + These manifests use a set of docker images `published in the docker hub + `_. By default, the manifests + will use the `latest` version of these images, but for production uses, you should + set the `SWH_IMAGE_TAG` environment variable to pin them to a specific version. + +To specify the tag to be used, simply set the SWH_IMAGE_TAG environment variable, like +so: + +.. code-block:: bash + + ~/swh-docker$ export SWH_IMAGE_TAG=20211022-121751 -then from within this repository, just type: +You can then spawn the base services using the following command: .. code-block:: bash ~/swh-docker$ docker stack deploy -c base-services.yml swh - Creating network swh-mirror_default - Creating config swh-mirror_storage - Creating config swh-mirror_objstorage - Creating config swh-mirror_nginx - Creating config swh-mirror_web - Creating service swh-mirror_grafana - Creating service swh-mirror_prometheus-statsd-exporter - Creating service swh-mirror_web - Creating service swh-mirror_objstorage - Creating service swh-mirror_db-storage - Creating service swh-mirror_memcache - Creating service swh-mirror_storage - Creating service swh-mirror_nginx - Creating service swh-mirror_prometheus + + Creating network swh_default + Creating config swh_storage + Creating config swh_objstorage + Creating config swh_nginx + Creating config swh_web + Creating service swh_grafana + Creating service swh_prometheus-statsd-exporter + Creating service swh_web + Creating service swh_objstorage + Creating service swh_db-storage + Creating service swh_memcache + Creating service swh_storage + Creating service swh_nginx + Creating service swh_prometheus + ~/swh-docker$ docker service ls - ID NAME MODE REPLICAS IMAGE PORTS - sz98tofpeb3j swh-mirror_db-storage global 1/1 postgres:11 - sp36lbgfd4qi swh-mirror_grafana replicated 1/1 grafana/grafana:latest - 7oja81jngiwo swh-mirror_memcache replicated 1/1 memcached:latest - y5te0gqs93li swh-mirror_nginx replicated 1/1 nginx:latest *:5081->5081/tcp - 79t3r3mv3qn6 swh-mirror_objstorage replicated 1/1 softwareheritage/base:20200918-133743 - l7q2zocoyvq6 swh-mirror_prometheus global 1/1 prom/prometheus:latest - p6hnd90qnr79 swh-mirror_prometheus-statsd-exporter replicated 1/1 prom/statsd-exporter:latest - jjry62tz3k76 swh-mirror_storage replicated 1/1 softwareheritage/base:20200918-133743 - jkkm7qm3awfh swh-mirror_web replicated 1/1 softwareheritage/web:20200918-133743 + + ID NAME MODE REPLICAS IMAGE PORTS + tc93talbe2tg swh_db-storage global 1/1 postgres:13 + 42q5jtxsh029 swh_db-web global 1/1 postgres:13 + rtlz62ok6s96 swh_grafana replicated 1/1 grafana/grafana:latest + jao3rt0et17n swh_memcache replicated 1/1 memcached:latest + rulxakqgu2ko swh_nginx replicated 1/1 nginx:latest *:5081->5081/tcp + q560pvw3q3ls swh_objstorage replicated 2/2 softwareheritage/base:20211022-121751 + a2h3ltaqdt56 swh_prometheus global 1/1 prom/prometheus:latest + lm24et9gjn2k swh_prometheus-statsd-exporter replicated 1/1 prom/statsd-exporter:latest + gwqinrao5win swh_storage replicated 2/2 softwareheritage/base:20211022-121751 + 7g46blmphfb4 swh_web replicated 1/1 softwareheritage/web:20211022-121751 This will start a series of containers with: - an objstorage service, - a storage service using a postgresql database as backend, -- a web app front end, +- a web app front end using a postgresql database as backend, - a memcache for the web app, - a prometheus monitoring app, - a prometeus-statsd exporter, - a grafana server, - an nginx server serving as reverse proxy for grafana and swh-web. -using the latest published version of the docker images by default. - +using the pinned version of the docker images. The nginx frontend will listen on the 5081 port, so you can use: @@ -160,43 +193,20 @@ .. warning:: - the 'latest' docker images work, it is highly recommended to - explicitly specify the version of the image you want to use. - -Docker images for the Software Heritage stack are tagged with their build date: - -.. code-block:: bash - - ~$ docker images -f reference='softwareheritage/*:20*' - REPOSITORY TAG IMAGE ID CREATED SIZE - softwareheritage web-20200819-112604 32ab8340e368 About an hour ago 339MB - softwareheritage base-20200819-112604 19fe3d7326c5 About an hour ago 242MB - softwareheritage web-20200630-115021 65b1869175ab 7 weeks ago 342MB - softwareheritage base-20200630-115021 3694e3fcf530 7 weeks ago 245MB - -To specify the tag to be used, simply set the SWH_IMAGE_TAG environment variable, like: - -.. code-block:: bash - - export SWH_IMAGE_TAG=20200819-112604 - docker deploy -c base-services.yml swh - -.. warning:: - - make sure to have this variable properly set for any later `docker deploy` - command you type, otherwise you running containers will be recreated using the - ':latest' image (which might **not** be the latest available version, nor - consistent amond the docker nodes on you swarm cluster). + Please make sure that the `SWH_IMAGE_TAG` variable is properly set for any later + `docker stack deploy` command you type, otherwise all the running containers will be + recreated using the ':latest' image (which might **not** be the latest available + version, nor consistent among the docker nodes on your swarm cluster). Updating a configuration ------------------------ -When you modify a configuration file exposed to docker services via the `docker +Configuration files are exposed to docker services via the `docker config` system. Unfortunately, docker does not support updating these config -objects, so you need to either: +objects, so you will need to either: - destroy the old config before being able to recreate them. That also means - you need to recreate every docker container using this config, or + you need to recreate every docker service using this config, or - adapt the `name:` field in the compose file. @@ -220,7 +230,7 @@ Note: since persistent data (databases and objects) are stored in volumes, you -can safely destoy and recreate any container you want, you will not loose any +can safely destoy and recreate any container you want, you will not lose any data. Or you can change the compose file like: @@ -276,19 +286,12 @@ .. code-block:: bash ~/swh-docker$ docker service update --image \ - softwareheritage/replayer:${SWH_IMAGE_TAG} ) \ + softwareheritage/replayer:${SWH_IMAGE_TAG} \ swh_graph-replayer -Set up a mirror -=============== - -.. warning:: - - you cannot "upgrade" an existing docker stack built from the base-services.yml file - to a mirror one; you need to recreate it; more precisely, you need to drop the - storage database before. This is due to the fact the storage database for a mirror is - not initialized the same way as the default storage database. +Set up the mirroring components +=============================== A Software Heritage mirror consists in base Software Heritage services, as described above, without any worker related to web scraping nor source code @@ -306,16 +309,16 @@ Copy these example files as plain yaml ones then modify them to replace the XXX markers with proper values (also make sure the kafka server list -is up to date.) Parameters to check/update are: +is up to date). The parameters to check/update are: -- `journal_client/brokers`: list of kafka brokers. -- `journal_client/group_id`: unique identifier for this mirroring session; +- `journal_client.brokers`: list of kafka brokers. +- `journal_client.group_id`: unique identifier for this mirroring session; you can choose whatever you want, but changing this value will make kafka start consuming messages from the beginning; kafka messages are dispatched among consumers with the same `group_id`, so in order to distribute the load among workers, they must share the same `group_id`. -- `journal_client/sasl.username`: kafka authentication username. -- `journal_client/sasl.password`: kafka authentication password. +- `journal_client."sasl.username"`: kafka authentication username. +- `journal_client."sasl.password"`: kafka authentication password. Then you need to merge the compose files "by hand" (due to this still `unresolved `_ @@ -338,19 +341,19 @@ .. code-block:: bash - ~/swh-docker$ docker stack deploy -c mirror.yml swh-mirror + ~/swh-docker$ docker stack deploy -c mirror.yml swh Graph replayer -------------- -To run the graph replayer compoenent of a mirror: +To run the graph replayer component of a mirror: .. code-block:: bash ~/swh-docker$ cd conf ~/swh-docker/conf$ cp graph-replayer.yml.example graph-replayer.yml - ~/swh-docker/conf$ # edit graph-replayer.yml files + ~/swh-docker/conf$ $EDITOR graph-replayer.yml ~/swh-docker/conf$ cd .. @@ -362,10 +365,10 @@ ~/swh-docker$ docker-compose \ -f base-services.yml \ -f graph-replayer-override.yml \ - config > graph-replayer.yml + config > stack-with-graph-replayer.yml ~/swh-docker$ docker stack deploy \ - -c graph-replayer.yml \ - swh-mirror + -c stack-with-graph-replayer.yml \ + swh [...] You can check everything is running with: @@ -373,35 +376,38 @@ .. code-block:: bash ~/swh-docker$ docker stack ls - NAME SERVICES ORCHESTRATOR - swh-mirror 11 Swarm + + NAME SERVICES ORCHESTRATOR + swh 11 Swarm + ~/swh-docker$ docker service ls - ID NAME MODE REPLICAS IMAGE PORTS - 88djaq3jezjm swh-mirror_db-storage replicated 1/1 postgres:11 - m66q36jb00xm swh-mirror_grafana replicated 1/1 grafana/grafana:latest - qfsxngh4s2sv swh-mirror_content-replayer replicated 1/1 softwareheritage/replayer:latest - qcl0n3ngr2uv swh-mirror_graph-replayer replicated 1/1 softwareheritage/replayer:latest - zn8dzsron3y7 swh-mirror_memcache replicated 1/1 memcached:latest - wfbvf3yk6t41 swh-mirror_nginx replicated 1/1 nginx:latest *:5081->5081/tcp - thtev7o0n6th swh-mirror_objstorage replicated 1/1 softwareheritage/base:latest - ysgdoqshgd2k swh-mirror_prometheus replicated 1/1 prom/prometheus:latest - u2mjjl91aebz swh-mirror_prometheus-statsd-exporter replicated 1/1 prom/statsd-exporter:latest - xyf2xgt465ob swh-mirror_storage replicated 1/1 softwareheritage/base:latest - su8eka2b5cbf swh-mirror_web replicated 1/1 softwareheritage/web:latest + + ID NAME MODE REPLICAS IMAGE PORTS + tc93talbe2tg swh_db-storage global 1/1 postgres:13 + 42q5jtxsh029 swh_db-web global 1/1 postgres:13 + rtlz62ok6s96 swh_grafana replicated 1/1 grafana/grafana:latest + 7hvn66um77wr swh_graph-replayer replicated 4/4 softwareheritage/replayer:20211022-121751 + jao3rt0et17n swh_memcache replicated 1/1 memcached:latest + rulxakqgu2ko swh_nginx replicated 1/1 nginx:latest *:5081->5081/tcp + q560pvw3q3ls swh_objstorage replicated 2/2 softwareheritage/base:20211022-121751 + a2h3ltaqdt56 swh_prometheus global 1/1 prom/prometheus:latest + lm24et9gjn2k swh_prometheus-statsd-exporter replicated 1/1 prom/statsd-exporter:latest + gwqinrao5win swh_storage replicated 2/2 softwareheritage/base:20211022-121751 + 7g46blmphfb4 swh_web replicated 1/1 softwareheritage/web:20211022-121751 If everything is OK, you should have your mirror filling. Check docker logs: .. code-block:: bash - ~/swh-docker$ docker service logs swh-mirror_graph-replayer + ~/swh-docker$ docker service logs swh_graph-replayer [...] or: .. code-block:: bash - ~/swh-docker$ docker service logs --tail 100 --follow swh-mirror_graph-replayer + ~/swh-docker$ docker service logs --tail 100 --follow swh_graph-replayer [...] @@ -429,7 +435,7 @@ config > content-replayer.yml ~/swh-docker$ docker stack deploy \ -c content-replayer.yml \ - swh-mirror + swh [...] @@ -447,12 +453,15 @@ config > mirror.yml ~/swh-docker$ docker stack deploy \ -c mirror.yml \ - swh-mirror + swh [...] -Scaling up services -------------------- +Getting your deployment production-ready +======================================== + +docker-stack scaling +-------------------- In order to scale up a replayer service, you can use the `docker scale` command. For example: @@ -464,16 +473,34 @@ will start 4 copies of the graph replayer service. -Notes: +Notes on the throughput of the mirroring process +------------------------------------------------ - One graph replayer service requires a steady 500MB to 1GB of RAM to run, so make sure you have properly sized machines for running these replayer containers, and to monitor these. -- The overall bandwidth of the replayer will depend heavily on the - `swh_storage` service, thus on the `swh_db-storage`. It will require some - network bandwidth for the ingress kafka payload (this can easily peak to - several hundreds of Mb/s). So make sure you have a correctly tuned database - and enough network bw. +- The graph replayer containers will require sufficient network bandwidth for the kafka + traffic (this can easily peak to several hundreds of megabits per second, and the + total volume of data fetched will be multiple tens of terabytes). + +- The biggest kafka topics are directory, revision and content, and will take the + longest to initially replay. + +Operational concerns for the Storage database +--------------------------------------------- + +The overall throughput of the mirroring process will depend heavily on the `swh_storage` +service, and on the performance of the underlying `swh_db-storage` database. You will +need to make sure that your database is `properly tuned +`_. + +You may also want to deploy your database directly to a bare-metal server rather than +have it managed within the docker stack. To do so, you will have to: -- Biggest topics are the directory, revision and content. +- modify the (merged) configuration of the docker stack to drop references to the + `db-storage` service (itself, and as dependency for the `storage` service) +- ensure that docker containers deployed in your swarm are able to connect to your + external database server +- override the environment variables of the `storage` service to reference the external + database server and dbname