diff --git a/docker/README.md b/docker/README.md
deleted file mode 100644
--- a/docker/README.md
+++ /dev/null
@@ -1,701 +0,0 @@
-# Docker environment
-
-This directory contains Dockerfiles to run a small Software Heritage instance
-on development machines.  The end goal is to smooth the contributors/developers
-workflow. Focus on coding, not configuring!
-
-WARNING: Running a Software Heritage instance on your machine can consume
-         quite a bit of resources: if you play a bit too hard (e.g., if you
-         try to list all GitHub repositories with the corresponding lister),
-         you may fill your hard drive, and consume a lot of CPU, memory and
-         network bandwidth.
-
-
-## Dependencies
-
-This uses docker with docker-compose, so ensure you have a working
-docker environment and docker-compose is installed.
-
-We recommend using the latest version of docker, so please read
-https://docs.docker.com/install/linux/docker-ce/debian/ for more details on how
-to install docker on your machine.
-
-On a debian system, docker-compose can be installed from Debian repositories:
-
-```
-~$ sudo apt install docker-compose
-```
-
-## Quick start
-
-First, change to the docker dir if you aren't there yet:
-
-```
-~$ cd swh-environment/docker
-```
-
-Then, start containers:
-
-```
-~/swh-environment/docker$ docker-compose up -d
-[...]
-Creating docker_amqp_1               ... done
-Creating docker_zookeeper_1          ... done
-Creating docker_kafka_1              ... done
-Creating docker_flower_1             ... done
-Creating docker_swh-scheduler-db_1   ... done
-[...]
-```
-
-This will build docker images and run them.
-Check everything is running fine with:
-
-```
-~/swh-environment/docker$ docker-compose ps
-                         Name                                       Command               State                                      Ports
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-docker_amqp_1                                    docker-entrypoint.sh rabbi ...   Up      15671/tcp, 0.0.0.0:5018->15672/tcp, 25672/tcp, 4369/tcp, 5671/tcp, 5672/tcp
-docker_flower_1                                  flower --broker=amqp://gue ...   Up      0.0.0.0:5555->5555/tcp
-docker_kafka_1                                   start-kafka.sh                   Up      0.0.0.0:9092->9092/tcp
-docker_swh-deposit-db_1                          docker-entrypoint.sh postgres    Up      5432/tcp
-docker_swh-deposit_1                             /entrypoint.sh                   Up      0.0.0.0:5006->5006/tcp
-[...]
-```
-
-The startup of some containers may fail the first time for dependency-related
-problems. If some containers failed to start, just run the `docker-compose up
--d` command again.
-
-If a container really refuses to start properly, you can check why using the
-`docker-compose logs` command. For example:
-
-```
-~/swh-environment/docker$ docker-compose logs swh-lister
-Attaching to docker_swh-lister_1
-[...]
-swh-lister_1                      | Processing /src/swh-scheduler
-swh-lister_1                      | Could not install packages due to an EnvironmentError: [('/src/swh-scheduler/.hypothesis/unicodedata/8.0.0/charmap.json.gz', '/tmp/pip-req-build-pm7nsax3/.hypothesis/unicodedata/8.0.0/charmap.json.gz', "[Errno 13] Permission denied: '/src/swh-scheduler/.hypothesis/unicodedata/8.0.0/charmap.json.gz'")]
-swh-lister_1                      |
-```
-
-Once all containers are running, you can use the web interface by opening
-http://localhost:5080/ in your web browser.
-
-At this point, the archive is empty and needs to be filled with some content.
-To do so, you can create tasks that will scrape a forge. For example, to inject
-the code from the https://0xacab.org gitlab forge:
-
-```
-~/swh-environment/docker$ docker-compose exec swh-scheduler \
-    swh scheduler task add list-gitlab-full \
-	  -p oneshot url=https://0xacab.org/api/v4
-
-Created 1 tasks
-
-Task 1
-  Next run: just now (2018-12-19 14:58:49+00:00)
-  Interval: 90 days, 0:00:00
-  Type: list-gitlab-full
-  Policy: oneshot
-  Args:
-  Keyword args:
-    url=https://0xacab.org/api/v4
-```
-
-This task will scrape the forge's project list and create subtasks to inject
-each git repository found there.
-
-This will take a bit af time to complete.
-
-To increase the speed at which git repositories are imported, you can spawn more
-`swh-loader-git` workers:
-
-```
-~/swh-environment/docker$ docker-compose exec swh-scheduler \
-    celery status
-listers@50ac2185c6c9: OK
-loader@b164f9055637: OK
-indexer@33bc6067a5b8: OK
-vault@c9fef1bbfdc1: OK
-
-4 nodes online.
-~/swh-environment/docker$ docker-compose exec swh-scheduler \
-    celery control pool_grow 3 -d loader@b164f9055637
--> loader@b164f9055637: OK
-        pool will grow
-~/swh-environment/docker$ docker-compose exec swh-scheduler \
-    celery inspect -d loader@b164f9055637 stats | grep prefetch_count
-       "prefetch_count": 4
-```
-
-Now there are 4 workers ingesting git repositories.
-You can also increase the number of `swh-loader-git` containers:
-
-```
-~/swh-environment/docker$ docker-compose up -d --scale swh-loader=4
-[...]
-Creating docker_swh-loader_2        ... done
-Creating docker_swh-loader_3        ... done
-Creating docker_swh-loader_4        ... done
-```
-
-## Updating the docker image
-
-All containers started by `docker-compose` are bound to a docker image named
-`swh/stack` including all the software components of Software Heritage.  When
-new versions of these components are released, the docker image will not be
-automatically updated. In order to update all Software Heritage components to
-their latest version, the docker image needs to be explicitly rebuilt by
-issuing the following command from within the `docker` directory:
-
-```
-~/swh-environment/docker$ docker build --no-cache -t swh/stack .
-```
-
-## Details
-
-This runs the following services on their respectively standard ports, all of
-the following services are configured to communicate with each other:
-
-- swh-storage-db: a `softwareheritage` instance db that stores the Merkle DAG,
-
-- swh-objstorage: Content-addressable object storage,
-
-- swh-storage: Abstraction layer over the archive, allowing to access all
-  stored source code artifacts as well as their metadata,
-
-- swh-web: the Software Heritage web user interface,
-
-- swh-scheduler: the API service as well as 2 utilities,
-  the runner and the listener,
-
-- swh-lister: celery workers dedicated to running lister tasks,
-
-- swh-loaders: celery workers dedicated to importing/updating source code
-  content (VCS repos, source packages, etc.),
-
-- swh-journal: Persistent logger of changes to the archive, with
-  publish-subscribe support.
-
-That means you can start doing the ingestion using those services using the
-same setup described in the getting-started starting directly at
-https://docs.softwareheritage.org/devel/getting-started.html#step-4-ingest-repositories
-
-
-### Exposed Ports
-
-Several services have their listening ports exposed on the host:
-
-- amqp: 5072
-- kafka: 5092
-- nginx: 5080
-
-And for SWH services:
-
-- scheduler API: 5008
-- storage API: 5002
-- object storage API: 5003
-- indexer API: 5007
-- web app: 5004
-- deposit app: 5006
-
-Beware that these ports are not the same as the ports used from within the
-docker network. This means that the same command executed from the host or from
-a docker container will not use the same urls to access services. For example,
-to use the `celery` utility from the host, you may type:
-
-```
-~/swh-environment/docker$ CELERY_BROKER_URL=amqp://:5072// celery status
-loader@61704103668c: OK
-[...]
-```
-
-To run the same command from within a container:
-
-```
-~/swh-environment/docker$ docker-compose exec swh-scheduler celery status
-loader@61704103668c: OK
-[...]
-```
-
-
-## Managing tasks
-
-One of the main components of the Software Heritage platform is the task system.
-These are used to manage everything related to background process, like
-discovering new git repositories to import, ingesting them, checking a known
-repository is up to date, etc.
-
-The task system is based on Celery but uses a custom database-based scheduler.
-
-So when we refer to the term 'task', it may designate either a Celery task or a
-SWH one (ie. the entity in the database). When we refer to simply a "task" in
-the documentation, it designates the SWH task.
-
-When a SWH task is ready to be executed, a Celery task is created to handle the
-actual SWH task's job. Note that not all Celery tasks are directly linked to a
-SWH task (some SWH tasks are implemented using a Celery task that spawns Celery
-subtasks).
-
-A (SWH) task can be `recurring` or `oneshot`. `oneshot` tasks are only executed
-once, whereas `recurring` are regularly executed. The scheduling configuration
-of these recurring tasks can be set via the fields `current_interval` and
-`priority` (can be 'high', 'normal' or 'low') of the task database entity.
-
-
-### Inserting a new lister task
-
-To list the content of a source code provider like github or a Debian
-distribution, you may add a new task for this.
-
-This task will (generally) scrape a web page or use a public API to identify
-the list of published software artefacts (git repos, debian source packages,
-etc.)
-
-Then, for each repository, a new task will be created to ingest this repository
-and keep it up to date.
-
-For example, to add a (one shot) task that will list git repos on the
-0xacab.org gitlab instance, one can do (from this git repository):
-
-```
-~/swh-environment/docker$ docker-compose exec swh-scheduler \
-    swh scheduler task add list-gitlab-full \
-	  -p oneshot url=https://0xacab.org/api/v4
-
-Created 1 tasks
-
-Task 12
-  Next run: just now (2018-12-19 14:58:49+00:00)
-  Interval: 90 days, 0:00:00
-  Type: list-gitlab-full
-  Policy: oneshot
-  Args:
-  Keyword args:
-    url=https://0xacab.org/api/v4
-```
-
-This will insert a new task in the scheduler. To list existing tasks for a
-given task type:
-
-```
-~/swh-environment/docker$ docker-compose exec swh-scheduler \
-  swh scheduler task list-pending list-gitlab-full
-
-Found 1 list-gitlab-full tasks
-
-Task 12
-  Next run: 2 minutes ago (2018-12-19 14:58:49+00:00)
-  Interval: 90 days, 0:00:00
-  Type: list-gitlab-full
-  Policy: oneshot
-  Args:
-  Keyword args:
-    url=https://0xacab.org/api/v4
-```
-
-To list all existing task types:
-
-```
-~/swh-environment/docker$ docker-compose exec swh-scheduler \
-  swh scheduler task-type list
-
-Known task types:
-load-svn-from-archive:
-  Loading svn repositories from svn dump
-load-svn:
-  Create dump of a remote svn repository, mount it and load it
-load-deposit:
-  Loading deposit archive into swh through swh-loader-tar
-check-deposit:
-  Pre-checking deposit step before loading into swh archive
-cook-vault-bundle:
-  Cook a Vault bundle
-load-hg:
-  Loading mercurial repository swh-loader-mercurial
-load-hg-from-archive:
-  Loading archive mercurial repository swh-loader-mercurial
-load-git:
-  Update an origin of type git
-list-github-incremental:
-  Incrementally list GitHub
-list-github-full:
-  Full update of GitHub repos list
-list-debian-distribution:
-  List a Debian distribution
-list-gitlab-incremental:
-  Incrementally list a Gitlab instance
-list-gitlab-full:
-  Full update of a Gitlab instance's repos list
-list-pypi:
-  Full pypi lister
-load-pypi:
-  Load Pypi origin
-index-mimetype:
-  Mimetype indexer task
-index-mimetype-for-range:
-  Mimetype Range indexer task
-index-fossology-license:
-  Fossology license indexer task
-index-fossology-license-for-range:
-  Fossology license range indexer task
-index-origin-head:
-  Origin Head indexer task
-index-revision-metadata:
-  Revision Metadata indexer task
-index-origin-metadata:
-  Origin Metadata indexer task
-
-```
-
-
-### Monitoring activity
-
-You can monitor the workers activity by connecting to the RabbitMQ console on
-`http://localhost:5080/rabbitmq` or the grafana dashboard on
-`http://localhost:5080/grafana`.
-
-If you cannot see any task being executed, check the logs of the
-`swh-scheduler-runner` service (here is a failure example due to the
-debian lister task not being properly registered on the
-swh-scheduler-runner service):
-
-```
-~/swh-environment/docker$ docker-compose logs --tail=10 swh-scheduler-runner
-Attaching to docker_swh-scheduler-runner_1
-swh-scheduler-runner_1    |     "__main__", mod_spec)
-swh-scheduler-runner_1    |   File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
-swh-scheduler-runner_1    |     exec(code, run_globals)
-swh-scheduler-runner_1    |   File "/usr/local/lib/python3.7/site-packages/swh/scheduler/celery_backend/runner.py", line 107, in <module>
-swh-scheduler-runner_1    |     run_ready_tasks(main_backend, main_app)
-swh-scheduler-runner_1    |   File "/usr/local/lib/python3.7/site-packages/swh/scheduler/celery_backend/runner.py", line 81, in run_ready_tasks
-swh-scheduler-runner_1    |     task_types[task['type']]['backend_name']
-swh-scheduler-runner_1    |   File "/usr/local/lib/python3.7/site-packages/celery/app/registry.py", line 21, in __missing__
-swh-scheduler-runner_1    |     raise self.NotRegistered(key)
-swh-scheduler-runner_1    | celery.exceptions.NotRegistered: 'swh.lister.debian.tasks.DebianListerTask'
-```
-
-
-## Using docker setup development and integration testing
-
-If you hack the code of one or more archive components with a virtual
-env based setup as described in the
-[[https://docs.softwareheritage.org/devel/developer-setup.html|developer
-setup guide]], you may want to test your modifications in a working
-Software Heritage instance. The simplest way to achieve this is to use
-this docker-based environment.
-
-If you haven't followed the
-[[https://docs.softwareheritage.org/devel/developer-setup.html|developer setup guide]],
-you must clone the the [swh-environment] repo in your `swh-environment`
-directory:
-
-```
-~/swh-environment$ git clone https://forge.softwareheritage.org/source/swh-environment.git .
-```
-
-Note the `.` at the end of this command: we want the git repository to be
-cloned directly in the `~/swh-environment` directory, not in a sub directory.
-Also note that if you haven't done it yet and you want to hack the source code
-of one or more Software Heritage packages, you really should read the
-[[https://docs.softwareheritage.org/devel/developer-setup.html|developer setup guide]].
-
-From there, we will checkout or update all the swh packages:
-
-```
-~/swh-environment$ ./bin/update
-```
-
-
-### Install a swh package from sources in a container
-
-It is possible to run a docker container with some swh packages installed from
-sources instead of using the latest published packages from pypi. To do this
-you must write a docker-compose override file (`docker-compose.override.yml`).
-An example is given in the `docker-compose.override.yml.example` file:
-
-``` yaml
-version: '2'
-
-services:
-  swh-objstorage:
-    volumes:
-      - "$HOME/swh-environment/swh-objstorage:/src/swh-objstorage"
-```
-
-The file named `docker-compose.override.yml` will automatically be loaded by
-`docker-compose`.
-
-This example shows the simplest case of the `swh-objstorage` package:
-you just have to mount it in the container in `/src` and the
-entrypoint will ensure every swh-* package found in `/src/` is
-installed (using `pip install -e` so you can easily hack your
-code). If the application you play with has autoreload support, there
-is no need to restart the impacted container.)
-
-
-### Using locally installed swh tools with docker
-
-In all examples above, we have executed swh commands from within a running
-container. Now we also have these swh commands locally available in our virtual
-env, we can use them to interact with swh services running in docker
-containers.
-
-For this, we just need to configure a few environment variables. First, ensure
-your Software Heritage virtualenv is activated (here, using virtualenvwrapper):
-
-```
-~$ workon swh
-(swh) ~/swh-environment$ export SWH_SCHEDULER_URL=http://127.0.0.1:5008/
-(swh) ~/swh-environment$ export CELERY_BROKER_URL=amqp://127.0.0.1:5072/
-```
-
-Now we can use the `celery` command directly to control the celery system
-running in the docker environment:
-
-```
-(swh) ~/swh-environment$ celery status
-vault@c9fef1bbfdc1: OK
-listers@ba66f18e7d02: OK
-indexer@cb14c33cbbfb: OK
-loader@61704103668c: OK
-
-4 nodes online.
-(swh) ~/swh-environment$ celery control -d loader@61704103668c pool_grow 3
-```
-
-And we can use the `swh-scheduler` command all the same:
-
-```
-(swh) ~/swh-environment$ swh scheduler task-type list
-Known task types:
-index-fossology-license:
-  Fossology license indexer task
-index-mimetype:
-  Mimetype indexer task
-[...]
-```
-
-
-### Make your life a bit easier
-
-When you use virtualenvwrapper, you can add postactivation commands:
-
-```
-(swh) ~/swh-environment$ cat >>$VIRTUAL_ENV/bin/postactivate <<'EOF'
-# unfortunately, the interface cmd for the click autocompletion
-# depends on the shell
-# https://click.palletsprojects.com/en/7.x/bashcomplete/#activation
-
-shell=$(basename $SHELL)
-case "$shell" in
-    "zsh")
-        autocomplete_cmd=source_zsh
-        ;;
-    *)
-        autocomplete_cmd=source
-        ;;
-esac
-
-eval "$(_SWH_COMPLETE=$autocomplete_cmd swh)"
-export SWH_SCHEDULER_URL=http://127.0.0.1:5008/
-export CELERY_BROKER_URL=amqp://127.0.0.1:5072/
-export COMPOSE_FILE=~/swh-environment/docker/docker-compose.yml:~/swh-environment/docker/docker-compose.override.yml
-alias doco=docker-compose
-
-EOF
-```
-
-This postactivate script does:
-
-- install a shell completion handler for the swh-scheduler command,
-- preset a bunch of environment variables
-
-  - `SWH_SCHEDULER_URL` so that you can just run `swh scheduler` against the
-    scheduler API instance running in docker, without having to specify the
-    endpoint URL,
-
-  - `CELERY_BROKER` so you can execute the `celery` tool (without cli options)
-    against the rabbitmq server running in the docker environment,
-
-  - `COMPOSE_FILE` so you can run `docker-compose` from everywhere,
-
-- create an alias `doco` for `docker-compose` because this is way too
-  long to type,
-
-So now you can easily:
-
-* Start the SWH platform:
-
-```
-  (swh) ~/swh-environment$ doco up -d
-  [...]
-```
-
-* Check celery:
-
-```
-  (swh) ~/swh-environment$ celery status
-  listers@50ac2185c6c9: OK
-  loader@b164f9055637: OK
-  indexer@33bc6067a5b8: OK
-```
-
-* List task-types:
-
-```
-  (swh) ~/swh-environment$ swh scheduler task-type list
-  [...]
-```
-
-* Get more info on a task type:
-
-```
-  (swh) ~/swh-environment$ swh scheduler task-type list -v -t load-hg
-  Known task types:
-  load-hg: swh.loader.mercurial.tasks.LoadMercurial
-    Loading mercurial repository swh-loader-mercurial
-    interval: 1 day, 0:00:00 [1 day, 0:00:00, 1 day, 0:00:00]
-    backoff_factor: 1.0
-    max_queue_length: 1000
-    num_retries: None
-    retry_delay: None
-```
-
-* Add a new task:
-
-```
-  (swh) ~/swh-environment$ swh scheduler task add load-hg \
-    origin_url=https://hg.logilab.org/master/cubicweb
-  Created 1 tasks
-  Task 1
-     Next run: just now (2019-02-06 12:36:58+00:00)
-     Interval: 1 day, 0:00:00
-     Type: load-hg
-     Policy: recurring
-     Args:
-     Keyword args:
-       origin_url: https://hg.logilab.org/master/cubicweb
-```
-
-* Respawn a task:
-
-```
-  (swh) ~/swh-environment$ swh scheduler task respawn 1
-```
-
-## Data persistence for a development setting
-
-The default `docker-compose.yml` configuration is not geared towards data persistence,
-but application testing.
-
-Volumes defined in associated images are anonymous and may get either unused or removed
-on the next `docker-compose up`.
-
-One way to make sure these volumes persist is to use named volumes.
-The volumes may be defined as follows in a `docker-compose.override.yml`.
-Note that volume definitions are merged with other compose files based on
-destination path.
-
-```
-services:
-  swh-storage-db:
-    volumes:
-      - "swh_storage_data:/var/lib/postgresql/data"
-  swh-objstorage:
-    volumes:
-      - "swh_objstorage_data:/srv/softwareheritage/objects"
-
-volumes:
-  swh_storage_data:
-  swh_objstorage_data:
-```
-
-This way, `docker-compose down` without the `-v` flag will not remove those volumes
-and data will persist.
-
-## Starting a kafka-powered mirror of the storage
-
-This repo comes with an optional `docker-compose.storage-mirror.yml`
-docker compose file that can be used to test the kafka-powered mirror
-mecanism for the main storage.
-
-This can be used like:
-
-```
-~/swh-environment/docker$ docker-compose -f docker-compose.yml -f docker-compose.storage-mirror.yml up -d
-[...]
-```
-
-Compared to the original compose file, this will:
-
-- overrides the swh-storage service to activate the kafka direct writer
-  on swh.journal.objects prefixed topics using thw swh.storage.master ID,
-- overrides the swh-web service to make it use the mirror instead of the
-  master storage,
-- starts a db for the mirror,
-- starts a storage service based on this db,
-- starts a replayer service that runs the process that listen to kafka to
-  keeps the mirror in sync.
-
-When using it, you will have a setup in which the master storage is used by
-workers and most other services, whereas the storage mirror will be used to
-by the web application and should be kept in sync with the master storage
-by kafka.
-
-
-Note that the object storage is not replicated here, only the graph storage.
-
-
-## Starting the backfiller
-
-Reading from the storage the objects <object-type> from within range
-[start-object, end-object] to the kafka topics.
-
-```
-(swh)$ docker-compose \
-             -f docker-compose.yml \
-             -f docker-compose.storage-mirror.yml \
-             -f docker-compose.storage-mirror.override.yml \
-             run \
-             swh-journal-backfiller \
-             snapshot \
-             --start-object 000000 \
-             --end-object 000001 \
-             --dry-run
-```
-
-
-## Using Sentry
-
-All entrypoints to SWH code (CLI, gunicorn, celery, ...) are, or should be,
-intrumented using Sentry. By default this is disabled, but if you run your
-own Sentry instance, you can use it.
-
-To do so, you must get a DSN from your Sentry instance, and set it as the
-value of `SWH_SENTRY_DSN` in the file `env/common_python.env`.
-You may also set it per-service in the `environment` section of each services
-in `docker-compose.override.yml`.
-
-
-## Caveats
-
-Running a lister task can lead to a lot of loading tasks, which can fill your
-hard drive pretty fast. Make sure to monitor your available storage space
-regularly when playing with this stack.
-
-Also, a few containers (`swh-storage`, `swh-xxx-db`) use a volume for storing
-the blobs or the database files. With the default configuration provided in the
-`docker-compose.yml` file, these volumes are not persistant. So removing the
-containers will delete the volumes!
-
-Also note that for the `swh-objstorage`, since the volume can be pretty big,
-the remove operation can be quite long (several minutes is not uncommon), which
-may mess a bit with the `docker-compose` command.
-
-If you have an error message like:
-
-  Error response from daemon: removal of container 928de3110381 is already in progress
-
-it means that you need to wait for this process to finish before being able to
-(re)start your docker stack again.
diff --git a/docker/README.rst b/docker/README.rst
new file mode 100644
--- /dev/null
+++ b/docker/README.rst
@@ -0,0 +1,666 @@
+Docker environment
+==================
+
+``swh-environment/docker/`` contains Dockerfiles to run a small Software Heritage
+instance on development machines. The end goal is to smooth the
+contributors/developers workflow. Focus on coding, not configuring!
+
+.. warning::
+   Running a Software Heritage instance on your machine can
+   consume quite a bit of resources: if you play a bit too hard (e.g., if
+   you try to list all GitHub repositories with the corresponding lister),
+   you may fill your hard drive, and consume a lot of CPU, memory and
+   network bandwidth.
+
+Dependencies
+------------
+
+This uses docker with docker-compose, so ensure you have a working
+docker environment and docker-compose is installed.
+
+We recommend using the latest version of docker, so please read
+https://docs.docker.com/install/linux/docker-ce/debian/ for more details
+on how to install docker on your machine.
+
+On a debian system, docker-compose can be installed from Debian
+repositories::
+
+   ~$ sudo apt install docker-compose
+
+Quick start
+-----------
+
+First, change to the docker dir if you aren’t there yet::
+
+   ~$ cd swh-environment/docker
+
+Then, start containers::
+
+   ~/swh-environment/docker$ docker-compose up -d
+   [...]
+   Creating docker_amqp_1               ... done
+   Creating docker_zookeeper_1          ... done
+   Creating docker_kafka_1              ... done
+   Creating docker_flower_1             ... done
+   Creating docker_swh-scheduler-db_1   ... done
+   [...]
+
+This will build docker images and run them. Check everything is running
+fine with::
+
+   ~/swh-environment/docker$ docker-compose ps
+                            Name                                       Command               State                                      Ports
+   -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+   docker_amqp_1                                    docker-entrypoint.sh rabbi ...   Up      15671/tcp, 0.0.0.0:5018->15672/tcp, 25672/tcp, 4369/tcp, 5671/tcp, 5672/tcp
+   docker_flower_1                                  flower --broker=amqp://gue ...   Up      0.0.0.0:5555->5555/tcp
+   docker_kafka_1                                   start-kafka.sh                   Up      0.0.0.0:9092->9092/tcp
+   docker_swh-deposit-db_1                          docker-entrypoint.sh postgres    Up      5432/tcp
+   docker_swh-deposit_1                             /entrypoint.sh                   Up      0.0.0.0:5006->5006/tcp
+   [...]
+
+The startup of some containers may fail the first time for
+dependency-related problems. If some containers failed to start, just
+run the ``docker-compose up -d`` command again.
+
+If a container really refuses to start properly, you can check why using
+the ``docker-compose logs`` command. For example::
+
+   ~/swh-environment/docker$ docker-compose logs swh-lister
+   Attaching to docker_swh-lister_1
+   [...]
+   swh-lister_1                      | Processing /src/swh-scheduler
+   swh-lister_1                      | Could not install packages due to an EnvironmentError: [('/src/swh-scheduler/.hypothesis/unicodedata/8.0.0/charmap.json.gz', '/tmp/pip-req-build-pm7nsax3/.hypothesis/unicodedata/8.0.0/charmap.json.gz', "[Errno 13] Permission denied: '/src/swh-scheduler/.hypothesis/unicodedata/8.0.0/charmap.json.gz'")]
+   swh-lister_1                      |
+
+Once all containers are running, you can use the web interface by
+opening http://localhost:5080/ in your web browser.
+
+At this point, the archive is empty and needs to be filled with some
+content. To do so, you can create tasks that will scrape a forge. For
+example, to inject the code from the https://0xacab.org gitlab forge::
+
+   ~/swh-environment/docker$ docker-compose exec swh-scheduler \
+       swh scheduler task add list-gitlab-full \
+         -p oneshot url=https://0xacab.org/api/v4
+
+   Created 1 tasks
+
+   Task 1
+     Next run: just now (2018-12-19 14:58:49+00:00)
+     Interval: 90 days, 0:00:00
+     Type: list-gitlab-full
+     Policy: oneshot
+     Args:
+     Keyword args:
+       url=https://0xacab.org/api/v4
+
+This task will scrape the forge’s project list and create subtasks to
+inject each git repository found there.
+
+This will take a bit af time to complete.
+
+To increase the speed at which git repositories are imported, you can
+spawn more ``swh-loader-git`` workers::
+
+   ~/swh-environment/docker$ docker-compose exec swh-scheduler \
+       celery status
+   listers@50ac2185c6c9: OK
+   loader@b164f9055637: OK
+   indexer@33bc6067a5b8: OK
+   vault@c9fef1bbfdc1: OK
+
+   4 nodes online.
+   ~/swh-environment/docker$ docker-compose exec swh-scheduler \
+       celery control pool_grow 3 -d loader@b164f9055637
+   -> loader@b164f9055637: OK
+           pool will grow
+   ~/swh-environment/docker$ docker-compose exec swh-scheduler \
+       celery inspect -d loader@b164f9055637 stats | grep prefetch_count
+          "prefetch_count": 4
+
+Now there are 4 workers ingesting git repositories. You can also
+increase the number of ``swh-loader-git`` containers::
+
+   ~/swh-environment/docker$ docker-compose up -d --scale swh-loader=4
+   [...]
+   Creating docker_swh-loader_2        ... done
+   Creating docker_swh-loader_3        ... done
+   Creating docker_swh-loader_4        ... done
+
+Updating the docker image
+-------------------------
+
+All containers started by ``docker-compose`` are bound to a docker image
+named ``swh/stack`` including all the software components of Software
+Heritage. When new versions of these components are released, the docker
+image will not be automatically updated. In order to update all Software
+Heritage components to their latest version, the docker image needs to
+be explicitly rebuilt by issuing the following command from within the
+``docker`` directory::
+
+   ~/swh-environment/docker$ docker build --no-cache -t swh/stack .
+
+Details
+-------
+
+This runs the following services on their respectively standard ports,
+all of the following services are configured to communicate with each
+other:
+
+-  swh-storage-db: a ``softwareheritage`` instance db that stores the
+   Merkle DAG,
+
+-  swh-objstorage: Content-addressable object storage,
+
+-  swh-storage: Abstraction layer over the archive, allowing to access
+   all stored source code artifacts as well as their metadata,
+
+-  swh-web: the Software Heritage web user interface,
+
+-  swh-scheduler: the API service as well as 2 utilities, the runner and
+   the listener,
+
+-  swh-lister: celery workers dedicated to running lister tasks,
+
+-  swh-loaders: celery workers dedicated to importing/updating source
+   code content (VCS repos, source packages, etc.),
+
+-  swh-journal: Persistent logger of changes to the archive, with
+   publish-subscribe support.
+
+That means you can start doing the ingestion using those services using
+the same setup described in the getting-started starting directly at
+https://docs.softwareheritage.org/devel/getting-started.html#step-4-ingest-repositories
+
+Exposed Ports
+~~~~~~~~~~~~~
+
+Several services have their listening ports exposed on the host:
+
+-  amqp: 5072
+-  kafka: 5092
+-  nginx: 5080
+
+And for SWH services:
+
+-  scheduler API: 5008
+-  storage API: 5002
+-  object storage API: 5003
+-  indexer API: 5007
+-  web app: 5004
+-  deposit app: 5006
+
+Beware that these ports are not the same as the ports used from within
+the docker network. This means that the same command executed from the
+host or from a docker container will not use the same urls to access
+services. For example, to use the ``celery`` utility from the host, you
+may type::
+
+   ~/swh-environment/docker$ CELERY_BROKER_URL=amqp://:5072// celery status
+   loader@61704103668c: OK
+   [...]
+
+To run the same command from within a container::
+
+   ~/swh-environment/docker$ docker-compose exec swh-scheduler celery status
+   loader@61704103668c: OK
+   [...]
+
+Managing tasks
+--------------
+
+One of the main components of the Software Heritage platform is the task
+system. These are used to manage everything related to background
+process, like discovering new git repositories to import, ingesting
+them, checking a known repository is up to date, etc.
+
+The task system is based on Celery but uses a custom database-based
+scheduler.
+
+So when we refer to the term ‘task’, it may designate either a Celery
+task or a SWH one (ie. the entity in the database). When we refer to
+simply a “task” in the documentation, it designates the SWH task.
+
+When a SWH task is ready to be executed, a Celery task is created to
+handle the actual SWH task’s job. Note that not all Celery tasks are
+directly linked to a SWH task (some SWH tasks are implemented using a
+Celery task that spawns Celery subtasks).
+
+A (SWH) task can be ``recurring`` or ``oneshot``. ``oneshot`` tasks are
+only executed once, whereas ``recurring`` are regularly executed. The
+scheduling configuration of these recurring tasks can be set via the
+fields ``current_interval`` and ``priority`` (can be ‘high’, ‘normal’ or
+‘low’) of the task database entity.
+
+Inserting a new lister task
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To list the content of a source code provider like github or a Debian
+distribution, you may add a new task for this.
+
+This task will (generally) scrape a web page or use a public API to
+identify the list of published software artefacts (git repos, debian
+source packages, etc.)
+
+Then, for each repository, a new task will be created to ingest this
+repository and keep it up to date.
+
+For example, to add a (one shot) task that will list git repos on the
+0xacab.org gitlab instance, one can do (from this git repository)::
+
+   ~/swh-environment/docker$ docker-compose exec swh-scheduler \
+       swh scheduler task add list-gitlab-full \
+         -p oneshot url=https://0xacab.org/api/v4
+
+   Created 1 tasks
+
+   Task 12
+     Next run: just now (2018-12-19 14:58:49+00:00)
+     Interval: 90 days, 0:00:00
+     Type: list-gitlab-full
+     Policy: oneshot
+     Args:
+     Keyword args:
+       url=https://0xacab.org/api/v4
+
+This will insert a new task in the scheduler. To list existing tasks for
+a given task type::
+
+   ~/swh-environment/docker$ docker-compose exec swh-scheduler \
+     swh scheduler task list-pending list-gitlab-full
+
+   Found 1 list-gitlab-full tasks
+
+   Task 12
+     Next run: 2 minutes ago (2018-12-19 14:58:49+00:00)
+     Interval: 90 days, 0:00:00
+     Type: list-gitlab-full
+     Policy: oneshot
+     Args:
+     Keyword args:
+       url=https://0xacab.org/api/v4
+
+To list all existing task types::
+
+   ~/swh-environment/docker$ docker-compose exec swh-scheduler \
+     swh scheduler task-type list
+
+   Known task types:
+   load-svn-from-archive:
+     Loading svn repositories from svn dump
+   load-svn:
+     Create dump of a remote svn repository, mount it and load it
+   load-deposit:
+     Loading deposit archive into swh through swh-loader-tar
+   check-deposit:
+     Pre-checking deposit step before loading into swh archive
+   cook-vault-bundle:
+     Cook a Vault bundle
+   load-hg:
+     Loading mercurial repository swh-loader-mercurial
+   load-hg-from-archive:
+     Loading archive mercurial repository swh-loader-mercurial
+   load-git:
+     Update an origin of type git
+   list-github-incremental:
+     Incrementally list GitHub
+   list-github-full:
+     Full update of GitHub repos list
+   list-debian-distribution:
+     List a Debian distribution
+   list-gitlab-incremental:
+     Incrementally list a Gitlab instance
+   list-gitlab-full:
+     Full update of a Gitlab instance's repos list
+   list-pypi:
+     Full pypi lister
+   load-pypi:
+     Load Pypi origin
+   index-mimetype:
+     Mimetype indexer task
+   index-mimetype-for-range:
+     Mimetype Range indexer task
+   index-fossology-license:
+     Fossology license indexer task
+   index-fossology-license-for-range:
+     Fossology license range indexer task
+   index-origin-head:
+     Origin Head indexer task
+   index-revision-metadata:
+     Revision Metadata indexer task
+   index-origin-metadata:
+     Origin Metadata indexer task
+
+Monitoring activity
+~~~~~~~~~~~~~~~~~~~
+
+You can monitor the workers activity by connecting to the RabbitMQ
+console on ``http://localhost:5080/rabbitmq`` or the grafana dashboard
+on ``http://localhost:5080/grafana``.
+
+If you cannot see any task being executed, check the logs of the
+``swh-scheduler-runner`` service (here is a failure example due to the
+debian lister task not being properly registered on the
+swh-scheduler-runner service)::
+
+   ~/swh-environment/docker$ docker-compose logs --tail=10 swh-scheduler-runner
+   Attaching to docker_swh-scheduler-runner_1
+   swh-scheduler-runner_1    |     "__main__", mod_spec)
+   swh-scheduler-runner_1    |   File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
+   swh-scheduler-runner_1    |     exec(code, run_globals)
+   swh-scheduler-runner_1    |   File "/usr/local/lib/python3.7/site-packages/swh/scheduler/celery_backend/runner.py", line 107, in <module>
+   swh-scheduler-runner_1    |     run_ready_tasks(main_backend, main_app)
+   swh-scheduler-runner_1    |   File "/usr/local/lib/python3.7/site-packages/swh/scheduler/celery_backend/runner.py", line 81, in run_ready_tasks
+   swh-scheduler-runner_1    |     task_types[task['type']]['backend_name']
+   swh-scheduler-runner_1    |   File "/usr/local/lib/python3.7/site-packages/celery/app/registry.py", line 21, in __missing__
+   swh-scheduler-runner_1    |     raise self.NotRegistered(key)
+   swh-scheduler-runner_1    | celery.exceptions.NotRegistered: 'swh.lister.debian.tasks.DebianListerTask'
+
+Using docker setup development and integration testing
+------------------------------------------------------
+
+If you hack the code of one or more archive components with a virtual
+env based setup as described in the
+[[https://docs.softwareheritage.org/devel/developer-setup.html|developer
+setup guide]], you may want to test your modifications in a working
+Software Heritage instance. The simplest way to achieve this is to use
+this docker-based environment.
+
+If you haven’t followed the
+[[https://docs.softwareheritage.org/devel/developer-setup.html|developer
+setup guide]], you must clone the the [swh-environment] repo in your
+``swh-environment`` directory::
+
+   ~/swh-environment$ git clone https://forge.softwareheritage.org/source/swh-environment.git .
+
+Note the ``.`` at the end of this command: we want the git repository to
+be cloned directly in the ``~/swh-environment`` directory, not in a sub
+directory. Also note that if you haven’t done it yet and you want to
+hack the source code of one or more Software Heritage packages, you
+really should read the
+[[https://docs.softwareheritage.org/devel/developer-setup.html|developer
+setup guide]].
+
+From there, we will checkout or update all the swh packages::
+
+   ~/swh-environment$ ./bin/update
+
+Install a swh package from sources in a container
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+It is possible to run a docker container with some swh packages
+installed from sources instead of using the latest published packages
+from pypi. To do this you must write a docker-compose override file
+(``docker-compose.override.yml``). An example is given in the
+``docker-compose.override.yml.example`` file:
+
+.. code:: yaml
+
+   version: '2'
+
+   services:
+     swh-objstorage:
+       volumes:
+         - "$HOME/swh-environment/swh-objstorage:/src/swh-objstorage"
+
+The file named ``docker-compose.override.yml`` will automatically be
+loaded by ``docker-compose``.
+
+This example shows the simplest case of the ``swh-objstorage`` package:
+you just have to mount it in the container in ``/src`` and the
+entrypoint will ensure every swh-\* package found in ``/src/`` is
+installed (using ``pip install -e`` so you can easily hack your code).
+If the application you play with has autoreload support, there is no
+need to restart the impacted container.)
+
+Using locally installed swh tools with docker
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In all examples above, we have executed swh commands from within a
+running container. Now we also have these swh commands locally available
+in our virtual env, we can use them to interact with swh services
+running in docker containers.
+
+For this, we just need to configure a few environment variables. First,
+ensure your Software Heritage virtualenv is activated (here, using
+virtualenvwrapper)::
+
+   ~$ workon swh
+   (swh) ~/swh-environment$ export SWH_SCHEDULER_URL=http://127.0.0.1:5008/
+   (swh) ~/swh-environment$ export CELERY_BROKER_URL=amqp://127.0.0.1:5072/
+
+Now we can use the ``celery`` command directly to control the celery
+system running in the docker environment::
+
+   (swh) ~/swh-environment$ celery status
+   vault@c9fef1bbfdc1: OK
+   listers@ba66f18e7d02: OK
+   indexer@cb14c33cbbfb: OK
+   loader@61704103668c: OK
+
+   4 nodes online.
+   (swh) ~/swh-environment$ celery control -d loader@61704103668c pool_grow 3
+
+And we can use the ``swh-scheduler`` command all the same::
+
+   (swh) ~/swh-environment$ swh scheduler task-type list
+   Known task types:
+   index-fossology-license:
+     Fossology license indexer task
+   index-mimetype:
+     Mimetype indexer task
+   [...]
+
+Make your life a bit easier
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When you use virtualenvwrapper, you can add postactivation commands::
+
+   (swh) ~/swh-environment$ cat >>$VIRTUAL_ENV/bin/postactivate <<'EOF'
+   # unfortunately, the interface cmd for the click autocompletion
+   # depends on the shell
+   # https://click.palletsprojects.com/en/7.x/bashcomplete/#activation
+
+   shell=$(basename $SHELL)
+   case "$shell" in
+       "zsh")
+           autocomplete_cmd=source_zsh
+           ;;
+       *)
+           autocomplete_cmd=source
+           ;;
+   esac
+
+   eval "$(_SWH_COMPLETE=$autocomplete_cmd swh)"
+   export SWH_SCHEDULER_URL=http://127.0.0.1:5008/
+   export CELERY_BROKER_URL=amqp://127.0.0.1:5072/
+   export COMPOSE_FILE=~/swh-environment/docker/docker-compose.yml:~/swh-environment/docker/docker-compose.override.yml
+   alias doco=docker-compose
+
+   EOF
+
+This postactivate script does:
+
+-  install a shell completion handler for the swh-scheduler command,
+-  preset a bunch of environment variables
+
+   -  ``SWH_SCHEDULER_URL`` so that you can just run ``swh scheduler``
+      against the scheduler API instance running in docker, without
+      having to specify the endpoint URL,
+
+   -  ``CELERY_BROKER`` so you can execute the ``celery`` tool (without
+      cli options) against the rabbitmq server running in the docker
+      environment,
+
+   -  ``COMPOSE_FILE`` so you can run ``docker-compose`` from
+      everywhere,
+
+-  create an alias ``doco`` for ``docker-compose`` because this is way
+   too long to type,
+
+So now you can easily:
+
+-  Start the SWH platform::
+
+     (swh) ~/swh-environment$ doco up -d
+     [...]
+
+-  Check celery::
+
+     (swh) ~/swh-environment$ celery status
+     listers@50ac2185c6c9: OK
+     loader@b164f9055637: OK
+     indexer@33bc6067a5b8: OK
+
+-  List task-types::
+
+     (swh) ~/swh-environment$ swh scheduler task-type list
+     [...]
+
+-  Get more info on a task type::
+
+     (swh) ~/swh-environment$ swh scheduler task-type list -v -t load-hg
+     Known task types:
+     load-hg: swh.loader.mercurial.tasks.LoadMercurial
+       Loading mercurial repository swh-loader-mercurial
+       interval: 1 day, 0:00:00 [1 day, 0:00:00, 1 day, 0:00:00]
+       backoff_factor: 1.0
+       max_queue_length: 1000
+       num_retries: None
+       retry_delay: None
+
+-  Add a new task::
+
+     (swh) ~/swh-environment$ swh scheduler task add load-hg \
+       origin_url=https://hg.logilab.org/master/cubicweb
+     Created 1 tasks
+     Task 1
+        Next run: just now (2019-02-06 12:36:58+00:00)
+        Interval: 1 day, 0:00:00
+        Type: load-hg
+        Policy: recurring
+        Args:
+        Keyword args:
+          origin_url: https://hg.logilab.org/master/cubicweb
+
+-  Respawn a task::
+
+     (swh) ~/swh-environment$ swh scheduler task respawn 1
+
+Data persistence for a development setting
+------------------------------------------
+
+The default ``docker-compose.yml`` configuration is not geared towards
+data persistence, but application testing.
+
+Volumes defined in associated images are anonymous and may get either
+unused or removed on the next ``docker-compose up``.
+
+One way to make sure these volumes persist is to use named volumes. The
+volumes may be defined as follows in a ``docker-compose.override.yml``.
+Note that volume definitions are merged with other compose files based
+on destination path.
+
+::
+
+   services:
+     swh-storage-db:
+       volumes:
+         - "swh_storage_data:/var/lib/postgresql/data"
+     swh-objstorage:
+       volumes:
+         - "swh_objstorage_data:/srv/softwareheritage/objects"
+
+   volumes:
+     swh_storage_data:
+     swh_objstorage_data:
+
+This way, ``docker-compose down`` without the ``-v`` flag will not
+remove those volumes and data will persist.
+
+Starting a kafka-powered mirror of the storage
+----------------------------------------------
+
+This repo comes with an optional ``docker-compose.storage-mirror.yml``
+docker compose file that can be used to test the kafka-powered mirror
+mecanism for the main storage.
+
+This can be used like::
+
+   ~/swh-environment/docker$ docker-compose -f docker-compose.yml -f docker-compose.storage-mirror.yml up -d
+   [...]
+
+Compared to the original compose file, this will:
+
+-  overrides the swh-storage service to activate the kafka direct writer
+   on swh.journal.objects prefixed topics using thw swh.storage.master
+   ID,
+-  overrides the swh-web service to make it use the mirror instead of
+   the master storage,
+-  starts a db for the mirror,
+-  starts a storage service based on this db,
+-  starts a replayer service that runs the process that listen to kafka
+   to keeps the mirror in sync.
+
+When using it, you will have a setup in which the master storage is used
+by workers and most other services, whereas the storage mirror will be
+used to by the web application and should be kept in sync with the
+master storage by kafka.
+
+Note that the object storage is not replicated here, only the graph
+storage.
+
+Starting the backfiller
+-----------------------
+
+Reading from the storage the objects from within range [start-object,
+end-object] to the kafka topics.
+
+::
+
+   (swh)$ docker-compose \
+                -f docker-compose.yml \
+                -f docker-compose.storage-mirror.yml \
+                -f docker-compose.storage-mirror.override.yml \
+                run \
+                swh-journal-backfiller \
+                snapshot \
+                --start-object 000000 \
+                --end-object 000001 \
+                --dry-run
+
+Using Sentry
+------------
+
+All entrypoints to SWH code (CLI, gunicorn, celery, …) are, or should
+be, intrumented using Sentry. By default this is disabled, but if you
+run your own Sentry instance, you can use it.
+
+To do so, you must get a DSN from your Sentry instance, and set it as
+the value of ``SWH_SENTRY_DSN`` in the file ``env/common_python.env``.
+You may also set it per-service in the ``environment`` section of each
+services in ``docker-compose.override.yml``.
+
+Caveats
+-------
+
+Running a lister task can lead to a lot of loading tasks, which can fill
+your hard drive pretty fast. Make sure to monitor your available storage
+space regularly when playing with this stack.
+
+Also, a few containers (``swh-storage``, ``swh-xxx-db``) use a volume
+for storing the blobs or the database files. With the default
+configuration provided in the ``docker-compose.yml`` file, these volumes
+are not persistant. So removing the containers will delete the volumes!
+
+Also note that for the ``swh-objstorage``, since the volume can be
+pretty big, the remove operation can be quite long (several minutes is
+not uncommon), which may mess a bit with the ``docker-compose`` command.
+
+If you have an error message like:
+
+Error response from daemon: removal of container 928de3110381 is already
+in progress
+
+it means that you need to wait for this process to finish before being
+able to (re)start your docker stack again.