diff --git a/docs/architecture/mirror.rst b/docs/architecture/mirror.rst
index 7885df3..03643d0 100644
--- a/docs/architecture/mirror.rst
+++ b/docs/architecture/mirror.rst
@@ -1,133 +1,133 @@
 .. _mirror:
 
 
 Mirroring
 =========
 
 
 Description
 -----------
 
 A mirror is a full copy of the |swh| archive, operated independently from the
 Software Heritage initiative. A minimal mirror consists of two parts:
 
 - the graph storage (typically an instance of :ref:`swh.storage <swh-storage>`),
   which contains the Merkle DAG structure of the archive, *except* the
   actual content of source code files (AKA blobs),
 
 - the object storage (typically an instance of :ref:`swh.objstorage <swh-objstorage>`),
   which contains all the blobs corresponding to archived source code files.
 
 However, a usable mirror needs also to be accessible by others. As such, a
 proper mirror should also allow to:
 
 - navigate the archive copy using a Web browser and/or the Web API (typically
   using the :ref:`the web application <swh-web>`),
 - retrieve data from the copy of the archive (typically using the :ref:`the
   vault service <swh-vault>`)
 
 A mirror is initially populated and maintained up-to-date by consuming data
 from the |swh| Kafka-based :ref:`journal <journal-specs>` and retrieving the
 blob objects (file content) from the |swh| :ref:`object storage <swh-objstorage>`.
 
 .. note:: It is not required that a mirror is deployed using the |swh| software
    stack. Other technologies, including different storage methods, can be
    used. But we will focus in this documentation to the case of mirror
    deployment using the |swh| software stack.
 
 
-.. thumbnail:: images/mirror-architecture.svg
+.. thumbnail:: ../images/mirror-architecture.svg
 
    General view of the |swh| mirroring architecture.
 
 
 Mirroring the Graph Storage
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 The replication of the graph is based on a journal using Kafka_ as event
 streaming platform.
 
 On the Software Heritage side, every addition made to the archive consist of
 the addition of a :ref:`data-model` object. The new object is also serialized
 as a msgpack_ bytestring which is used as the value of a message added to a
 Kafka topic dedicated to the object type.
 
 The main Kafka topics for the |swh| :ref:`data-model` are:
 
 - `swh.journal.objects.content`
 - `swh.journal.objects.directory`
 - `swh.journal.objects.metadata_authority`
 - `swh.journal.objects.metadata_fetcher`
 - `swh.journal.objects.origin_visit_status`
 - `swh.journal.objects.origin_visit`
 - `swh.journal.objects.origin`
 - `swh.journal.objects.raw_extrinsic_metadata`
 - `swh.journal.objects.release`
 - `swh.journal.objects.revision`
 - `swh.journal.objects.skipped_content`
 - `swh.journal.objects.snapshot`
 
 In order to set up a mirror of the graph, one needs to deploy a stack capable
 of retrieving all these topics and store their content reliably. For example a
 Kafka cluster configured as a replica of the main Kafka broker hosted by |swh|
 would do the job (albeit not in a very useful manner by itself).
 
 A more useful mirror can be set up using the :ref:`storage <swh-storage>`
 component with the help of the special service named `replayer` provided by the
-:doc:`apidoc/swh.storage.replay` module.
+:mod:`swh.storage.replay` module.
 
 .. TODO: replace this previous link by a link to the 'swh storage replay'
    command once available, and ideally once
    https://github.com/sphinx-doc/sphinx/issues/880 is fixed
 
 
 Mirroring the Object Storage
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 File content (blobs) are *not* directly stored in messages of the
 `swh.journal.objects.content` Kafka topic, which only contains metadata about
 them, such as various kinds of cryptographic hashes. A separate component is in
 charge of replicating blob objects from the archive and stored them in the
 local object storage instance.
 
 A separate `swh-journal` client should subscribe to the
 `swh.journal.objects.content` topic to get the stream of blob objects
 identifiers, then retrieve corresponding blobs from the main Software Heritage
 object storage, and store them in the local object storage.
 
 A reference implementation for this component is available in
 :ref:`content replayer <swh-objstorage-replayer>`.
 
 
 Installation
 ------------
 
 When using the |swh| software stack to deploy a mirror, a number of |swh|
 software components must be installed (cf. architecture diagram above):
 
 - a database to store the graph of the |swh| archive,
 - the :ref:`swh-storage` component,
 - an object storage solution (can be cloud-based or on local filesystem like
   ZFS pools),
 - the :ref:`swh-objstorage` component,
-- the :ref:`swh.storage.replay` service (part of the :ref:`swh-storage`
+- the :mod:`swh.storage.replay` service (part of the :ref:`swh-storage`
   package)
-- the :ref:`swh.objstorage.replayer.replay` service (from the
+- the :mod:`swh.objstorage.replayer.replay` service (from the
   :ref:`swh-objstorage-replayer` package).
 
 A `docker-swarm <https://docs.docker.com/engine/swarm/>`_ based deployment
 solution is provided as a working example of the mirror stack:
 
   https://forge.softwareheritage.org/source/swh-docker
 
 It is strongly recommended to start from there before planning a
 production-like deployment.
 
 See the `README <https://forge.softwareheritage.org/source/swh-docker/browse/master/README.md>`_
 file of the `swh-docker <https://forge.softwareheritage.org/source/swh-docker>`_
 repository for details.
 
 
 .. _Kafka: https://kafka.apache.org/
 .. _msgpack: https://msgpack.org
 
diff --git a/docs/architecture/overview.rst b/docs/architecture/overview.rst
index df8f8b7..d9000ee 100644
--- a/docs/architecture/overview.rst
+++ b/docs/architecture/overview.rst
@@ -1,275 +1,275 @@
 .. _architecture-overview:
 
 Software Architecture Overview
 ==============================
 
 
 From an end-user point of view, the |swh| platform consists in the
 :term:`archive`, which can be accessed using the web interface or its REST API.
 Behind the scene (and the web app) are several components/services that expose
 different aspects of the |swh| :term:`archive` as internal RPC APIs.
 
 These internal APIs have a dedicated database, usually PostgreSQL_.
 
 A global (and incomplete) view of this architecture looks like:
 
 .. thumbnail:: ../images/general-architecture.svg
 
    General view of the |swh| architecture.
 
 .. _architecture-tier-1:
 
 Core components
 ---------------
 
 The following components are the foundation of the entire |swh| architecture,
 as they fetch data, store it, and make it available to every other service.
 
 Data storage
 ^^^^^^^^^^^^
 
 The :ref:`Storage <swh-storage>` provides an API to store and retrieve
 elements of the :ref:`graph <data-model>`, such as directory structure,
 revision history, and their respective metadata.
 It relies on the :ref:`Object Storage <swh-objstorage>` service to store
 the content of source code file themselves.
 
 Both the Storage and Object Storage are designed as abstractions over possible
 backends. The former supports both PostgreSQL (the current solution in production)
 and Cassandra (a more scalable option we are exploring).
 The latter supports a large variety of "cloud" object storage as backends,
 as well as a simple local filesystem.
 
 Task management
 ^^^^^^^^^^^^^^^
 
 The :ref:`Scheduler <swh-scheduler>` manages the entire choreography of jobs/tasks
 in |swh|, from detecting and ingesting repositories, to extracting metadata from them,
 to repackaging repositories into small downloadable archives.
 
 It does this by managing its own database of tasks that need to run
 (either periodically or only once),
 and passing them to celery_ for execution on dedicated workers.
 
 Listers
 ^^^^^^^
 
 :term:`Listers <lister>` are type of task, run by the Scheduler, aiming at scraping a
 web site, a forge, etc. to gather all the source code repositories it can
 find, also known as :term:`origins <origin>`.
 For each found source code repository, a :term:`loader` task is created.
 
 The following sequence diagram shows the interactions between these components
 when a new forge needs to be archived. This example depicts the case of a
 gitlab_ forge, but any other supported source type would be very similar.
 
-.. thumbnail:: images/tasks-lister.svg
+.. thumbnail:: ../images/tasks-lister.svg
 
 As one might observe in this diagram, it does two things:
 
 - it asks the forge (a gitlab_ instance in this case) the list of known
   repositories, and
 
 - it insert one :term:`loader` task for each source code repository that will
   be in charge of importing the content of that repository.
 
 Note that most listers usually work in incremental mode, meaning they store in a
 dedicated database the current state of the listing of the forge. Then, on a subsequent
 execution of the lister, it will ask only for new repositories.
 
 Also note that if the lister inserts a new loading task for a repository for which a
 loading task already exists, the existing task will be updated (if needed) instead of
 creating a new task.
 
 Loaders
 ^^^^^^^
 
 :term:`Loaders <loader>` are also a type of task, but aim at importing or
 updating a source code repository. It is the one that inserts :term:`blob`
 objects in the :term:`object storage`, and inserts nodes and edges in the
 :ref:`graph <swh-merkle-dag>`.
 
 
 The sequence diagram below describe this second step of importing the content
 of a repository. Once again, we take the example of a git repository, but any
 other type of repository would be very similar.
 
-.. thumbnail:: images/tasks-git-loader.svg
+.. thumbnail:: ../images/tasks-git-loader.svg
 
 
 Journal
 ^^^^^^^
 
 The last core component is the :term:`Journal <journal>`, which is a persistent logger
 of every change in the archive, with publish-subscribe_ support, using Kafka.
 
 The Storage writes to it every time a new object is added to the archive;
 and many components read from it to be notified of these changes.
 For example, it allows the Scheduler to know how often software repositories are
 updated by their developers, to decide when next to visit these repositories.
 
 It is also the foundation of the :ref:`mirror` infrastructure, as it allows
 mirrors to stay up to date.
 
 
 .. _architecture-tier-2:
 
 Other major components
 ----------------------
 
 All the components we saw above are critical to the |swh| archive as they are
 in charge of archiving source code.
 But are not enough to provide another important features of |swh|: making
 this archive accessible and searchable by anyone.
 
 
 Archive website and API
 ^^^^^^^^^^^^^^^^^^^^^^^
 
 First of all, the archive website and API, also known as :ref:`swh-web <swh-web>`,
 is the main entry point of the archive.
 
 This is the component that serves https://archive.softwareheritage.org/,
 which is the window into the entire archive, as it provides access to it
 through a web browser or the HTTP API.
 
 It does so by querying most of the internal APIs of |swh|:
 the Data Storage (to display source code repositories and their content),
 the Scheduler (to allow manual scheduling of loader tasks through the
 `Save Code Now <https://archive.softwareheritage.org/save/>`_ feature),
 and many of the other services we will see below.
 
 Internal data mining
 ^^^^^^^^^^^^^^^^^^^^
 
 :term:`Indexers <indexer>` are a type of task aiming at crawling
 the content of the :term:`archive` to extract derived information.
 
 It ranges from detecting the MIME type or license of individual files,
 to reading all types of metadata files at the root of repositories
 and storing them together in a unified format, CodeMeta_.
 
 All results computed by Indexers are stored in a PostgreSQL database,
 the Indexer Storage.
 
 
 Vault
 ^^^^^
 
 The :term:`Vault <vault>` is an internal API, in charge of cooking
 compressed archive (zip or tgz) of archived objects on request (via swh-web).
 These compressed objects are typically directories or repositories.
 
 Since this can be a rather long process, it is delegated to
 an asynchronous (celery) task, through the Scheduler.
 
 .. _architecture-tier-3:
 
 Extra services
 --------------
 
 Finally, |swh| provides additional tools that, although not necessary to operate
 the archive, provide convenient interfaces or performance benefits.
 
 It is therefore possible to have a fully-functioning archive without any of these
 services (our :ref:`development Docker environment <getting-started>` disables
 most of these by default).
 
 Search
 ^^^^^^
 
 The :ref:`swh-search <swh-search>` service complements both the Storage
 and the Indexer Storage, to provide efficient advanced reverse-index search queries,
 such as full-text search on origin URLs and metadata.
 
 This service is a recent addition to the |swh| architecture based on ElasticSearch,
 and is currently in use only for URL search.
 
 Graph
 ^^^^^
 
 :ref:`swh-graph <swh-graph>` is also a recent addition to the architecture
 designed to complement the Storage using a specialized backend.
 It leverages WebGraph_ to store a compressed in-memory representation of the
 entire graph, and provides fast implementations of graph traversal algorithms.
 
 Counters
 ^^^^^^^^
 
 The `archive's landing page <https://archive.softwareheritage.org/>`_ features
 counts of the total number of files/directories/revisions/... in the archive.
 Perhaps surprisingly, counting unique objects at |swh|'s scale is hard,
 and a performance bottleneck when implemented purely in the Storage's SQL database.
 
 :ref:`swh-counters <swh-counters>` provides an alternative design to solve this issue,
 by reading new objects from the Journal and counting them using Redis_' HyperLogLog_
 feature; and keeps the history of these counters over time using Prometheus_.
 
 Deposit
 ^^^^^^^
 
 The :ref:`Deposit <swh-deposit>` is an alternative way to add content to the archive.
 While listers and loaders, as we saw above, **discover** repositories
 and **pull** artifacts into the archive, the Deposit allows trusted partners to
 **push** the content of their repository directly to the archive,
 and is internally loaded by the
 :mod:`Deposit Loader <swh.loader.package.deposit.loader>`
 
 The Deposit is centered on the SWORDv2_ protocol, which allows depositing archives
 (usually TAR or ZIP) along with metadata in XML.
 
 The Deposit has its own HTTP interface, independent of swh-web.
 It also has its own SWORD client, which is specialized to interact with the Deposit
 server.
 
 Authentication
 ^^^^^^^^^^^^^^
 
 While the archive itself is public, |swh| reserves some features
 to authenticated clients, such as higher rate limits, access to experimental APIs
 (currently: the Graph service), or the Deposit.
 
 This is managed centrally by :ref:`swh-auth <swh-auth>` using KeyCloak.
 
 Web Client, Fuse, Scanner
 ^^^^^^^^^^^^^^^^^^^^^^^^^
 
 SWH provides a few tools to access the archive via the API:
 
 * :ref:`swh-web-client`, a command-line interface to authenticate with SWH
   and a library to access the API from Python programs
 * :ref:`swh-fuse`, a Filesystem in USErspace implementation,
   that exposes the entire archive as a regular directory on your computer
 * :ref:`swh-scanner`, a work-in-progress to check which of the files in
   a project are already in the archive, without submitting them
 
 Replayers and backfillers
 ^^^^^^^^^^^^^^^^^^^^^^^^^
 
 As the Journal and various databases may be out of sync for various reasons
 (scrub of either of them, migration, database addition, ...),
 and because some databases need to follow the content of the Journal (mirrors),
 some places of the |swh| codebase contains tools known as "replayers" and "backfillers",
 designed to keep them in sync:
 
-* the :ref:`Object Storage Replayer <swh.objstorage.replayer>` copies the content
+* the :mod:`Object Storage Replayer <swh.objstorage.replayer>` copies the content
   of an objects storage to another one. It first performs a full copy, then streams
   new objects using the Journal to stay up to date
 * the Storage Replayer loads the entire content of the Journal into a Storage database,
   and also keeps them in sync.
   This is used for mirrors, and when creating a new database.
 * the Storage Backfiller, which does the opposite. This was initially used to populate
   the Journal from the database; and is occasionally when one needs to clear a topic
   in the Journal and recreate it.
 
 
 .. _celery: https://www.celeryproject.org
 .. _CodeMeta: https://codemeta.github.io/
 .. _gitlab: https://gitlab.com
 .. _PostgreSQL: https://www.postgresql.org/
 .. _Prometheus: https://prometheus.io/
 .. _publish-subscribe: https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern
 .. _Redis: https://redis.io/
 .. _SWORDv2: http://swordapp.github.io/SWORDv2-Profile/SWORDProfile.html
 .. _HyperLogLog: https://redislabs.com/redis-best-practices/counting/hyperloglog/
 .. _WebGraph: https://webgraph.di.unimi.it/
diff --git a/docs/archive-changelog.rst b/docs/archive-changelog.rst
index 115d583..509dc94 100644
--- a/docs/archive-changelog.rst
+++ b/docs/archive-changelog.rst
@@ -1,132 +1,132 @@
 .. _archive-changelog:
 
 
 Software Heritage --- Archive ChangeLog
 =======================================
 
 Below you can find a time-indexed list of notable events and changes to
 archival policies in the Software Heritage Archive. Each of them might have
 (had) an impact on how content is archived and explain apparent statistical
 anomalies or other changes in archival behavior over time. They are collected
 in this document for historical reasons.
 
 
 2020
 ----
 
 * **2020-10-06 - 2020-11-23:** source code crawlers have been paused to avoid
   an out of disk condition, due to an unexpected delay in the arrival of new
   storage hardware. Push archival (both deposit_ and `save code now`_) remained
   in operation. (tracking: `T2656 <https://forge.softwareheritage.org/T2656>`_)
 
 * **2020-09-15:** completed first archival of, and added to regular crawling
   `GNU Guix System`_ (tracking: `T2594
   <https://forge.softwareheritage.org/T2594>`_)
 
 * **2020-06-11:** completed integration with the IPOL_ journal, allowing paper
   authors to explicitly deposit_ source code to the archive (`announcement
-  <https://www.softwareheritage.org/2020/06/11/ipol-and-swh/>`_)
+  <https://www.softwareheritage.org/2020/06/11/ipol-and-swh/>`__)
 
 * **2020-05-25:** completed first archival of, and added to regular crawling
   NixOS_ (tracking: `T2411 <https://forge.softwareheritage.org/T2411>`_)
 
 
 2019
 ----
 
 * **2019-09-10:** completed first archival of Bitbucket_ Git repositories and
   added Bitbucket as a regularly crawled forge (tracking: `T592
   <https://forge.softwareheritage.org/T592>`_)
 
 * **2019-06-30:** completed first archival of, and added to regular crawling,
   several GitLab_ instances: `0xacab.org <https://0xacab.org>`_, `framagit.org
   <https://framagit.org>`_, `gite.lirmm.fr <https://gite.lirmm.fr>`_,
   `gitlab.common-lisp.net <https://gitlab.common-lisp.net>`_,
   `gitlab.freedesktop.org <https://gitlab.freedesktop.org>`_, `gitlab.gnome.org
   <https://gitlab.gnome.org>`_, `gitlab.inria.fr <https://gitlab.inria.fr>`_,
   `salsa.debian.org <https://salsa.debian.org>`_
 
 * **2019-06-12:** completed first archival of CRAN_ packages and added CRAN as
   a regularly crawled package repository (tracking: `T1709
   <https://forge.softwareheritage.org/T1709>`_)
 
 * **2019-06-11:** completed a full archival of GNU_ source code releases from
   `ftp.gnu.org`_, and added it to regular crawling (tracking: `T1722
   <https://forge.softwareheritage.org/T1722>`_)
 
 * **2019-05-27:** completed a full archival of NPM_ packages and added it as a
   regularly crawled package repository (tracking: `T1378
   <https://forge.softwareheritage.org/T1378>`_)
 
 * **2019-01-10:** enabled the `save code now`_ service, allowing users to
   explicitly request archival of a specific source code repository
   (`announcement
-  <https://www.softwareheritage.org/2019/01/10/save_code_now/>`_)
+  <https://www.softwareheritage.org/2019/01/10/save_code_now/>`__)
 
 
 2018
 ----
 
 * **2018-10-10:** completed first archival of PyPI_ packages and added PyPI as
   a regularly crawled package repository (`announcement
-  <https://www.softwareheritage.org/2018/10/10/pypi-available-on-software-heritage/>`_)
+  <https://www.softwareheritage.org/2018/10/10/pypi-available-on-software-heritage/>`__)
 
 * **2018-09-25:** completed integration with HAL_, allowing paper authors to
   explicitly deposit_ source code to the archive (`announcement
-  <https://www.softwareheritage.org/2018/09/28/depositing-scientific-software-into-software-heritage/>`_)
+  <https://www.softwareheritage.org/2018/09/28/depositing-scientific-software-into-software-heritage/>`__)
 
 * **2018-08-31:** completed first archival of public GitLab_ repositories from
   `gitlab.com <https://gitlab.com>`_ and added it as a regularly crawled forge
   (tracking: `T1111 <https://forge.softwareheritage.org/T1111>`_)
 
 * **2018-03-21:** completed archival of `Google Code`_ Mercurial repositories.
   (tracking: `T682 <https://forge.softwareheritage.org/T682>`_)
 
 * **2018-02-20:** completed archival of Debian_ packages and added Debian as a
   regularly crawled distribution (`announcement
-  <https://www.softwareheritage.org/2018/02/20/listing-and-loading-of-debian-repositories-now-live/>`_)
+  <https://www.softwareheritage.org/2018/02/20/listing-and-loading-of-debian-repositories-now-live/>`__)
 
 
 2017
 ----
 
 * **2017-10-02:** completed archival of `Google Code`_ Subversion repositories
   (tracking: `T617 <https://forge.softwareheritage.org/T617>`_)
 
 * **2017-06-06:** completed archival of `Google Code`_ Git repositories
   (tracking: `T673 <https://forge.softwareheritage.org/T673>`_)
 
 
 2016
 ----
 
 * **2016-04-04:** completed archival of the Gitorious_ (tracking: `T312
   <https://forge.softwareheritage.org/T312>`_)
 
 
 2015
 ----
 
 * **2015-11-06:** archived all GNU_ source code releases from `ftp.gnu.org`_
   (tracking: `T90 <https://forge.softwareheritage.org/T90>`_)
 * **2015-07-28:** started archiving public GitHub_ repositories
 
 
 
 .. _Bitbucket: https://bitbucket.org
 .. _CRAN: https://cran.r-project.org
 .. _Debian: https://www.debian.org
 .. _GNU Guix System: https://guix.gnu.org/
 .. _GNU: https://en.wikipedia.org/wiki/Google_Code
 .. _GitHub: https://github.com
 .. _GitLab: https://gitlab.com
 .. _Gitorious: https://en.wikipedia.org/wiki/Gitorious
 .. _Google Code: https://en.wikipedia.org/wiki/Google_Code
 .. _HAL: https://hal.archives-ouvertes.fr
 .. _IPOL: http://www.ipol.im
 .. _NPM: https://www.npmjs.com
 .. _NixOS: https://nixos.org/
 .. _PyPI: https://pypi.org
 .. _deposit: https://deposit.softwareheritage.org
 .. _ftp.gnu.org: http://ftp.gnu.org
 .. _save code now: https://save.softwareheritage.org
diff --git a/docs/contributing/code-review.rst b/docs/contributing/code-review.rst
index 94f1ec6..9d85ba7 100644
--- a/docs/contributing/code-review.rst
+++ b/docs/contributing/code-review.rst
@@ -1,52 +1,52 @@
 .. _code-review:
 
 Code Review
 ===========
 
 This page documents code review practices used for Software Heritage development.
 
 Guidelines
 ----------
 
 Please adhere to the following guidelines to perform and obtain code reviews
 (CRs) in the context of Software Heritage development:
 
 1. **CRs are strongly recommended** for any non-trivial code change,
    but not mandatory (nor enforced at the VCS level).
 2. The CR :ref:`workflow <patch-submission>` is implemented using
    Phabricator/Differential.
 3. Explicitly **suggest reviewer(s)** when submitting new CR requests:
    either the most knowledgeable person(s) for the target code or the general
    `reviewers <https://forge.softwareheritage.org/project/view/50/>`_
    (which is the `default <https://forge.softwareheritage.org/H18>`_).
 4. **Review anything you want**: no matter the suggested reviewer(s),
    feel free to review any outstanding CR.
 5. **One LGTM is enough**: feel free to approve any outstanding CR.
 6. **Review every day**: CRs should be timely as fellow developers
    will wait for them.
    To make CRs sustainable each developer should strive to dedicate
    a fixed minimum amount of CR time every (work) day.
 
 For more detailed suggestions (and much more) on the motivational
 and practical aspects of code reviews see Good reads below.
 
 Good reads
 ----------
 
 Good reads on various angles of code review:
 
-* `Best practices <https://medium.com/palantir/code-review-best-practices-19e02780015f>`_ (Palantir) ← comprehensive and recommended read, especially if you're short on time
-* `Best practices <https://github.com/thoughtbot/guides/tree/master/code-review>`_ (Thoughtbot)
-* `Best practices <https://smartbear.com/learn/code-review/best-practices-for-peer-code-review/>`_ (Smart Bear)
+* `Best practices (Palantir) <https://medium.com/palantir/code-review-best-practices-19e02780015f>`_ ← comprehensive and recommended read, especially if you're short on time
+* `Best practices (Thoughtbot) <https://github.com/thoughtbot/guides/tree/master/code-review>`_
+* `Best practices (Smart Bear) <https://smartbear.com/learn/code-review/best-practices-for-peer-code-review/>`_
 * `Review checklist <https://www.codeproject.com/Articles/524235/Codeplusreviewplusguidelines>`_ (Code Project)
 * `Motivation: code quality <https://blog.codinghorror.com/code-reviews-just-do-it/>`_ (Coding Horror)
 * `Motivation: team culture <https://blog.fullstory.com/what-we-learned-from-google-code-reviews-arent-just-for-catching-bugs/>`_ (Google & FullStory)
 * `Motivation: humanizing peer reviews <http://www.processimpact.com/articles/humanizing_reviews.pdf>`_ (Wiegers)
 * `Motivation: sharing knowledge <https://www.atlassian.com/agile/software-development/code-reviews>`_ (Atlassian)
 
 See also
 --------
 
 * :ref:`patch-submission`
 * :ref:`python-style-guide`
 * :ref:`git-style-guide`
diff --git a/docs/journal.rst b/docs/journal.rst
index 25746c4..f8db95a 100644
--- a/docs/journal.rst
+++ b/docs/journal.rst
@@ -1,673 +1,673 @@
 .. _journal-specs:
 
 Software Heritage Journal --- Specifications
 ============================================
 
 The |swh| journal is a Kafka_-based stream of events for every added object in
 the |swh| Archive and some of its related services, especially indexers.
 
 Each topic_ will stream added elements for a given object type according to the
 topic name.
 
 Objects streamed in a topic are serialized versions of objects stored in the
 |swh| Archive specified by the main |swh| :py:mod:`data model <swh.model.model>` or
 the :py:mod:`indexer object model <swh.indexer.storage.model>`.
 
 
 In this document we will describe expected messages in each topic, so a
 potential consumer can easily cope with the |swh| journal without having to
 read the source code or the |swh| :ref:`data model <swh-model>` in details (it
 is however recommended to familiarize yourself with this later).
 
 Kafka message values are dictionary structures serialized as msgpack_, with a
 few custom encodings. See the section `Kafka message format`_ below for a
 complete description of the serialization format.
 
 Note that each example given below show the dictionary before being serialized
 as a msgpack_ chunk.
 
 
 Topics
 ------
 
 There are several groups of topics:
 
 - main storage Merkle-DAG related topics,
 - other storage objects (not part of the Merkle DAG),
 - indexer related objects (not yet documented below).
 
 Topics prefix can be either `swh.journal.objects` or
 `swh.journal.objects_privileged` (see below).
 
 Anonymized topics
 +++++++++++++++++
 
 For topics that transport messages with user information (name and email
 address), namely `swh.journal.objects.release`_ and
 `swh.journal.objects.revision`_, there are 2 versions of those: one is an
 anonymized topic, in which user information are obfuscated, and a pristine
 version with clear data.
 
 Access to pristine topics depends on ACLs linked to credentials used to connect
 to the Kafka cluster.
 
 
 List of topics
 ++++++++++++++
 
 - `swh.journal.objects.origin`_
 - `swh.journal.objects.origin_visit`_
 - `swh.journal.objects.origin_visit_status`_
 - `swh.journal.objects.snapshot`_
 - `swh.journal.objects.release`_
 - `swh.journal.objects.privileged_release <swh.journal.objects.release>`_
 - `swh.journal.objects.revision`_
 - `swh.journal.objects.privileged_revision <swh.journal.objects.revision>`_
 - `swh.journal.objects.directory`_
 - `swh.journal.objects.content`_
-- `swh.journal.objects.skippedcontent`_
+- `swh.journal.objects.skipped_content`_
 - `swh.journal.objects.metadata_authority`_
 - `swh.journal.objects.metadata_fetcher`_
 - `swh.journal.objects.raw_extrinsic_metadata`_
 
 
 
 Topics for Merkle-DAG objects
 -----------------------------
 
 These topics are for the various objects stored in the |swh| Merkle DAG, see
 the :ref:`data model <swh-model>` for more details.
 
 
 `swh.journal.objects.snapshot`
 ++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.Snapshot` objects.
 
 Message format:
 
 - `branches` [dict] branches present in this snapshot,
 - `id` [bytes] the intrinsic identifier of the
   :py:class:`swh.model.model.Snapshot` object
 
 with `branches` being a dictionary which keys are branch names [bytes], and values a dictionary of:
 
 - `target` [bytes] intrinsic identifier of the targeted object
 - `target_type` [string] the type of the targeted object (can be "content",
   "directory", "revision", "release", "snapshot" or "alias").
 
 Example:
 
 .. code:: python
 
    {
     'branches': {
       b'refs/pull/1/head': {
         'target': b'\x07\x10\\\xfc\xae\x1f\xb1\xf9\xb5\xad\x8bI\xf1G\x10\x9a\xba>8\x0c',
         'target_type': 'revision'
         },
       b'refs/pull/2/head': {
         'target': b'\x1a\x868-\x9b\x1d\x00\xfbd\xeaH\xc88\x9c\x94\xa1\xe0U\x9bJ',
         'target_type': 'revision'
         },
       b'refs/heads/master': {
         'target': b'\x7f\xc4\xfe4f\x7f\xda\r\x0e[\xba\xbc\xd7\x12d#\xf7&\xbfT',
         'target_type': 'revision'
         },
       b'HEAD': {
         'target': b'refs/heads/master',
         'target_type': 'alias'
         }
       },
     'id': b'\x10\x00\x06\x08\xe9E^\x0c\x9bS\xa5\x05\xa8\xdf\xffw\x88\xb8\x93^'
    }
 
 
 
 `swh.journal.objects.release`
 +++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.Release` objects.
 
 This topics is anonymized. The non-anonymized version of this topic is
 `swh.journal.objects_privileged.release`.
 
 Message format:
 
 - `name` [bytes] name (typically the version) of the release
 - `message` [bytes] message of the release
 - `target` [bytes] identifier of the target object
 - `target_type` [string] type of the target, can be "content", "directory",
   "revision", "release" or "snapshot"
 - `synthetic` [bool] True if the :py:class:`swh.model.model.Release` object has
   been forged by the loading process; this flag is not used for the id
   computation,
 - `author` [dict] the author of the release
 - `date` [gitdate] the date of the release
 - `id` [bytes] the intrinsic identifier of the
   :py:class:`swh.model.model.Release` object
 
 Example:
 
 .. code:: python
 
    {
     'name': b'0.3',
     'message': b'',
     'target': b'<\xd6\x15\xd9\xef@\xe0[\xe7\x11=\xa1W\x11h%\xcc\x13\x96\x8d',
     'target_type': 'revision',
     'synthetic': False,
     'author': {
       'fullname': b'\xf5\x8a\x95k\xffKgN\x82\xd0f\xbf\x12\xe8w\xc8a\xf79\x9e\xf4V\x16\x8d\xa4B\x84\x15\xea\x83\x92\xb9',
       'name': None,
       'email': None
       },
     'date': {
       'timestamp': {
         'seconds': 1480432642,
         'microseconds': 0
         },
       'offset': 180,
       'negative_utc': False
       },
     'id': b'\xd0\x00\x06u\x05uaK`.\x0c\x03R%\xca,\xe1x\xd7\x86'
    }
 
 
 `swh.journal.objects.revision`
 ++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.Revision` objects.
 
 This topics is anonymized. The non-anonymized version of this topic is
 `swh.journal.objects_privileged.revision`.
 
 Message format:
 
 - `message` [bytes] the commit message for the revision
 - `author` [dict] the author of the revision
 - `committer` [dict] the committer of the revision
 - `date` [gitdate] the revision date
 - `committer_date` [gitdate] the revision commit date
 - `type` [string] the type of the revision (can be "git", "tar", "dsc", "svn", "hg")
 - `directory` [bytes] the intrinsic identifier of the directory this revision links to
 - `synthetic` [bool] whether this :py:class:`swh.model.model.Revision` is synthetic or not,
 - `metadata` [bytes] the metadata linked to this :py:class:`swh.model.model.Revision` (not part of the
   intrinsic identifier computation),
 - `parents` [list[bytes]] list of parent :py:class:`swh.model.model.Revision` intrinsic identifiers
 - `id` [bytes] intrinsic identifier of the :py:class:`swh.model.model.Revision`
 - `extra_headers` [list[(bytes, bytes)]] TODO
 
 
 Example:
 
 .. code:: python
 
    {
     'message': b'I now arrange to be able to create a prettyprinted version of the Pascal\ncode to make review of translation of it easier, and I have thought a bit\nmore about coping with Pastacl variant records and the like, but have yet to\nimplement everything. lufylib.red is a place for support code.\n',
     'author': {
       'fullname': b'\xf3\xa7\xde7[\x8b#=\xe48\\/\xa1 \xed\x05NA\xa6\xf8\x9c\n\xad5\xe7\xe0"\xc4\xd5[\xc9z',
       'name': None,
       'email': None
       },
     'committer': {
       'fullname': b'\xf3\xa7\xde7[\x8b#=\xe48\\/\xa1 \xed\x05NA\xa6\xf8\x9c\n\xad5\xe7\xe0"\xc4\xd5[\xc9z',
       'name': None,
       'email': None
       },
     'date': {
       'timestamp': {'seconds': 1495977610, 'microseconds': 334267},
       'offset': 0,
       'negative_utc': False
       },
     'committer_date': {
       'timestamp': {'seconds': 1495977610, 'microseconds': 334267},
       'offset': 0,
       'negative_utc': False
       },
     'type': 'svn',
     'directory': b'\x815\xf0\xd9\xef\x94\x0b\xbf\x86<\xa4j^\xb65\xe9\xf4\xd1\xc3\xfe',
     'synthetic': True,
     'metadata': None,
     'parents': [
       b'D\xb1\xc8\x0f&\xdc\xd4 \x92J\xaf\xab\x19V\xad\xe7~\x18\n\x0c',
       ],
     'id': b'\x1e\x1c\x19<l\xaa\xd2~{P\x11jH\x0f\xfd\xb0Y\x86\x99\x08',
     'extra_headers': [
       [b'svn_repo_uuid', b'2bfe0521-f11c-4a00-b80e-6202646ff360'],
       [b'svn_revision', b'4067']
       ]
    }
 
 
 
 `swh.journal.objects.content`
 +++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.Content` objects.
 
 Message format:
 
 - `sha1` [bytes] SHA1 of the :py:class:`swh.model.model.Content`
 - `sha1_git` [bytes] SHA1_GIT of the :py:class:`swh.model.model.Content`
 - `sha256` [bytes] SHA256 of the :py:class:`swh.model.model.Content`
 - `blake2s256` [bytes] Blake2S256 hash of the :py:class:`swh.model.model.Content`
 - `length` [int] length of the :py:class:`swh.model.model.Content`
 - `status` [string] visibility status of the :py:class:`swh.model.model.Content` (can be "visible" or "hidden")
 - `ctime` [timestamp] creation date of the :py:class:`swh.model.model.Content` (i.e. date at which this
   :py:class:`swh.model.model.Content` has been seen for the first time in the |swh| Archive).
 
 Example:
 
 .. code:: python
 
    {
     'sha1': b'-\xe7\xc1`\x9d\xd7\x7fu+\x05l\x07\xd1}\x95\x16o-u\x1d',
     'sha1_git': b'\xb9B\xa7EOW[\xef\x8b\x98\xa6b\xe9\xc7\xf0\x96g\x06`\xa4',
     'sha256': b'h{\xda\x8d\xaeG\xa4\xc6\x10\x05\xbc\xc9hca\x0em)\xd3A\x08\xd6\x95~(\xe5\xba\xe4\xaa\xcaT\x19',
     'blake2s256': b'\x8cl\xec\xe8S\xcd\xab\x90E\xc2\x8c\xfax\xe3\xbe\xca\x9aJ6\x1a\x9c](6\xc3\xb49\x8b:\xf9\xd8r',
     'length': 3220,
     'status': 'visible',
     'ctime': Timestamp(seconds=1606260407, nanoseconds=818259954)
    }
 
 
 
 `swh.journal.objects.skipped_content`
 +++++++++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.SkippedContent` objects.
 
 
 Message format:
 
 - `sha1` [bytes] SHA1 of the :py:class:`swh.model.model.SkippedContent`
 - `sha1_git` [bytes] SHA1 of the :py:class:`swh.model.model.SkippedContent`
 - `sha256` [bytes] SHA1 of the :py:class:`swh.model.model.SkippedContent`
 - `blake2s256` [bytes] SHA1 of the :py:class:`swh.model.model.SkippedContent`
 - `length` [int] length of the :py:class:`swh.model.model.SkippedContent`
 - `status` [string] visibility status of the
   :py:class:`swh.model.model.SkippedContent` (can only be "absent")
 - `reason` [string] message indicating the reason for this content to be a
   :py:class:`swh.model.model.SkippedContent` (rather than a
   :py:class:`swh.model.model.Content`)
 - `ctime` [timestamp] creation date of the
   :py:class:`swh.model.model.SkippedContent` (i.e. date at which this
   :py:class:`swh.model.model.SkippedContent` has been seen for the first time in
   the |swh| Archive)
 
 
 Example:
 
 .. code:: python
 
    {
     'sha1': b'[\x0f\x19I-%+\xec\x9dS\x86\xffz\xcb\xa2\x9f\x15\xcc\xb4&',
     'sha1_git': b'\xa9\xff4\xa7\xff\x85\xb3x$Ot\xaa\x91\x0b\xd0ZB!\x04\x8a',
     'sha256': b"\xe6\x876\xb2U-\x87\xb8\xe3\x12\xa0L\rq'\x88\xd4\x95\x92\xdf\x86\xfci\xe3E\x82\xe0\x95^\xbf\x1e\xbe",
     'blake2s256': b'\xe1 \n\x1d5\x8b\x1f\x98\\\x8e\xaa\x1d?8*\xc1\xf7\xb9\x95\r|\x1e\xee^\x10\x10\x19\xc6\x9c\x11\xedX',
     'length': 125146729,
     'status': 'absent',
     'reason': 'Content too large',
     'ctime': Timestamp(seconds=1606260407, nanoseconds=818259954)
    }
 
 
 
 `swh.journal.objects.directory`
 +++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.Directory` objects.
 
 Message format:
 
 - `entries` [list[dict]] list of directory entries
 - `id` [bytes] intrinsic identifier of this :py:class:`swh.model.model.Directory`
 
 with directory entries being dictionaries:
 
 - `name` [bytes] name of the directory entry
 - `type` [string] type of directory entry (can be "file", "dir" or "rev")
 - `perms` [int] permissions for this directory entry
 
 
 Example:
 
 .. code:: python
 
    {
     'entries': [
      {'name': b'LICENSE',
       'type': 'file',
       'target': b'b\x03f\xeb\x90\x07\x1cs\xaeib\x8eg\x97]0\xf0\x9dg\x01',
       'perms': 33188},
      {'name': b'README.md',
       'type': 'file',
       'target': b'\x1e>\xb56x\xbc\xe5\xba\xa4\xed\x03\xae\x83\xdb@\xd0@0\xed\xc8',
       'perms': 33188},
      {'name': b'lib',
       'type': 'dir',
       'target': b'-\xb2(\x95\xe46X\x9f\xed\x1d\xa6\x95\xec`\x10\x1a\x89\xc3\x01U',
       'perms': 16384},
      {'name': b'package.json',
       'type': 'file',
       'target': b'Z\x91N\x9bw\xec\xb0\xfbN\xe9\x18\xa2E-%\x8fxW\xa1x',
       'perms': 33188}
     ],
     'id': b'eS\x86\xcf\x16n\xeb\xa96I\x90\x10\xd0\xe9&s\x9a\x82\xd4P'
    }
 
 
 
 Other Objects Topics
 --------------------
 
 These topics are for objects of the |swh| archive that are not part of the
 Merkle DAG but are essential parts of the archive; see the :ref:`data model
 <swh-model>` for more details.
 
 
 `swh.journal.objects.origin`
 ++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.Origin` objects.
 
 Message format:
 
 - `url` [string] URL of the :py:class:`swh.model.model.Origin`
 
 Example:
 
 .. code:: python
 
    {
      "url": "https://github.com/vujkovicm/pml"
    }
 
 
 `swh.journal.objects.origin_visit`
 ++++++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.OriginVisit` objects.
 
 Message format:
 
 - `origin` [string] URL of the visited :py:class:`swh.model.model.Origin`
 - `date` [timestamp] date of the visit
 - `type` [string] type of the loader used to perform the visit
 - `visit` [int] number of the visit for this `origin`
 
 Example:
 
 .. code:: python
 
    {
     'origin': 'https://pypi.org/project/wasp-eureka/',
     'date': Timestamp(seconds=1606260407, nanoseconds=818259954),
     'type': 'pypi',
     'visit': 505}
    }
 
 
 `swh.journal.objects.origin_visit_status`
 +++++++++++++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.OriginVisitStatus` objects.
 
 Message format:
 
 - `origin` [string] URL of the visited :py:class:`swh.model.model.Origin`
 - `visit` [int] number of the visit for this `origin` this status concerns
 - `date` [timestamp] date of the visit status update
 - `status` [string] status (can be "created", "ongoing", "full" or "partial"),
 - `snapshot` [bytes] identifier of the :py:class:`swh.model.model.Snaphot` this
   visit resulted in (if `status` is "full" or "partial")
 - `metadata`: deprecated
 
 Example:
 
 .. code:: python
 
    {
     'origin': 'https://pypi.org/project/stricttype/',
     'visit': 524,
     'date': Timestamp(seconds=1606260407, nanoseconds=818259954),
     'status': 'full',
     'snapshot': b"\x85\x8f\xcb\xec\xbd\xd3P;Z\xb0~\xe7\xa2(\x0b\x11'\x05i\xf7",
     'metadata': None
    }
 
 
 
 Extrinsic Metadata related Topics
 ---------------------------------
 
 Extrinsic metadata is information about software that is not part of the source
 code itself but still closely related to the software. See
 :ref:`extrinsic-metadata-specification` for more details on the Extrinsic
 Metadata model.
 
 `swh.journal.objects.metadata_authority`
 ++++++++++++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.MetadataAuthority` objects.
 
 Message format:
 
 - `type` [string]
 - `url` [string]
 - `metadata` [dict]
 
 Examples:
 
 .. code:: python
 
    {
     'type': 'forge',
     'url': 'https://guix.gnu.org/sources.json',
     'metadata': {}
    }
 
    {
     'type': 'deposit_client',
     'url': 'https://www.softwareheritage.org',
     'metadata': {'name': 'swh'}
    }
 
 
 
 `swh.journal.objects.metadata_fetcher`
 ++++++++++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.MetadataFetcher` objects.
 
 Message format:
 
 - `type` [string]
 - `version` [string]
 - `metadata` [dict]
 
 Example:
 
 .. code:: python
 
    {
     'name': 'swh.loader.package.cran.loader.CRANLoader',
     'version': '0.15.0',
     'metadata': {}
    }
 
 
 
 `swh.journal.objects.raw_extrinsic_metadata`
 ++++++++++++++++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.RawExtrinsicMetadata` objects.
 
 Message format:
 
 - `type` [string]
 - `target` [string]
 - `discovery_date` [timestamp]
 - `authority` [dict]
 - `fetcher` [dict]
 - `format` [string]
 - `metadata` [bytes]
 - `origin` [string]
 - `visit` [int]
 - `snapshot` [SWHID]
 - `release` [SWHID]
 - `revision` [SWHID]
 - `path` [bytes]
 - `directory` [SWHID]
 
 Example:
 
 .. code:: python
 
    {
     'type': 'snapshot',
     'id': 'swh:1:snp:f3b180979283d4931d3199e6171840a3241829a3',
     'discovery_date': Timestamp(seconds=1606260407, nanoseconds=818259954),
     'authority': {
       'type': 'forge',
       'url': 'https://pypi.org/',
       'metadata': {}
       },
     'fetcher': {
       'name': 'swh.loader.package.pypi.loader.PyPILoader',
       'version': '0.10.0',
       'metadata': {}
       },
     'format': 'pypi-project-json',
     'metadata': b'{"info":{"author":"Signaltonsalat","author_email":"signaltonsalat@gmail.com"}]}',
     'origin': 'https://pypi.org/project/schwurbler/'
    }
 
 
 
 
 
 Kafka message format
 --------------------
 
 Each value of a Kafka message in a topic is a dictionary-like structure
 encoded as a msgpack_ byte string.
 
 Keys are ASCII strings.
 
 All values are encoded using default msgpack type system except for long
 integers for which we use a custom format using msgpack `extended type`_ to
 prevent overflow while packing some objects.
 
 
 Integer
 +++++++
 
 For long integers (that do not fit in the `[-(2**63), 2 ** 64 - 1]` range), a
 custom `extended type`_ based encoding scheme is used.
 
 The `type` information can be:
 
 - `1` for positive (possibly long) integers,
 - `2` for negative (possibly long) integers.
 
 The payload is simply the bytes (big endian) representation of the absolute
 value (always positive).
 
 For example (adapted to standard integers for the sake of readability; these
 values are small so they will actually be encoded using the default msgpack
 format for integers):
 
 - `12345` would be encoded as the extension value `[1, [0x30, 0x39]]` (aka `0xd5013039`)
 - `-42` would be encoded as the extension value `[2, [0x2A]]` (aka `0xd4022a`)
 
 
 Datetime
 ++++++++
 
 There are 2 type of date that can be encoded in a Kafka message:
 
 - dates for git-like objects (:py:class:`swh.model.model.Revision` and
   :py:class:`swh.model.model.Release`): these dates are part of the hash
   computation used as identifier in the Merkle DAG. In order to fully support
   git repositories, a custom encoding is required. These dates (coming from the
   git data model) are encoded as a dictionary with:
 
   - `timestamp` [dict] POSIX timestamp of the date, as a dictionary with 2 keys
     (`seconds` and `microseconds`)
 
   - `offset` [int] offset of the date (in minutes)
 
   - `negative_utc` [bool] only True for the very edge case where the date has a
     zero but negative offset value (which does not makes much sense, but
     technically the git format permits)
 
   Example:
 
   .. code:: python
 
      {
        'timestamp': {'seconds': 1480432642, 'microseconds': 0},
        'offset': 180,
        'negative_utc': False
      }
 
   These are denoted as `gitdate` below.
 
 - other dates (resulting of the |swh| processing stack) are encoded using
   msgpack's Timestamp_ extended type.
 
   These are denoted as `timestamp` below.
 
   Note that these dates used to be encoded as a dictionary (beware: keys are bytes):
 
   .. code:: python
 
      {
       b"swhtype": "datetime",
       b"d": '2020-09-15T16:19:13.037809+00:00'
      }
 
 
 Person
 ++++++
 
 :py:class:`swh.model.model.Person` objects represent a person in the |swh|
 Merkle DAG, namely a :py:class:`swh.model.model.Revision` author or committer,
 or a :py:class:`swh.model.model.Release` author.
 
 :py:class:`swh.model.model.Person` objects are serialized as a dictionary like:
 
 .. code:: python
 
    {
     'fullname': 'John Doe <john.doe@example.com>',
     'name': 'John Doe',
     'email': 'john.doe@example.com'
    }
 
 For anonymized topics, :py:class:`swh.model.model.Person` entities have seen
 anonymized prior to being serialized. The anonymized
 :py:class:`swh.model.model.Person` object is a dictionary like:
 
 .. code:: python
 
    {
     'fullname': <hashed value>,
     'name': null,
     'email': null
    }
 
 
 where the `<hashed value>` is computed from original values as a sha256 of the
 original's `fullname`.
 
 
 
 
 .. _Kafka: https://kafka.apache.org
 .. _topic: https://kafka.apache.org/documentation/#intro_concepts_and_terms
 .. _msgpack: https://msgpack.org/
 .. _`extended type`: https://github.com/msgpack/msgpack/blob/master/spec.md#extension-types
 .. _`Timestamp`: https://github.com/msgpack/msgpack/blob/master/spec.md#timestamp-extension-type
diff --git a/requirements-swh-dev.txt b/requirements-swh-dev.txt
index 8347359..3c06a0e 100644
--- a/requirements-swh-dev.txt
+++ b/requirements-swh-dev.txt
@@ -1,30 +1,30 @@
 # Add here internal Software Heritage dependencies, one per line.
 # Dependencies need to be ordered in a way that ensure only
 # development versions will be used (not the release ones hosted on PyPI).
 #
 # This is NOT in alphabetical order
 
 ../swh-core[http,db,logging]
 ../swh-auth[django]
 ../swh-model
 ../swh-journal
 ../swh-counters
 ../swh-objstorage[testing]
 ../swh-storage
 ../swh-objstorage-replayer
-../swh-scheduler
+../swh-scheduler[simulator]
 ../swh-deposit
 ../swh-graph
 ../swh-icinga-plugins
 ../swh-indexer
 ../swh-lister
 ../swh-loader-core
 ../swh-loader-git
 ../swh-loader-mercurial
 ../swh-loader-svn
 ../swh-search
 ../swh-vault
 ../swh-web
 ../swh-web-client
 ../swh-scanner
 ../swh-fuse
diff --git a/requirements-swh.txt b/requirements-swh.txt
index fe0d7ba..a693bc1 100644
--- a/requirements-swh.txt
+++ b/requirements-swh.txt
@@ -1,24 +1,24 @@
 # Add here internal Software Heritage dependencies, one per line.
 swh.auth[django]
 swh.core[db,http]
 swh.counters
 swh.deposit[server]
 swh.fuse
 swh.graph
 swh.indexer
 swh.journal
 swh.lister
 swh.loader.core
 swh.loader.git
 swh.loader.mercurial
 swh.loader.svn
 swh.model
 swh.objstorage[testing]
 swh.objstorage.replayer
 swh.scanner
-swh.scheduler
+swh.scheduler[simulator]
 swh.search
 swh.storage
 swh.vault
 swh.web
 swh.web.client
diff --git a/requirements.txt b/requirements.txt
index 4f5324f..33e22ae 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,12 +1,13 @@
 # Add here external Python modules dependencies, one per line. Module names
 # should match https://pypi.python.org/pypi names. For the full spec or
 # dependency lines, see https://pip.readthedocs.org/en/1.1/requirements.html
 sphinx
 sphinxcontrib-httpdomain
 sphinxcontrib-images
 sphinxcontrib-programoutput
 sphinx-tabs
 sphinx-reredirects
 sphinx_rtd_theme
 sphinx-click
 myst-parser
+sphinx-celery
diff --git a/swh/docs/sphinx/conf.py b/swh/docs/sphinx/conf.py
index 892ac56..9314acb 100755
--- a/swh/docs/sphinx/conf.py
+++ b/swh/docs/sphinx/conf.py
@@ -1,182 +1,186 @@
 #!/usr/bin/env python3
 # -*- coding: utf-8 -*-
 #
 
 import os
 from typing import Dict
 
 import django
 
 # General information about the project.
 project = "Software Heritage - Development Documentation"
 copyright = "2015-2021  The Software Heritage developers"
 author = "The Software Heritage developers"
 
 # -- General configuration ------------------------------------------------
 
 # Add any Sphinx extension module names here, as strings. They can be
 # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
 # ones.
 extensions = [
     "sphinx.ext.autodoc",
     "sphinx.ext.napoleon",
     "sphinxcontrib.httpdomain",
     "sphinx.ext.extlinks",
     "sphinxcontrib.images",
     "sphinxcontrib.programoutput",
     "sphinx.ext.viewcode",
     "sphinx_tabs.tabs",
     "sphinx_rtd_theme",
     "sphinx.ext.graphviz",
     "sphinx_click.ext",
     "myst_parser",
     "sphinx.ext.todo",
     "sphinx_reredirects",
     "swh.docs.sphinx.view_in_phabricator",
+
+    # swh.scheduler inherits some attribute descriptions from celery that use
+    # custom crossrefs (eg. :setting:`task_ignore_result`)
+    "sphinx_celery.setting_crossref",
 ]
 
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ["_templates"]
 
 # The suffix(es) of source filenames.
 # You can specify multiple suffix as a list of string:
 #
 source_suffix = ".rst"
 
 # The master toctree document.
 master_doc = "index"
 
 # A string of reStructuredText that will be included at the beginning of every
 # source file that is read.
 # A bit hackish but should work both for each swh package and the whole swh-doc
 rst_prolog = """
 .. include:: /../../swh-docs/docs/swh_substitutions
 """
 
 # The version info for the project you're documenting, acts as replacement for
 # |version| and |release|, also used in various other places throughout the
 # built documents.
 #
 # The short X.Y version.
 version = ""
 # The full version, including alpha/beta/rc tags.
 release = ""
 
 # The language for content autogenerated by Sphinx. Refer to documentation
 # for a list of supported languages.
 #
 # This is also used if you do content translation via gettext catalogs.
 # Usually you set "language" from the command line for these cases.
 language = "en"
 
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.
 # This patterns also effect to html_static_path and html_extra_path
 exclude_patterns = ["_build", "swh-icinga-plugins/index.rst"]
 
 # The name of the Pygments (syntax highlighting) style to use.
 pygments_style = "sphinx"
 
 # If true, `todo` and `todoList` produce output, else they produce nothing.
 todo_include_todos = True
 
 
 # -- Options for HTML output ----------------------------------------------
 
 # The theme to use for HTML and HTML Help pages.  See the documentation for
 # a list of builtin themes.
 #
 html_theme = "sphinx_rtd_theme"
 
 html_favicon = "_static/favicon.ico"
 
 # Theme options are theme-specific and customize the look and feel of a theme
 # further.  For a list of options available for each theme, see the
 # documentation.
 #
 html_theme_options = {
     "collapse_navigation": True,
     "sticky_navigation": True,
 }
 
 html_logo = "_static/software-heritage-logo-title-motto-vertical-white.png"
 
 # Add any paths that contain custom static files (such as style sheets) here,
 # relative to this directory. They are copied after the builtin static files,
 # so a file named "default.css" will overwrite the builtin "default.css".
 html_static_path = ["_static"]
 
 # make logo actually appear, avoiding gotcha due to alabaster default conf.
 # https://github.com/bitprophet/alabaster/issues/97#issuecomment-303722935
 html_sidebars = {
     "**": [
         "about.html",
         "globaltoc.html",
         "relations.html",
         "sourcelink.html",
         "searchbox.html",
     ]
 }
 
 # If not None, a 'Last updated on:' timestamp is inserted at every page
 # bottom, using the given strftime format.
 # The empty string is equivalent to '%b %d, %Y'.
 html_last_updated_fmt = "%Y-%m-%d %H:%M:%S %Z"
 
 # refer to the Python standard library.
 intersphinx_mapping = {"python": ("https://docs.python.org/3", None)}
 
 # Redirects for pages that were moved, so we don't break external links.
 # Uses sphinx-reredirects
 redirects = {
     "swh-deposit/spec-api": "api/api-documentation.html",
     "swh-deposit/metadata": "api/metadata.html",
     "swh-deposit/specs/blueprint": "../api/use-cases.html",
     "swh-deposit/user-manual": "api/user-manual.html",
     "architecture": "architecture/overview.html",
     "mirror": "architecture/mirror.html",
 }
 
 
 # -- autodoc configuration ----------------------------------------------
 autodoc_default_flags = [
     "members",
     "undoc-members",
     "private-members",
     "special-members",
 ]
 autodoc_member_order = "bysource"
 autodoc_mock_imports = ["rados"]
 
 modindex_common_prefix = ["swh."]
 
 # For the todo extension. Todo and todolist produce output only if this is True
 todo_include_todos = True
 
 # for the extlinks extension, sub-projects should fill that dict
 extlinks: Dict = {}
 
 
 # XXX Kill this ASA this PR is accepted and released
 # https://github.com/sphinx-contrib/httpdomain/pull/19
 def register_routingtable_as_label(app, document):
     from sphinx.locale import _  # noqa
 
     labels = app.env.domaindata["std"]["labels"]
     labels["routingtable"] = "http-routingtable", "", _("HTTP Routing Table")
     anonlabels = app.env.domaindata["std"]["anonlabels"]
     anonlabels["routingtable"] = "http-routingtable", ""
 
 
 # hack to set the adequate django settings when building global swh doc
 # to avoid autodoc build errors
 def setup(app):
     os.environ.setdefault("DJANGO_SETTINGS_MODULE", "swh.docs.django_settings")
     django.setup()
     from distutils.version import StrictVersion  # noqa
 
     import pkg_resources  # noqa
 
     httpdomain = pkg_resources.get_distribution("sphinxcontrib-httpdomain")
     if StrictVersion(httpdomain.version) <= StrictVersion("1.7.0"):
         app.connect("doctree-read", register_routingtable_as_label)