diff --git a/docs/archive-changelog.rst b/docs/archive-changelog.rst
index d24ca8e..115d583 100644
--- a/docs/archive-changelog.rst
+++ b/docs/archive-changelog.rst
@@ -1,132 +1,132 @@
 .. _archive-changelog:
 
 
 Software Heritage --- Archive ChangeLog
 =======================================
 
 Below you can find a time-indexed list of notable events and changes to
 archival policies in the Software Heritage Archive. Each of them might have
 (had) an impact on how content is archived and explain apparent statistical
 anomalies or other changes in archival behavior over time. They are collected
 in this document for historical reasons.
 
 
 2020
 ----
 
 * **2020-10-06 - 2020-11-23:** source code crawlers have been paused to avoid
   an out of disk condition, due to an unexpected delay in the arrival of new
   storage hardware. Push archival (both deposit_ and `save code now`_) remained
   in operation. (tracking: `T2656 <https://forge.softwareheritage.org/T2656>`_)
 
 * **2020-09-15:** completed first archival of, and added to regular crawling
   `GNU Guix System`_ (tracking: `T2594
   <https://forge.softwareheritage.org/T2594>`_)
 
 * **2020-06-11:** completed integration with the IPOL_ journal, allowing paper
   authors to explicitly deposit_ source code to the archive (`announcement
   <https://www.softwareheritage.org/2020/06/11/ipol-and-swh/>`_)
 
 * **2020-05-25:** completed first archival of, and added to regular crawling
   NixOS_ (tracking: `T2411 <https://forge.softwareheritage.org/T2411>`_)
 
 
 2019
 ----
 
 * **2019-09-10:** completed first archival of Bitbucket_ Git repositories and
   added Bitbucket as a regularly crawled forge (tracking: `T592
   <https://forge.softwareheritage.org/T592>`_)
 
 * **2019-06-30:** completed first archival of, and added to regular crawling,
   several GitLab_ instances: `0xacab.org <https://0xacab.org>`_, `framagit.org
   <https://framagit.org>`_, `gite.lirmm.fr <https://gite.lirmm.fr>`_,
   `gitlab.common-lisp.net <https://gitlab.common-lisp.net>`_,
   `gitlab.freedesktop.org <https://gitlab.freedesktop.org>`_, `gitlab.gnome.org
   <https://gitlab.gnome.org>`_, `gitlab.inria.fr <https://gitlab.inria.fr>`_,
   `salsa.debian.org <https://salsa.debian.org>`_
 
 * **2019-06-12:** completed first archival of CRAN_ packages and added CRAN as
   a regularly crawled package repository (tracking: `T1709
   <https://forge.softwareheritage.org/T1709>`_)
 
 * **2019-06-11:** completed a full archival of GNU_ source code releases from
   `ftp.gnu.org`_, and added it to regular crawling (tracking: `T1722
   <https://forge.softwareheritage.org/T1722>`_)
 
-* **2019-05-27:** completed a full archival of NPM_ packages andded it as a
+* **2019-05-27:** completed a full archival of NPM_ packages and added it as a
   regularly crawled package repository (tracking: `T1378
   <https://forge.softwareheritage.org/T1378>`_)
 
 * **2019-01-10:** enabled the `save code now`_ service, allowing users to
   explicitly request archival of a specific source code repository
   (`announcement
   <https://www.softwareheritage.org/2019/01/10/save_code_now/>`_)
 
 
 2018
 ----
 
 * **2018-10-10:** completed first archival of PyPI_ packages and added PyPI as
   a regularly crawled package repository (`announcement
   <https://www.softwareheritage.org/2018/10/10/pypi-available-on-software-heritage/>`_)
 
 * **2018-09-25:** completed integration with HAL_, allowing paper authors to
   explicitly deposit_ source code to the archive (`announcement
   <https://www.softwareheritage.org/2018/09/28/depositing-scientific-software-into-software-heritage/>`_)
 
 * **2018-08-31:** completed first archival of public GitLab_ repositories from
   `gitlab.com <https://gitlab.com>`_ and added it as a regularly crawled forge
   (tracking: `T1111 <https://forge.softwareheritage.org/T1111>`_)
 
 * **2018-03-21:** completed archival of `Google Code`_ Mercurial repositories.
   (tracking: `T682 <https://forge.softwareheritage.org/T682>`_)
 
 * **2018-02-20:** completed archival of Debian_ packages and added Debian as a
   regularly crawled distribution (`announcement
   <https://www.softwareheritage.org/2018/02/20/listing-and-loading-of-debian-repositories-now-live/>`_)
 
 
 2017
 ----
 
 * **2017-10-02:** completed archival of `Google Code`_ Subversion repositories
   (tracking: `T617 <https://forge.softwareheritage.org/T617>`_)
 
 * **2017-06-06:** completed archival of `Google Code`_ Git repositories
   (tracking: `T673 <https://forge.softwareheritage.org/T673>`_)
 
 
 2016
 ----
 
 * **2016-04-04:** completed archival of the Gitorious_ (tracking: `T312
   <https://forge.softwareheritage.org/T312>`_)
 
 
 2015
 ----
 
 * **2015-11-06:** archived all GNU_ source code releases from `ftp.gnu.org`_
   (tracking: `T90 <https://forge.softwareheritage.org/T90>`_)
 * **2015-07-28:** started archiving public GitHub_ repositories
 
 
 
 .. _Bitbucket: https://bitbucket.org
 .. _CRAN: https://cran.r-project.org
 .. _Debian: https://www.debian.org
 .. _GNU Guix System: https://guix.gnu.org/
 .. _GNU: https://en.wikipedia.org/wiki/Google_Code
 .. _GitHub: https://github.com
 .. _GitLab: https://gitlab.com
 .. _Gitorious: https://en.wikipedia.org/wiki/Gitorious
 .. _Google Code: https://en.wikipedia.org/wiki/Google_Code
 .. _HAL: https://hal.archives-ouvertes.fr
 .. _IPOL: http://www.ipol.im
 .. _NPM: https://www.npmjs.com
 .. _NixOS: https://nixos.org/
 .. _PyPI: https://pypi.org
 .. _deposit: https://deposit.softwareheritage.org
 .. _ftp.gnu.org: http://ftp.gnu.org
 .. _save code now: https://save.softwareheritage.org
diff --git a/docs/contributing/phabricator.rst b/docs/contributing/phabricator.rst
index 313bf0b..cca5d4d 100644
--- a/docs/contributing/phabricator.rst
+++ b/docs/contributing/phabricator.rst
@@ -1,289 +1,289 @@
 .. highlight:: bash
 
 .. _patch-submission:
 
 Submitting patches
 ==================
 
 `Phabricator`_ is the tool that Software Heritage uses as its
 coding/collaboration forge.
 
 Software Heritage's Phabricator instance can be found at
 https://forge.softwareheritage.org/
 
 .. _Phabricator: http://phabricator.org/
 
 Code Review in Phabricator
 --------------------------
 
 We use the Differential application of Phabricator to perform
 :ref:`code reviews <code-review>` in the context of Software Heritage.
 
 * we use Git and ``history.immutable=true``
   (but beware as that is partly a Phabricator misnomer, read on)
 * when code reviews are required, developers will be allowed to push
   directly to master once an accepted Differential diff exists
 
 Configuration
 +++++++++++++
 
 Arcanist configuration
 ^^^^^^^^^^^^^^^^^^^^^^
 
 Authentication
 ~~~~~~~~~~~~~~
 
 First, you should install Arcanist and authenticate it to Phabricator::
 
    sudo apt-get install arcanist
    arc set-config default https://forge.softwareheritage.org/
    arc install-certificate
 
 arc will prompt you to login into Phabricator via web
 (which will ask your personal Phabricator credentials).
 You will then have to copy paste the API token from the web page to arc,
 and hit Enter to complete the certificate installation.
 
 Immutability
 ~~~~~~~~~~~~
 
 When using git, Arcanist by default mess with the local history,
 rewriting commits at the time of first submission.
 To avoid that we use so called `history immutability`_
 
 .. _history immutability: https://secure.phabricator.com/book/phabricator/article/arcanist_new_project/#history-mutability-git
 
 To that end, you shall configure your ``arc`` accordingly::
 
    arc set-config history.immutable true
 
 Note that this does **not** mean that you are forbidden to rewrite
 your local branches (e.g., with ``git rebase``).
 Quite the contrary: you are encouraged to locally rewrite branches
 before pushing to ensure that commits are logically separated
 and your commit history easy to bisect.
 The above setting just means that *arc* will not rewrite commit
 history under your nose.
 
 Enabling ``git push`` to our forge
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 The way we've configured our review setup for continuous integration
 needs you to configure git to allow pushes to our forge.
 There's two ways you can do this : setting a ssh key to push over ssh,
 or setting a specific password for git pushes over https.
 
 SSH key for pushes
 ~~~~~~~~~~~~~~~~~~
 
 In your forge User settings page (On the top right, click on your avatar,
 then click *Settings*), you have access to a *Authentication* >
 *SSH Public Keys* section (Direct link:
 ``hxxps://forge.softwareheritage.org/settings/user/<your username>/page/ssh/``).
 You then have the option to upload a SSH public key,
 which will authenticate your pushes.
 
 You then need to configure ssh/git to use that key pair,
 for instance by editing the ``~/.ssh/config`` file.
 
 Finally, you should configure git to push over ssh when pushing to
 https://forge.softwareheritage.org, by running the following command::
 
    git config --global url.git@forge.softwareheritage.org:.pushInsteadOf https://forge.softwareheritage.org
 
 This lets git know that it should use ``git@forge.softwareheritage.org:``
 as a base url when pushing repositories cloned from
 forge.softwareheritage.org over https.
 
 VCS password for pushes
 ~~~~~~~~~~~~~~~~~~~~~~~
 
 If you're not comfortable setting up SSH to upload your changes,
 you have the option of setting a VCS password.
 This password, *separate from your account password*,
 allows Phabricator to authenticate your uploads over HTTPS.
 
 In your forge User settings page (On the top right, click on your avatar,
 then click *Settings*), you need to use the *Authentication* > *VCS Password*
 section to set your VCS password (Direct link:
 ``hxxps://forge.softwareheritage.org/settings/user/<your username>/page/vcspassword/``).
 
 If you still get a 403 error on push, this means you need
 a forge administrator to enable HTTPS pushes for the repository
 (which wasn't done by default in historical repositories).
 Please drop by on IRC and let us know!
 
 Workflow
 ++++++++
 
 * work in a feature branch: ``git checkout -b my-feat``
 * initial review request: hack/commit/hack/commit ;
   ``arc diff origin/master``
 * react to change requests: hack/commit/hack/commit ;
   ``arc diff --update Dxx origin/master``
 * landing change: ``git checkout master ; git merge my-feat ; git push``
 
 Starting a new feature and submit it for review
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 Use a **one branch per feature** workflow, with well-separated
 **logical commits** (:ref:`following those conventions <git-style-guide>`).
 Please open one diff per logical commit to keep the diff size to a minimum.
 
 .. code-block::
 
    git checkout -b my-shiny-feature
    ... hack hack hack ...
    git commit -m 'architecture skeleton for my-shiny-feature'
    ... hack hack hack ...
    git commit -m 'my-shiny-feature: implement module foo'
    ... etc ...
 
 Please, follow the
 To **submit your code for review** the first time::
 
    arc diff origin/master
 
 arc will prompt for a **code review message**. Provide the following information:
 
 * first line: *short description* of the overall work
   (i.e., the feature you're working on).
   This will become the title of the review
 * *Summary* field (optional): *long description* of the overall work;
   the field can continue in subsequent lines, up to the next field.
   This will become the "Summary" section of the review
 * *Test Plan* field (optional): write here if something special is needed
   to test your change
 * *Reviewers* field (optional): the (Phabricator) name(s) of
   desired reviewers.
   If you don't specify one (recommended) the default reviewers will be chosen
 * *Subscribers* field (optional): the (Phabricator) name(s) of people that
   will be notified about changes to this review request.
   In most cases it should be left empty
 
 For example::
 
    mercurial loader
 
    Summary: first stab at a mercurial loader (T329)
 
    The implementation follows the plan detailed in F2F discussion with @foo.
 
    Performances seem decent enough for a first trial (XXX seconds for YYY repository
    that contains ZZZ patches).
 
    Test plan:
 
    Reviewers:
 
    Subscribers: foo
 
 After completing the message arc will submit the review request
 and tell you its number and URL::
 
    [...]
    Created a new Differential revision:
            Revision URI: https://forge.softwareheritage.org/Dxx
 
 .. _arc-update:
 
 Updating your branch to reflect requested changes
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 Your feature might get accepted as is, YAY!
 Or, reviewers might request changes; no big deal!
 
 Use the Differential web UI to follow-up to received comments, if needed.
 
 To implement requested changes in the code, hack on your branch as usual by:
 
 * adding new commits, and/or
 * rewriting old commits with git rebase (to preserve a nice, easy to bisect history)
 * pulling on master and rebasing your branch against it if meanwhile someone
   landed commits on master:
 
 .. code-block::
 
    git checkout master
    git pull
    git checkout my-shiny-feature
    git rebase master
 
 
 When you're ready to **update your review request**::
 
    arc diff --update Dxx HEAD~
 
 Arc will prompt you for a message: describe what you've changed
 w.r.t. the previous review request, free form.
 Your message will become the changelog entry in Differential
 for this new version of the diff.
 
 Differential only care about the code diff, and not about the commits
 or their order.
 Therefore each "update" can be a completely different series of commits,
 possibly rewritten from the previous submission.
 
 Dependencies between diffs
 ^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 Note that you can manage diff dependencies within the same module
 with the following keyword in the diff description::
 
    Depends on Dxx
 
 That allows to keep a logical view in your diff.
 It's not strictly necessary (because the tooling now deals with it properly)
 but it might help reviewers or yourself to do so.
 
 Landing your change onto master
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 Once your change has been approved in Differential,
 you will be able to land it onto the master branch.
 
 Before doing so, you're encouraged to **clean up your git commit history**,
 reordering/splitting/merging commits as needed to have separate
 logical commits and an easy to bisect history.
 Update the diff :ref:`following the prior section <arc-update>`
-(It'd be good to let the ci build finish to make sure everything is still green).
+(It'd be good to let the CI build finish to make sure everything is still green).
 
 Once you're happy you can **push to origin/master** directly, e.g.::
 
    git checkout master
    git merge --ff-only my-shiny-feature
    git push
 
 ``--ff-only`` is optional, and makes sure you don't unintentionally
 create a merge commit.
 
 Optionally you can then delete your local feature branch::
 
    git branch -d my-shiny-feature
 
 Reviewing locally / landing someone else's changes
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 You can do local reviews of code with arc patch::
 
    arc patch Dxyz
 
 This will create a branch **arcpatch-Dxyz** containing the changes
 on your local checkout.
 
 You can then merge those changes upstream with::
 
    git checkout master
    git merge --ff arcpatch-Dxyz
    git push origin master
 
 or, alternatively::
 
    arc land --squash
 
 
 See also
 --------
 
 * :ref:`code-review` for guidelines on how code is reviewed
   when developing for Software Heritage
diff --git a/docs/glossary.rst b/docs/glossary.rst
index d78bdea..c7a6bca 100644
--- a/docs/glossary.rst
+++ b/docs/glossary.rst
@@ -1,193 +1,193 @@
 :orphan:
 
 .. _glossary:
 
 Glossary
 ========
 
 .. glossary::
 
    archive
 
      An instance of the |swh| data store.
 
    ark
 
      `Archival Resource Key`_ (ARK) is a Uniform Resource Locator (URL) that is
      a multi-purpose persistent identifier for information objects of any type.
 
    artifact
    software artifact
 
      An artifact is one of many kinds of tangible by-products produced during
      the development of software.
 
    content
    blob
 
      A (specific version of a) file stored in the archive, identified by its
      cryptographic hashes (SHA1, "git-like" SHA1, SHA256) and its size. Also
      known as: :term:`blob`. Note: it is incorrect to refer to Contents as
      "files", because files are usually considered to be named, whereas
      Contents are nameless. It is only in the context of specific
      :term:`directories <directory>` that :term:`contents <content>` acquire
      (local) names.
 
    deposit
 
      A :term:`software artifact` that was pushed to the Software Heritage
      archive (unlike :term:`loaders <loader>`, which pull artifacts).
      A deposit is useful when you want to ensure a software release's source
      code is archived in SWH even if it is not published anywhere else.
 
      See also: the :ref:`swh-deposit` component, which implements a deposit
      client and server.
 
    directory
 
      A set of named pointers to contents (file entries), directories (directory
      entries) and revisions (revision entries). All entries are associated to
      the local name of the entry (i.e., a relative path without any path
      separator) and permission metadata (e.g., ``chmod`` value or equivalent).
 
    doi
 
      A Digital Object Identifier or DOI_ is a persistent identifier or handle
      used to uniquely identify objects, standardized by the International
      Organization for Standardization (ISO).
 
    extrinsic metadata
 
      Metadata about software that is not shipped as part of the software source
      code, but is available instead via out-of-band means. For example,
      homepage, maintainer contact information, and popularity information
      ("stars") as listed on GitHub/GitLab repository pages.
 
      See also: :term:`intrinsic metadata`.
 
    journal
 
      The :ref:`journal <swh-journal>` is the persistent logger of the |swh| architecture in charge
      of logging changes of the archive, with publish-subscribe_ support.
 
    lister
 
      A :ref:`lister <swh-lister>` is a component of the |swh| architecture that is in charge of
      enumerating the :term:`software origin` (e.g., VCS, packages, etc.)
      available at a source code distribution place.
 
    loader
 
      A :ref:`loader <swh-loader-core>` is a component of the |swh| architecture
      responsible for reading a source code :term:`origin` (typically a git
-     reposiitory) and import or update its content in the :term:`archive` (ie.
+     repository) and import or update its content in the :term:`archive` (ie.
      add new file contents int :term:`object storage` and repository structure
      in the :term:`storage database`).
 
    hash
    cryptographic hash
    checksum
    digest
 
      A fixed-size "summary" of a stream of bytes that is easy to compute, and
      hard to reverse. (Cryptographic hash function Wikipedia article) also
      known as: :term:`checksum`, :term:`digest`.
 
    indexer
 
      A component of the |swh| architecture dedicated to producing metadata
      linked to the known :term:`blobs <blob>` in the :term:`archive`.
 
    intrinsic metadata
 
      Metadata about software that is shipped as part of the source code of the
      software itself or as part of related artifacts (e.g., revisions,
      releases, etc). For example, metadata that is shipped in `PKG-INFO` files
      for Python packages, `pom.xml` for Maven-based Java projects,
      `debian/control` for Debian packages, `metadata.json` for NPM, etc.
 
      See also: :term:`extrinsic metadata`.
 
    objstore
    objstorage
    object store
    object storage
 
      Content-addressable object storage. It is the place where actual object
      :term:`blobs <blob>` objects are stored.
 
    origin
    software origin
    data source
 
      A location from which a coherent set of sources has been obtained, like a
      git repository, a directory containing tarballs, etc.
 
    person
 
      An entity referenced by a revision as either the author or the committer
      of the corresponding change. A person is associated to a full name and/or
      an email address.
 
    release
    tag
    milestone
 
      a revision that has been marked as noteworthy with a specific name (e.g.,
      a version number), together with associated development metadata (e.g.,
      author, timestamp, etc).
 
    revision
    commit
    changeset
 
      A point in time snapshot of the content of a directory, together with
      associated development metadata (e.g., author, timestamp, log message,
      etc).
 
    scheduler
 
      The component of the |swh| architecture dedicated to the management and
      the prioritization of the many tasks.
 
    snapshot
 
      the state of all visible branches during a specific visit of an origin
 
    storage
    storage database
 
      The main database of the |swh| platform in which the all the elements of
      the :ref:`data-model` but the :term:`content` are stored as a :ref:`Merkle
      DAG <swh-merkle-dag>`.
 
    type of origin
 
      Information about the kind of hosting, e.g., whether it is a forge, a
      collection of repositories, an homepage publishing tarball, or a one shot
      source code repository. For all kind of repositories please specify which
      VCS system is in use (Git, SVN, CVS, etc.) object.
 
    vault
    vault service
 
      User-facing service that allows to retrieve parts of the :term:`archive`
      as self-contained bundles (e.g., individual releases, entire repository
      snapshots, etc.)
 
    visit
 
      The passage of |swh| on a given :term:`origin`, to retrieve all source
      code and metadata available there at the time. A visit object stores the
      state of all visible branches (if any) available at the origin at visit
      time; each of them points to a revision object in the archive. Future
      visits of the same origin will create new visit objects, without removing
      previous ones.
 
 
 
 .. _blob: https://en.wikipedia.org/wiki/Binary_large_object
 .. _DOI: https://www.doi.org
 .. _`persistent identifier`: https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html#persistent-identifiers
 .. _`Archival Resource Key`: http://n2t.net/e/ark_ids.html
 .. _publish-subscribe: https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern
diff --git a/docs/journal.rst b/docs/journal.rst
index d1d3983..25746c4 100644
--- a/docs/journal.rst
+++ b/docs/journal.rst
@@ -1,673 +1,673 @@
 .. _journal-specs:
 
 Software Heritage Journal --- Specifications
 ============================================
 
-The |swh| journal is a kafka_-based stream of events for every added object in
+The |swh| journal is a Kafka_-based stream of events for every added object in
 the |swh| Archive and some of its related services, especially indexers.
 
 Each topic_ will stream added elements for a given object type according to the
 topic name.
 
 Objects streamed in a topic are serialized versions of objects stored in the
 |swh| Archive specified by the main |swh| :py:mod:`data model <swh.model.model>` or
 the :py:mod:`indexer object model <swh.indexer.storage.model>`.
 
 
 In this document we will describe expected messages in each topic, so a
 potential consumer can easily cope with the |swh| journal without having to
 read the source code or the |swh| :ref:`data model <swh-model>` in details (it
 is however recommended to familiarize yourself with this later).
 
 Kafka message values are dictionary structures serialized as msgpack_, with a
 few custom encodings. See the section `Kafka message format`_ below for a
 complete description of the serialization format.
 
 Note that each example given below show the dictionary before being serialized
 as a msgpack_ chunk.
 
 
 Topics
 ------
 
 There are several groups of topics:
 
 - main storage Merkle-DAG related topics,
 - other storage objects (not part of the Merkle DAG),
 - indexer related objects (not yet documented below).
 
 Topics prefix can be either `swh.journal.objects` or
 `swh.journal.objects_privileged` (see below).
 
 Anonymized topics
 +++++++++++++++++
 
 For topics that transport messages with user information (name and email
 address), namely `swh.journal.objects.release`_ and
 `swh.journal.objects.revision`_, there are 2 versions of those: one is an
 anonymized topic, in which user information are obfuscated, and a pristine
 version with clear data.
 
 Access to pristine topics depends on ACLs linked to credentials used to connect
 to the Kafka cluster.
 
 
 List of topics
 ++++++++++++++
 
 - `swh.journal.objects.origin`_
 - `swh.journal.objects.origin_visit`_
 - `swh.journal.objects.origin_visit_status`_
 - `swh.journal.objects.snapshot`_
 - `swh.journal.objects.release`_
 - `swh.journal.objects.privileged_release <swh.journal.objects.release>`_
 - `swh.journal.objects.revision`_
 - `swh.journal.objects.privileged_revision <swh.journal.objects.revision>`_
 - `swh.journal.objects.directory`_
 - `swh.journal.objects.content`_
 - `swh.journal.objects.skippedcontent`_
 - `swh.journal.objects.metadata_authority`_
 - `swh.journal.objects.metadata_fetcher`_
 - `swh.journal.objects.raw_extrinsic_metadata`_
 
 
 
-Topics for Merkel-DAG objects
+Topics for Merkle-DAG objects
 -----------------------------
 
 These topics are for the various objects stored in the |swh| Merkle DAG, see
 the :ref:`data model <swh-model>` for more details.
 
 
 `swh.journal.objects.snapshot`
 ++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.Snapshot` objects.
 
 Message format:
 
 - `branches` [dict] branches present in this snapshot,
 - `id` [bytes] the intrinsic identifier of the
   :py:class:`swh.model.model.Snapshot` object
 
 with `branches` being a dictionary which keys are branch names [bytes], and values a dictionary of:
 
 - `target` [bytes] intrinsic identifier of the targeted object
 - `target_type` [string] the type of the targeted object (can be "content",
   "directory", "revision", "release", "snapshot" or "alias").
 
 Example:
 
 .. code:: python
 
    {
     'branches': {
       b'refs/pull/1/head': {
         'target': b'\x07\x10\\\xfc\xae\x1f\xb1\xf9\xb5\xad\x8bI\xf1G\x10\x9a\xba>8\x0c',
         'target_type': 'revision'
         },
       b'refs/pull/2/head': {
         'target': b'\x1a\x868-\x9b\x1d\x00\xfbd\xeaH\xc88\x9c\x94\xa1\xe0U\x9bJ',
         'target_type': 'revision'
         },
       b'refs/heads/master': {
         'target': b'\x7f\xc4\xfe4f\x7f\xda\r\x0e[\xba\xbc\xd7\x12d#\xf7&\xbfT',
         'target_type': 'revision'
         },
       b'HEAD': {
         'target': b'refs/heads/master',
         'target_type': 'alias'
         }
       },
     'id': b'\x10\x00\x06\x08\xe9E^\x0c\x9bS\xa5\x05\xa8\xdf\xffw\x88\xb8\x93^'
    }
 
 
 
 `swh.journal.objects.release`
 +++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.Release` objects.
 
 This topics is anonymized. The non-anonymized version of this topic is
 `swh.journal.objects_privileged.release`.
 
 Message format:
 
 - `name` [bytes] name (typically the version) of the release
 - `message` [bytes] message of the release
 - `target` [bytes] identifier of the target object
 - `target_type` [string] type of the target, can be "content", "directory",
   "revision", "release" or "snapshot"
 - `synthetic` [bool] True if the :py:class:`swh.model.model.Release` object has
   been forged by the loading process; this flag is not used for the id
   computation,
 - `author` [dict] the author of the release
 - `date` [gitdate] the date of the release
 - `id` [bytes] the intrinsic identifier of the
   :py:class:`swh.model.model.Release` object
 
 Example:
 
 .. code:: python
 
    {
     'name': b'0.3',
     'message': b'',
     'target': b'<\xd6\x15\xd9\xef@\xe0[\xe7\x11=\xa1W\x11h%\xcc\x13\x96\x8d',
     'target_type': 'revision',
     'synthetic': False,
     'author': {
       'fullname': b'\xf5\x8a\x95k\xffKgN\x82\xd0f\xbf\x12\xe8w\xc8a\xf79\x9e\xf4V\x16\x8d\xa4B\x84\x15\xea\x83\x92\xb9',
       'name': None,
       'email': None
       },
     'date': {
       'timestamp': {
         'seconds': 1480432642,
         'microseconds': 0
         },
       'offset': 180,
       'negative_utc': False
       },
     'id': b'\xd0\x00\x06u\x05uaK`.\x0c\x03R%\xca,\xe1x\xd7\x86'
    }
 
 
 `swh.journal.objects.revision`
 ++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.Revision` objects.
 
 This topics is anonymized. The non-anonymized version of this topic is
 `swh.journal.objects_privileged.revision`.
 
 Message format:
 
 - `message` [bytes] the commit message for the revision
 - `author` [dict] the author of the revision
 - `committer` [dict] the committer of the revision
 - `date` [gitdate] the revision date
 - `committer_date` [gitdate] the revision commit date
 - `type` [string] the type of the revision (can be "git", "tar", "dsc", "svn", "hg")
 - `directory` [bytes] the intrinsic identifier of the directory this revision links to
 - `synthetic` [bool] whether this :py:class:`swh.model.model.Revision` is synthetic or not,
 - `metadata` [bytes] the metadata linked to this :py:class:`swh.model.model.Revision` (not part of the
   intrinsic identifier computation),
 - `parents` [list[bytes]] list of parent :py:class:`swh.model.model.Revision` intrinsic identifiers
 - `id` [bytes] intrinsic identifier of the :py:class:`swh.model.model.Revision`
 - `extra_headers` [list[(bytes, bytes)]] TODO
 
 
 Example:
 
 .. code:: python
 
    {
     'message': b'I now arrange to be able to create a prettyprinted version of the Pascal\ncode to make review of translation of it easier, and I have thought a bit\nmore about coping with Pastacl variant records and the like, but have yet to\nimplement everything. lufylib.red is a place for support code.\n',
     'author': {
       'fullname': b'\xf3\xa7\xde7[\x8b#=\xe48\\/\xa1 \xed\x05NA\xa6\xf8\x9c\n\xad5\xe7\xe0"\xc4\xd5[\xc9z',
       'name': None,
       'email': None
       },
     'committer': {
       'fullname': b'\xf3\xa7\xde7[\x8b#=\xe48\\/\xa1 \xed\x05NA\xa6\xf8\x9c\n\xad5\xe7\xe0"\xc4\xd5[\xc9z',
       'name': None,
       'email': None
       },
     'date': {
       'timestamp': {'seconds': 1495977610, 'microseconds': 334267},
       'offset': 0,
       'negative_utc': False
       },
     'committer_date': {
       'timestamp': {'seconds': 1495977610, 'microseconds': 334267},
       'offset': 0,
       'negative_utc': False
       },
     'type': 'svn',
     'directory': b'\x815\xf0\xd9\xef\x94\x0b\xbf\x86<\xa4j^\xb65\xe9\xf4\xd1\xc3\xfe',
     'synthetic': True,
     'metadata': None,
     'parents': [
       b'D\xb1\xc8\x0f&\xdc\xd4 \x92J\xaf\xab\x19V\xad\xe7~\x18\n\x0c',
       ],
     'id': b'\x1e\x1c\x19<l\xaa\xd2~{P\x11jH\x0f\xfd\xb0Y\x86\x99\x08',
     'extra_headers': [
       [b'svn_repo_uuid', b'2bfe0521-f11c-4a00-b80e-6202646ff360'],
       [b'svn_revision', b'4067']
       ]
    }
 
 
 
 `swh.journal.objects.content`
 +++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.Content` objects.
 
 Message format:
 
 - `sha1` [bytes] SHA1 of the :py:class:`swh.model.model.Content`
 - `sha1_git` [bytes] SHA1_GIT of the :py:class:`swh.model.model.Content`
 - `sha256` [bytes] SHA256 of the :py:class:`swh.model.model.Content`
 - `blake2s256` [bytes] Blake2S256 hash of the :py:class:`swh.model.model.Content`
 - `length` [int] length of the :py:class:`swh.model.model.Content`
 - `status` [string] visibility status of the :py:class:`swh.model.model.Content` (can be "visible" or "hidden")
 - `ctime` [timestamp] creation date of the :py:class:`swh.model.model.Content` (i.e. date at which this
   :py:class:`swh.model.model.Content` has been seen for the first time in the |swh| Archive).
 
 Example:
 
 .. code:: python
 
    {
     'sha1': b'-\xe7\xc1`\x9d\xd7\x7fu+\x05l\x07\xd1}\x95\x16o-u\x1d',
     'sha1_git': b'\xb9B\xa7EOW[\xef\x8b\x98\xa6b\xe9\xc7\xf0\x96g\x06`\xa4',
     'sha256': b'h{\xda\x8d\xaeG\xa4\xc6\x10\x05\xbc\xc9hca\x0em)\xd3A\x08\xd6\x95~(\xe5\xba\xe4\xaa\xcaT\x19',
     'blake2s256': b'\x8cl\xec\xe8S\xcd\xab\x90E\xc2\x8c\xfax\xe3\xbe\xca\x9aJ6\x1a\x9c](6\xc3\xb49\x8b:\xf9\xd8r',
     'length': 3220,
     'status': 'visible',
     'ctime': Timestamp(seconds=1606260407, nanoseconds=818259954)
    }
 
 
 
 `swh.journal.objects.skipped_content`
 +++++++++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.SkippedContent` objects.
 
 
 Message format:
 
 - `sha1` [bytes] SHA1 of the :py:class:`swh.model.model.SkippedContent`
 - `sha1_git` [bytes] SHA1 of the :py:class:`swh.model.model.SkippedContent`
 - `sha256` [bytes] SHA1 of the :py:class:`swh.model.model.SkippedContent`
 - `blake2s256` [bytes] SHA1 of the :py:class:`swh.model.model.SkippedContent`
 - `length` [int] length of the :py:class:`swh.model.model.SkippedContent`
 - `status` [string] visibility status of the
   :py:class:`swh.model.model.SkippedContent` (can only be "absent")
 - `reason` [string] message indicating the reason for this content to be a
   :py:class:`swh.model.model.SkippedContent` (rather than a
   :py:class:`swh.model.model.Content`)
 - `ctime` [timestamp] creation date of the
   :py:class:`swh.model.model.SkippedContent` (i.e. date at which this
   :py:class:`swh.model.model.SkippedContent` has been seen for the first time in
   the |swh| Archive)
 
 
 Example:
 
 .. code:: python
 
    {
     'sha1': b'[\x0f\x19I-%+\xec\x9dS\x86\xffz\xcb\xa2\x9f\x15\xcc\xb4&',
     'sha1_git': b'\xa9\xff4\xa7\xff\x85\xb3x$Ot\xaa\x91\x0b\xd0ZB!\x04\x8a',
     'sha256': b"\xe6\x876\xb2U-\x87\xb8\xe3\x12\xa0L\rq'\x88\xd4\x95\x92\xdf\x86\xfci\xe3E\x82\xe0\x95^\xbf\x1e\xbe",
     'blake2s256': b'\xe1 \n\x1d5\x8b\x1f\x98\\\x8e\xaa\x1d?8*\xc1\xf7\xb9\x95\r|\x1e\xee^\x10\x10\x19\xc6\x9c\x11\xedX',
     'length': 125146729,
     'status': 'absent',
     'reason': 'Content too large',
     'ctime': Timestamp(seconds=1606260407, nanoseconds=818259954)
    }
 
 
 
 `swh.journal.objects.directory`
 +++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.Directory` objects.
 
 Message format:
 
 - `entries` [list[dict]] list of directory entries
 - `id` [bytes] intrinsic identifier of this :py:class:`swh.model.model.Directory`
 
 with directory entries being dictionaries:
 
 - `name` [bytes] name of the directory entry
 - `type` [string] type of directory entry (can be "file", "dir" or "rev")
 - `perms` [int] permissions for this directory entry
 
 
 Example:
 
 .. code:: python
 
    {
     'entries': [
      {'name': b'LICENSE',
       'type': 'file',
       'target': b'b\x03f\xeb\x90\x07\x1cs\xaeib\x8eg\x97]0\xf0\x9dg\x01',
       'perms': 33188},
      {'name': b'README.md',
       'type': 'file',
       'target': b'\x1e>\xb56x\xbc\xe5\xba\xa4\xed\x03\xae\x83\xdb@\xd0@0\xed\xc8',
       'perms': 33188},
      {'name': b'lib',
       'type': 'dir',
       'target': b'-\xb2(\x95\xe46X\x9f\xed\x1d\xa6\x95\xec`\x10\x1a\x89\xc3\x01U',
       'perms': 16384},
      {'name': b'package.json',
       'type': 'file',
       'target': b'Z\x91N\x9bw\xec\xb0\xfbN\xe9\x18\xa2E-%\x8fxW\xa1x',
       'perms': 33188}
     ],
     'id': b'eS\x86\xcf\x16n\xeb\xa96I\x90\x10\xd0\xe9&s\x9a\x82\xd4P'
    }
 
 
 
 Other Objects Topics
 --------------------
 
 These topics are for objects of the |swh| archive that are not part of the
 Merkle DAG but are essential parts of the archive; see the :ref:`data model
 <swh-model>` for more details.
 
 
 `swh.journal.objects.origin`
 ++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.Origin` objects.
 
 Message format:
 
 - `url` [string] URL of the :py:class:`swh.model.model.Origin`
 
 Example:
 
 .. code:: python
 
    {
      "url": "https://github.com/vujkovicm/pml"
    }
 
 
 `swh.journal.objects.origin_visit`
 ++++++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.OriginVisit` objects.
 
 Message format:
 
 - `origin` [string] URL of the visited :py:class:`swh.model.model.Origin`
 - `date` [timestamp] date of the visit
 - `type` [string] type of the loader used to perform the visit
 - `visit` [int] number of the visit for this `origin`
 
 Example:
 
 .. code:: python
 
    {
     'origin': 'https://pypi.org/project/wasp-eureka/',
     'date': Timestamp(seconds=1606260407, nanoseconds=818259954),
     'type': 'pypi',
     'visit': 505}
    }
 
 
 `swh.journal.objects.origin_visit_status`
 +++++++++++++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.OriginVisitStatus` objects.
 
 Message format:
 
 - `origin` [string] URL of the visited :py:class:`swh.model.model.Origin`
 - `visit` [int] number of the visit for this `origin` this status concerns
 - `date` [timestamp] date of the visit status update
 - `status` [string] status (can be "created", "ongoing", "full" or "partial"),
 - `snapshot` [bytes] identifier of the :py:class:`swh.model.model.Snaphot` this
   visit resulted in (if `status` is "full" or "partial")
 - `metadata`: deprecated
 
 Example:
 
 .. code:: python
 
    {
     'origin': 'https://pypi.org/project/stricttype/',
     'visit': 524,
     'date': Timestamp(seconds=1606260407, nanoseconds=818259954),
     'status': 'full',
     'snapshot': b"\x85\x8f\xcb\xec\xbd\xd3P;Z\xb0~\xe7\xa2(\x0b\x11'\x05i\xf7",
     'metadata': None
    }
 
 
 
 Extrinsic Metadata related Topics
 ---------------------------------
 
 Extrinsic metadata is information about software that is not part of the source
 code itself but still closely related to the software. See
 :ref:`extrinsic-metadata-specification` for more details on the Extrinsic
 Metadata model.
 
 `swh.journal.objects.metadata_authority`
 ++++++++++++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.MetadataAuthority` objects.
 
 Message format:
 
 - `type` [string]
 - `url` [string]
 - `metadata` [dict]
 
 Examples:
 
 .. code:: python
 
    {
     'type': 'forge',
     'url': 'https://guix.gnu.org/sources.json',
     'metadata': {}
    }
 
    {
     'type': 'deposit_client',
     'url': 'https://www.softwareheritage.org',
     'metadata': {'name': 'swh'}
    }
 
 
 
 `swh.journal.objects.metadata_fetcher`
 ++++++++++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.MetadataFetcher` objects.
 
 Message format:
 
 - `type` [string]
 - `version` [string]
 - `metadata` [dict]
 
 Example:
 
 .. code:: python
 
    {
     'name': 'swh.loader.package.cran.loader.CRANLoader',
     'version': '0.15.0',
     'metadata': {}
    }
 
 
 
 `swh.journal.objects.raw_extrinsic_metadata`
 ++++++++++++++++++++++++++++++++++++++++++++
 
 Topic for :py:class:`swh.model.model.RawExtrinsicMetadata` objects.
 
 Message format:
 
 - `type` [string]
 - `target` [string]
 - `discovery_date` [timestamp]
 - `authority` [dict]
 - `fetcher` [dict]
 - `format` [string]
 - `metadata` [bytes]
 - `origin` [string]
 - `visit` [int]
 - `snapshot` [SWHID]
 - `release` [SWHID]
 - `revision` [SWHID]
 - `path` [bytes]
 - `directory` [SWHID]
 
 Example:
 
 .. code:: python
 
    {
     'type': 'snapshot',
     'id': 'swh:1:snp:f3b180979283d4931d3199e6171840a3241829a3',
     'discovery_date': Timestamp(seconds=1606260407, nanoseconds=818259954),
     'authority': {
       'type': 'forge',
       'url': 'https://pypi.org/',
       'metadata': {}
       },
     'fetcher': {
       'name': 'swh.loader.package.pypi.loader.PyPILoader',
       'version': '0.10.0',
       'metadata': {}
       },
     'format': 'pypi-project-json',
     'metadata': b'{"info":{"author":"Signaltonsalat","author_email":"signaltonsalat@gmail.com"}]}',
     'origin': 'https://pypi.org/project/schwurbler/'
    }
 
 
 
 
 
 Kafka message format
 --------------------
 
-Each value of a kafka message in a topic is a dictionary-like structure
+Each value of a Kafka message in a topic is a dictionary-like structure
 encoded as a msgpack_ byte string.
 
 Keys are ASCII strings.
 
 All values are encoded using default msgpack type system except for long
 integers for which we use a custom format using msgpack `extended type`_ to
 prevent overflow while packing some objects.
 
 
 Integer
 +++++++
 
 For long integers (that do not fit in the `[-(2**63), 2 ** 64 - 1]` range), a
 custom `extended type`_ based encoding scheme is used.
 
 The `type` information can be:
 
 - `1` for positive (possibly long) integers,
 - `2` for negative (possibly long) integers.
 
 The payload is simply the bytes (big endian) representation of the absolute
 value (always positive).
 
 For example (adapted to standard integers for the sake of readability; these
 values are small so they will actually be encoded using the default msgpack
 format for integers):
 
 - `12345` would be encoded as the extension value `[1, [0x30, 0x39]]` (aka `0xd5013039`)
 - `-42` would be encoded as the extension value `[2, [0x2A]]` (aka `0xd4022a`)
 
 
 Datetime
 ++++++++
 
-There are 2 type of date that can be encoded in a kafka message:
+There are 2 type of date that can be encoded in a Kafka message:
 
 - dates for git-like objects (:py:class:`swh.model.model.Revision` and
   :py:class:`swh.model.model.Release`): these dates are part of the hash
   computation used as identifier in the Merkle DAG. In order to fully support
   git repositories, a custom encoding is required. These dates (coming from the
   git data model) are encoded as a dictionary with:
 
   - `timestamp` [dict] POSIX timestamp of the date, as a dictionary with 2 keys
     (`seconds` and `microseconds`)
 
   - `offset` [int] offset of the date (in minutes)
 
   - `negative_utc` [bool] only True for the very edge case where the date has a
     zero but negative offset value (which does not makes much sense, but
     technically the git format permits)
 
   Example:
 
   .. code:: python
 
      {
        'timestamp': {'seconds': 1480432642, 'microseconds': 0},
        'offset': 180,
        'negative_utc': False
      }
 
   These are denoted as `gitdate` below.
 
 - other dates (resulting of the |swh| processing stack) are encoded using
   msgpack's Timestamp_ extended type.
 
   These are denoted as `timestamp` below.
 
   Note that these dates used to be encoded as a dictionary (beware: keys are bytes):
 
   .. code:: python
 
      {
       b"swhtype": "datetime",
       b"d": '2020-09-15T16:19:13.037809+00:00'
      }
 
 
 Person
 ++++++
 
 :py:class:`swh.model.model.Person` objects represent a person in the |swh|
 Merkle DAG, namely a :py:class:`swh.model.model.Revision` author or committer,
 or a :py:class:`swh.model.model.Release` author.
 
 :py:class:`swh.model.model.Person` objects are serialized as a dictionary like:
 
 .. code:: python
 
    {
     'fullname': 'John Doe <john.doe@example.com>',
     'name': 'John Doe',
     'email': 'john.doe@example.com'
    }
 
 For anonymized topics, :py:class:`swh.model.model.Person` entities have seen
 anonymized prior to being serialized. The anonymized
 :py:class:`swh.model.model.Person` object is a dictionary like:
 
 .. code:: python
 
    {
     'fullname': <hashed value>,
     'name': null,
     'email': null
    }
 
 
 where the `<hashed value>` is computed from original values as a sha256 of the
-orignal's `fullname`.
+original's `fullname`.
 
 
 
 
-.. _kafka: https://kafka.apache.org
+.. _Kafka: https://kafka.apache.org
 .. _topic: https://kafka.apache.org/documentation/#intro_concepts_and_terms
 .. _msgpack: https://msgpack.org/
 .. _`extended type`: https://github.com/msgpack/msgpack/blob/master/spec.md#extension-types
 .. _`Timestamp`: https://github.com/msgpack/msgpack/blob/master/spec.md#timestamp-extension-type
diff --git a/docs/mirror.rst b/docs/mirror.rst
index b2fd491..eea9942 100644
--- a/docs/mirror.rst
+++ b/docs/mirror.rst
@@ -1,132 +1,132 @@
 .. _mirror:
 
 
 Mirroring
 =========
 
 
 Description
 -----------
 
 A mirror is a full copy of the |swh| archive, operated independently from the
 Software Heritage initiative. A minimal mirror consists of two parts:
 
 - the graph storage (typically an instance of :ref:`swh.storage <swh-storage>`),
   which contains the Merkle DAG structure of the archive, *except* the
   actual content of source code files (AKA blobs),
 
 - the object storage (typically an instance of :ref:`swh.objstorage <swh-objstorage>`),
   which contains all the blobs corresponding to archived source code files.
 
 However, a usable mirror needs also to be accessible by others. As such, a
 proper mirror should also allow to:
 
 - navigate the archive copy using a Web browser and/or the Web API (typically
   using the :ref:`the web application <swh-web>`),
 - retrieve data from the copy of the archive (typically using the :ref:`the
   vault service <swh-vault>`)
 
 A mirror is initially populated and maintained up-to-date by consuming data
 from the |swh| Kafka-based :ref:`journal <journal-specs>` and retrieving the
 blob objects (file content) from the |swh| :ref:`object storage <swh-objstorage>`.
 
 .. note:: It is not required that a mirror is deployed using the |swh| software
    stack. Other technologies, including different storage methods, can be
    used. But we will focus in this documentation to the case of mirror
    deployment using the |swh| software stack.
 
 
 .. thumbnail:: images/mirror-architecture.svg
 
    General view of the |swh| mirroring architecture.
 
 
 Mirroring the Graph Storage
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 The replication of the graph is based on a journal using Kafka_ as event
 streaming platform.
 
 On the Software Heritage side, every addition made to the archive consist of
 the addition of a :ref:`data-model` object. The new object is also serialized
 as a msgpack_ bytestring which is used as the value of a message added to a
 Kafka topic dedicated to the object type.
 
 The main Kafka topics for the |swh| :ref:`data-model` are:
 
 - `swh.journal.objects.content`
 - `swh.journal.objects.directory`
 - `swh.journal.objects.metadata_authority`
 - `swh.journal.objects.metadata_fetcher`
 - `swh.journal.objects.origin_visit_status`
 - `swh.journal.objects.origin_visit`
 - `swh.journal.objects.origin`
 - `swh.journal.objects.raw_extrinsic_metadata`
 - `swh.journal.objects.release`
 - `swh.journal.objects.revision`
 - `swh.journal.objects.skipped_content`
 - `swh.journal.objects.snapshot`
 
 In order to set up a mirror of the graph, one needs to deploy a stack capable
 of retrieving all these topics and store their content reliably. For example a
-kafka cluster configured as a replica of the main kafka broker hosted by |swh|
+Kafka cluster configured as a replica of the main Kafka broker hosted by |swh|
 would do the job (albeit not in a very useful manner by itself).
 
 A more useful mirror can be set up using the :ref:`storage <swh-storage>`
 component with the help of the special service named `replayer` provided by the
 :doc:`apidoc/swh.storage.replay` module.
 
 .. TODO: replace this previous link by a link to the 'swh storage replay'
    command once available, and ideally once
    https://github.com/sphinx-doc/sphinx/issues/880 is fixed
 
 
 Mirroring the Object Storage
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 File content (blobs) are *not* directly stored in messages of the
 `swh.journal.objects.content` Kafka topic, which only contains metadata about
 them, such as various kinds of cryptographic hashes. A separate component is in
 charge of replicating blob objects from the archive and stored them in the
 local object storage instance.
 
 A separate `swh-journal` client should subscribe to the
 `swh.journal.objects.content` topic to get the stream of blob objects
 identifiers, then retrieve corresponding blobs from the main Software Heritage
 object storage, and store them in the local object storage.
 
 A reference implementation for this component is available in
 :ref:`content replayer <swh-objstorage-replayer>`.
 
 
 Installation
 ------------
 
 When using the |swh| software stack to deploy a mirror, a number of |swh|
 software components must be installed (cf. architecture diagram above):
 
 - a database to store the graph of the |swh| archive,
 - the :ref:`swh-storage` component,
 - an object storage solution (can be cloud-based or on local filesystem like
   ZFS pools),
 - the :ref:`swh-objstorage` component,
 - the :ref:`swh.storage.replay` service (part of the :ref:`swh-storage`
   package)
 - the :ref:`swh.objstorage.replayer.replay` service (from the
   :ref:`swh-objstorage-replayer` package).
 
 A `docker-swarm <https://docs.docker.com/engine/swarm/>`_ based deployment
 solution is provided as a working example of the mirror stack:
 
   https://forge.softwareheritage.org/source/swh-docker
 
 It is strongly recommended to start from there before planning a
 production-like deployment.
 
 See the `README <https://forge.softwareheritage.org/source/swh-docker/browse/master/README.md>`_
 file of the `swh-docker <https://forge.softwareheritage.org/source/swh-docker>`_
 repository for details.
 
 
-.. _kafka: https://kafka.apache.org/
+.. _Kafka: https://kafka.apache.org/
 .. _msgpack: https://msgpack.org
diff --git a/docs/tutorials/issue-debugging-monitoring.md b/docs/tutorials/issue-debugging-monitoring.md
index aa4fad9..218197a 100644
--- a/docs/tutorials/issue-debugging-monitoring.md
+++ b/docs/tutorials/issue-debugging-monitoring.md
@@ -1,143 +1,143 @@
 # Issue debugging and monitoring guide
 
 In order to debug issues happening in production, you need to get as much information as
 possible on the issue. It helps reproducing or directly fixing the issue. In addition,
 you want to monitor it to see how it evolves or if it is fixed for good.
 
 The tools used at SWH to get insights on issue happening in production are Sentry and
 Kibana.
 
 ## Sentry overview
 
 SWH instance URL: <https://sentry.softwareheritage.org/>
 
 The service requires a login password pair to access, but does not require the SWH VPN
 access. To sign up, click "Request to join" and provide your SWH developer email address
 for the admins to create the account.
 
 Official documentation: <https://docs.sentry.io/product/>
 
 Sentry is specifically geared towards debugging production issues. In the "Issues" pane,
 it presents issues grouped by similarity with statistics about their occurrence. Issues
 can be filtered by:
 - project (i.e. SWH service repository), e.g. "swh-loader-core" or "swh-vault";
 - environment, e.g. "production" or "staging";
 - time range.
 
 Viewing a particular issue, you can access:
 
 - the execution trace at the point of error, with pretty-printed local variables at each
   stack frame, as you would get in a post-mortem debugging session;
 - contextual metadata about the running environment, which includes:
     - the first and last occurrence as detected by Sentry,
     - corresponding component versions,
     - installed packages,
     - entrypoint parameters,
     - runtime environment such as the interpreter version, the hostname¸ or the logging
       configuration.
 - the breadcrumbs view, which shows several event log lines produced in the same run
   prior to the error. These are not the logs produced by the application, but events
   gathered through Sentry integrations.
 
 ## Debugging SWH services with Sentry
 
 Here we show a specific type of issue that is characteristic of microservice
 architectures as implemented at SWH. One difficulty may arise in finding where an issue
 originates, because the execution is split between multiple services. It results in a
 chain of linked issues, potentially one for each service involved.
 
 Errors of type `RemoteException` encapsulate an error occurring in the service called
 through a RPC mechanism. If the information encapsulated in this top-level error is not
 sufficient, one would search for complementary traces by filtering the "Issues" view by
 the linked service's project name.
 
 Example:
 
 Sentry issue: <https://sentry.softwareheritage.org/organizations/swh/issues/5026/?project=11>
 
 The error appear as `<RemoteException 500 HttpResponseError: ['Download stream interrupted.']>`
 A request from a vault cooker to the storage service had a network error.
 
 Thanks to Sentry we see also which was the specific storage requested:
 
     `<RemoteStorage url=http://storage01.euwest.azure.internal.softwareheritage.org:5002/>`
 
 Upon searching in the storage service issues, we find a corresponding `HttpResponseError`:
 <https://sentry.softwareheritage.org/organizations/swh/issues/3857/?project=3>
 
 We skip through the error reporting logic in the trace to get to the operation that was
 performed. We see that this error comes in turn from a RPC call to the objstorage service:
 
     HttpResponseError: "Download stream interrupted." at `swh/storage/objstorage.py` in `content_get` at line 41
 
 This is a transient network error: it should not persist when retrying. So a solution
 might be to add a retrying mechanism somewhere in this chain of RPC calls.
 
 ## Issue monitoring with Sentry
 
 Aggregated error traces as shown in the "Issues" pane are the primary source of
 information for monitoring. This includes the statistics of occurrence for a given
 period of time.
 
 Sentry also comes with issue management features, that notably let you silence or
 resolve errors. Silencing means the issue will still be recorded but not notified.
 Resolving means the issue will be hidden from the default view, and any new occurrence
 of it will specifically notify the issue owner that the issue still arises and is in
 fact not resolved. Make sure an owner is associated to the issue, typically through
 ownership rules set in the project settings.
 
 For more info on monitoring issues, refer to:
 <https://docs.sentry.io/product/error-monitoring/>
 
 ## Kibana overview
 
 SWH instance URL: <http://kibana0.internal.softwareheritage.org:5601/app/kibana/>
 Access to the SWH VPN is needed, but credentials are not.
 
 Related wiki page: <https://intranet.softwareheritage.org/wiki/Kibana>
 
 Official documentation: <https://www.elastic.co/guide/en/kibana/current/index.html>
 
-Kibana is a vizualization UI for searching through indexed logs. You can search through
+Kibana is a visualization UI for searching through indexed logs. You can search through
 different sources of logs in the "Discover" pane. The sources configured include
 application logs for SWH services and system logs. You can also access dashboards shared
 by other on a particular topic or create our own from a saved search.
 
 There are 2 query languages which are quite similar: Lucene or KQL. Whatever one you
 choose, you will have the same querying capabilities. A query tries to match values for
 specific keys, and support many predicates and combination of them. See the
 documentation for KQL: https://www.elastic.co/guide/en/kibana/current/kuery-query.html
 
 To get logs for a particular service, you have to know the name of its systemd unit and
 the hostname of the production server providing this service. For a worker, switch the
 index pattern to "swh_workers-*", for another SWH service switch it to "systemlogs-*".
 
 Example for getting swh-vault production logs:
 
 With the index pattern set to "systemlogs-*", enter the KQL query:
 
     `systemd_unit:"gunicorn-swh-vault.service" AND hostname:"vangogh"`
 
 Upon expanding a log entry with the leading arrow icon, you can inspect the entry in a
 structured way. You can filter on particular values or fields, using the icons that are
 left to the desired field. Fields including "message", "hostname" or "systemd_unit" are
 often the most informational. You can also view the entry in context, several entries
 before and after chronologically.
 
 ## Issue monitoring with Kibana
 
 You can use Kibana saved searches and dashboards to follow issues based on associated
 logs. Of course, we need to have logs produced that are related to the issue we want to
 track.
 
 You can save a search, as opposed to only a query, to easily get back to it or include
 it in a dashboard. Just click "Save" in the top toolbar above the search bar. It
 includes the query, filters, selected columns, sorting and index pattern.
 
 Now you may want to have a customizable view of these logs, along with graphical
 presentations. In the "Dashboard" pane, create a new dashboard. Click "add" in the top
-toolbar and select your saved search. It will appear in resizeable panel. Now doing a
-search will restrict the search to the dataset cinfigured for the panels.
+toolbar and select your saved search. It will appear in resizable panel. Now doing a
+search will restrict the search to the dataset configured for the panels.
 
-To create more complete vizualizations including graphs, refer to:
+To create more complete visualizations including graphs, refer to:
 <https://www.elastic.co/guide/en/kibana/current/dashboard.html>
diff --git a/docs/tutorials/testing.rst b/docs/tutorials/testing.rst
index 4673482..fd368ce 100644
--- a/docs/tutorials/testing.rst
+++ b/docs/tutorials/testing.rst
@@ -1,123 +1,123 @@
 .. _testing-guide:
 
 Software testing guide
 ======================
 
 Tools landscape
 ---------------
 
 The testing framework we use is pytest_. It provides many facilities to write tests
 efficiently.
 
 It is complemented by hypothesis_, a library for property-based testing in some of
 our test suites. Its usage is a more advanced topic.
 
 We also use tox_, the automation framework, to run the
 tests along with other quality checks in isolated environments.
 
 The main quality checking tools in use are:
 
 * mypy_, a static type checker. We gradually type-annotate all additions or refactorings
   to the codebase;
 * flake8_, a simple code style checker (aka linter);
 * black_, an uncompromising code formatter.
 
 They are run automatically through ``tox`` or as ``pre-commit`` hooks in our Git repositories.
 
 The SWH testing framework
 -------------------------
 
 This sections shows specifics about our usage of pytest and custom helpers.
 
 The pytest fixture system makes easy to write, share and plug setup and teardown code.
 Fixtures are automatically loaded from the project ``conftest`` or ``pytest_plugin`` modules
 into any test function by giving its name as argument.
 
 | Several pytest plugins have been defined across SWH projects:
 | ``core``, ``core.db``, ``storage``, ``scheduler``, ``loader``, ``journal``.
 | Many others, provided by the community are in use:
 | ``flask``, ``django``, ``aiohttp``, ``postgresql``, ``mock``, ``requests-mock``, ``cov``, etc.
 
 We make of various mocking helpers:
 
 * ``unittest.mock``: ``Mock`` classes, ``patch`` function;
 * ``mocker`` fixture from the ``mock`` plugin: adaptation of ``unittest.mock`` to the
   fixture system, with a bonus ``spy`` function to audit without modifying objects;
 * ``monkeypatching`` builtin fixture: modify object attributes or environment, with
   automatic teardown.
 
 Other notable helpers include:
 
 * ``datadir``: to compute the path to the current test's ``data`` directory.
   Available in the ``core`` plugin.
 * ``requests_mock_datadir``: to load network responses from the datadir.
   Available in the ``core`` plugin.
 * ``swh_rpc_client``: for testing SWH RPC client and servers without incurring IO.
   Available in the ``core`` plugin.
 * ``postgresql_fact``: for testing database-backends interactions.
   Available in the ``core.db`` plugin, adapted for performance from the ``postgresql`` plugin.
 * ``click.testing.CliRunner``: to simplify testing of Click command-line interfaces.
   It allows to test commands with some level of isolation from the execution environment.
   https://click.palletsprojects.com/en/7.x/api/#click.testing.CliRunner
 
 Testing guidelines
 ------------------
 
 General considerations
 ^^^^^^^^^^^^^^^^^^^^^^
 
-We mostly do functional tests, and unit-testing when more ganularity is needed. By this,
+We mostly do functional tests, and unit-testing when more granularity is needed. By this,
 we mean that we test each functionality and invariants of a component, without isolating
 it from its dependencies systematically. The goal is to strike a balance between test
 effectiveness and test maintenance. However, the most critical parts, like the storage
 service, get more extensive unit-testing.
 
 Organize tests
 ^^^^^^^^^^^^^^
 
 * In order to test a component (module, class), one must start by identifying its sets of
   functionalities and invariants (or properties).
 * One test may check multiples properties or commonly combined functionalities, if it can
   fit in a short descriptive name.
 * Organize tests in multiple modules, one for each aspect or subcomponent tested.
-  e.g.: initialization/configuration, db/backend, service API, utils, cli, etc.
+  e.g.: initialization/configuration, db/backend, service API, utils, CLI, etc.
 
 Test data
 ^^^^^^^^^
 
 Each repository has its own ``tests`` directory, some such as listers even have one for
 each lister type.
 
 * Put any non-trivial test data, used for setup or mocking, in (potentially compressed)
   files in a ``data`` directory under the local testing directory.
 * Use ``datadir`` fixtures to load them.
 
 Faking dependencies
 ^^^^^^^^^^^^^^^^^^^
 
 * Make use of temporary directories for testing code relying on filesystem paths.
 * Mock only already tested and expensive operations, typically IO with external services.
 * Use ``monkeypatch`` fixture when updating environment or when mocking is overkill.
 * Mock HTTP requests with ``requests_mock`` or ``requests_mock_datadir``.
 
 Final words
 ^^^^^^^^^^^
 
 If testing is difficult, the tested design may need reconsideration.
 
 Other SWH resources on software quality
 ---------------------------------------
 
 | https://wiki.softwareheritage.org/wiki/Python_style_guide
 | https://wiki.softwareheritage.org/wiki/Git_style_guide
 | https://wiki.softwareheritage.org/wiki/Arcanist_setup
 | https://wiki.softwareheritage.org/wiki/Code_review
 | https://wiki.softwareheritage.org/wiki/Jenkins
 | https://wiki.softwareheritage.org/wiki/Testing_the_archive_features
 
 .. _pytest: https://pytest.org
 .. _tox: https://tox.readthedocs.io
 .. _hypothesis: https://hypothesis.readthedocs.io
 .. _mypy: https://mypy.readthedocs.io
 .. _flake8: https://flake8.pycqa.org
 .. _black: https://black.readthedocs.io