diff --git a/docs/images/infrastructure/network/carp_maintenance.png b/docs/images/infrastructure/network/carp_maintenance.png
new file mode 100644
index 0000000..13428d1
Binary files /dev/null and b/docs/images/infrastructure/network/carp_maintenance.png differ
diff --git a/docs/images/infrastructure/network/check_for_upgrade.png b/docs/images/infrastructure/network/check_for_upgrade.png
new file mode 100644
index 0000000..f0f233b
Binary files /dev/null and b/docs/images/infrastructure/network/check_for_upgrade.png differ
diff --git a/docs/images/infrastructure/network/proceed_update.png b/docs/images/infrastructure/network/proceed_update.png
new file mode 100644
index 0000000..e9b3fc0
Binary files /dev/null and b/docs/images/infrastructure/network/proceed_update.png differ
diff --git a/docs/images/infrastructure/network/reactivate_carp.png b/docs/images/infrastructure/network/reactivate_carp.png
new file mode 100644
index 0000000..18beb21
Binary files /dev/null and b/docs/images/infrastructure/network/reactivate_carp.png differ
diff --git a/docs/images/infrastructure/network/sync.png b/docs/images/infrastructure/network/sync.png
new file mode 100644
index 0000000..c210c45
Binary files /dev/null and b/docs/images/infrastructure/network/sync.png differ
diff --git a/docs/images/network.png b/docs/images/network.png
new file mode 100644
index 0000000..42fbeb7
Binary files /dev/null and b/docs/images/network.png differ
diff --git a/docs/images/network.svg b/docs/images/network.svg
new file mode 100644
index 0000000..4ec134d
--- /dev/null
+++ b/docs/images/network.svg
@@ -0,0 +1,62 @@
+
\ No newline at end of file
diff --git a/docs/images/network.uml b/docs/images/network.uml
new file mode 100644
index 0000000..10d0267
--- /dev/null
+++ b/docs/images/network.uml
@@ -0,0 +1,49 @@
+@startuml
+
+nwdiag {
+ inet [ shape = cloud ];
+ inet -- inria_gw;
+
+ network VLAN210 {
+ louvre [address = "VPN" ];
+ inria_gw [description = "INRIA GW"];
+ }
+ network VLAN1300 {
+ workers;
+ kafka;
+ inria_gw;
+ forge;
+ pergamon;
+
+ group {
+ description = "FIREWALLS";
+
+ pushkin;
+ glyptotek;
+ }
+
+ }
+ network VLAN440 {
+ workers;
+ pushkin;
+ glyptotek;
+ louvre;
+ forge;
+ kafka;
+ pergamon;
+ production_nodes [description = "Production nodes"];
+ }
+
+ network VLAN443 {
+ pushkin;
+ glyptotek;
+ staging_nodes [description = "Staging nodes"];
+ }
+
+ network VLAN442 {
+ pushkin;
+ glyptotek;
+ admin_nodes [description = "Admin nodes"];
+ }
+}
+@enduml
diff --git a/docs/index.rst b/docs/index.rst
index 567ca36..90e9e7c 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -1,222 +1,228 @@
.. _swh-docs:
Software Heritage - Development Documentation
=============================================
Getting started
---------------
* :ref:`getting-started` → deploy a local copy of the Software Heritage
software stack in less than 5 minutes, or
* :ref:`developer-setup` → get a working development setup that allows to hack
on the Software Heritage software stack
Contributing
------------
* :ref:`patch-submission` → learn how to submit your patches to the
Software Heritage codebase
* :ref:`code-review` → rules and guidelines to review code in
Software Heritage
* :ref:`python-style-guide` → how to format the Python code you write
Architecture
------------
* :ref:`architecture-overview` → get a glimpse of the Software Heritage software
architecture
* :ref:`mirror` → learn what a Software Heritage mirror is and how to set up
one
* :ref:`Keycloak ` → learn how to use Keycloak,
the authentication system used by |swh|'s web interface and public APIs
Data Model and Specifications
-----------------------------
* :ref:`persistent-identifiers` Specifications of the SoftWare Heritage persistent IDentifiers (SWHID).
* :ref:`data-model` Documentation of the main |swh| archive data model.
* :ref:`journal-specs` Documentation of the Kafka journal of the |swh| archive.
Tutorials
---------
* :ref:`testing-guide`
* :doc:`/tutorials/issue-debugging-monitoring`
* :ref:`Listing the content of your favorite forge `
and :ref:`running a lister in Docker `
Roadmap
-------
* :ref:`roadmap-2021`
+Engineering
+-----------
+
+* :ref:`infrastructure`
+
Components
----------
Here is brief overview of the most relevant software components in the Software
Heritage stack, in alphabetical order.
For a better introduction to the architecture, see the :ref:`architecture-overview`,
which presents each of them in a didactical order.
Each component name is linked to the development documentation
of the corresponding Python module.
:ref:`swh.auth `
low-level library used by modules needing keycloak authentication
:ref:`swh.core `
low-level utilities and helpers used by almost all other modules in the
stack
:ref:`swh.counters `
service providing efficient estimates of the number of objects in the SWH archive,
using Redis's Hyperloglog
:ref:`swh.dataset `
public datasets and periodic data dumps of the archive released by Software
Heritage
:ref:`swh.deposit `
push-based deposit of software artifacts to the archive
swh.docs
developer documentation (used to generate this doc you are reading)
:ref:`swh.fuse `
Virtual file system to browse the Software Heritage archive, based on
`FUSE `_
:ref:`swh.graph `
Fast, compressed, in-memory representation of the archive, with tooling to
generate and query it.
:ref:`swh.indexer `
tools and workers used to crawl the content of the archive and extract
derived information from any artifact stored in it
:ref:`swh.journal `
persistent logger of changes to the archive, with publish-subscribe support
:ref:`swh.lister `
collection of listers for all sorts of source code hosting and distribution
places (forges, distributions, package managers, etc.)
:ref:`swh.loader-core `
low-level loading utilities and helpers used by all other loaders
:ref:`swh.loader-git `
loader for `Git `_ repositories
:ref:`swh.loader-mercurial `
loader for `Mercurial `_ repositories
:ref:`swh.loader-svn `
loader for `Subversion `_ repositories
:ref:`swh.model `
implementation of the :ref:`data-model` to archive source code artifacts
:ref:`swh.objstorage `
content-addressable object storage
:ref:`swh.objstorage.replayer `
Object storage replication tool
:ref:`swh.scanner `
source code scanner to analyze code bases and compare them with source code
artifacts archived by Software Heritage
:ref:`swh.scheduler `
task manager for asynchronous/delayed tasks, used for recurrent (e.g.,
listing a forge, loading new stuff from a Git repository) and one-off
activities (e.g., loading a specific version of a source package)
:ref:`swh.search `
search engine for the archive
:ref:`swh.storage `
abstraction layer over the archive, allowing to access all stored source
code artifacts as well as their metadata
:ref:`swh.vault `
implementation of the vault service, allowing to retrieve parts of the
archive as self-contained bundles (e.g., individual releases, entire
repository snapshots, etc.)
:ref:`swh.web `
Web application(s) to browse the archive, for both interactive (HTML UI)
and mechanized (REST API) use
:ref:`swh.web.client `
Python client for :ref:`swh.web `
Dependencies
------------
The dependency relationships among the various modules are depicted below.
.. _py-deps-swh:
.. figure:: images/py-deps-swh.svg
:width: 1024px
:align: center
Dependencies among top-level Python modules (click to zoom).
Archive
-------
* :ref:`Archive ChangeLog `: notable changes to the archive
over time
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* `URLs index `_
* :ref:`search`
* :ref:`glossary`
.. ensure sphinx does not complain about index files not being included
.. toctree::
:maxdepth: 2
:caption: Contents:
:titlesonly:
:hidden:
getting-started/index
architecture/index
contributing/index
tutorials/index
roadmap/roadmap-2021.rst
+ infrastructure/index
swh.auth
swh.core
swh.counters
swh.dataset
swh.deposit
swh.fuse
swh.graph
swh.indexer
swh.journal
swh.lister
swh.loader
swh.model
swh.objstorage
swh.objstorage.replayer
swh.scanner
swh.scheduler
swh.search
swh.storage
swh.vault
swh.web
swh.web.client
archive-changelog
journal
Python modules autodocumentation
diff --git a/docs/infrastructure/index.rst b/docs/infrastructure/index.rst
new file mode 100644
index 0000000..422ac78
--- /dev/null
+++ b/docs/infrastructure/index.rst
@@ -0,0 +1,14 @@
+.. _infrastructure:
+
+Infrastructure
+##############
+
+.. keep this in sync with the 'sysadm' section in swh-docs/docs/index.rst
+
+This section regroups the knowledge base and procedures relative to the |swh| infrastructure management.
+
+.. toctree::
+ :maxdepth: 2
+ :titlesonly:
+
+ network
diff --git a/docs/infrastructure/network.rst b/docs/infrastructure/network.rst
new file mode 100644
index 0000000..36b9d15
--- /dev/null
+++ b/docs/infrastructure/network.rst
@@ -0,0 +1,151 @@
+Network documentation
+#####################
+
+.. keep this in sync with the 'sysadm' section in swh-docs/docs/index.rst
+
+This section regroups the knowledge base for our network components.
+
+
+.. toctree::
+ :maxdepth: 2
+ :titlesonly:
+
+
+Network architecture
+********************
+
+The network is split in several VLANs provided by the INRIA network team:
+
+.. thumbnail:: ../images/network.png
+
+
+Firewalls
+=========
+
+The firewalls are 2 `OPNsense `_ VMs deployed on the PROXMOX cluster with an `High Availability `_ configuration.
+
+They are sharing a virtual IP on each VLAN to act as the gateway. Only one of the 2 firewalls is owning all the GW ips at the same time. The owner is called the ``PRIMARY``
+
+.. list-table::
+ :header-rows: 1
+
+ * - Nominal Role
+ - name (link to the inventory)
+ - login page
+ * - PRIMARY
+ - `pushkin `_
+ - `https://pushkin.internal.softwareheritage.org `_
+ * - BACKUP
+ - `glyptotek `_
+ - `https://glyptotek.internal.softwareheritage.org `_
+
+
+Configuration backup
+--------------------
+
+The configuration is automatically committed on a `git repository `_.
+Each firewall regularly pushes its configuration on a dedicated branch of the repository.
+
+The configuration is visible on the `System / Configuration / Backups `_ page
+of each one.
+
+Upgrade procedure
+-----------------
+
+Initial status
+^^^^^^^^^^^^^^
+
+This is the nominal status of the firewalls:
+
+.. list-table::
+ :header-rows: 1
+
+ * - Firewall
+ - Status
+ * - pushkin
+ - PRIMARY
+ * - glyptotek
+ - BACKUP
+
+Preparation
+^^^^^^^^^^^
+
+* Connect to the `principal `_ (pushkin here)
+* Check the `CARP status `_ to ensure the firewall is the principal (must have the status MASTER for all the IPS)
+* Connect to the `backup `_ (glytotek here)
+* Check the `CARP status `_ to ensure the firewall is the backup (must have the status BACKUP for all the IPS)
+* Ensure the 2 firewalls are in sync:
+
+ * On the principal, go to the `High availability status `_ and force a synchronization
+ * click on the button on the right of ``Synchronize config to backup``
+ .. image:: ../images/infrastructure/network/sync.png
+
+* Switch the principal/backup to prepare the upgrade of the master
+ (The switch is transparent from the user perspective and can be done without service interruption)
+
+ * [1] On the principal, go to the `Virtual IPS status `_ page
+ * Activate the CARP maintenance mode
+ .. image:: ../images/infrastructure/network/carp_maintenance.png
+ * check the status of the VIPs, they must be ``BACKUP`` on pushkin and ``PRIMARY`` on glyptotek
+
+
+* wait a few minutes to let the monitoring detect if there are connection issues, check ssh connection on several servers on different VLANs (staging, admin, ...)
+
+If everything is ok, proceed to the next section.
+
+
+Upgrade the first firewall
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Before starting this section, the firewall statuses should be:
+
+.. list-table::
+ :header-rows: 1
+
+ * - Firewall
+ - Status
+ * - pushkin
+ - BACKUP
+ * - glyptotek
+ - PRIMARY
+
+If not, be sure of what you are doing and adapt the links accordingly
+
+* [2] go to the `System Firmware: status `_ page (pushkin here)
+* Click on the ``Check for upgrades`` button
+.. image:: ../images/infrastructure/network/check_for_upgrade.png
+* follow the interface indication, one or several reboots can be necessary depending to the number of upgrade to apply
+.. image:: ../images/infrastructure/network/proceed_update.png
+* repeat from the ``Check for upgrades`` operation until there is no upgrades to apply
+* Switch the principal/backup to restore ``pushkin`` as the principal:
+
+ * on the current backup (pushkin here) go to `Virtual IPS status `_
+ * [3] click on `Leave Persistent CARP Maintenance Mode`
+ .. image:: ../images/infrastructure/network/reactivate_carp.png
+ * refresh the page, the role should have changed from ``BACKUP`` to ``MASTER``
+ * check on the other firewall, if the roles is indeed ``BACKUP`` for all the IPs
+
+* Wait few moment to ensure everything is ok with the new version
+
+Upgrade the second firewall
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Before starting this section, the firewall statuses should be:
+
+.. list-table::
+ :header-rows: 1
+
+ * - Firewall
+ - Status
+ * - pushkin
+ - PRIMARY
+ * - glyptotek
+ - BACKUP
+
+If not, be sure of what you are doing and adapt the links accordingly
+
+* Proceed to the second firewall upgrade
+
+ * perform [1] on the backup (should be ``glyptotek`` here)
+ * perform [2] on the backup (should be ``glyptotek`` here)
+ * perform [3] on the backup (should be ``glyptotek`` here)