diff --git a/docs/images/infrastructure/network/carp_maintenance.png b/docs/images/infrastructure/network/carp_maintenance.png new file mode 100644 index 0000000..13428d1 Binary files /dev/null and b/docs/images/infrastructure/network/carp_maintenance.png differ diff --git a/docs/images/infrastructure/network/check_for_upgrade.png b/docs/images/infrastructure/network/check_for_upgrade.png new file mode 100644 index 0000000..f0f233b Binary files /dev/null and b/docs/images/infrastructure/network/check_for_upgrade.png differ diff --git a/docs/images/infrastructure/network/proceed_update.png b/docs/images/infrastructure/network/proceed_update.png new file mode 100644 index 0000000..e9b3fc0 Binary files /dev/null and b/docs/images/infrastructure/network/proceed_update.png differ diff --git a/docs/images/infrastructure/network/reactivate_carp.png b/docs/images/infrastructure/network/reactivate_carp.png new file mode 100644 index 0000000..18beb21 Binary files /dev/null and b/docs/images/infrastructure/network/reactivate_carp.png differ diff --git a/docs/images/infrastructure/network/sync.png b/docs/images/infrastructure/network/sync.png new file mode 100644 index 0000000..c210c45 Binary files /dev/null and b/docs/images/infrastructure/network/sync.png differ diff --git a/docs/images/network.png b/docs/images/network.png new file mode 100644 index 0000000..42fbeb7 Binary files /dev/null and b/docs/images/network.png differ diff --git a/docs/images/network.svg b/docs/images/network.svg new file mode 100644 index 0000000..4ec134d --- /dev/null +++ b/docs/images/network.svg @@ -0,0 +1,62 @@ +[From network.uml (line 19) ]... (skipping 15 lines) ...pergamon;group {description = "<b>FIREWALLS</b>";Syntax Error? \ No newline at end of file diff --git a/docs/images/network.uml b/docs/images/network.uml new file mode 100644 index 0000000..10d0267 --- /dev/null +++ b/docs/images/network.uml @@ -0,0 +1,49 @@ +@startuml + +nwdiag { + inet [ shape = cloud ]; + inet -- inria_gw; + + network VLAN210 { + louvre [address = "VPN" ]; + inria_gw [description = "INRIA GW"]; + } + network VLAN1300 { + workers; + kafka; + inria_gw; + forge; + pergamon; + + group { + description = "FIREWALLS"; + + pushkin; + glyptotek; + } + + } + network VLAN440 { + workers; + pushkin; + glyptotek; + louvre; + forge; + kafka; + pergamon; + production_nodes [description = "Production nodes"]; + } + + network VLAN443 { + pushkin; + glyptotek; + staging_nodes [description = "Staging nodes"]; + } + + network VLAN442 { + pushkin; + glyptotek; + admin_nodes [description = "Admin nodes"]; + } +} +@enduml diff --git a/docs/index.rst b/docs/index.rst index 567ca36..90e9e7c 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -1,222 +1,228 @@ .. _swh-docs: Software Heritage - Development Documentation ============================================= Getting started --------------- * :ref:`getting-started` → deploy a local copy of the Software Heritage software stack in less than 5 minutes, or * :ref:`developer-setup` → get a working development setup that allows to hack on the Software Heritage software stack Contributing ------------ * :ref:`patch-submission` → learn how to submit your patches to the Software Heritage codebase * :ref:`code-review` → rules and guidelines to review code in Software Heritage * :ref:`python-style-guide` → how to format the Python code you write Architecture ------------ * :ref:`architecture-overview` → get a glimpse of the Software Heritage software architecture * :ref:`mirror` → learn what a Software Heritage mirror is and how to set up one * :ref:`Keycloak ` → learn how to use Keycloak, the authentication system used by |swh|'s web interface and public APIs Data Model and Specifications ----------------------------- * :ref:`persistent-identifiers` Specifications of the SoftWare Heritage persistent IDentifiers (SWHID). * :ref:`data-model` Documentation of the main |swh| archive data model. * :ref:`journal-specs` Documentation of the Kafka journal of the |swh| archive. Tutorials --------- * :ref:`testing-guide` * :doc:`/tutorials/issue-debugging-monitoring` * :ref:`Listing the content of your favorite forge ` and :ref:`running a lister in Docker ` Roadmap ------- * :ref:`roadmap-2021` +Engineering +----------- + +* :ref:`infrastructure` + Components ---------- Here is brief overview of the most relevant software components in the Software Heritage stack, in alphabetical order. For a better introduction to the architecture, see the :ref:`architecture-overview`, which presents each of them in a didactical order. Each component name is linked to the development documentation of the corresponding Python module. :ref:`swh.auth ` low-level library used by modules needing keycloak authentication :ref:`swh.core ` low-level utilities and helpers used by almost all other modules in the stack :ref:`swh.counters ` service providing efficient estimates of the number of objects in the SWH archive, using Redis's Hyperloglog :ref:`swh.dataset ` public datasets and periodic data dumps of the archive released by Software Heritage :ref:`swh.deposit ` push-based deposit of software artifacts to the archive swh.docs developer documentation (used to generate this doc you are reading) :ref:`swh.fuse ` Virtual file system to browse the Software Heritage archive, based on `FUSE `_ :ref:`swh.graph ` Fast, compressed, in-memory representation of the archive, with tooling to generate and query it. :ref:`swh.indexer ` tools and workers used to crawl the content of the archive and extract derived information from any artifact stored in it :ref:`swh.journal ` persistent logger of changes to the archive, with publish-subscribe support :ref:`swh.lister ` collection of listers for all sorts of source code hosting and distribution places (forges, distributions, package managers, etc.) :ref:`swh.loader-core ` low-level loading utilities and helpers used by all other loaders :ref:`swh.loader-git ` loader for `Git `_ repositories :ref:`swh.loader-mercurial ` loader for `Mercurial `_ repositories :ref:`swh.loader-svn ` loader for `Subversion `_ repositories :ref:`swh.model ` implementation of the :ref:`data-model` to archive source code artifacts :ref:`swh.objstorage ` content-addressable object storage :ref:`swh.objstorage.replayer ` Object storage replication tool :ref:`swh.scanner ` source code scanner to analyze code bases and compare them with source code artifacts archived by Software Heritage :ref:`swh.scheduler ` task manager for asynchronous/delayed tasks, used for recurrent (e.g., listing a forge, loading new stuff from a Git repository) and one-off activities (e.g., loading a specific version of a source package) :ref:`swh.search ` search engine for the archive :ref:`swh.storage ` abstraction layer over the archive, allowing to access all stored source code artifacts as well as their metadata :ref:`swh.vault ` implementation of the vault service, allowing to retrieve parts of the archive as self-contained bundles (e.g., individual releases, entire repository snapshots, etc.) :ref:`swh.web ` Web application(s) to browse the archive, for both interactive (HTML UI) and mechanized (REST API) use :ref:`swh.web.client ` Python client for :ref:`swh.web ` Dependencies ------------ The dependency relationships among the various modules are depicted below. .. _py-deps-swh: .. figure:: images/py-deps-swh.svg :width: 1024px :align: center Dependencies among top-level Python modules (click to zoom). Archive ------- * :ref:`Archive ChangeLog `: notable changes to the archive over time Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * `URLs index `_ * :ref:`search` * :ref:`glossary` .. ensure sphinx does not complain about index files not being included .. toctree:: :maxdepth: 2 :caption: Contents: :titlesonly: :hidden: getting-started/index architecture/index contributing/index tutorials/index roadmap/roadmap-2021.rst + infrastructure/index swh.auth swh.core swh.counters swh.dataset swh.deposit swh.fuse swh.graph swh.indexer swh.journal swh.lister swh.loader swh.model swh.objstorage swh.objstorage.replayer swh.scanner swh.scheduler swh.search swh.storage swh.vault swh.web swh.web.client archive-changelog journal Python modules autodocumentation diff --git a/docs/infrastructure/index.rst b/docs/infrastructure/index.rst new file mode 100644 index 0000000..422ac78 --- /dev/null +++ b/docs/infrastructure/index.rst @@ -0,0 +1,14 @@ +.. _infrastructure: + +Infrastructure +############## + +.. keep this in sync with the 'sysadm' section in swh-docs/docs/index.rst + +This section regroups the knowledge base and procedures relative to the |swh| infrastructure management. + +.. toctree:: + :maxdepth: 2 + :titlesonly: + + network diff --git a/docs/infrastructure/network.rst b/docs/infrastructure/network.rst new file mode 100644 index 0000000..36b9d15 --- /dev/null +++ b/docs/infrastructure/network.rst @@ -0,0 +1,151 @@ +Network documentation +##################### + +.. keep this in sync with the 'sysadm' section in swh-docs/docs/index.rst + +This section regroups the knowledge base for our network components. + + +.. toctree:: + :maxdepth: 2 + :titlesonly: + + +Network architecture +******************** + +The network is split in several VLANs provided by the INRIA network team: + +.. thumbnail:: ../images/network.png + + +Firewalls +========= + +The firewalls are 2 `OPNsense `_ VMs deployed on the PROXMOX cluster with an `High Availability `_ configuration. + +They are sharing a virtual IP on each VLAN to act as the gateway. Only one of the 2 firewalls is owning all the GW ips at the same time. The owner is called the ``PRIMARY`` + +.. list-table:: + :header-rows: 1 + + * - Nominal Role + - name (link to the inventory) + - login page + * - PRIMARY + - `pushkin `_ + - `https://pushkin.internal.softwareheritage.org `_ + * - BACKUP + - `glyptotek `_ + - `https://glyptotek.internal.softwareheritage.org `_ + + +Configuration backup +-------------------- + +The configuration is automatically committed on a `git repository `_. +Each firewall regularly pushes its configuration on a dedicated branch of the repository. + +The configuration is visible on the `System / Configuration / Backups `_ page +of each one. + +Upgrade procedure +----------------- + +Initial status +^^^^^^^^^^^^^^ + +This is the nominal status of the firewalls: + +.. list-table:: + :header-rows: 1 + + * - Firewall + - Status + * - pushkin + - PRIMARY + * - glyptotek + - BACKUP + +Preparation +^^^^^^^^^^^ + +* Connect to the `principal `_ (pushkin here) +* Check the `CARP status `_ to ensure the firewall is the principal (must have the status MASTER for all the IPS) +* Connect to the `backup `_ (glytotek here) +* Check the `CARP status `_ to ensure the firewall is the backup (must have the status BACKUP for all the IPS) +* Ensure the 2 firewalls are in sync: + + * On the principal, go to the `High availability status `_ and force a synchronization + * click on the button on the right of ``Synchronize config to backup`` + .. image:: ../images/infrastructure/network/sync.png + +* Switch the principal/backup to prepare the upgrade of the master + (The switch is transparent from the user perspective and can be done without service interruption) + + * [1] On the principal, go to the `Virtual IPS status `_ page + * Activate the CARP maintenance mode + .. image:: ../images/infrastructure/network/carp_maintenance.png + * check the status of the VIPs, they must be ``BACKUP`` on pushkin and ``PRIMARY`` on glyptotek + + +* wait a few minutes to let the monitoring detect if there are connection issues, check ssh connection on several servers on different VLANs (staging, admin, ...) + +If everything is ok, proceed to the next section. + + +Upgrade the first firewall +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Before starting this section, the firewall statuses should be: + +.. list-table:: + :header-rows: 1 + + * - Firewall + - Status + * - pushkin + - BACKUP + * - glyptotek + - PRIMARY + +If not, be sure of what you are doing and adapt the links accordingly + +* [2] go to the `System Firmware: status `_ page (pushkin here) +* Click on the ``Check for upgrades`` button +.. image:: ../images/infrastructure/network/check_for_upgrade.png +* follow the interface indication, one or several reboots can be necessary depending to the number of upgrade to apply +.. image:: ../images/infrastructure/network/proceed_update.png +* repeat from the ``Check for upgrades`` operation until there is no upgrades to apply +* Switch the principal/backup to restore ``pushkin`` as the principal: + + * on the current backup (pushkin here) go to `Virtual IPS status `_ + * [3] click on `Leave Persistent CARP Maintenance Mode` + .. image:: ../images/infrastructure/network/reactivate_carp.png + * refresh the page, the role should have changed from ``BACKUP`` to ``MASTER`` + * check on the other firewall, if the roles is indeed ``BACKUP`` for all the IPs + +* Wait few moment to ensure everything is ok with the new version + +Upgrade the second firewall +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Before starting this section, the firewall statuses should be: + +.. list-table:: + :header-rows: 1 + + * - Firewall + - Status + * - pushkin + - PRIMARY + * - glyptotek + - BACKUP + +If not, be sure of what you are doing and adapt the links accordingly + +* Proceed to the second firewall upgrade + + * perform [1] on the backup (should be ``glyptotek`` here) + * perform [2] on the backup (should be ``glyptotek`` here) + * perform [3] on the backup (should be ``glyptotek`` here)