Changeset View
Changeset View
Standalone View
Standalone View
docs/architecture/mirror.rst
Show All 30 Lines | |||||
blob objects (file content) from the |swh| :ref:`object storage <swh-objstorage>`. | blob objects (file content) from the |swh| :ref:`object storage <swh-objstorage>`. | ||||
.. note:: It is not required that a mirror is deployed using the |swh| software | .. note:: It is not required that a mirror is deployed using the |swh| software | ||||
stack. Other technologies, including different storage methods, can be | stack. Other technologies, including different storage methods, can be | ||||
used. But we will focus in this documentation to the case of mirror | used. But we will focus in this documentation to the case of mirror | ||||
deployment using the |swh| software stack. | deployment using the |swh| software stack. | ||||
.. thumbnail:: images/mirror-architecture.svg | .. thumbnail:: ../images/mirror-architecture.svg | ||||
General view of the |swh| mirroring architecture. | General view of the |swh| mirroring architecture. | ||||
Mirroring the Graph Storage | Mirroring the Graph Storage | ||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||||
The replication of the graph is based on a journal using Kafka_ as event | The replication of the graph is based on a journal using Kafka_ as event | ||||
Show All 21 Lines | |||||
In order to set up a mirror of the graph, one needs to deploy a stack capable | In order to set up a mirror of the graph, one needs to deploy a stack capable | ||||
of retrieving all these topics and store their content reliably. For example a | of retrieving all these topics and store their content reliably. For example a | ||||
Kafka cluster configured as a replica of the main Kafka broker hosted by |swh| | Kafka cluster configured as a replica of the main Kafka broker hosted by |swh| | ||||
would do the job (albeit not in a very useful manner by itself). | would do the job (albeit not in a very useful manner by itself). | ||||
A more useful mirror can be set up using the :ref:`storage <swh-storage>` | A more useful mirror can be set up using the :ref:`storage <swh-storage>` | ||||
component with the help of the special service named `replayer` provided by the | component with the help of the special service named `replayer` provided by the | ||||
:doc:`apidoc/swh.storage.replay` module. | :mod:`swh.storage.replay` module. | ||||
.. TODO: replace this previous link by a link to the 'swh storage replay' | .. TODO: replace this previous link by a link to the 'swh storage replay' | ||||
command once available, and ideally once | command once available, and ideally once | ||||
https://github.com/sphinx-doc/sphinx/issues/880 is fixed | https://github.com/sphinx-doc/sphinx/issues/880 is fixed | ||||
Mirroring the Object Storage | Mirroring the Object Storage | ||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||||
Show All 19 Lines | |||||
When using the |swh| software stack to deploy a mirror, a number of |swh| | When using the |swh| software stack to deploy a mirror, a number of |swh| | ||||
software components must be installed (cf. architecture diagram above): | software components must be installed (cf. architecture diagram above): | ||||
- a database to store the graph of the |swh| archive, | - a database to store the graph of the |swh| archive, | ||||
- the :ref:`swh-storage` component, | - the :ref:`swh-storage` component, | ||||
- an object storage solution (can be cloud-based or on local filesystem like | - an object storage solution (can be cloud-based or on local filesystem like | ||||
ZFS pools), | ZFS pools), | ||||
- the :ref:`swh-objstorage` component, | - the :ref:`swh-objstorage` component, | ||||
- the :ref:`swh.storage.replay` service (part of the :ref:`swh-storage` | - the :mod:`swh.storage.replay` service (part of the :ref:`swh-storage` | ||||
ardumont: What's the difference between the 2? | |||||
Done Inline Actions:ref: points to an explicit target, defined with .. _swh.storage.replay:. https://www.sphinx-doc.org/en/master/usage/restructuredtext/roles.html#cross-referencing-arbitrary-locations :mod: points to a module definition, defined with .. py:mod:. https://www.sphinx-doc.org/en/master/usage/restructuredtext/domains.html vlorentz: `:ref:` points to an explicit target, defined with `.. _swh.storage.replay:`. https://www. | |||||
package) | package) | ||||
- the :ref:`swh.objstorage.replayer.replay` service (from the | - the :mod:`swh.objstorage.replayer.replay` service (from the | ||||
:ref:`swh-objstorage-replayer` package). | :ref:`swh-objstorage-replayer` package). | ||||
A `docker-swarm <https://docs.docker.com/engine/swarm/>`_ based deployment | A `docker-swarm <https://docs.docker.com/engine/swarm/>`_ based deployment | ||||
solution is provided as a working example of the mirror stack: | solution is provided as a working example of the mirror stack: | ||||
https://forge.softwareheritage.org/source/swh-docker | https://forge.softwareheritage.org/source/swh-docker | ||||
It is strongly recommended to start from there before planning a | It is strongly recommended to start from there before planning a | ||||
Show All 10 Lines |
What's the difference between the 2?