Page Menu
Home
Software Heritage
Search
Configure Global Search
Log In
Files
F7124834
D8854.diff
No One
Temporary
Actions
View File
Edit File
Delete File
View Transforms
Subscribe
Mute Notifications
Award Token
Flag For Later
Size
3 KB
Subscribers
None
D8854.diff
View Options
diff --git a/sysadm/data-silos/kafka/index.rst b/sysadm/data-silos/kafka/index.rst
--- a/sysadm/data-silos/kafka/index.rst
+++ b/sysadm/data-silos/kafka/index.rst
@@ -3,5 +3,10 @@
Kafka
=====
+.. toctree::
+ :titlesonly:
+
+ manage-topics
+
.. todo::
This page is a work in progress.
diff --git a/sysadm/data-silos/kafka/manage-topics.rst b/sysadm/data-silos/kafka/manage-topics.rst
new file mode 100644
--- /dev/null
+++ b/sysadm/data-silos/kafka/manage-topics.rst
@@ -0,0 +1,128 @@
+.. _manage-topics:
+
+Reference: Manage topics
+========================
+
+.. admonition:: Intended audience
+ :class: important
+
+ sysadm members
+
+
+In the following documentation, we'll describe the common actions we have to
+periodically execute during deployments.
+
+The demonstrated commands are triggered from one of the kafka cluster nodes (be it
+staging or production).
+
+Common configuration
+--------------------
+
+Once connected to the nodes, we have a bunch of environment variables to setup:
+
+::
+
+ KAFKA_TOPICS=/opt/kafka/bin/kafka-topics.sh
+ KAFKA_CONSUMER_GROUPS=/opt/kafka/bin/kafka-consumer-groups.sh
+
+
+staging
+-------
+
+The following setup is expected for staging:
+
+::
+
+ export KAFKA_SERVER=storage1.internal.staging.swh.network:9092
+ export NB_PARTITIONS=64
+ export REPLICATION_FACTOR=1
+
+production
+----------
+
+The following setup is expected for production:
+
+::
+
+ export KAFKA_SERVER=kafka1.internal.softwareheritage.org:9092
+ export NB_PARTITIONS=256
+ export REPLICATION_FACTOR=2
+
+
+Note: Report to the :ref:`service-url` page to determine what kafka server to use.
+
+
+Topic naming
+------------
+
+
+Existing topics are prefixed with the following names:
+
+- swh.journal.objects: Objects written (by loaders) in the archive are pushed in those
+ topics
+- swh.journal.indexed: Objects written by the indexers (derived out of the objects in
+ the archive)
+
+.. warning:: FIXME: Update topics with test topics and eventually the new provenance
+ ones
+
+Examples:
+
+- `swh.journal.objects.directory`
+- `swh.journal.indexed.origin_extrinsic_metadata`
+- ...
+
+
+Topic (re)creation
+------------------
+
+.. warning:: Topics must be created before any services consume or write to them.
+ Otherwise, they will be created with the wrong configuration.
+
+
+If a topic got created without the proper configuration, we can delete it:
+
+.. code::
+
+ $KAFKA_TOPICS --bootstrap-server $KAFKA_SERVER \
+ --topic $TOPIC \
+ --delete
+
+And then create it back with the proper setup:
+
+.. code::
+
+ $KAFKA_TOPICS --bootstrap-server $KAFKA_SERVER \
+ --create --topic $TOPIC \
+ --partitions $NB_PARTITIONS \
+ --replication-factor $REPLICATION_FACTOR
+
+Note:
+Provided the `TOPIC` variable is set appropriately.
+
+
+Consumer group reset
+--------------------
+
+Once in a while, due to a change in data format, or migration, we need to reset the
+topics so the consumer can restart from scratch.
+
+Example of consumer groups:
+
+- `swh.indexer.journal_client.extrinsic_metadata`: anonymous consumer group
+- `swh-indexer-prod-01-swh.indexer.journal_client.extrinsic_metadata`: authenticated
+ production consumer group is prefixed by the user name
+
+.. code::
+
+ $KAFKA_CONSUMER_GROUPS \
+ --bootstrap-server $KAFKA_SERVER \
+ --group $CONSUMER_GROUP \
+ --reset-offsets --to-earliest \
+ --all-topics
+
+Note:
+
+- Provided the `CONSUMER_GROUP` variable is set appropriately
+- The reset can be more restrictive on specific topics, replace `--all-topics` by
+ as many calls to `--topic $TOPIC` as topics are involved.
File Metadata
Details
Attached
Mime Type
text/plain
Expires
Dec 21 2024, 9:24 PM (11 w, 4 d ago)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3221097
Attached To
D8854: sysadm/data-silos/kafka: Bootstrap topics management page
Event Timeline
Log In to Comment