Page MenuHomeSoftware Heritage

[cassandra] deploy the replaying stack
Closed, MigratedEdits Locked

Description

In order to ingest data in cassandra, we need a replayer stack

The simpler is probably to deploy the storage/replayer service in kubernetes as during the test on grid5000

For the record, the configuration used for the grid5000 deployment is here:
https://forge.softwareheritage.org/source/snippets/browse/master/sysadmin/grid5000/cassandra/kubernetes/

Event Timeline

vsellier triaged this task as Normal priority.Jul 12 2022, 12:13 PM
vsellier created this task.
vsellier renamed this task from [cassandra] deploy the replying stack to [cassandra] deploy the replaying stack.Jul 12 2022, 3:29 PM
vsellier changed the task status from Open to Work in Progress.Aug 12 2022, 11:09 AM
vsellier moved this task from Backlog to in-progress on the System administration board.

Everything is almost ready to start the ingestion tests:

  • The deployment of the replayers is implemented
  • The monitoring can also be easily deployed in a kubernetes cluster having the prometheus operator installed [1]
  • the grafana dashboards are also available as templates [1]

We now need to create the infrastructure that support a high number of replayers (during the test, they ran on the cassandra servers but it's not good as it can have an impact on the cassandra behavior)

[1] https://archive.softwareheritage.org/swh:1:dir:fe0c1915ac60876f06e4d76b723e04284361e7d4;origin=https://forge.softwareheritage.org/source/k8s-clusters-conf.git;visit=swh:1:snp:66a35583e901a1a5a62b4097fcd64e822316e80e;anchor=swh:1:rev:1fef5d96e81bcd55c84a759a635c6a47ec19f155;path=/production-cassandra/monitoring/