Page MenuHomeSoftware Heritage

cassandra: Refactor the former installation scripts

Authored by vsellier on Aug 12 2022, 11:01 AM.


  • Move the previous yaml base configuration to a puppet template
  • Install cassandra via the archive instead of the debian packages It gives more flexibilities regarding the multi instances configuration
  • Support multi instance per server
  • Centralize the configuration to the cassandra.yaml file

There is still some work to do:

  • Manage more cassandra configuration properties
  • Add the tcp port monitoring
  • Wire the exporter metrics to one of the prometheus

Related to T4373

Diff Detail

rSPSITE puppet-swh-site
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Why do you include It should be shipped with Cassandra.

eg. it's in conf/ in

I specified why on the first line of the file, but I realize it not very visible.

The JMX_PORT variable is hardcoded in the file shipped with cassandra.
I've updated the lines 229->231 to check if the JMX_PORT is already specified as en environment variable.

It's necessary to be able to start several instances on the same server.

Setting -Dcassandra.jmx.local.port / -Dcassandra.jmx.remote.port / in the $JVM_EXTRA_OPTS env var should override what this file configured

Setting -Dcassandra.jmx.local.port / -Dcassandra.jmx.remote.port / in the $JVM_EXTRA_OPTS env var should override what this file configured

In our case, it should be -Dcassandra.jmx.local.port= and
I tried to avoid, deal with the jmx configuration in puppet and also the script will still try to specify the the jmx configration, local or remote (l 240/243).
The parameters will be declared several times and we will rely on how the jvm deals with the parameters order

I'm really not a fan of erb templates for yaml configuration files (specifically, seed_provider: <%= @config["seed_provider"].to_yaml().delete_prefix("---") %> is pretty jarring). I agree that inlining the full default config was not a good idea, though.

For the, java seems to always parse the options left to right, and the rightmost one wins. So we should be safe to put the JMX definitions in JVM_EXTRA_OPTS. Probably.

(Of course, this was documented behavior for java 8, but it's not documented anymore)

  • Rebase
  • Override the jmx port value via the JVM_EXTRA_OPTS environment
  • Inline the cassandra.yaml properties in hiera

I think you've forgotten to remove the erb template :-)


If you use inline_yaml (which is one of our functions) you get the "Managed by puppet" header. I'm not sure where to_yaml comes from

This revision is now accepted and ready to land.Aug 17 2022, 12:48 PM

use inline_yaml instead of to_yaml
thanks for the hint, I completely forgot about it