Page MenuHomeSoftware Heritage

monitoring: gather metrics into prometheus
ClosedPublic

Authored by vsellier on Dec 7 2020, 12:14 PM.

Details

Summary
  • export all the statistics for the search cluster
  • be gentle on the log cluster and don't try to get individual indices

statistics

Related to T2852

Test Plan
  • octocatalog-diff esnode1:
diff origin/production/esnode1.internal.softwareheritage.org current/esnode1.internal.softwareheritage.org
*******************************************
+ Archive[prometheus-elasticsearch-exporter] =>
   parameters =>
     "cleanup": true,
     "creates": "/usr/share/elasticsearch/plugins/prometheus-exporter/plugin-desc...
     "extract": true,
     "extract_path": "/usr/share/elasticsearch/plugins/prometheus-exporter",
     "group": "root",
     "notify": [
       "Service[elasticsearch]"
     ],
     "path": "/tmp/prometheus-exporter-7.8.0.0.zip",
     "source": "https://github.com/vvanholl/elasticsearch-prometheus-exporter/rel...
     "user": "root"
*******************************************
  File[/etc/elasticsearch/elasticsearch.yml] =>
   parameters =>
     content =>
      @@ -12,4 +12,6 @@
       path.data: "/srv/elasticsearch"
       path.logs: "/var/log/elasticsearch"
      +http.port: 9200
      +prometheus.indices: false
       indices.memory.index_buffer_size: 50%
       index.store.type: hybridfs
*******************************************
+ File[/usr/share/elasticsearch/plugins/prometheus-exporter] =>
   parameters =>
     "ensure": "directory",
     "group": "elasticsearch",
     "mode": "0755",
     "owner": "elasticsearch"
*******************************************
+ Profile::Prometheus::Export_scrape_config[elasticsearch_esnode1.internal.softwareheritage.org] =>
   parameters =>
     "job": "elasticsearch",
     "labels": {
     },
     "metrics_path": "/_prometheus/metrics",
     "scheme": "http",
     "target": "esnode1.internal.softwareheritage.org:9200"
*******************************************
*** End octocatalog-diff on esnode1.internal.softwareheritage.org
  • octocatalog-diff search-esnode0:
diff origin/production/search-esnode0.internal.staging.swh.network current/search-esnode0.internal.staging.swh.network
*******************************************
+ Archive[prometheus-elasticsearch-exporter] =>
   parameters =>
     "cleanup": true,
     "creates": "/usr/share/elasticsearch/plugins/prometheus-exporter/plugin-desc...
     "extract": true,
     "extract_path": "/usr/share/elasticsearch/plugins/prometheus-exporter",
     "group": "root",
     "notify": [
       "Service[elasticsearch]"
     ],
     "path": "/tmp/prometheus-exporter-7.9.3.0.zip",
     "source": "https://github.com/vvanholl/elasticsearch-prometheus-exporter/rel...
     "user": "root"
*******************************************
  File[/etc/elasticsearch/elasticsearch.yml] =>
   parameters =>
     content =>
      @@ -8,3 +8,5 @@
       path.data: "/srv/elasticsearch"
       path.logs: "/var/log/elasticsearch"
      +http.port: 9200
      +prometheus.indices: true
       network.host: 192.168.130.80
*******************************************
+ File[/usr/share/elasticsearch/plugins/prometheus-exporter] =>
   parameters =>
     "ensure": "directory",
     "group": "elasticsearch",
     "mode": "0755",
     "owner": "elasticsearch"
*******************************************
+ Profile::Prometheus::Export_scrape_config[elasticsearch_search-esnode0.internal.staging.swh.network] =>
   parameters =>
     "job": "elasticsearch",
     "labels": {
     },
     "metrics_path": "/_prometheus/metrics",
     "scheme": "http",
     "target": "search-esnode0.internal.staging.swh.network:9200"
*******************************************
*** End octocatalog-diff on search-esnode0.internal.staging.swh.network

Diff Detail

Repository
rSPSITE puppet-swh-site
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

This revision is now accepted and ready to land.Dec 7 2020, 12:23 PM

puppet was disabled on production nodes to avoid this diff to be applied. We will perform a rolling restart of the production cluster after the next scheduled kernel upgrade