Details

Reviewers

olasd
vsellier

Group Reviewers

System administrators

Maniphest Tasks

T2852: Take back control on elasticsearch puppet manifests
T2817: Enable the swh-search environment in staging

Summary

This allows declaration of the elasticsearch which is configured manually so
far. We reused the actual production configuration in
/etc/elasticsearch/{elasticsearch.yml,jvm_options} as default.

The following diff configures both for production and staging node the
following:

/etc/elasticsearch/elasticsearch.yml (overriding the one from the debian package)
/etc/elasticsearch/jvm.options.d/jvm.options (adding some Xms/Xmx override)

This also fixed a couple of current paper cuts:

uid/gid creation
fix the inter-dependency on package/service/apt-config order
remove a deprecated xpack configuration (since 7.8.0 which is the prod version)
unmanage the no longer required openjdk-8 dependency (es complained about it) [1]

[1] We'll need to uninstall that jdk from the production esnodes

[2] We'll need to apply the following configuration in production on node at a
time.

Related to T2817

Test Plan

vagrant up staging-esnode0 ~> happily configures and starts elasticsearch accordingly

bin/octocatalog-diff on an esnode production node (there is some diff but the
actual configuration in the end is the same as the current one):

bin/octocatalog-diff --octocatalog-diff-args --no-truncate-details --to staging_add_elasticsearch_node esnode1
Found host esnode1.internal.softwareheritage.org
WARN     -> Environment "arcpatch-D4460" contained non-word characters, correcting name to arcpatch_D4460
WARN     -> Environment "open-template1" contained non-word characters, correcting name to open_template1
WARN     -> Environment "update-writer-config" contained non-word characters, correcting name to update_writer_config
WARN     -> Environment "wip-pg-hba-rules-in-yaml" contained non-word characters, correcting name to wip_pg_hba_rules_in_yaml
Cloning into '/tmp/swh-ocd.idXBDTTy/environments/production/data/private'...
done.
Cloning into '/tmp/swh-ocd.idXBDTTy/environments/staging_add_elasticsearch_node/data/private'...
done.
*** Running octocatalog-diff on host esnode1.internal.softwareheritage.org
I, [2020-12-02T16:40:00.786190 #28052]  INFO -- : Catalogs compiled for esnode1.internal.softwareheritage.org
I, [2020-12-02T16:40:02.157247 #28052]  INFO -- : Diffs computed for esnode1.internal.softwareheritage.org
diff origin/production/esnode1.internal.softwareheritage.org current/esnode1.internal.softwareheritage.org
*******************************************
+ Concat::Fragment[0_es_jvm_option] =>
   parameters =>
      "content": "-Xms16g"
      "order": "00"
      "target": "es_jvm_options"
*******************************************
+ Concat::Fragment[1_es_jvm_option] =>
   parameters =>
      "content": "-Xmx16g"
      "order": "00"
      "target": "es_jvm_options"
*******************************************
+ Concat[es_jvm_options] =>
   parameters =>
      "backup": "puppet"
      "ensure": "present"
      "ensure_newline": true
      "force": false
      "format": "plain"
      "group": 119
      "mode": "0644"
      "notify": "Service[elasticsearch]"
      "order": "alpha"
      "owner": 114
      "path": "/etc/elasticsearch/jvm.options.d/jvm.options"
      "replace": true
      "show_diff": true
      "warn": false
*******************************************
+ Concat_file[es_jvm_options] =>
   parameters =>
      "backup": "puppet"
      "ensure_newline": true
      "force": false
      "format": "plain"
      "group": 119
      "mode": "0644"
      "order": "alpha"
      "owner": 114
      "path": "/etc/elasticsearch/jvm.options.d/jvm.options"
      "replace": true
      "show_diff": true
      "tag": "es_jvm_options"
*******************************************
+ Concat_fragment[0_es_jvm_option] =>
   parameters =>
      "content": "-Xms16g"
      "order": "00"
      "tag": "es_jvm_options"
      "target": "es_jvm_options"
*******************************************
+ Concat_fragment[1_es_jvm_option] =>
   parameters =>
      "content": "-Xmx16g"
      "order": "00"
      "tag": "es_jvm_options"
      "target": "es_jvm_options"
*******************************************
+ File[/etc/elasticsearch/elasticsearch.yml] =>
   parameters =>
      "ensure": "file"
      "group": 119
      "mode": "0644"
      "notify": "Service[elasticsearch]"
      "owner": 114
      "content": >>>
# File managed by puppet - modifications will be lost
cluster.name: swh-logging-prod
node.name: esnode1
network.host: 192.168.100.61
discovery.seed_hosts:
- esnode1.internal.softwareheritage.org
- esnode2.internal.softwareheritage.org
- esnode3.internal.softwareheritage.org
cluster.initial_master_nodes:
- esnode1
- esnode2
- esnode3
path.data: "/srv/elasticsearch"
path.logs: "/var/log/elasticsearch"
index.store.type: hybridfs
indices.memory.index_buffer_size: 50%
<<<
*******************************************
+ File[/srv/elasticsearch] =>
   parameters =>
      "ensure": "directory"
      "group": 119
      "mode": "2755"
      "owner": 114
*******************************************
- File_line[elasticsearch store type]
*******************************************
+ Group[elasticsearch] =>
   parameters =>
      "ensure": "present"
      "gid": 119
*******************************************
- Package[openjdk-8-jre-headless]
*******************************************
  Systemd::Dropin_file[elasticsearch.conf] =>
   parameters =>
     notify =>
      + Service[elasticsearch]
*******************************************
  User[elasticsearch] =>
   parameters =>
     gid =>
      - "119"
      + 119
     uid =>
      - "114"
      + 114
*******************************************
*** End octocatalog-diff on esnode1.internal.softwareheritage.org

Diff Detail

Repository

rSPSITE puppet-swh-site

Branch

staging_add_elasticsearch_node

Lint

No Linters Available

Unit

No Unit Test Coverage

Build Status

Buildable 17715
Build 27386: arc lint + arc unit

Event Timeline

ardumont created this revision.Dec 2 2020, 4:53 PM

Harbormaster completed remote builds in B17684: Diff 16498.Dec 2 2020, 4:53 PM

Override heap_size more "appropriately" for staging and vagrant

Harbormaster completed remote builds in B17685: Diff 16499.Dec 2 2020, 4:58 PM

Looks like a good step forward overall.

Can you move the elasticsearch config to another file in the common directory, rather than the main file (you could make it an elastic.yml file with the config for kibana and logstash as well)?

I've made a bunch of other comments inline.

data/common/common.yaml
2859–2860	This should use the same `listen_network`/`ip_for_network` trick as kibana, logstash and other things like them, rather than having to override it on all hosts. Unfortunately this means that I don't think the interpolation can happen in the yaml (but needs to happen in the puppet manifest). Overall I don't think that's a problem.
2862–2866	jvm options aren't part of the elasticsearch config so you should probably move them one "level" up.
2868–2880	Some of this (at least the bits referencing hostnames) should probably be in a `data/deployments/production/` file instead of here, to avoid poor defaults and a risk of "wires crossing" in other deployments.
2871	We need to check whether that's actually a proper value these days for a cluster built from scratch.
2872	The need to override that config needs to be questioned too :-)
data/deployments/staging/common.yaml
252–253	is this correct?
data/hostname/esnode1.internal.softwareheritage.org.yaml
1 ↗	(On Diff #16499)	If we use a `listen_network`/`ip_for_network`-based trick we don't need to (re) introduce these files.
site-modules/profile/manifests/elasticsearch.pp
4–5	Not sure the whole user management is needed at all; I suspect the Debian package creates the user and group itself already?
28–29	can probably be owned by root then (we don't need ES to have write access)
39–40	Same here, definitely should be owned by root

ardumont added inline comments.Dec 2 2020, 5:46 PM

data/deployments/staging/common.yaml
252–253	good catch, we opened this too soon and it's not correct indeed.
259	neither is this, should be `search-esnode0.internal.staging.swh.network`.
data/hostname/esnode1.internal.softwareheritage.org.yaml
1 ↗	(On Diff #16499)	right, we'll check.
site-modules/profile/manifests/elasticsearch.pp
4–5	with vagrant, we needed this. From scratch, we ended up having the gid being 120 instead of 119. We need control over the uid/gid for the folder creations below. If we want to use the elasticsearch snapshot mechanism for the backups (usually shared folders over nfs), we also need the control over those.
28–29	ack
39–40	ack

Thanks for the feedback.

Currently taking it into account.

And trying out the elasticsearch puppet module (which was already in our puppet-environment but not used...)

vsellier added a child revision: D4654: -wip- Switch to the official elasticsearch plugin.Dec 3 2020, 12:21 PM

@olasd We have tested to use the official elasticsearch puppet plugin (D4654).
There is several issues to use it. WDYT?

ardumont marked 9 inline comments as done.Dec 3 2020, 2:39 PM

ardumont added inline comments.

data/common/common.yaml
2871	we kept it and moved it for the production as it is its current value as an extra config. It won't be opened for the new staging cluster for now.
2872	quite, it got also moved as an extra configuration for the production (for the same reason as ^).

Adapt according to various feedbacks:

move dedicated prod configuration to deployments/production/common.yaml
move existing production configuration options to deployments/production/common.yaml
Fix hostname typos
Avoid declaring extra configuration files for the esnodes ip configuration. Use our ip_for_network function stanza instead
Fix faulty and unneeded for now swh search configuration
...

Status: vagrant still happily provisions the node as per configuration

Harbormaster completed remote builds in B17699: Diff 16513.Dec 3 2020, 2:45 PM

Override /etc/hosts configuration for vagrant

Harbormaster completed remote builds in B17700: Diff 16514.Dec 3 2020, 2:51 PM

staging: Add elasticsearch node
elasticsearch: Complete uid/gid creation
elasticsearch: Declare es configuration
elasticsearch: configure the jvm options in the dropin directory

Harbormaster completed remote builds in B17703: Diff 16517.Dec 3 2020, 3:24 PM

olasd added inline comments.Dec 3 2020, 4:03 PM

site-modules/profile/manifests/elasticsearch.pp
4–5	We don't really need hardcoded uid/gids; we can just use the 'elasticsearch' name too. Do you have pointers to this shared folders over nfs situation? That sounds quite awful, and using something like https://www.elastic.co/guide/en/elasticsearch/plugins/7.10/repository-azure.html would make more sense to me.

vsellier added inline comments.Dec 3 2020, 4:47 PM

site-modules/profile/manifests/elasticsearch.pp
4–5	This is the case when the snapshots are stored on the filesystem[1]. I prefer generally to have unified uids/gid on the whole infra to have something reproducible but as it's not specified for other applicative users, it makes sense to only use the names instead of the ids. It makes me think the elasticsearch user should flagged as a system user. [1] https://www.elastic.co/guide/en/elasticsearch/reference/7.10/snapshots-register-repository.html#snapshots-filesystem-repository

Drop puppet uid/gid management and let the elasticsearch install deal with it.

Harbormaster completed remote builds in B17715: Diff 16528.Dec 3 2020, 5:36 PM

ardumont marked 3 inline comments as done.Dec 3 2020, 5:38 PM

ardumont added inline comments.

site-modules/profile/manifests/elasticsearch.pp
4–5	Not sure the whole user management is needed at all; I suspect the Debian package creates the user and group itself already? yes, in the end, we removed that part It's now done through the debian install step ;)

Thanks for this very nice improvement!

This revision is now accepted and ready to land.Dec 3 2020, 6:31 PM

Landed in 06327419

vsellier mentioned this in T2817: Enable the swh-search environment in staging.Dec 4 2020, 11:46 AM

oops, wrong button ¯\_(ツ)_/¯ (i requested review...)

fake LGTM to be able to change the status

This revision is now accepted and ready to land.Dec 4 2020, 11:47 AM

Landed in 06327419

vsellier mentioned this in T2852: Take back control on elasticsearch puppet manifests.Dec 4 2020, 2:20 PM

vsellier added a task: T2852: Take back control on elasticsearch puppet manifests.

Diff 16528

View Options

data/common/common.yaml

Show First 20 Lines • Show All 2,850 Lines • ▼ Show 20 Lines	- alias: swh-indexer
passwd: "%{hiera('swh::deploy::indexer::storage::db::password')}"		passwd: "%{hiera('swh::deploy::indexer::storage::db::password')}"

elastic::elk_version: '7.8.0'		elastic::elk_version: '7.8.0'

elasticsearch::hosts:		elasticsearch::hosts:
- http://esnode1.internal.softwareheritage.org:9200		- http://esnode1.internal.softwareheritage.org:9200
- http://esnode2.internal.softwareheritage.org:9200		- http://esnode2.internal.softwareheritage.org:9200
- http://esnode3.internal.softwareheritage.org:9200		- http://esnode3.internal.softwareheritage.org:9200

		elasticsearch::jvm_options:
		olasdUnsubmitted Done Inline Actions This should use the same `listen_network`/`ip_for_network` trick as kibana, logstash and other things like them, rather than having to override it on all hosts. Unfortunately this means that I don't think the interpolation can happen in the yaml (but needs to happen in the puppet manifest). Overall I don't think that's a problem. olasd: This should use the same `listen_network`/`ip_for_network` trick as kibana, logstash and other…
		- "-Xms%{lookup('elasticsearch::jvm_options::heap_size')}"
		- "-Xmx%{lookup('elasticsearch::jvm_options::heap_size')}"

		elasticsearch::config::path::data: /srv/elasticsearch
		elasticsearch::config::path::logs: /var/log/elasticsearch

		olasdUnsubmitted Done Inline Actions jvm options aren't part of the elasticsearch config so you should probably move them one "level" up. olasd: jvm options aren't part of the elasticsearch config so you should probably move them one…
		elasticsearch::config:
		cluster.name: "%{alias('elasticsearch::config::cluster::name')}"
		node.name: "%{::hostname}"
		discovery.seed_hosts: "%{alias('elasticsearch::config::discovery::seed_hosts')}"
		cluster.initial_master_nodes: "%{alias('elasticsearch::config::cluster::initial_master_nodes')}"
		olasdUnsubmitted Done Inline Actions We need to check whether that's actually a proper value these days for a cluster built from scratch. olasd: We need to check whether that's actually a proper value these days for a cluster built from…
		ardumontAuthorUnsubmitted Done Inline Actions we kept it and moved it for the production as it is its current value as an extra config. It won't be opened for the new staging cluster for now. ardumont: we kept it and moved it for the production as it is its current value as an extra config. It…
		path.data: "%{alias('elasticsearch::config::path::data')}"
		olasdUnsubmitted Done Inline Actions The need to override that config needs to be questioned too :-) olasd: The need to override that config needs to be questioned too :-)
		ardumontAuthorUnsubmitted Done Inline Actions quite, it got also moved as an extra configuration for the production (for the same reason as ^). ardumont: quite, it got also moved as an extra configuration for the production (for the same reason as…
		path.logs: "%{alias('elasticsearch::config::path::logs')}"

logstash::listen_network: "%{lookup('internal_network')}"		logstash::listen_network: "%{lookup('internal_network')}"
logstash::elasticsearch::hosts: "%{alias('elasticsearch::hosts')}"		logstash::elasticsearch::hosts: "%{alias('elasticsearch::hosts')}"

kibana::listen_network: "%{lookup('internal_network')}"		kibana::listen_network: "%{lookup('internal_network')}"
kibana::server_name: "%{::swh_hostname.internal_fqdn}"		kibana::server_name: "%{::swh_hostname.internal_fqdn}"
kibana::config:		kibana::config:
		olasdUnsubmitted Done Inline Actions Some of this (at least the bits referencing hostnames) should probably be in a `data/deployments/production/` file instead of here, to avoid poor defaults and a risk of "wires crossing" in other deployments. olasd: Some of this (at least the bits referencing hostnames) should probably be in a…
server.name: "%{alias('kibana::server_name')}"		server.name: "%{alias('kibana::server_name')}"
elasticsearch.hosts: "%{alias('elasticsearch::hosts')}"		elasticsearch.hosts: "%{alias('elasticsearch::hosts')}"
kibana.index: .kibana		kibana.index: .kibana

# sentry::secret_key in private-data		# sentry::secret_key in private-data

sentry::postgres::host: db.internal.softwareheritage.org		sentry::postgres::host: db.internal.softwareheritage.org
sentry::postgres::port: 5432		sentry::postgres::port: 5432
▲ Show 20 Lines • Show All 1,402 Lines • Show Last 20 Lines

View Options

data/deployments/production/common.yaml

	swh::deploy::deposit::reverse_proxy::backend_http_host: "::1"			swh::deploy::deposit::reverse_proxy::backend_http_host: "::1"
	swh::deploy::webapp::reverse_proxy::backend_http_host: "::1"			swh::deploy::webapp::reverse_proxy::backend_http_host: "::1"

				elasticsearch::config::cluster::name: swh-logging-prod
				elasticsearch::config::discovery::seed_hosts:
				- esnode1.internal.softwareheritage.org
				- esnode2.internal.softwareheritage.org
				- esnode3.internal.softwareheritage.org
				elasticsearch::config::cluster::initial_master_nodes:
				- esnode1
				- esnode2
				- esnode3

				elasticsearch::config::extras:
				indices.memory.index_buffer_size: 50%
				index.store.type: hybridfs

				elasticsearch::jvm_options::heap_size: 16g

View Options

data/deployments/production/vagrant.yaml

This file was added.

elasticsearch::jvm_options::heap_size: 512m

View Options

data/deployments/staging/common.yaml

	Show First 20 Lines • Show All 238 Lines • ▼ Show 20 Lines
	hitch::frontend: "[*]:443"			hitch::frontend: "[*]:443"
	hitch::proxy_support: true			hitch::proxy_support: true

	varnish::http_port: 80			varnish::http_port: 80

	apache::http_port: 9080			apache::http_port: 9080
	# Disable default vhost on port 80			# Disable default vhost on port 80
	apache::default_vhost: false			apache::default_vhost: false

				# Elasticsearch

				elastic::elk_version: '7.9.3'

				elasticsearch::config::cluster::name: swh-search

				olasdUnsubmitted Done Inline Actions is this correct? olasd: is this correct?
				ardumontAuthorUnsubmitted Done Inline Actions good catch, we opened this too soon and it's not correct indeed. ardumont: good catch, we opened this too soon and it's not correct indeed.
				elasticsearch::config::discovery::seed_hosts:
				- search-esnode0.internal.staging.swh.network
				elasticsearch::config::cluster::initial_master_nodes:
				- search-esnode0

				elasticsearch::jvm_options::heap_size: 8g
				ardumontAuthorUnsubmitted Done Inline Actions neither is this, should be `search-esnode0.internal.staging.swh.network`. ardumont: neither is this, should be `search-esnode0.internal.staging.swh.network`.

View Options

data/deployments/staging/vagrant.yaml

View Options

data/hostname/search-esnode0.internal.staging.swh.network.yaml

View Options

data/subnets/vagrant.yaml

View Options

manifests/site.pp

View Options

site-modules/profile/manifests/elasticsearch.pp

View Options

site-modules/role/manifests/swh_elasticsearch.pp

View Options

site-modules/role/manifests/swh_elasticsearch_broker.pp

Puppetize elasticsearch nodes
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 16528

data/common/common.yaml

data/deployments/production/common.yaml

data/deployments/production/vagrant.yaml

data/deployments/staging/common.yaml

data/deployments/staging/vagrant.yaml

data/hostname/search-esnode0.internal.staging.swh.network.yaml

data/subnets/vagrant.yaml

manifests/site.pp

site-modules/profile/manifests/elasticsearch.pp

site-modules/role/manifests/swh_elasticsearch.pp

site-modules/role/manifests/swh_elasticsearch_broker.pp

Puppetize elasticsearch nodesClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 16528

data/common/common.yaml

data/deployments/production/common.yaml

data/deployments/production/vagrant.yaml

data/deployments/staging/common.yaml

data/deployments/staging/vagrant.yaml

data/hostname/search-esnode0.internal.staging.swh.network.yaml

data/subnets/vagrant.yaml

manifests/site.pp

site-modules/profile/manifests/elasticsearch.pp

site-modules/role/manifests/swh_elasticsearch.pp

site-modules/role/manifests/swh_elasticsearch_broker.pp

Puppetize elasticsearch nodes
ClosedPublic
Actions

Revision Contents
Changeset List