diff --git a/sysadmin/grid5000/cassandra/.gitignore b/sysadmin/grid5000/cassandra/.gitignore new file mode 100644 index 0000000..9de872b --- /dev/null +++ b/sysadmin/grid5000/cassandra/.gitignore @@ -0,0 +1,4 @@ +.terraform +terraform.tfstate* +.vagrant +playbook.retry diff --git a/sysadmin/grid5000/cassandra/terraform/Readme.md b/sysadmin/grid5000/cassandra/Readme.md similarity index 76% rename from sysadmin/grid5000/cassandra/terraform/Readme.md rename to sysadmin/grid5000/cassandra/Readme.md index 16f9368..2d08649 100644 --- a/sysadmin/grid5000/cassandra/terraform/Readme.md +++ b/sysadmin/grid5000/cassandra/Readme.md @@ -1,121 +1,150 @@ Grid5000 terraform provisioning =============================== Prerequisite ------------ Tools ##### terraform >= 13.0 +vagrant >= 2.2.3 [for local tests only] Credentials ########### * grid5000 credentials ``` cat < ~/.grid5000.yml uri: https://api.grid5000.fr username: username password: password EOF ``` Theses credentials will be used to interact with the grid5000 api to create the jobs * Private/public key files (id_rsa) in the `~/.ssh` directory The public key will be installed on the nodes Run --- +### Local (on vagrant) + +The `Vagrantfile` is configured to provision 3 nodes, install cassandra and the configure the cluster using the ansible configuration: + +``` +vagrant up +vagrant ssh cassandra1 +sudo -i +nodetool status +``` + +If everything is ok, the `nodetool` command line returns: +``` +root@cassandra1:~# nodetool status +Datacenter: datacenter1 +======================= +Status=Up/Down +|/ State=Normal/Leaving/Joining/Moving +-- Address Load Tokens Owns (effective) Host ID Rack +UN 10.168.180.12 15.78 KiB 256 67.9% 05d61a24-832a-4936-b0a5-39926f800d09 rack1 +UN 10.168.180.11 73.28 KiB 256 67.0% 23d855cc-37d6-43a7-886e-9446e7774f8d rack1 +UN 10.168.180.13 15.78 KiB 256 65.0% c6bc1eff-fa0d-4b67-bc53-fc31c6ced5bb rack1 +``` + +Cassandra can take some time to start, so you have to wait before the cluster stabilize itself. + +### On Grid5000 + * Initialize terraform modules (first time only) ``` terraform init ``` * Test the plan It only check the status of the declared resources compared to the grid5000 status. It's a read only operation, no actions on grid5000 will be perform. ``` terraform plan ``` * Execute the plan ``` terraform apply ``` This action creates the job, provisions the nodes according the `main.tf` file content and install the specified linux distribution on it. This command will log the reserved node name in output. For example for a 1 node reservation: ``` grid5000_job.cassandra: Creating... grid5000_job.cassandra: Still creating... [10s elapsed] grid5000_job.cassandra: Creation complete after 11s [id=1814813] grid5000_deployment.my_deployment: Creating... grid5000_deployment.my_deployment: Still creating... [10s elapsed] grid5000_deployment.my_deployment: Still creating... [20s elapsed] grid5000_deployment.my_deployment: Still creating... [30s elapsed] grid5000_deployment.my_deployment: Still creating... [40s elapsed] grid5000_deployment.my_deployment: Still creating... [50s elapsed] grid5000_deployment.my_deployment: Still creating... [1m0s elapsed] grid5000_deployment.my_deployment: Still creating... [1m10s elapsed] grid5000_deployment.my_deployment: Still creating... [1m20s elapsed] grid5000_deployment.my_deployment: Still creating... [1m30s elapsed] grid5000_deployment.my_deployment: Still creating... [1m40s elapsed] grid5000_deployment.my_deployment: Still creating... [1m50s elapsed] grid5000_deployment.my_deployment: Still creating... [2m0s elapsed] grid5000_deployment.my_deployment: Still creating... [2m10s elapsed] grid5000_deployment.my_deployment: Creation complete after 2m12s [id=D-0bb76036-1512-429f-be99-620afa328b26] Apply complete! Resources: 2 added, 0 changed, 0 destroyed. Outputs: nodes = [ "chifflet-6.lille.grid5000.fr", ] ``` It's now possible to connect to the nodes: ``` $ ssh -A access.grid5000.fr $ ssh -A root@chifflet-6.lille.grid5000.fr Linux chifflet-6.lille.grid5000.fr 4.19.0-16-amd64 #1 SMP Debian 4.19.181-1 (2021-03-19) x86_64 Debian10-x64-base-2021060212 (Image based on Debian Buster for AMD64/EM64T) Maintained by support-staff Doc: https://www.grid5000.fr/w/Getting_Started#Deploying_nodes_with_Kadeploy root@chifflet-6:~# ``` Cleanup ------- To destroy the resources before the end of the job: ``` terraform destroy ``` If the job is stopped, simply remove the `terraform.tfstate` file: ``` rm terraform.tfstate ``` ## TODO [ ] variablization of the script [ ] Ansible provisionning of the nodes [ ] disk initialization [ ] support different cluster topologies (nodes / disks / ...) [ ] cassandra installation [ ] swh-storage installation [ ] ... diff --git a/sysadmin/grid5000/cassandra/Vagrantfile b/sysadmin/grid5000/cassandra/Vagrantfile new file mode 100644 index 0000000..5ff3003 --- /dev/null +++ b/sysadmin/grid5000/cassandra/Vagrantfile @@ -0,0 +1,60 @@ +# -*- mode: ruby -*- +# vi: set ft=ruby : + +vms = { + "cassandra1" => { + :ip => "10.168.180.11", + :memory => 2048, + :cpus => 2, + }, + "cassandra2" => { + :ip => "10.168.180.12", + :memory => 2048, + :cpus => 2, + }, + "cassandra3" => { + :ip => "10.168.180.13", + :memory => 2048, + :cpus => 2, + }, +} + +# Images/remote configuration +$global_debian10_box = "debian10-20210517-1348" +$global_debian10_box_url = "https://annex.softwareheritage.org/public/isos/libvirt/debian/swh-debian-10.9-amd64-20210517-1348.qcow2" + +vms.each { | vm_name, vm_props | + + Vagrant.configure("2") do |global_config| + unless Vagrant.has_plugin?("libvirt") + $stderr.puts <<-MSG + vagrant-libvirt plugin is required for this. + To install: `$ sudo apt install vagrant-libvirt + MSG + exit 1 + end + + global_config.vm.define vm_name do |config| + config.vm.box = $global_debian10_box + config.vm.box_url = $global_debian10_box_url + config.vm.box_check_update = false + config.vm.hostname = vm_name + config.vm.network :private_network, ip: vm_props[:ip], netmask: "255.255.0.0" + + config.vm.synced_folder ".", "/vagrant", type: 'nfs' + + config.vm.provision :ansible do |ansible| + ansible.verbose = true + ansible.become = true + ansible.playbook = "ansible/playbook.yml" + ansible.inventory_path = "ansible/hosts.yml" + end + + config.vm.provider :libvirt do |provider| + provider.memory = vm_props[:memory] + provider.cpus = vm_props[:cpus] + provider.driver = 'kvm' + end + end + end +} diff --git a/sysadmin/grid5000/cassandra/ansible/hosts.yml b/sysadmin/grid5000/cassandra/ansible/hosts.yml new file mode 100644 index 0000000..5c4de68 --- /dev/null +++ b/sysadmin/grid5000/cassandra/ansible/hosts.yml @@ -0,0 +1,36 @@ +cassandra: + hosts: + dahu[1:32].grenoble.grid5000.fr: + # local vagrant hosts + cassandra[1:9]: + vars: + cassandra_config_dir: /etc/cassandra + cassandra_data_dir_base: /srv/cassandra + cassandra_data_dir_system: "{{cassandra_data_dir_base}}/system" + cassandra_data_dir: "{{ cassandra_data_dir_base }}/data" + cassandra_commitlogs_dir: "{{ cassandra_data_dir_base }}/commitlogs" + +vagrant_nodes: + hosts: + cassandra1: + ansible_host: 10.168.180.11 + ansible_user: vagrant + ansible_ssh_private_key_file: .vagrant/machines/cassandra1/libvirt/private_key + cassandra2: + ansible_host: 10.168.180.12 + ansible_user: vagrant + ansible_ssh_private_key_file: .vagrant/machines/cassandra2/libvirt/private_key + cassandra3: + ansible_host: 10.168.180.13 + ansible_user: vagrant + ansible_ssh_private_key_file: .vagrant/machines/cassandra2/libvirt/private_key + vars: + cassandra_listen_interface: eth1 + # passed through --extra-vars on grid5000 + cassandra_seed_ips: 10.168.180.11,10.168.180.12,10.168.180.13 + +dahu_cluster_hosts: + hosts: + dahu[1:32].grenoble.grid5000.fr + vars: + cassandra_listen_interface: enp24s0f0 diff --git a/sysadmin/grid5000/cassandra/ansible/playbook.yml b/sysadmin/grid5000/cassandra/ansible/playbook.yml new file mode 100644 index 0000000..b5a58a7 --- /dev/null +++ b/sysadmin/grid5000/cassandra/ansible/playbook.yml @@ -0,0 +1,60 @@ +--- +- name: Install cassandra + hosts: cassandra + + tasks: + # - name: "Get public ipv4 address" + # set_fact: + # cassandra_seed_ips: "{{ansible_facts[item]['ipv4']['address']}}" + # with_items: + # - "{{cassandra_listen_interface }}" + + - name: Install cassandra signing key + apt_key: + url: https://downloads.apache.org/cassandra/KEYS + state: present + + - name: Install cassandra apt repository + apt_repository: + repo: deb http://downloads.apache.org/cassandra/debian 40x main + state: present + filename: cassandra.sources.list + + - name: Install packages + apt: + update_cache: true # force an apt update before + name: + ## TODO: check other version than jdk11 + - openjdk-11-jdk + - cassandra + - dstat + + - name: Create datadirs + file: + state: directory + path: "{{ item }}" + owner: "cassandra" + group: "cassandra" + mode: "0755" + with_items: + - "{{ cassandra_data_dir_base }}" + - "{{ cassandra_data_dir_system }}" + - "{{ cassandra_data_dir }}" + - "{{ cassandra_commitlogs_dir }}" + + - name: Configure cassandra + template: + src: "templates/{{item}}" + dest: "{{cassandra_config_dir}}/{{item}}" + with_items: [cassandra.yaml, jvm.options] + register: cassandra_configuration_files + + - name: Restart cassandra service + service: + name: cassandra + state: restarted + when: cassandra_configuration_files.changed + + + + # TODO test different read ahead diff --git a/sysadmin/grid5000/cassandra/ansible/templates/cassandra.yaml b/sysadmin/grid5000/cassandra/ansible/templates/cassandra.yaml new file mode 100644 index 0000000..7511ccc --- /dev/null +++ b/sysadmin/grid5000/cassandra/ansible/templates/cassandra.yaml @@ -0,0 +1,37 @@ +cluster_name: swh-storage # default 'Test Cluster' +num_tokens: 256 # default 256 +allocate_tokens_for_local_replication_factor: 3 +data_file_directories: + - {{ cassandra_data_dir }} # TODO use several disks +local_system_data_file_directory: {{ cassandra_data_dir_system }} +commitlog_directory: {{ cassandra_commitlogs_dir }} + +disk_optimization_strategy: spinning # spinning | ssd + +# listen_address: 0.0.0.0 # always wrong according to the documentation +listen_interface: {{ cassandra_listen_interface }} # always wrong according to the documentation + +concurrent_compactors: 1 # should be min(nb core, nb disks) + +internode_compression: dc # default dc possible all|dc|none + +concurrent_reads: 16 # 16 x number of drives +concurrent_writes: 32 # 8 x number of cores + +commitlog_sync: periodic # default periodic +commitlog_sync_period_in_ms: 10000 # default 10000 + +partitioner: org.apache.cassandra.dht.Murmur3Partitioner +endpoint_snitch: SimpleSnitch + +seed_provider: +- class_name: org.apache.cassandra.locator.SimpleSeedProvider + parameters: + # seeds is actually a comma-delimited list of addresses. + # Ex: ",," + - seeds: "{{ cassandra_seed_ips }}" + +# TODO Test this options effects +# disk_failure_policy: +# cdc_enabled +#end diff --git a/sysadmin/grid5000/cassandra/ansible/templates/jvm.options b/sysadmin/grid5000/cassandra/ansible/templates/jvm.options new file mode 100644 index 0000000..7e78467 --- /dev/null +++ b/sysadmin/grid5000/cassandra/ansible/templates/jvm.options @@ -0,0 +1,103 @@ +########################################################################### +# jvm11-server.options # +# # +# See jvm-server.options. This file is specific for Java 11 and newer. # +########################################################################### + +################# +# GC SETTINGS # +################# + + + +### CMS Settings +-XX:+UseConcMarkSweepGC +-XX:+CMSParallelRemarkEnabled +-XX:SurvivorRatio=8 +-XX:MaxTenuringThreshold=1 +-XX:CMSInitiatingOccupancyFraction=75 +-XX:+UseCMSInitiatingOccupancyOnly +-XX:CMSWaitDuration=10000 +-XX:+CMSParallelInitialMarkEnabled +-XX:+CMSEdenChunksRecordAlways +## some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541 +-XX:+CMSClassUnloadingEnabled + + + +### G1 Settings +## Use the Hotspot garbage-first collector. +#-XX:+UseG1GC +#-XX:+ParallelRefProcEnabled + +# +## Have the JVM do less remembered set work during STW, instead +## preferring concurrent GC. Reduces p99.9 latency. +#-XX:G1RSetUpdatingPauseTimePercent=5 +# +## Main G1GC tunable: lowering the pause target will lower throughput and vise versa. +## 200ms is the JVM default and lowest viable setting +## 1000ms increases throughput. Keep it smaller than the timeouts in cassandra.yaml. +#-XX:MaxGCPauseMillis=500 + +## Optional G1 Settings +# Save CPU time on large (>= 16GB) heaps by delaying region scanning +# until the heap is 70% full. The default in Hotspot 8u40 is 40%. +#-XX:InitiatingHeapOccupancyPercent=70 + +# For systems with > 8 cores, the default ParallelGCThreads is 5/8 the number of logical cores. +# Otherwise equal to the number of cores when 8 or less. +# Machines with > 10 cores should try setting these to <= full cores. +#-XX:ParallelGCThreads=16 +# By default, ConcGCThreads is 1/4 of ParallelGCThreads. +# Setting both to the same value can reduce STW durations. +#-XX:ConcGCThreads=16 + + +### JPMS + +-Djdk.attach.allowAttachSelf=true +--add-exports java.base/jdk.internal.misc=ALL-UNNAMED +--add-exports java.base/jdk.internal.ref=ALL-UNNAMED +--add-exports java.base/sun.nio.ch=ALL-UNNAMED +--add-exports java.management.rmi/com.sun.jmx.remote.internal.rmi=ALL-UNNAMED +--add-exports java.rmi/sun.rmi.registry=ALL-UNNAMED +--add-exports java.rmi/sun.rmi.server=ALL-UNNAMED +--add-exports java.sql/java.sql=ALL-UNNAMED + +--add-opens java.base/java.lang.module=ALL-UNNAMED +--add-opens java.base/jdk.internal.loader=ALL-UNNAMED +--add-opens java.base/jdk.internal.ref=ALL-UNNAMED +--add-opens java.base/jdk.internal.reflect=ALL-UNNAMED +--add-opens java.base/jdk.internal.math=ALL-UNNAMED +--add-opens java.base/jdk.internal.module=ALL-UNNAMED +--add-opens java.base/jdk.internal.util.jar=ALL-UNNAMED +--add-opens jdk.management/com.sun.management.internal=ALL-UNNAMED + + +### GC logging options -- uncomment to enable + +# Java 11 (and newer) GC logging options: +# See description of https://bugs.openjdk.java.net/browse/JDK-8046148 for details about the syntax +# The following is the equivalent to -XX:+PrintGCDetails -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M +#-Xlog:gc=info,heap*=trace,age*=debug,safepoint=info,promotion*=trace:file=/var/log/cassandra/gc.log:time,uptime,pid,tid,level:filecount=10,filesize=10485760 + +# Notes for Java 8 migration: +# +# -XX:+PrintGCDetails maps to -Xlog:gc*:... - i.e. add a '*' after "gc" +# -XX:+PrintGCDateStamps maps to decorator 'time' +# +# -XX:+PrintHeapAtGC maps to 'heap' with level 'trace' +# -XX:+PrintTenuringDistribution maps to 'age' with level 'debug' +# -XX:+PrintGCApplicationStoppedTime maps to 'safepoint' with level 'info' +# -XX:+PrintPromotionFailure maps to 'promotion' with level 'trace' +# -XX:PrintFLSStatistics=1 maps to 'freelist' with level 'trace' + +### Netty Options + +# On Java >= 9 Netty requires the io.netty.tryReflectionSetAccessible system property to be set to true to enable +# creation of direct buffers using Unsafe. Without it, this falls back to ByteBuffer.allocateDirect which has +# inferior performance and risks exceeding MaxDirectMemory +-Dio.netty.tryReflectionSetAccessible=true + +# The newline in the end of file is intentional diff --git a/sysadmin/grid5000/cassandra/terraform/.gitignore b/sysadmin/grid5000/cassandra/terraform/.gitignore deleted file mode 100644 index 88329db..0000000 --- a/sysadmin/grid5000/cassandra/terraform/.gitignore +++ /dev/null @@ -1,2 +0,0 @@ -.terraform -terraform.tfstate