staging first:
- [x] D7606: Reuse rancher cluster (used for our gitlab in-house experiment)
- [x] D7600: elastic worker node needs a specific role with docker prepared
- [x] D7624: P1342: Upgrade proxmox vm template
- [x] D7625, P1343: Declare new vm template with zfs dependency ready (so automation is not requiring reboot in the middle)
- [x] D7607: Register vms to cluster rancher
- [ ] vms runs docker container images of lister/loader images
- [x] Monitor services: install prometheus, grafana
- [ ] Federate it to our main swh prometheus
- [ ] service logs pushed to our log infrastructure
- [ ] (Optional) Rework puppet manifest to actually run the registration command [1]
End goal:
- listing and loading happens
- resulting logs are pushed to our standard kibana logs infrastructure
- stats results are pushed to our standard grafana
Annex:
- ci build swh docker images (we can reuse existing ones at first)
[1] It's currently proxmox but later we'll have to do it without proxmox with baremetal
machines
[2] https://rancher.euwest.azure.internal.softwareheritage.org/k8s/clusters/c-t85mz/api/v1/namespaces/cattle-monitoring-system/services/http:rancher-monitoring-grafana:80/proxy/d/rancher-home-1/home?orgId=1&from=1651067146437&to=1651070746437