Thanos is the swiss-army knife for prometheus federation/HA/clustering.
It allows querying a global view of multiple, potentially redundant, prometheus data
stores, by pushing data from prometheus instances to centralised object stores, then
providing query frontends for each of these data stores.
Plan:
- [x] Install manual thanos services in mmca (temporary provenance server)
- [x] Push historical data from mmca to a thanos datastore bucket
- [x] Push historical data from pergamon to a thanos datastore bucket
- [x] D8089: Provision thanos query dedicated node (+ inventory update)
- [x] D8092: Expose a thanos query service to read from those datastore
- [x] D8097: Expose thanos gateway service to access historical data
~~- [ ] Expose thanos gateway on mmca (historical data access)~~ -> will make it run on thanos node
- [x] D8097: Update thanos query to read from those gateways as well
- [ ] Make communication work between thanos node and mmca node
- [ ] Switch grafana datasource from pergamon's prometheus to the thanos query service
- [ ] Drop mmca's prometheus federation from puppet
- [ ] Instantiate prometheus/thanos services in staging environment
- [ ] Federate it through thanos
- [ ] Document
Draft note can be found in the hedgedoc document [2].
[1] https://thanos.io/
[2] https://hedgedoc.softwareheritage.org/X1henrmkT8yL6_W9R0YpGg?both