It did not send the last two weekly mails
Description
Revisions and Commits
rSPSITE puppet-swh-site | |||
D6134 | rSPSITE8e92a5381143 Raise alert when postfix service is not running |
Related Objects
Event Timeline
The postfix server was down. It got restarted today.
No logs were seen after the 09/08/2021 [1] which is around the last outage [2]
The server (nor the service) were restarted.
[1] http://kibana0.internal.softwareheritage.org:5601/goto/b6b81346f42425ff50071a356350b38f
[2] I opened back the august systemdlog in elasticsearch first.
index_name="systemlogs"; for i in $(seq 01 22); do ([[ $i -lt 10 ]] && num="0$i" || num="$i"); curl -X POST $ES_SERVER/$index_name-2021.08.$num/_open; curl -X POST $ES_SERVER/$index_name-2021.08.$num/_unfreeze; done
Looking into adding an icinga alert around this.
For this, the following plugins [1] [2] sound interesting.
Plus, that'd definitely help to have alerts around other systemd services as well.
[1] https://icinga.com/docs/icinga-2/latest/doc/10-icinga-template-library/#systemd
[2] https://exchange.icinga.com/kiminen/icinga2-check_systemd_service
For this, the following plugins [1] [2] sound interesting.
Plus, that'd definitely help to have alerts around other systemd services as well.[1] https://icinga.com/docs/icinga-2/latest/doc/10-icinga-template-library/#systemd
Turns out this one ^ is already packaged for debian unstable but not backported.
I'm of a mind to just do the necessary packaging glue to be able to build it and upload
it to our swh debian repository [1]
[1] P1128
Turns out this one ^ is already packaged for debian unstable but not backported.
I'm of a mind to just do the necessary packaging glue to be able to build it and upload
it to our swh debian repository [1]
After some bumps, it's finally uploaded there [1]
package repository is mirrored on our forge [2] and the package got built from the debian/buster-swh branch.
I'll address the puppet part to actually install the icinga alert tomorrow (which will install that package).
[1]
tony@yavin4 $ swh-debian-upload monitoring-plugins-systemd_2.3.1-1~swh1~bpo10+1_amd64.changes The .dsc file is already signed. Would you like to use the current signature? [Yn]y # <----- detail: already signed earlier with failing upload due to me! signfile monitoring-plugins-systemd_2.3.1-1~swh1~bpo10+1_amd64.changes 0D10C3B8 Successfully signed dsc and changes files Exporting indices... root@pergamon:/srv/softwareheritage/repository# reprepro ls monitoring-plugins-systemd monitoring-plugins-systemd | 2.3.1-1~swh1~bpo10+1 | buster-swh | i386, amd64, source root@pergamon:/srv/softwareheritage/repository# apt-cache search monitoring-plugins-systemd root@pergamon:/srv/softwareheritage/repository# apt update root@pergamon:/srv/softwareheritage/repository# apt-cache search monitoring-plugins-systemd monitoring-plugins-systemd - systemd plugin for nagios compatible monitoring systems
[2] https://forge.softwareheritage.org/source/monitoring-plugins-systemd/
Postfix alert installed so next time, we should be able to be made aware ;)
Closing now.