We should be able to monitor our ever growing list of swh services's state [1]:
- swh-worker@swh_indexer_ctags.service
- swh-worker@swh_indexer_fossology_license.service
- swh-worker@swh_indexer_language.service
- swh-worker@swh_indexer_mimetype.service
- swh-worker@swh_indexer_orchestrator.service
- swh-worker@swh_indexer_orchestrator_text.service
- swh-worker@swh_indexer_rehash.service
- swh-worker@swh_lister_debian.service
- swh-worker@swh_lister_github.service
- swh-worker@swh_lister_gitlab.service
- swh-worker@swh_lister_pypi.service
- swh-worker@swh_loader_debian.service
- swh-worker@swh_loader_deposit.service
- swh-worker@swh_loader_git.service
- swh-worker@swh_loader_git_disk.service
- swh-worker@swh_loader_mercurial.service
- swh-worker@swh_loader_svn.service
- swh-worker@swh_storage_archiver.service
- swh-worker@swh_vault_cooker.service
Ideally, we should be able to be alerted when the service is not in its right state, e.g:
- stopped but should be running
- running but should be stopped
- disabled but should be enabled
- ...
[1] https://forge.softwareheritage.org/source/puppet-swh-site/browse/production/site-modules/profile/manifests/swh/deploy/worker/
Note:
- The need is independent from the queue consumption detection (e.g. service running but associated queue consumption cancelled).
- Those services are systemd ones