Unstuck infrastructure.
What happened so far:
- irc notifications around 25/07/2021 3:27 about socket timeout
- then escalation and most public facing services went down
- Analysis started on the 26/07 next morning
- First: status.softwareheritage.org manually updated to notify the issue on channels
- Around noon identification of ceph which spitted lots of logs, which filled in disk on /
- Fixed the storage issue and progressive restart of nodes
- ...