Page MenuHomeSoftware Heritage

vpn (and ssh) connection to louvre.s.o failure
Closed, MigratedEdits Locked

Description

Since around noon today, VPN connections to louvre.softwareheritage.org fail with a timeout.
I cannot ssh to that machine via its public IP either, but that might be expected, I don't know. The host responds to ping though.
I haven't noticed other disruptions in user-facing services, but this makes impossible to access any internal machine for now.

Event Timeline

zack triaged this task as Unbreak Now! priority.Mar 13 2021, 1:48 PM
zack created this task.
zack updated the task description. (Show Details)
olasd claimed this task.
olasd added a subscriber: olasd.

SSH to our public IPs is not allowed, except for the forge host (and the public ssh is constrained to the git user there).

The OpenVPN CRL had expired.

As I didn't know what was happening, I accessed a random hypervisor via iDRAC; I then figured out on which node louvre was (grep -r louvre /etc/pve/nodes), sshed to this host, then zapped the machine. As the VPN didn't come back, I connected to the machine via qm terminal 108; logged in, and accessed the openvpn logs.

To update the CRL as root:

cd /etc/openvpn/keys; ./easyrsa gen-crl; systemctl reload openvpn

We should probably cron some form of that command every week or so, to avoid the issue recurring; Unfortunately it looks like an openvpn reload bounces all clients, so we should take care to do that when people are able to restore their connection.