Page MenuHomeSoftware Heritage

Install a new VPN endpoint at Rocquencourt
Open, NormalPublic

Description

There is actually only one VPN endpoint at Rocquencourt, on louvre.

Louvre has recently experienced a catastrophic failure and had to be rebooted, shutting down all existing network connections.
She will also be moved to a different physical bay, soon, once again shutting down connections for all users.

This can be avoided by creating a different VPN endpoint in Rocquencourt, on a different host.

Event Timeline

ftigeot created this task.Feb 12 2019, 1:12 PM
ftigeot triaged this task as Normal priority.

Louvre had previously fallen more than once. Some of the events are documented in T1173.

olasd added a subscriber: olasd.Feb 18 2019, 7:30 PM

Thanks for recording this task; I'll use the opportunity to document the reasoning behind the current internal networking setup, to try and make sure nothing is forgotten before migrating it.

There's more to the networking single point of failure than just the VPN (which one, btw?), and I think adapting the configuration to not have a single point of failure will be more work than just moving the VPN configurations on a VM.

The "internal" network is currently split across three logical networks:

  • SESI VLAN 440 (network 192.168.100.0/24) connects the "internal" interface of all the machines at Rocquencourt (bare metal, VM, containers);
  • Azure Virtual Network swh-vnet (network 192.168.200.0/22) connects our machines on Azure;
  • OpenVPN (network 192.168.101.0/24), connecting some external resources (two containers in Bologna, orangerie and orangeriedev), as well as roaming or static staff machines to one another.

The current expectation throughout the infrastructure is that those three networks can communicate with each other directly. The way we're doing that currently is having a common router with a leg on each of the networks.

The machine currently acting as this router is louvre which is:

  • the default gateway (192.168.100.1) for the VLAN 440 network (also set up to do MASQUERADEing towards the public internet)
  • the OpenVPN server
  • the local endpoint for an IPSEC tunnel allowing communication between VLAN 440, OpenVPN and the Azure Virtual Network

To make the network communication across all machines work seamlessly, a few classes of configurations are required.

  • Rocquencourt machines that are setup as purely internal (no public IP address), such as containers, some virtual machines, the elasticsearch and ceph nodes, have the easiest config: a static network configuration with default route pointing at 192.168.100.1. Considering its place as common router, this allows machines to reach all networks, as well as the outside world, with no extra config.
  • Rocquencourt machines with a public IP address (workers, VMs with external services, some hypervisors) are set up with two interfaces, one on the public vlan (untagged on the physical network) and one on the private vlan (id=440 on the physical network)
    1. On the main routing table:
      • The default route (reaching the outside world) is set via the public interface.
      • We add explicit routes for the two VPN networks via the internal interface gateway.
    2. By default linux will make packets sent from any IP address go through the default gateway; To avoid that and make sure bidirectional communication across all three internal networks succeeds, we:
      1. Define an extra routing table private, in /etc/iproute2/rt_tables
      2. Set an ip rule to make packets outgoing on the private interface use the private routing table
      3. Set a default route on the private table to go through 192.168.100.1.

Both Rocquencourt configurations, in simple cases (i.e. not for hypervisors with bridges and VLAN trunking and bonding on top of one another), are handled by the profile::network puppet manifest.

  • The Azure routing matrix is, quite frankly, black magic to my eyes. The setup is as follows:
    • We have defined a virtual network, swh-vnet, to use the range 192.168.200.0/22.
    • All virtual machines have at least a network interface
      1. we attach it to the swh-vnet network on VM setup
      2. an "IP configuration" is added so they get a ("Dynamic" but morally never changing) IP address in the range 192.168.200.0/22 through a DHCP server (managed by Azure infra).
        1. if public IP addressing is enabled on the "IP configuration", nothing changes in the output of the DHCP server, the VM only sees the private VNet address. However, bidirectional NAT happens at the edge of the network so that traffic to the public IP arrives at the VM, and traffic exiting the VM to the internet appears as coming from the public IP address.
        2. if public IP addressing is disabled, the machine still gets to talk to the outside world through a (built-in) NAT gateway.
    • To get communication to/from swh-vnet, we have setup a "Virtual network gateway" swh-vnet-gateway
      • https://docs.microsoft.com/en-us/azure/vpn-gateway/vpn-gateway-about-vpngateways
      • "Site-to-Site" configuration uses IPSEC / IKEv2 on the backend
      • azure-side configuration:
        1. declare louvre.softwareheritage.org as "Local network gateway", setting up the networks that are available behind it (192.168.100.0/24 and 192.168.101.0/24)
        2. declare a "Connection" between swh-vnet-gateway and louvre.softwareheritage.org (set the shared secret)
      • louvre-side configuration
        1. strongswan as IPSEC client
        2. Tunnel configuration in /etc/ipsec.conf
        3. configuration skimmed from a random blog post, confirmed by comparing with the Cisco config provided by azure
      • On the azure VM side, the routing happens automatically within their routing fabric
      • On the louvre side, IPSEC routing happens directly within the kernel (the routes aren't visible through ip route, only ipsec status shows the state of the tunnels)
  • The OpenVPN clients get their routes to the three networks in two ways:
    • they can communicate with each other with the client-to-client directive enabled in the OpenVPN config
    • routes to the other networks are pushed explicitly (push "route xxx yyy" directives)

So, to sum up, the current SPOF is actually a nicely tangled ball of, in order of priority from (IMO) most to least important:

  • gateway to the internet for internal-only machines at Rocquencourt
  • router between the private networks
  • IPSEC network-to-network gateway for Azure
  • OpenVPN gateway for roaming clients and odd external machines that don't warrant a network-to-network connection

All these bits also warrant a bunch of firewalling rules on the SESI side as well as with our partners where we use OpenVPN, which took quite a while to get right and won't be able to be changed in a pinch.

I honestly have no idea from which side to start pulling this string to untangle it, but at least now the breakdown of the setup is written down somewhere.

Thanks @olasd for this piece of information.

One remark: I don't want to move the OpenVPN on a VM, I want (at least) 2 OpenVPN on @ different physical machines. Indeed, we might want to have the same redundancy for other networking path currently going through louvre. This may also raise the question (already mentioned by @ftigeot IINM) of using a dedicated network-appliance-like machine as central router in Rocquencourt (I've been using Lanner brand without problem in my previous job, or it might also be a simple 1u dell machine).

One question: why are there so many machines with a public IPv4? I understand this for louvre, since it's the current main router (doing MASQ etc.), but I don't really get why pergamon, tate, moma, banco or beaubourg have public IPs. I'd much rather prefer a vip-like frontend (e.g. with varnish or nginx acting as a reverse proxy, potentially doing load balancing, caching, tls handchecks and so on).
This frontend machine does not even need a public IP, doing DNAT on the front router(s) could be enough, I think.

Or something like:

                               +-+
           pub ip1             |p|
                 +-- swh-fw1 --+r|
                /              |i|
Internet -- SESI               |v|
                \              |a|
                 +-- swh-fw2 --+t|
           pub ip2             |e|
                               +-+
olasd added a comment.Feb 19 2019, 2:58 PM

I'm slightly reordering what you wrote here, sorry!

One remark: I don't want to move the OpenVPN on a VM, I want (at least) 2 OpenVPN on @ different physical machines.

Please help me understand the concrete reasoning behind OpenVPN needing such a high availability setup. While it's an important tool for backend admin work (e.g. deployment of new versions), an OpenVPN downtime will not prevent most of the team from working. The only bit of public-facing infrastructure really depending on OpenVPN is the vault, and it's been designed to gracefully handle unavailability, on purpose.

I think we should spend time on identifying the team day to day tasks that depend on OpenVPN, and thinking of alternative ways of achieving them. For instance, one thing I can think of right off the bat is access to production logs through kibana, which we should be able to put behind a (public) reverse proxy with authentication.

One question: why are there so many machines with a public IPv4? I understand this for louvre, since it's the current main router (doing MASQ etc.), but I don't really get why pergamon, tate, moma, banco or beaubourg have public IPs.

There's two distinct answers to this question:

  • workers each have a separate public IP address to avoid a potential upstream blacklist blocking all of them at once. They also have a mostly free-for-all whitelist on the SESI edge firewall.
  • machines with public services have a public IP address because it is (or, I guess, was) the solution with the least overhead to spin up the infra at the time.

Our infra grew organically around a single hypervisor (louvre) with a large disk array directly attached, and we've never really done any refactoring to really get away from that. "load balancing" across our public IP addresses could surely happen at the network border just as well as on the machines separately as done currently.

I'd much rather prefer a vip-like frontend (e.g. with varnish or nginx acting as a reverse proxy, potentially doing load balancing, caching, tls handchecks and so on). This frontend machine does not even need a public IP, doing DNAT on the front router(s) could be enough, I think.

Sure, I don't see any reason not to gradually move towards that. For instance that's how Jenkins has been setup : jenkins.softwareheritage.org points to pergamon's public IP, which reverse-proxies to the service on the dedicated VM.

Or something like:

                               +-+
           pub ip1             |p|
                 +-- swh-fw1 --+r|
                /              |i|
Internet -- SESI               |v|
                \              |a|
                 +-- swh-fw2 --+t|
           pub ip2             |e|
                               +-+

Indeed, we might want to have the same redundancy for other networking path currently going through louvre.
This may also raise the question (already mentioned by @ftigeot IINM) of using a dedicated network-appliance-like machine as central router in Rocquencourt (I've been using Lanner brand without problem in my previous job, or it might also be a simple 1u dell machine).

I'm not questioning that the status quo needs to be improved here. We just need to keep in mind that adding complexity (be that different hardware, software high availability, ...) to the network setup means more moving parts, more maintenance overhead, more things to learn, more things to monitor.

My highest priority concern on the current setup is that everything is cobbled together by hand without much consistency or monitoring, and that only my head used to contain a coherent view of the full setup (now, it's my head and this ticket, yay).

All in all, I would really like our internal networking to move towards something that is:

  • declarative: we should have a somewhat self-documenting configuration for the topology of the network.
  • generic and consistent: we should be able to route across disparate networks, e.g. from different cloud providers, as well as on-premises networks and a VPN for admin tasks.
  • reliable and resilient: there must be some sort of mesh routing between our networks, so that things keep working when a link fails; networks must support having several edge routers