Page MenuHomeSoftware Heritage

Network refactoring - step 1
Closed, MigratedEdits Locked

Description

[Results of the meetings done with @ardumont and @olasd]

The current network architecture has several problems:

  • any server can reach any other regardless of the environments
  • the routing management is complicated due to intervlan routing (staging <-> production <-> public)
  • several servers have public interface "exposed" to internet via the public vlan as they should be more protected by the fw/rp for security reasons

New needs also appeared and they can't be easily addressed with the current network organization:

  • staging / production isolation to avoid undesirable interactions between this 2 environments
  • Opening the staging to the external world to allow external people to test their contributions
  • Allowing external people to access the staging environment via the VPN (without being able to access the production or admin servers)
  • [and probably many others]

The current network architecture is:

The goal is to progress steps by step and to validate the new architecture by testing it on non critical services (staging, new services like the inventory app)

The proposed solution is to:

  • manage the inter-vlan communication with a global firewall to centralize the routing and filtering informations
  • migrate the VPN and IPSec tunnel to this new firewall
  • manage the public entrypoints (web sites, monitoring, ...) with reverse proxies, one per environment

The proposal for the step one:

The identified tasks are:

  • Convert the unused ceph's vlan (VLAN442) to an admin vlan. The target will be to migrate the admin / monitoring / logs / internal servers into this vlan to isolate them from the production of the archive
  • Choose and install a software firewall, pfsense or opnsense can be good candidates. Elected: opnsense
  • Create an install the netbox server
  • Create an admin reverse proxy exposing the netbox service (internal only)
  • Convert the servers used for the ceph cluster to the new staging environment (database + proxmox, to confirm and detail)
    • storage1
    • db1
    • ceph-mon: in-progress
  • T2721: Deploy the staging services on the new staging environment (to detail on several subtasks)
  • T2747: Create the reverse proxy to expose the staging services publicly

For the record, this is a copy of the whiteboard after the metting:

Event Timeline

zack triaged this task as Normal priority.Oct 1 2020, 1:24 PM

@olasd I looked at the swh-docs repository to store the sources of the diagrams as you have suggested but I'm not sure this is the better place to store them as the goal is not to display them on the doc site.

at this time, I will only attach them to this task.

vsellier changed the task status from Open to Work in Progress.Oct 13 2020, 9:52 AM
vsellier changed the status of subtask T2755: Monitor the firewalls from Open to Work in Progress.Nov 9 2020, 11:04 AM

The network configuration is done and the staging archive and deposit are now exposed publicly. The principal goal of the task is achieve.
The staging VMs could be moved to their dedicated hypervisor when it will be available, finally it's not a mandatory step for this task as we were able to use the existing hypervisors.

gitlab-migration changed the status of subtask T2721: Install and configure a firewall for the staging environment from Resolved to Migrated.
gitlab-migration changed the status of subtask T2747: Create the reverse proxy to expose the staging services publicly from Resolved to Migrated.
gitlab-migration changed the status of subtask T2755: Monitor the firewalls from Resolved to Migrated.