Impacts after migration:
- [1] still reachable as before
- the machine shall be reached at riverside.internal.admin.swh.network (ssh).
Note:
Node exposing sentry service: riverside.internal.softwareheritage.org [2].
[1] https://sentry.softwareheritage.org
[2] https://inventory.internal.admin.swh.network/virtualization/virtual-machines/12/
Step-by-step plan:
- Gandi: Reduce sentry.s.o CNAME ttl early (days before migration starts, e.g. ~300s)
- Inventory:
- Reserve new ip in vlan 442
- Deprecate the ip from vlan 440
- D7045: Puppet manifest adaptations for moving the node to the admin vlan [4]
- Firewall: Open rule to allow access from pergamon to riverside:9000
- On {pergamon, riverside, rp1} [5]
- Stop puppet agent
- On pergamon
- Deploy new puppet manifest change (last time we forgot ¯\_(ツ)_/¯)
- On riverside:
- Update the ip to the new vlan442 ip (192.168.50.70)
- Connect through ssh and adapt /etc/network/interfaces with new ip
- Modify directly through the proxmox ui (not terraform-ed yet)
- Adapt hardware entry about network (proxmox ui) to change from vmbr0 to vmbr442
- Update the hostname to riverside.i.a.s.n
- Remove the puppet certificates rm -rf /var/lib/puppet/ssl (agent node)
- Update facts deployment and subnets /etc/facter/facts.d/deployment.txt to admin [6]
- Reboot machine (poweroff, start)
- Run puppet with puppet agent --test --fqdn riverside.internal.admin.swh.network
-
Install necessary facts for cloud-init to stop tampering with /etc/hosts
- Update the ip to the new vlan442 ip (192.168.50.70)
- On pergamon:
- Run puppet agent
- Decommission riverside.i.s.o certificate
- On rp1:
- Run puppet agent
- Gandi: Change sentry.s.o CNAME value from pergamon to swh-rproxy3.inria.fr. (to target the admin reverse proxy)
- Inventory:
- Change the reserved ip status to active
- Update sentry node with its new ip [1]
- Clean up no longer necessary sentry reverse proxy on pergamon
- Gandi: Bump the sentry.s.o CNAME TTL to "standard" value of 1800 (like the others)
- Terraform:
-
Reference riverside node in sysadm terraform admin manifest [3]node is diverging too much, the risk/benefit seems off so we don't do it.
-
[4] Check the diff description/code for more details
[5]
$ clush -b -w pergamon -w riverside -w rp1.internal.admin.swh.network "puppet agent --disable T3891"
[6]
root@riverside:~# cat /etc/facter/facts.d/deployment.txt deployment=admin root@riverside:~# cat /etc/facter/facts.d/subnet.txt subnet=sesi_rocquencourt_admin