Louvre is no longer able to run all existing VMs.
A third hypervisor has to be added to our infrastructure in order to continue to operate smoothly when one of them is out of service.
Description
Description
Status | Assigned | Task | ||
---|---|---|---|---|
Migrated | gitlab-migration | T1173 Huge slowdowns on louvre since 2018-08-20 | ||
Migrated | gitlab-migration | T1392 Add a new hypervisor | ||
Migrated | gitlab-migration | T1503 Rename hypervisor3 to a museum name |
Event Timeline
Comment Actions
New hypervisor hardware has been racked in our bay at Rocquencourt.
The machine's iDrac management interface is accessible on the management network, under the name swh7-adm.inria.fr (details on the wiki).
Comment Actions
I've reinstalled the machine following these steps:
- Debian installed on the machine with the plain debian installer (no bonding support => no network)
- network configured with iproute2 (with the bond + vlan stack)
- https://baturin.org/docs/iproute2/ was quite useful
- install facter from stretch backports, install puppet from stretch
- /etc/facter/facts.d seeded
- puppet run
- install ifenslave, bridge-utils and vlan to get the proper /etc/network/interfaces scripts
- write proper /etc/network/interfaces with inspiration from louvre
- reboot and hope the network comes up. rinse, repeat.
- setup proxmox as advised in https://pve.proxmox.com/wiki/Install_Proxmox_VE_on_Debian_Stretch
- reboot on the proxmox kernel
- Update /etc/pve/corosync.conf with the new host on one of the existing hypervisors
- add node to nodelist with a new nodeid
- add unicast ip address to ring0 in the totem section
- increment config version number in the totem section
- restart pve-cluster on existing nodes to take the new corosync config into account
- follow https://pve.proxmox.com/wiki/Cluster_Manager#pvecm_join_node_to_cluster
- pvecm add 192.168.100.1
- type root password for louvre
- see the host appear in the proxmox cluster
- reboot one last time for good measure (and to see whether everything starts ok on boot)
Comment Actions
The new hypervisor has been working without any particular issue since its installation.