The terraform apply works, the staging gw was removed and apparently without side effects on other servers
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Oct 23 2020
Thanks, i will try my first terraform apply 😬
Oct 22 2020
List of the rules created :
- icinga : Floating rule: icinga server -> *:icinga port (5665)
- prometheus: Floating rule: prometheurs server -> *:prometheus ports (9100/9102/9237/7071/9419)
- logstash/journal: VLAN440 rule: * -> logstash server:logstash_port (5044)
worker0 is migrated and reachable. the dns and icinga rules are well updated after puppet ran on worker0 and pergamon.
To update the server, I had to manually change the ip configuration and reboot it because puppet was failing as it was not able to determine the right ip in 192.168.130.0 network as the server was still associated to an ip in 192.168.128.0 :
root@worker0:~# puppet agent --test Info: Using configured environment 'staging' Info: Retrieving pluginfacts Info: Retrieving plugin Info: Retrieving locales Info: Loading facts Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Function Call, pick(): must receive at least one non empty value (file: /etc/puppet/code/environments/staging/site-modules/profile/manifests/prometheus/node.pp, line: 31, column: 28) on node worker0.internal.staging.swh.network Warning: Not using cache on failed catalog Error: Could not retrieve catalog; skipping run
The new rules have to be also manually declared on pergamon to reach the new networks.
Puppet declared them on the configuration but didn't reload the network :
root@pergamon:~# puppet agent --test Info: Using configured environment 'production' Info: Retrieving pluginfacts Info: Retrieving plugin Info: Retrieving locales Info: Loading facts Info: Caching catalog for pergamon.softwareheritage.org Info: Applying configuration version '1603355074' Notice: /Stage[main]/Profile::Network/Debnet::Iface[eth0]/Concat[/etc/network/interfaces]/File[/etc/network/interfaces]/content: --- /etc/network/interfaces 2020-09-15 16:10:15.235917411 +0000 +++ /tmp/puppet-file20201022-2531741-3gl773 2020-10-22 08:25:16.977289874 +0000 @@ -18,6 +18,8 @@ up ip route add 192.168.101.0/24 via 192.168.100.1 up ip route add 192.168.200.0/21 via 192.168.100.1 up ip route add 192.168.128.0/24 via 192.168.100.125 + up ip route add 192.168.130.0/24 via 192.168.100.130 + up ip route add 192.168.50.0/24 via 192.168.100.130 up ip rule add from 192.168.100.29 table private up ip route add 192.168.100.0/24 src 192.168.100.29 dev eth1 table private up ip route add default via 192.168.100.1 dev eth1 table private @@ -25,6 +27,8 @@ down ip route del default via 192.168.100.1 dev eth1 table private down ip route del 192.168.100.0/24 src 192.168.100.29 dev eth1 table private down ip rule del from 192.168.100.29 table private + down ip route del 192.168.50.0/24 via 192.168.100.130 + down ip route del 192.168.130.0/24 via 192.168.100.130 down ip route del 192.168.128.0/24 via 192.168.100.125 down ip route del 192.168.200.0/21 via 192.168.100.1 down ip route del 192.168.101.0/24 via 192.168.100.1
The first staging node will be migrated one by one to avoid too much noise in the monitoring and make the detection of the mission rules in the firewall easier. Puppet is disabled on all the staging node to avoid a massive migration :
Oct 21 2020
good news! thanks for the confirmation
After having some hard time to configure the initial firewall rules correctly due to the inter-vlan traffic seen as coming from the gateway address and not filtered, the fw rules allow the following facts :
- Route manually declared on louvre:
root@louvre:~# ip route add 192.168.130.0/24 via 192.168.100.130 dev ens18 root@louvre:~# ip route add 192.168.50.0/24 via 192.168.100.130 dev ens18 root@louvre:~# ip route default via 128.93.193.254 dev ens19 onlink 128.93.193.0/24 dev ens19 proto kernel scope link src 128.93.193.5 192.168.50.0/24 via 192.168.100.130 dev ens18 192.168.100.0/24 dev ens18 proto kernel scope link src 192.168.100.1 192.168.101.0/24 via 192.168.101.2 dev tun0 192.168.101.2 dev tun0 proto kernel scope link src 192.168.101.1 192.168.128.0/24 via 192.168.100.125 dev ens18 192.168.130.0/24 via 192.168.100.130 dev ens18
The route command is not installed on louvre as it's now replaced by ip.
the staging and staging_new properties were changed to staging_legacy and staging as you suggested. it's better this way.
Update after the review's feedbacks
Some rules needs to be declared to be able to reach the new networks through the firewall.
Oct 20 2020
Netbox updated accordingly : https://inventory.internal.softwareheritage.org/virtualization/virtual-machines/75/
VIPs configuration
On the FW UI, go to Interfaces / Virtual IPs / Settings
Add the following Virtual IPs :
- Mode CARP / interface VLAN440 / Address: 192.168.100.130/24 / Virtual IP Password: not significant / VHID Group : 1 / Description: VLAN440 gw wip
- Mode CARP / interface VLAN442 / Address: 192.168.50.1/24 / Virtual IP Password: not significant / VHID Group: 2 / Description: VLAN442 fw wip
- Mode CARP / interface: VLAN443 / Address: 192.168.130.1/24 / Virtual IP Password: not significant / VHID Group: 3/ Description: VLAN443 fw wip
- Mode CARP / interface: VLAN1300 / Address: 128.93.166.2/26 / Virtual IP Password: not significant / VHID Group: 4 / Description: VLAN1300 fw wip
The firewall was installed with an iso image OPNsense-20.7-OpenSSL-dvd-amd64.iso uploaded on the ceph-proxmox storage
Oct 19 2020
The test phase is achieved. OPNSense seems to have a consensus with no blocking points.
Let's start the real implementation now.
formating (fat finger)
formating
formating
rollback the network configuration commit (should be a new diff)
poc network configuration in markdown
Oct 16 2020
Oct 15 2020
There is a proxmox builder [1] for packer, I will give it a try to check if we can benefit of the work done for vagrant on puppet and have a common base between the real vms and the local vms used to test.
👍 it looks synchronized
Oct 14 2020
fix the wrong status change embedded with the previous comment
A prometheus exporter is available as an additional plugin.
The open vpn configuration support a certificat authority and csr stuff currently manually managed on louvre.
- IPSec / Azure configuration
I was not able to test the git backup plugin as it seems it's not yet released and it doesn't appear on the installable plugin list.
The commit for the version 1.0 was done 6 days ago : https://github.com/opnsense/plugins/commit/87c4c96fe1d1dc881f72f91ee67b6a84c9dea42a
I have also tested with the development version of pfsense but it also does not appear.
The HA was quite simple to configure with the documentation [1] and an additional blog post which helps with the nat section not very explicit in the official documentation [2]
It's recommended to have a dedicated network link between the 2 firewalls used to the synchronization. In the tests I have done, I configured the sync on the admin network (VLAN442). It works but it's not the optimal configuration.
Oct 13 2020
Well, I let this problem aside for the moment as there is nothing special configured for the interface on the VLAN1300 and I have no idea of what can be the source of the problem. Perhaps the "illumination" will come later...
Having the WAN gateway declared on the VLAN1330 is working well.
Changing the default gateway to 128.93.166.62 force to declare an additional route for the vpn connections (192.168.101.0/24 => gw 192.168.100.1).
PFSense and OPNsense were tested.
Oct 12 2020
@olasd I looked at the swh-docs repository to store the sources of the diagrams as you have suggested but I'm not sure this is the better place to store them as the goal is not to display them on the doc site.
LGTM (not tested)
Thanks, it's really great.
I have tested locally the qemutest vm and converted the staging-webapp and staging-deposit vms, everything looks good.
The virtualbox and libvirt networks (with the same ip range) can't cohabit together but after a cleanup on the virtualbox side, everything works as expected.
Oct 8 2020
Thanks, no changes are detected by terraform after this diff
Oct 7 2020
rebase
Link to a diff, not a task
fix a typo on the commit message
lgtm, with this, we will be able to update the staging environment without impacting the rest of the infra