Thanks for validating,
I haven't changed the other docker-compose files because I didn't succeed to start them and not sure the storage part is still used.
As they are independent, we can do it in another diff without impacting the main docker-compose
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Nov 3 2020
We have performed with @ardumont several tests on the webapp, the vault, the deposit, the loaders and the listers and it seems everything is working well.
Nov 2 2020
the puppet agent was stopped since some time.
It was restarted and the webapp is now up to date :
Following the diff D4391, the zfs dsatasets were reconfigured tobe mounted on the /srv/softwareheritage/postgres/* :
systemctl stop postgresql@12-main zfs set mountpoint=none data/postgres-indexer-12 zfs set mountpoint=none data/postgres-secondary-12 zfs set mountpoint=none data/postgres-main-12 zfs set mountpoint=none data/postgres-misc
Staging
The staging is already up to date with the last tag., There is just the indexers packages which needs an update
factorize the base directory declaration to avoid duplication in the puppet code
Oct 30 2020
rebase
I have landed this one as it's accepted. I will prepare another ones for the other databases
Oct 29 2020
Check the right database availability
Using this, we can execute the "init-admin" command at each start so new superuser scripts can be executed during each restart toUsing this, we can execute the "init-admin" command at each start which can be useful when new super-user migrations are added
This is a poc for the scheduler, all the databases initialization could be changed this way when the diff on swh-core will land.
The configuration backup in git is configured[3].
The configuration should be committed on the iFWCFG[1] repository by the user swhfirewall (the credentials are in the credentials repository)
Oct 28 2020
- refactor the postgresql declaration to configure the main cluster instance
Oct 27 2020
The puppetlabs-postgresql module doesn't allow to manage several postgresql clusters. We have made the tradeoff to use only one cluster on db1 at the beginning to be able to deploy db1 via puppet as it's the priority. The module will be extended or replaced by something else later.
Oct 26 2020
For the puppet part, the actual staging configuration needs some adaptations as the configuration install postgresql on version 11 and 13. Another point is the different clusters are not managed by puppet but it's the same for the production.
- Create the postgresql:5434 dataset
zfs create data/postgres-secondary-12 -o mountpoint=/srv/softwareheritage/postgres/12/secondary
- Create the postgresql:5435 dataset
zfs create data/postgres-indexer-12 -o mountpoint=/srv/softwareheritage/postgres/12/indexer
Oct 23 2020
- All the servers are migrated to the new network 192.163.130.0/24.
- Netbox is up to date.
- The provisionning code was changed accordingly and applied
Update the state file after the terraform apply
The terraform apply works, the staging gw was removed and apparently without side effects on other servers
Thanks, i will try my first terraform apply 😬
Oct 22 2020
List of the rules created :
- icinga : Floating rule: icinga server -> *:icinga port (5665)
- prometheus: Floating rule: prometheurs server -> *:prometheus ports (9100/9102/9237/7071/9419)
- logstash/journal: VLAN440 rule: * -> logstash server:logstash_port (5044)
worker0 is migrated and reachable. the dns and icinga rules are well updated after puppet ran on worker0 and pergamon.
To update the server, I had to manually change the ip configuration and reboot it because puppet was failing as it was not able to determine the right ip in 192.168.130.0 network as the server was still associated to an ip in 192.168.128.0 :
root@worker0:~# puppet agent --test Info: Using configured environment 'staging' Info: Retrieving pluginfacts Info: Retrieving plugin Info: Retrieving locales Info: Loading facts Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Function Call, pick(): must receive at least one non empty value (file: /etc/puppet/code/environments/staging/site-modules/profile/manifests/prometheus/node.pp, line: 31, column: 28) on node worker0.internal.staging.swh.network Warning: Not using cache on failed catalog Error: Could not retrieve catalog; skipping run
The new rules have to be also manually declared on pergamon to reach the new networks.
Puppet declared them on the configuration but didn't reload the network :
root@pergamon:~# puppet agent --test Info: Using configured environment 'production' Info: Retrieving pluginfacts Info: Retrieving plugin Info: Retrieving locales Info: Loading facts Info: Caching catalog for pergamon.softwareheritage.org Info: Applying configuration version '1603355074' Notice: /Stage[main]/Profile::Network/Debnet::Iface[eth0]/Concat[/etc/network/interfaces]/File[/etc/network/interfaces]/content: --- /etc/network/interfaces 2020-09-15 16:10:15.235917411 +0000 +++ /tmp/puppet-file20201022-2531741-3gl773 2020-10-22 08:25:16.977289874 +0000 @@ -18,6 +18,8 @@ up ip route add 192.168.101.0/24 via 192.168.100.1 up ip route add 192.168.200.0/21 via 192.168.100.1 up ip route add 192.168.128.0/24 via 192.168.100.125 + up ip route add 192.168.130.0/24 via 192.168.100.130 + up ip route add 192.168.50.0/24 via 192.168.100.130 up ip rule add from 192.168.100.29 table private up ip route add 192.168.100.0/24 src 192.168.100.29 dev eth1 table private up ip route add default via 192.168.100.1 dev eth1 table private @@ -25,6 +27,8 @@ down ip route del default via 192.168.100.1 dev eth1 table private down ip route del 192.168.100.0/24 src 192.168.100.29 dev eth1 table private down ip rule del from 192.168.100.29 table private + down ip route del 192.168.50.0/24 via 192.168.100.130 + down ip route del 192.168.130.0/24 via 192.168.100.130 down ip route del 192.168.128.0/24 via 192.168.100.125 down ip route del 192.168.200.0/21 via 192.168.100.1 down ip route del 192.168.101.0/24 via 192.168.100.1
The first staging node will be migrated one by one to avoid too much noise in the monitoring and make the detection of the mission rules in the firewall easier. Puppet is disabled on all the staging node to avoid a massive migration :
Oct 21 2020
good news! thanks for the confirmation
After having some hard time to configure the initial firewall rules correctly due to the inter-vlan traffic seen as coming from the gateway address and not filtered, the fw rules allow the following facts :
- Route manually declared on louvre:
root@louvre:~# ip route add 192.168.130.0/24 via 192.168.100.130 dev ens18 root@louvre:~# ip route add 192.168.50.0/24 via 192.168.100.130 dev ens18 root@louvre:~# ip route default via 128.93.193.254 dev ens19 onlink 128.93.193.0/24 dev ens19 proto kernel scope link src 128.93.193.5 192.168.50.0/24 via 192.168.100.130 dev ens18 192.168.100.0/24 dev ens18 proto kernel scope link src 192.168.100.1 192.168.101.0/24 via 192.168.101.2 dev tun0 192.168.101.2 dev tun0 proto kernel scope link src 192.168.101.1 192.168.128.0/24 via 192.168.100.125 dev ens18 192.168.130.0/24 via 192.168.100.130 dev ens18
The route command is not installed on louvre as it's now replaced by ip.
the staging and staging_new properties were changed to staging_legacy and staging as you suggested. it's better this way.
Update after the review's feedbacks
Some rules needs to be declared to be able to reach the new networks through the firewall.
Oct 20 2020
Netbox updated accordingly : https://inventory.internal.softwareheritage.org/virtualization/virtual-machines/75/
VIPs configuration
On the FW UI, go to Interfaces / Virtual IPs / Settings
Add the following Virtual IPs :
- Mode CARP / interface VLAN440 / Address: 192.168.100.130/24 / Virtual IP Password: not significant / VHID Group : 1 / Description: VLAN440 gw wip
- Mode CARP / interface VLAN442 / Address: 192.168.50.1/24 / Virtual IP Password: not significant / VHID Group: 2 / Description: VLAN442 fw wip
- Mode CARP / interface: VLAN443 / Address: 192.168.130.1/24 / Virtual IP Password: not significant / VHID Group: 3/ Description: VLAN443 fw wip
- Mode CARP / interface: VLAN1300 / Address: 128.93.166.2/26 / Virtual IP Password: not significant / VHID Group: 4 / Description: VLAN1300 fw wip
The firewall was installed with an iso image OPNsense-20.7-OpenSSL-dvd-amd64.iso uploaded on the ceph-proxmox storage