Revisions and Commits
Status | Assigned | Task | ||
---|---|---|---|---|
Migrated | gitlab-migration | T2650 Network refactoring - step 1 | ||
Migrated | gitlab-migration | T2747 Create the reverse proxy to expose the staging services publicly |
Event Timeline
Current state of the frontend services (webapp, deposit) is described in the following graph:
source code: P852
Reverse proxy referenced in the inventory app [1] [2]
[1] https://inventory.internal.softwareheritage.org/virtualization/virtual-machines/87/
[2] https://inventory.internal.softwareheritage.org/ipam/prefixes/8/ip-addresses/
Puppet web api roles (deposit, webapp) got reworked from [1] to [2] so their name and intent are clearer.
This now declares the following roles and deploys them accordingly (this does
not change anything in production):
- swh_rp_webapp: webapp with reverse proxy as before (deployed on node webapp0.azure)
- swh_rp_webapps: webapp and deposit with reverse proxy behavior as before (deployed on node moma)
- swh_webapp: only webapp service, no more hitch/varnish (deployed on node webapp.staging)
- swh_deposit: only deposit service no more hitch/varnish (deploy on node deposit.staging)
Introducing as expressed above the new swh_reverse_proxy role (for a dedicated
reverse proxy node to create). This is the node which will serve the purpose of
reverse proxy in front of swh_webapp and swh_deposit role for the staging
nodes.
All this is currently in the T2747_reverse_proxy git branch in the swh-site
repository (vagrant + octocatalog-diff happy so far, with no change on
production node \o/).
We need to create the new reverse proxy node now.
[1]
source code: P855
[2]
source code: P856
A step was achieve in the configuration. The staging services are now accessible from the internet from these addresses :
- webapp : https://webapp.staging.swh.network
- deposit: https://deposit.staging.swh.network
It remains some monitoring alerts to solve and the counters on the home page of the webapp which seem to display the production numbers (and off course to document the new environment)/
To solve the monitoring alerts [1], we tried to bypass the restriction between the VLAN210 and the VLAN1300 by adding a route between pergamon and VLAN1300 via the firewall (D4454).
The route is well created on pergamon but it seems to be ignored :
root@pergamon:~# traceroute 128.93.166.2 traceroute to 128.93.166.2 (128.93.166.2), 30 hops max, 60 byte packets 1 louvre.internal.softwareheritage.org (192.168.100.1) 0.185 ms * *
It's the same for other routes :
root@pergamon:~# traceroute 192.168.130.10 traceroute to 192.168.130.10 (192.168.130.10), 30 hops max, 60 byte packets 1 louvre.internal.softwareheritage.org (192.168.100.1) 0.168 ms * * 2 pushkin.internal.softwareheritage.org (192.168.100.129) 0.331 ms 0.316 ms 0.307 ms 3 pushkin.internal.softwareheritage.org (192.168.100.129) 0.426 ms 0.414 ms 0.400 ms
We have manually added a route on louvre but it's ignored due to a masquerading rule used by the internet gateway. We have stopped here to avoid breaking anything
root@louvre:~# ip route add 128.93.166.0/26 via 192.168.100.130 dev ens18
root@louvre:~# iptables -L -n -t nat Chain PREROUTING (policy ACCEPT) target prot opt source destination Chain INPUT (policy ACCEPT) target prot opt source destination Chain POSTROUTING (policy ACCEPT) target prot opt source destination MASQUERADE all -- 192.168.100.0/24 !192.168.0.0/16 Chain OUTPUT (policy ACCEPT) target prot opt source destination
@olasd we need to discuss with you to identify the best way to move forward
[1] https://icinga.softwareheritage.org/dashboard#!/monitoring/service/show?host=rp0.internal.staging.swh.network&service=swh deposit https certificate
https://icinga.softwareheritage.org/monitoring/list/services?service_problem=1&sort=service_severity#!/monitoring/service/show?host=rp0.internal.staging.swh.network&service=swh webapp https certificate
After double(at least) checking the routed on louvre is working well (the packets are not intercepted by the ip masquerade).
The problem was the DNAT rule on the firewall was not applied because the packets are not entering from the vtnet0 interface (they were simply lost). The DNAT rule was updated to be applied on the vtnet1 (VLAN440) and vtnet0 (VLAN1300) interfaces[1]. Pergamon can now reach the reverse proxy on ports 80/443
[1]: R228:292b3b551b6138d5815a66b3fc2541d21e69891b
NB: The routes on pergamon were not reverted because they should work directly. It's still a point to discuss
This is a schema in complement of the previous ones. It represent a more network oriented interaction between the server and the firewall :
Tests from hal-preprod plugged to our staging deposit went fine
Some deposits took place [1] and we can see the resulting ingested origins
in the staging webapp [2]
[1] P866
[2] https://webapp.staging.swh.network/browse/search/?q=preprod&with_visit=true&with_content=true
Email has been sent to swh-devel to announce it and to ask for feedback [1]
In the mean time, the work is done and this task can be closed \o/
[1] P867