Page MenuHomeSoftware Heritage

Create the reverse proxy to expose the staging services publicly
Closed, MigratedEdits Locked

Event Timeline

ardumont triaged this task as Normal priority.Nov 2 2020, 11:00 AM
ardumont created this task.

Current state of the frontend services (webapp, deposit) is described in the following graph:


source code: P852

Here is the desired staging reverse proxy schema:


source: P853

ardumont changed the task status from Open to Work in Progress.Nov 4 2020, 4:17 PM

Puppet web api roles (deposit, webapp) got reworked from [1] to [2] so their name and intent are clearer.

This now declares the following roles and deploys them accordingly (this does
not change anything in production):

  • swh_rp_webapp: webapp with reverse proxy as before (deployed on node webapp0.azure)
  • swh_rp_webapps: webapp and deposit with reverse proxy behavior as before (deployed on node moma)
  • swh_webapp: only webapp service, no more hitch/varnish (deployed on node webapp.staging)
  • swh_deposit: only deposit service no more hitch/varnish (deploy on node deposit.staging)

Introducing as expressed above the new swh_reverse_proxy role (for a dedicated
reverse proxy node to create). This is the node which will serve the purpose of
reverse proxy in front of swh_webapp and swh_deposit role for the staging
nodes.

All this is currently in the T2747_reverse_proxy git branch in the swh-site
repository (vagrant + octocatalog-diff happy so far, with no change on
production node \o/).

We need to create the new reverse proxy node now.

[1]



source code: P855

[2]



source code: P856

A step was achieve in the configuration. The staging services are now accessible from the internet from these addresses :

It remains some monitoring alerts to solve and the counters on the home page of the webapp which seem to display the production numbers (and off course to document the new environment)/

To solve the monitoring alerts [1], we tried to bypass the restriction between the VLAN210 and the VLAN1300 by adding a route between pergamon and VLAN1300 via the firewall (D4454).
The route is well created on pergamon but it seems to be ignored :

root@pergamon:~# traceroute 128.93.166.2
traceroute to 128.93.166.2 (128.93.166.2), 30 hops max, 60 byte packets
 1  louvre.internal.softwareheritage.org (192.168.100.1)  0.185 ms * *

It's the same for other routes :

root@pergamon:~# traceroute 192.168.130.10
traceroute to 192.168.130.10 (192.168.130.10), 30 hops max, 60 byte packets
 1  louvre.internal.softwareheritage.org (192.168.100.1)  0.168 ms * *
 2  pushkin.internal.softwareheritage.org (192.168.100.129)  0.331 ms  0.316 ms  0.307 ms
 3  pushkin.internal.softwareheritage.org (192.168.100.129)  0.426 ms  0.414 ms  0.400 ms

We have manually added a route on louvre but it's ignored due to a masquerading rule used by the internet gateway. We have stopped here to avoid breaking anything

root@louvre:~# ip route add 128.93.166.0/26 via 192.168.100.130 dev ens18
root@louvre:~# iptables -L -n -t nat
Chain PREROUTING (policy ACCEPT)
target     prot opt source               destination         

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         
MASQUERADE  all  --  192.168.100.0/24    !192.168.0.0/16      

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

@olasd we need to discuss with you to identify the best way to move forward

[1] https://icinga.softwareheritage.org/dashboard#!/monitoring/service/show?host=rp0.internal.staging.swh.network&service=swh deposit https certificate
https://icinga.softwareheritage.org/monitoring/list/services?service_problem=1&sort=service_severity#!/monitoring/service/show?host=rp0.internal.staging.swh.network&service=swh webapp https certificate

After double(at least) checking the routed on louvre is working well (the packets are not intercepted by the ip masquerade).
The problem was the DNAT rule on the firewall was not applied because the packets are not entering from the vtnet0 interface (they were simply lost). The DNAT rule was updated to be applied on the vtnet1 (VLAN440) and vtnet0 (VLAN1300) interfaces[1]. Pergamon can now reach the reverse proxy on ports 80/443

[1]: R228:292b3b551b6138d5815a66b3fc2541d21e69891b

NB: The routes on pergamon were not reverted because they should work directly. It's still a point to discuss

This is a schema in complement of the previous ones. It represent a more network oriented interaction between the server and the firewall :

Did a bit of cleanup in related sentry issues which should no longer come back.

Tests from hal-preprod plugged to our staging deposit went fine
Some deposits took place [1] and we can see the resulting ingested origins
in the staging webapp [2]

[1] P866

[2] https://webapp.staging.swh.network/browse/search/?q=preprod&with_visit=true&with_content=true

Email has been sent to swh-devel to announce it and to ask for feedback [1]

In the mean time, the work is done and this task can be closed \o/

[1] P867

ardumont claimed this task.