Page MenuHomeSoftware Heritage
Feed Advanced Search

Nov 20 2020

vsellier created D4539: Add mandatory cloud storage configuration on bare metal storage servers.
Nov 20 2020, 10:24 AM
vsellier added a revision to T2796: 2020-11-18 Datacenter operations in Rocquencourt: D4539: Add mandatory cloud storage configuration on bare metal storage servers.
Nov 20 2020, 10:24 AM · System administration
vsellier committed rSENV1e0ad13a1dba: Update octocatalog-diff facts (authored by vsellier).
Update octocatalog-diff facts
Nov 20 2020, 10:22 AM
vsellier committed rSENVd1aef9871d78: vagrant: Add staging-journal0 host (authored by vsellier).
vagrant: Add staging-journal0 host
Nov 20 2020, 10:22 AM
vsellier closed D4496: vagrant: Add staging-journal0 host.
Nov 20 2020, 10:22 AM
vsellier added a comment to T2796: 2020-11-18 Datacenter operations in Rocquencourt.
  • The configuration was applied on moma
  • a manual import was performed on worker01 :
    • the /etc/softwareheritage/loader_git.yaml config was updated:
root@worker01:/etc/softwareheritage# diff -U3 /tmp/loader_git.yml loader_git.yml 
--- /tmp/loader_git.yml	2020-11-20 08:43:18.682462213 +0000
+++ loader_git.yml	2020-11-20 08:44:00.150375756 +0000
@@ -13,7 +13,7 @@
   - cls: filter
   - cls: remote
     args:
-      url: http://uffizi.internal.softwareheritage.org:5002/
+      url: http://saam.internal.softwareheritage.org:5002/
 max_content_size: 104857600
 save_data: false
 save_data_path: "/srv/storage/space/data/sharded_packfiles"
  • the import was run on the puppet-swh-site repository:
root@worker01:/etc/softwareheritage# sudo -u swhworker SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_git.yml swh loader run git https://github.com/SoftwareHeritage/puppet-swh-site

The first try returns this exception :

swh.core.api.RemoteException: <RemoteException 500 ValueError: ["Storage class azure-prefixed is not available: No module named 'swh.objstorage.backends.azure'"]>
Nov 20 2020, 9:48 AM · System administration
vsellier closed D4516: Move archive storage to a new server.
Nov 20 2020, 9:24 AM
vsellier committed rSPSITE2830795f6f7a: Move archive storage to a new server (authored by vsellier).
Move archive storage to a new server
Nov 20 2020, 9:24 AM
vsellier updated the diff for D4516: Move archive storage to a new server.

rebase

Nov 20 2020, 9:24 AM
vsellier updated the diff for D4516: Move archive storage to a new server.

rebase

Nov 20 2020, 9:21 AM
vsellier added a comment to D4534: Kafka needs a jre to run.
In D4534#113059, @olasd wrote:

We use the puppet java module in a bunch of other places, maybe it makes sense to directly import that (which would mean using include ::java)?

Nov 20 2020, 9:10 AM
vsellier updated the test plan for D4534: Kafka needs a jre to run.
Nov 20 2020, 9:09 AM
vsellier updated the diff for D4534: Kafka needs a jre to run.

Use ::java instead of directly install the jre package

Nov 20 2020, 9:08 AM

Nov 19 2020

vsellier accepted D4531: profile::mountpoints: only create directories if the mountpoint is enabled.
Nov 19 2020, 4:08 PM
vsellier accepted D4528: Carry over uffizi local storage/objstorage configs to saam.
Nov 19 2020, 4:05 PM
vsellier accepted D4532: Add mountpoints for saam.

LGTM

Nov 19 2020, 3:55 PM
vsellier added inline comments to D4531: profile::mountpoints: only create directories if the mountpoint is enabled.
Nov 19 2020, 3:55 PM
vsellier added a revision to T2790: [staging] deploy the journal infrastructure: D4534: Kafka needs a jre to run.
Nov 19 2020, 3:27 PM · System administration, Staging environment
vsellier created D4534: Kafka needs a jre to run.
Nov 19 2020, 3:27 PM
vsellier added a comment to D4497: Manage the parent directories of the kafka logdirs.

I will not land this now, it seems there is another issue with the startup of kafka when the logdir is already existing but empty (i.e. created by puppet). I need to dig further

Nov 19 2020, 2:36 PM
vsellier accepted D4523: Multipath setup for saam.

The systemd configuration looks good.

Nov 19 2020, 2:11 PM
vsellier added a comment to D4516: Move archive storage to a new server.

Looks fine to me except for the vagrant bit.

thanks, it's fixed

Nov 19 2020, 9:52 AM
vsellier updated the diff for D4516: Move archive storage to a new server.

fix misdirected removal

Nov 19 2020, 9:50 AM
vsellier added a comment to D4517: Run 'docker-compose up' in the background instead of detached, to show logs..

up -d is waiting the containers are "started" before returning the hand so you are sure the execs on line 169- can be executed.
You will also miss the return code of the docker-compose up command

Nov 19 2020, 9:42 AM

Nov 18 2020

vsellier created D4516: Move archive storage to a new server.
Nov 18 2020, 9:32 PM
vsellier committed rSPSITEdec921518d0b: declare the new proxmox hypervisor (authored by vsellier).
declare the new proxmox hypervisor
Nov 18 2020, 4:16 PM
vsellier committed rSPSITE61c61c95a68e: Declare new storage server (authored by vsellier).
Declare new storage server
Nov 18 2020, 2:50 PM

Nov 17 2020

vsellier added a revision to T2790: [staging] deploy the journal infrastructure: D4497: Manage the parent directories of the kafka logdirs.
Nov 17 2020, 6:20 PM · System administration, Staging environment
vsellier created D4497: Manage the parent directories of the kafka logdirs.
Nov 17 2020, 6:20 PM
vsellier added a revision to T2790: [staging] deploy the journal infrastructure: D4496: vagrant: Add staging-journal0 host.
Nov 17 2020, 5:36 PM · System administration, Staging environment
vsellier created D4496: vagrant: Add staging-journal0 host.
Nov 17 2020, 5:36 PM
vsellier added a comment to T2790: [staging] deploy the journal infrastructure.

Rectification : kafka is installed on the node but it seems the configuration is not complete

Nov 17 2020, 5:30 PM · System administration, Staging environment
vsellier changed the status of T2790: [staging] deploy the journal infrastructure, a subtask of T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage), from Open to Work in Progress.
Nov 17 2020, 3:09 PM · Staging environment, System administration
vsellier changed the status of T2790: [staging] deploy the journal infrastructure from Open to Work in Progress.
Nov 17 2020, 3:09 PM · System administration, Staging environment
vsellier added a project to T2790: [staging] deploy the journal infrastructure: System administration.
Nov 17 2020, 2:53 PM · System administration, Staging environment
vsellier added a project to T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage): Staging environment.
Nov 17 2020, 2:53 PM · Staging environment, System administration
vsellier added a project to T2790: [staging] deploy the journal infrastructure: Staging environment.
Nov 17 2020, 2:53 PM · System administration, Staging environment
vsellier added a subtask for T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage): T2790: [staging] deploy the journal infrastructure.
Nov 17 2020, 2:53 PM · Staging environment, System administration
vsellier added a parent task for T2790: [staging] deploy the journal infrastructure: T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage).
Nov 17 2020, 2:53 PM · System administration, Staging environment
vsellier triaged T2790: [staging] deploy the journal infrastructure as Normal priority.
Nov 17 2020, 2:52 PM · System administration, Staging environment
vsellier added a comment to T2733: Explore / install a varnish prometheus probe.

The varnish logs should be also ingested to elasticsearch to have fine grained statistics.

Nov 17 2020, 2:42 PM · Metrics/monitoring, System administration
vsellier triaged T2787: Improve access_logs parsing as Normal priority.
Nov 17 2020, 12:36 PM · System administration, Metrics/monitoring
vsellier added a project to T2733: Explore / install a varnish prometheus probe: Metrics/monitoring.
Nov 17 2020, 11:54 AM · Metrics/monitoring, System administration
vsellier closed T2606: Test puppet configuration in a local vagrant environment as Resolved.
  • adapt the configuration to be able to test locally without interference with the other environments :

The /etc/hosts files of the vagrant vms are configured to declare local ips for the service they are using [1] . It's not a strong security but it works for the moment.
A strongest security will be put in place when the admin servers will be moved to the admin network, the network could be filtered to ensure such local vms can't interact with real production servers

Nov 17 2020, 11:37 AM · System administration
vsellier updated the task description for T2606: Test puppet configuration in a local vagrant environment.
Nov 17 2020, 11:30 AM · System administration
vsellier closed T2650: Network refactoring - step 1 as Resolved.

The network configuration is done and the staging archive and deposit are now exposed publicly. The principal goal of the task is achieve.
The staging VMs could be moved to their dedicated hypervisor when it will be available, finally it's not a mandatory step for this task as we were able to use the existing hypervisors.

Nov 17 2020, 11:28 AM · System administration
vsellier closed T2755: Monitor the firewalls, a subtask of T2650: Network refactoring - step 1, as Resolved.
Nov 17 2020, 9:54 AM · System administration
vsellier closed T2755: Monitor the firewalls as Resolved.
Nov 17 2020, 9:54 AM · System administration
vsellier added a comment to T2755: Monitor the firewalls.

The metric are well ingested by prometheus and the hosts availability is checked by icinga.
A basic dashboard was created in grafana[1] with the following information for both firewall :

  • uptime
  • load
  • memory stats
  • partitions stats
  • network traffic for each interface
Nov 17 2020, 9:40 AM · System administration
vsellier committed rSENV4ad1c6735cd6: vagrant: Add pergamon host (authored by vsellier).
vagrant: Add pergamon host
Nov 17 2020, 9:14 AM
vsellier closed D4486: vagrant: Add pergamon host.
Nov 17 2020, 9:14 AM

Nov 16 2020

vsellier created D4486: vagrant: Add pergamon host.
Nov 16 2020, 7:08 PM
vsellier accepted D4485: Reload the icinga2 service when a config file gets dropped by recursion.

LGTM, it works in vagrant with the firewalls configuration :

==> pergamon: Notice: /Stage[main]/Profile::Icinga2::Master/File[/etc/icinga2/zones.d/master/pushkin.internal.softwareheritage.org.conf]/ensure: removed
==> pergamon: Info: /etc/icinga2/zones.d/master: Scheduling refresh of Class[Icinga2::Service]
Nov 16 2020, 7:02 PM
vsellier committed rSPSITE206f57fe9998: prometheus: Well categorize the firewall metrics (authored by vsellier).
prometheus: Well categorize the firewall metrics
Nov 16 2020, 6:39 PM
vsellier closed D4484: prometheus: Well categorize the firewall metrics.
Nov 16 2020, 6:39 PM
vsellier added a comment to D4484: prometheus: Well categorize the firewall metrics.

I have created the diff for information but will land it quickly to fix the prometheus configuration ASAP.

Nov 16 2020, 6:38 PM
vsellier added a revision to T2755: Monitor the firewalls: D4484: prometheus: Well categorize the firewall metrics.
Nov 16 2020, 6:36 PM · System administration
vsellier created D4484: prometheus: Well categorize the firewall metrics.
Nov 16 2020, 6:36 PM
vsellier committed rSPSITE0f75022e2364: Factorize the firewall properties (authored by vsellier).
Factorize the firewall properties
Nov 16 2020, 4:08 PM
vsellier closed D4482: Factorize the firewall properties.
Nov 16 2020, 4:08 PM
vsellier updated the diff for D4482: Factorize the firewall properties.

fix formating

Nov 16 2020, 4:07 PM
vsellier updated the diff for D4482: Factorize the firewall properties.

replace the lost lookup by an alias

Nov 16 2020, 4:03 PM
vsellier added a revision to T2755: Monitor the firewalls: D4482: Factorize the firewall properties.
Nov 16 2020, 3:59 PM · System administration
vsellier created D4482: Factorize the firewall properties.
Nov 16 2020, 3:59 PM
vsellier committed rSPSITEb68b8ba48e39: Declare the mandatory icinga host (authored by vsellier).
Declare the mandatory icinga host
Nov 16 2020, 2:18 PM
vsellier closed D4477: Declare the mandatory icinga host.
Nov 16 2020, 2:18 PM
vsellier added a revision to T2755: Monitor the firewalls: D4477: Declare the mandatory icinga host.
Nov 16 2020, 1:03 PM · System administration
vsellier created D4477: Declare the mandatory icinga host.
Nov 16 2020, 1:03 PM
vsellier committed rSPSITE8b3eebe739ba: Configure firewalls monitoring (authored by vsellier).
Configure firewalls monitoring
Nov 16 2020, 12:23 PM
vsellier closed D4453: Grab firewalls metrics via prometheus.
Nov 16 2020, 12:23 PM
vsellier committed rSENV3eb4fd587aa1: Ensure the puppet code is always up-to-date on the vms (authored by vsellier).
Ensure the puppet code is always up-to-date on the vms
Nov 16 2020, 12:17 PM
vsellier updated the diff for D4453: Grab firewalls metrics via prometheus.

rebase and remove unnecessary spaces

Nov 16 2020, 11:57 AM

Nov 10 2020

vsellier updated the diff for D4460: staging: Fix internal webapp and deposit communication.

use https

Nov 10 2020, 8:23 PM
vsellier added a comment to D4460: staging: Fix internal webapp and deposit communication.

It fixes problems to reach the public ip from the internal network.
Feel free to land it if it looks good to you

Nov 10 2020, 8:21 PM
vsellier added a revision to T2747: Create the reverse proxy to expose the staging services publicly: D4460: staging: Fix internal webapp and deposit communication.
Nov 10 2020, 8:19 PM · System administration
vsellier created D4460: staging: Fix internal webapp and deposit communication.
Nov 10 2020, 8:19 PM
vsellier updated the diff for D4453: Grab firewalls metrics via prometheus.
  • Add webui checks on icinga
  • Rename the puppet class to something more generic as it's not only dedicated to prometheus configuration
Nov 10 2020, 7:16 PM
vsellier updated the diff for D4453: Grab firewalls metrics via prometheus.

rebase

Nov 10 2020, 6:53 PM
vsellier added a comment to T2747: Create the reverse proxy to expose the staging services publicly.

This is a schema in complement of the previous ones. It represent a more network oriented interaction between the server and the firewall :

Nov 10 2020, 6:47 PM · System administration
vsellier committed rSPSITE19a95b2ca2b2: staging monitoring: Fix vhost computation to use public vhost (authored by vsellier).
staging monitoring: Fix vhost computation to use public vhost
Nov 10 2020, 4:02 PM
vsellier closed D4456: staging monitoring: Fix vhost computation to use public vhost.
Nov 10 2020, 4:02 PM
vsellier added a revision to T2747: Create the reverse proxy to expose the staging services publicly: D4456: staging monitoring: Fix vhost computation to use public vhost.
Nov 10 2020, 3:54 PM · System administration
vsellier created D4456: staging monitoring: Fix vhost computation to use public vhost.
Nov 10 2020, 3:54 PM
vsellier added a comment to T2747: Create the reverse proxy to expose the staging services publicly.

After double(at least) checking the routed on louvre is working well (the packets are not intercepted by the ip masquerade).
The problem was the DNAT rule on the firewall was not applied because the packets are not entering from the vtnet0 interface (they were simply lost). The DNAT rule was updated to be applied on the vtnet1 (VLAN440) and vtnet0 (VLAN1300) interfaces[1]. Pergamon can now reach the reverse proxy on ports 80/443

Nov 10 2020, 2:50 PM · System administration
vsellier added a comment to T2747: Create the reverse proxy to expose the staging services publicly.

To solve the monitoring alerts [1], we tried to bypass the restriction between the VLAN210 and the VLAN1300 by adding a route between pergamon and VLAN1300 via the firewall (D4454).
The route is well created on pergamon but it seems to be ignored :

root@pergamon:~# traceroute 128.93.166.2
traceroute to 128.93.166.2 (128.93.166.2), 30 hops max, 60 byte packets
 1  louvre.internal.softwareheritage.org (192.168.100.1)  0.185 ms * *

It's the same for other routes :

root@pergamon:~# traceroute 192.168.130.10
traceroute to 192.168.130.10 (192.168.130.10), 30 hops max, 60 byte packets
 1  louvre.internal.softwareheritage.org (192.168.100.1)  0.168 ms * *
 2  pushkin.internal.softwareheritage.org (192.168.100.129)  0.331 ms  0.316 ms  0.307 ms
 3  pushkin.internal.softwareheritage.org (192.168.100.129)  0.426 ms  0.414 ms  0.400 ms
Nov 10 2020, 12:30 PM · System administration
vsellier committed rSPSITEe25589e40654: network: Add an internal route to the public swh network (authored by vsellier).
network: Add an internal route to the public swh network
Nov 10 2020, 11:04 AM
vsellier closed D4454: network: Add an internal route to the public swh network.
Nov 10 2020, 11:04 AM
vsellier added a revision to T2747: Create the reverse proxy to expose the staging services publicly: D4454: network: Add an internal route to the public swh network.
Nov 10 2020, 11:01 AM · System administration
vsellier created D4454: network: Add an internal route to the public swh network.
Nov 10 2020, 11:01 AM
vsellier updated the diff for D4453: Grab firewalls metrics via prometheus.

Fix indentation

Nov 10 2020, 9:53 AM
vsellier added a comment to T2747: Create the reverse proxy to expose the staging services publicly.

A step was achieve in the configuration. The staging services are now accessible from the internet from these addresses :

Nov 10 2020, 9:27 AM · System administration

Nov 9 2020

vsellier committed rSENV7486bc9c27ad: Update octocatalog-diff facts (authored by vsellier).
Update octocatalog-diff facts
Nov 9 2020, 6:26 PM
vsellier added a revision to T2755: Monitor the firewalls: D4453: Grab firewalls metrics via prometheus.
Nov 9 2020, 6:25 PM · System administration
vsellier created D4453: Grab firewalls metrics via prometheus.
Nov 9 2020, 6:25 PM
vsellier accepted D4449: Drop staging-rp-{webapp,deposit} which are declared in gandi.

we don't need it because pergamon is not managing the first level of swh.network and declaring such entries avoid puppet to test and update the dns configuration as your paste P862 shows it.

Nov 9 2020, 2:46 PM
vsellier accepted D4447: Declare dns records on rp0/webapp/deposit nodes.

LGTM

Nov 9 2020, 12:08 PM
vsellier changed the status of T2755: Monitor the firewalls from Open to Work in Progress.
Nov 9 2020, 11:04 AM · System administration
vsellier changed the status of T2755: Monitor the firewalls, a subtask of T2650: Network refactoring - step 1, from Open to Work in Progress.
Nov 9 2020, 11:04 AM · System administration
vsellier committed rSPSITE47d0ec201bc8: Override host ips in vagrant environment (authored by vsellier).
Override host ips in vagrant environment
Nov 9 2020, 10:28 AM
vsellier closed D4445: Override host ips in vagrant environment.
Nov 9 2020, 10:28 AM