Page MenuHomeSoftware Heritage
Feed Advanced Search

Dec 16 2021

vsellier committed rSPREbc6effcf7b4f: staging: increase objstorage0 memory to avoid OOM on big contents (authored by vsellier).
staging: increase objstorage0 memory to avoid OOM on big contents
Dec 16 2021, 10:32 AM
vsellier accepted D6833: Clean up psql version to default to globally defined version.

LGTM thanks

Dec 16 2021, 9:54 AM

Dec 15 2021

vsellier moved T3778: The docker-dev build is often failing from in-progress to deployed/landed/monitoring on the System administration board.
Dec 15 2021, 5:09 PM · System administration
vsellier closed T3795: update terraform recipe to configure the vm's cpu type to default and support onboot option as Resolved.
Dec 15 2021, 5:08 PM · System administration
vsellier committed rSPREc2c80579a436: staging: add missing args for 2 of the staging workers (authored by vsellier).
staging: add missing args for 2 of the staging workers
Dec 15 2021, 5:08 PM
vsellier closed D6845: terraform: specify the cpu to kvm64 by default.
Dec 15 2021, 5:08 PM
vsellier committed rSPRE3c005a3610bf: terraform: specify the cpu to kvm64 by default (authored by vsellier).
terraform: specify the cpu to kvm64 by default
Dec 15 2021, 5:08 PM
vsellier accepted D6844: Add documentation for bootstrapping the Debian branches of a SWH package.

Thanks, it will be very useful ;)

Dec 15 2021, 5:06 PM
vsellier updated the test plan for D6845: terraform: specify the cpu to kvm64 by default.
Dec 15 2021, 4:49 PM
vsellier retitled D6845: terraform: specify the cpu to kvm64 by default from terraform: specigy the cpu to kvm64 by default to terraform: specify the cpu to kvm64 by default.
Dec 15 2021, 4:49 PM
vsellier updated the diff for D6845: terraform: specify the cpu to kvm64 by default.

fix a type on the commit message

Dec 15 2021, 4:49 PM
vsellier added a revision to T3795: update terraform recipe to configure the vm's cpu type to default and support onboot option: D6845: terraform: specify the cpu to kvm64 by default.
Dec 15 2021, 4:44 PM · System administration
vsellier requested review of D6845: terraform: specify the cpu to kvm64 by default.
Dec 15 2021, 4:44 PM
vsellier committed rSPRE6008e41c85f0: staging: configure onboot=false for workers and POCs (authored by vsellier).
staging: configure onboot=false for workers and POCs
Dec 15 2021, 4:07 PM
vsellier renamed T3795: update terraform recipe to configure the vm's cpu type to default and support onboot option from update terraform recipe to configure the vm's cpu type to default to update terraform recipe to configure the vm's cpu type to default and support onboot option.
Dec 15 2021, 4:06 PM · System administration
vsellier closed T3806: terraform: upgrade proxmox provider to last release as Resolved.

proxmox provider updated to v2.9.3

Dec 15 2021, 3:20 PM · System administration
vsellier closed D6840: terraform: update the proxmox provider.
Dec 15 2021, 3:20 PM
vsellier committed rSPRE5cf38552a3b5: terraform: update the proxmox provider (authored by vsellier).
terraform: update the proxmox provider
Dec 15 2021, 3:20 PM
vsellier updated the diff for D6840: terraform: update the proxmox provider.

Remove last references to storage_type

Dec 15 2021, 3:19 PM
vsellier added inline comments to D6840: terraform: update the proxmox provider.
Dec 15 2021, 3:10 PM
vsellier requested review of D6840: terraform: update the proxmox provider.
Dec 15 2021, 12:55 PM
vsellier added a revision to T3806: terraform: upgrade proxmox provider to last release: D6840: terraform: update the proxmox provider.
Dec 15 2021, 12:55 PM · System administration
vsellier added a comment to T3806: terraform: upgrade proxmox provider to last release.

After some adapations, the syntax is now good.

Dec 15 2021, 11:25 AM · System administration

Dec 14 2021

vsellier updated the task description for T3806: terraform: upgrade proxmox provider to last release.
Dec 14 2021, 6:32 PM · System administration
vsellier added a comment to T3795: update terraform recipe to configure the vm's cpu type to default and support onboot option.

there are some issues to correctly upgrade the cpu type status.
The real value of the field on proxmox is not correctly detected and terraform is always trying to upgrade the cpu type.

Dec 14 2021, 6:30 PM · System administration
vsellier changed the status of T3795: update terraform recipe to configure the vm's cpu type to default and support onboot option from Open to Work in Progress.
Dec 14 2021, 6:28 PM · System administration
vsellier changed the status of T3806: terraform: upgrade proxmox provider to last release from Open to Work in Progress.
Dec 14 2021, 6:28 PM · System administration
vsellier accepted D6827: sysadm: Add how to access the firewall nodes without vpn tutorial.

thanks

Dec 14 2021, 6:19 PM
vsellier closed T3805: Decommission boatbucket server as Resolved.
  • node decommissioned from puppet:
root@pergamon:~# /usr/local/sbin/swh-puppet-master-decommission boatbucket.internal.softwareheritage.org
+ puppet node deactivate boatbucket.internal.softwareheritage.org
Submitted 'deactivate node' for boatbucket.internal.softwareheritage.org with UUID bf7bd0ea-f1ae-442f-b840-5bb1adb261f3
+ puppet node clean boatbucket.internal.softwareheritage.org
Notice: Revoked certificate with serial 224
Notice: Removing file Puppet::SSL::Certificate boatbucket.internal.softwareheritage.org at '/var/lib/puppet/ssl/ca/signed/boatbucket.internal.softwareheritage.org.pem'
boatbucket.internal.softwareheritage.org
+ puppet cert clean boatbucket.internal.softwareheritage.org
Warning: `puppet cert` is deprecated and will be removed in a future release.
   (location: /usr/lib/ruby/vendor_ruby/puppet/application.rb:370:in `run')
Notice: Revoked certificate with serial 224
+ systemctl restart apache2
root@pergamon:~# puppet agent --test
  • server manually removed from proxmox / uffizi
Dec 14 2021, 4:47 PM · System administration
vsellier added a comment to T3805: Decommission boatbucket server.

home directories backuped:

root@boatbucket:/home# ls -d alphare/* alphare/.bash_history boatbucket/* boatbucket/.bash_history | xargs tar cvjf boatbucket-backup-2021-12-14.tar.bz2
...

and saved on saam

root@boatbucket:/home# sudo -u boatbucket cp -v boatbucket-backup-2021-12-14.tar.bz /srv/boatbucket/
'boatbucket-backup-2021-12-14.tar.bz' -> '/srv/boatbucket/boatbucket-backup-2021-12-14.tar.bz'
root@boatbucket:/home# ls -al /srv/boatbucket/boatbucket-backup-2021-12-14.tar.bz 
-rw-r--r-- 1 boatbucket boatbucket 124170240 Dec 14 15:36 /srv/boatbucket/boatbucket-backup-2021-12-14.tar.bz
root@boatbucket:/home# mount | grep boatbucket
systemd-1 on /srv/boatbucket type autofs (rw,relatime,fd=54,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=13290)
saam:/srv/storage/space/mirrors/boatbucket on /srv/boatbucket type nfs4 (rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.100.107,local_lock=none,addr=192.168.100.109)
Dec 14 2021, 4:38 PM · System administration
vsellier updated the task description for T3579: Meta-task: upgrade infrastructure to Debian Bullseye.
Dec 14 2021, 4:18 PM · System administration (Component upgrades)
vsellier changed the status of T3805: Decommission boatbucket server from Open to Work in Progress.
Dec 14 2021, 4:17 PM · System administration
vsellier updated the task description for T3801: Migrate production database servers to bullseye.
Dec 14 2021, 4:03 PM · System administration (Component upgrades)
vsellier updated the task description for T3801: Migrate production database servers to bullseye.
Dec 14 2021, 3:56 PM · System administration (Component upgrades)
vsellier accepted D6832: bojimans: Fix postgresql version.

LGTM
not blocker, do we need the swh::postgresql::version indirection ?

Dec 14 2021, 2:26 PM
vsellier added a comment to T3801: Migrate production database servers to bullseye.

The following minor postgresql upgrades will be performed during the upgrade:

  • somerset: postgresql 13.4 -> 13.5 [1]

A dump/restore is not required for those running 13.X.

  • belvedere:
    • 11.14-0 -> 11.14-1 (indexer db)
    • 12.8-1 -> 12.9-1 [2] (other dbs)

A dump/restore is not required for those running 12.X.

  • db1:
    • 12.8-1 -> 12.9-1 [2]
Dec 14 2021, 11:30 AM · System administration (Component upgrades)
vsellier added inline comments to D6827: sysadm: Add how to access the firewall nodes without vpn tutorial.
Dec 14 2021, 10:51 AM
vsellier added inline comments to D6830: Pin python3.7 version so venv still works after bullseye migration.
Dec 14 2021, 10:42 AM
vsellier added a comment to T3778: The docker-dev build is often failing.

It seems since the 8th of december, there were no requests > 1s in the builds
I will monitor it during the current week, if it not occurs again, I will change the status to resolved

Dec 14 2021, 9:50 AM · System administration
vsellier created P1243 check docker-dev build request durations.
Dec 14 2021, 9:46 AM

Dec 13 2021

vsellier committed rSENVbfe3ae692020: vagrant: declare somerset host (authored by vsellier).
vagrant: declare somerset host
Dec 13 2021, 7:02 PM
vsellier committed rSENV6fc0d51c5562: vagrant: declare the prod-ns0 node (authored by vsellier).
vagrant: declare the prod-ns0 node
Dec 13 2021, 7:02 PM
vsellier committed rSPSITE15ab070df6ca: vagrant: declare a couple of missing servers (authored by vsellier).
vagrant: declare a couple of missing servers
Dec 13 2021, 6:57 PM
vsellier added a watcher for Object storage: vsellier.
Dec 13 2021, 5:58 PM
vsellier updated the task description for T3579: Meta-task: upgrade infrastructure to Debian Bullseye.
Dec 13 2021, 5:57 PM · System administration (Component upgrades)
vsellier updated the task description for T3579: Meta-task: upgrade infrastructure to Debian Bullseye.
Dec 13 2021, 5:56 PM · System administration (Component upgrades)
vsellier closed T3800: migrate ns0 to bullseye, a subtask of T3579: Meta-task: upgrade infrastructure to Debian Bullseye, as Resolved.
Dec 13 2021, 5:53 PM · System administration (Component upgrades)
vsellier closed T3800: migrate ns0 to bullseye as Resolved.

upgrade done following the T3799 procedure.

Dec 13 2021, 5:53 PM · System administration (Component upgrades)
vsellier added a comment to T3800: migrate ns0 to bullseye.

After several tests in vagrant, the upgrade looks ok, even if I couldn't succeed to have a complete local dns environment.

Dec 13 2021, 5:20 PM · System administration (Component upgrades)
vsellier accepted D6828: conftest: Fix tests hang since elasticsearch 7.16 release.

LGTM thanks

Dec 13 2021, 4:12 PM
vsellier added a comment to T3803: swh-search tests are hanging since elasticsearch 7.16 release.

If not defined, this variable is set by the elasticsearch launch script https://github.com/elastic/elasticsearch/pull/80699/files#diff-ddfc3a6ea1404997e56f2e771adede06b173f0fea37b4779d827c85d6cc52897R35
I guess as the fixture is not starting elasticsearch[1] throught the startup script, the variable is not defined

Dec 13 2021, 3:27 PM · Archive search
vsellier added a comment to T3803: swh-search tests are hanging since elasticsearch 7.16 release.

This link is interesting: https://www.elastic.co/guide/en/elasticsearch/reference/current/executable-jna-tmpdir.html
(from https://github.com/elastic/elasticsearch/issues/73309)

Dec 13 2021, 3:05 PM · Archive search
vsellier changed the status of T3800: migrate ns0 to bullseye, a subtask of T3579: Meta-task: upgrade infrastructure to Debian Bullseye, from Open to Work in Progress.
Dec 13 2021, 2:37 PM · System administration (Component upgrades)
vsellier changed the status of T3800: migrate ns0 to bullseye from Open to Work in Progress.
Dec 13 2021, 2:37 PM · System administration (Component upgrades)
vsellier committed rSENVd2556f0370c2: vagrantfile: fix the wrong os version of staging-search0 (authored by vsellier).
vagrantfile: fix the wrong os version of staging-search0
Dec 13 2021, 12:06 PM
vsellier triaged T3802: Migrate bojimans (netbox) to bullseye as Normal priority.
Dec 13 2021, 11:56 AM · System administration (Component upgrades)
vsellier updated the task description for T3579: Meta-task: upgrade infrastructure to Debian Bullseye.
Dec 13 2021, 11:53 AM · System administration (Component upgrades)
vsellier triaged T3801: Migrate production database servers to bullseye as Normal priority.
Dec 13 2021, 11:02 AM · System administration (Component upgrades)
vsellier triaged T3800: migrate ns0 to bullseye as Normal priority.
Dec 13 2021, 10:58 AM · System administration (Component upgrades)
vsellier reassigned T3487: Installation of the new provenance server from vsellier to olasd.

olasd: I transfer you the ownership of this task as you manage the subject. Feel free to close the task if the installation can be considered as done.

Dec 13 2021, 9:57 AM · System administration

Dec 10 2021

vsellier updated the task description for T3579: Meta-task: upgrade infrastructure to Debian Bullseye.
Dec 10 2021, 5:23 PM · System administration (Component upgrades)
vsellier updated the task description for T3579: Meta-task: upgrade infrastructure to Debian Bullseye.
Dec 10 2021, 5:22 PM · System administration (Component upgrades)
vsellier added a parent task for T3771: Upgrade proxmox hypervisor from version 6 to version 7 + debian11 migration: T3579: Meta-task: upgrade infrastructure to Debian Bullseye.
Dec 10 2021, 5:21 PM · System administration (Component upgrades)
vsellier added a subtask for T3579: Meta-task: upgrade infrastructure to Debian Bullseye: T3771: Upgrade proxmox hypervisor from version 6 to version 7 + debian11 migration.
Dec 10 2021, 5:21 PM · System administration (Component upgrades)
vsellier renamed T3771: Upgrade proxmox hypervisor from version 6 to version 7 + debian11 migration from Upgrade proxmox hypervisor from version 6 to version 7 to Upgrade proxmox hypervisor from version 6 to version 7 + debian11 migration.
Dec 10 2021, 5:21 PM · System administration (Component upgrades)
vsellier closed T3771: Upgrade proxmox hypervisor from version 6 to version 7 + debian11 migration as Resolved.

All the hypervisors are migrated and the services restored

Dec 10 2021, 5:20 PM · System administration (Component upgrades)
vsellier updated the task description for T3771: Upgrade proxmox hypervisor from version 6 to version 7 + debian11 migration.
Dec 10 2021, 5:12 PM · System administration (Component upgrades)
vsellier created T3795: update terraform recipe to configure the vm's cpu type to default and support onboot option.
Dec 10 2021, 4:01 PM · System administration
vsellier closed T3792: decommission louvre, a subtask of T3579: Meta-task: upgrade infrastructure to Debian Bullseye, as Resolved.
Dec 10 2021, 2:05 PM · System administration (Component upgrades)
vsellier closed T3792: decommission louvre as Resolved.
root@pergamon:/usr/local/sbin# ./swh-puppet-master-decommission louvre.internal.softwareheritage.org
+ puppet node deactivate louvre.internal.softwareheritage.org
Submitted 'deactivate node' for louvre.internal.softwareheritage.org with UUID edca37d0-0976-4598-aadd-aef13a033a34
+ puppet node clean louvre.internal.softwareheritage.org
Notice: Revoked certificate with serial 156
Notice: Removing file Puppet::SSL::Certificate louvre.internal.softwareheritage.org at '/var/lib/puppet/ssl/ca/signed/louvre.internal.softwareheritage.org.pem'
louvre.internal.softwareheritage.org
+ puppet cert clean louvre.internal.softwareheritage.org
Warning: `puppet cert` is deprecated and will be removed in a future release.
   (location: /usr/lib/ruby/vendor_ruby/puppet/application.rb:370:in `run')
Notice: Revoked certificate with serial 156
+ systemctl restart apache2
  • vm 108 removed
Dec 10 2021, 2:05 PM · System administration
vsellier changed the status of T3792: decommission louvre from Open to Work in Progress.
Dec 10 2021, 1:52 PM · System administration
vsellier added a comment to T3787: upgrade ceph nodes from nautilus to octopus.

The ceph packages need to be also updated on the proxmox nodes even if they are not in the ceph cluster (from the output of pve6to7)

Dec 10 2021, 10:46 AM · System administration (Component upgrades)

Dec 9 2021

vsellier added a comment to T3444: 26/07/2021: Unstuck infrastructure outage then post-mortem.

it's good for me to close it.

Dec 9 2021, 5:03 PM · System administration
vsellier updated the task description for T3771: Upgrade proxmox hypervisor from version 6 to version 7 + debian11 migration.
Dec 9 2021, 4:54 PM · System administration (Component upgrades)
vsellier renamed T3771: Upgrade proxmox hypervisor from version 6 to version 7 + debian11 migration from Migrate proxmox hypervisor nodes to bullseye to Upgrade proxmox hypervisor from version 6 to version 7.
Dec 9 2021, 4:54 PM · System administration (Component upgrades)
vsellier changed the status of T3787: upgrade ceph nodes from nautilus to octopus from Open to Work in Progress.
Dec 9 2021, 3:21 PM · System administration (Component upgrades)
vsellier added a comment to T3778: The docker-dev build is often failing.

No requests took more than 1s during the last build this night.
I will continue to monitor the builds and try to diagnose the problem more accurately

Dec 9 2021, 1:40 PM · System administration
vsellier triaged T3784: swh-search / staging: transient timeouts on elasticsearch queries as Normal priority.
Dec 9 2021, 12:38 PM · Archive search, System administration
vsellier added a comment to D6796: winery: basic implementation of the backend.

a couple of remarks. sorry in advance if it's just because it's a bootstrap and everything is not yet finalized

Dec 9 2021, 11:58 AM
vsellier added a comment to T3771: Upgrade proxmox hypervisor from version 6 to version 7 + debian11 migration.

Output of the pve6to7 script on uffizi:

Dec 9 2021, 9:33 AM · System administration (Component upgrades)
vsellier added a comment to T3771: Upgrade proxmox hypervisor from version 6 to version 7 + debian11 migration.

Preconditions checklist from the proxmox upgrade guide:

  • Upgraded to the latest version of Proxmox VE 6.4 (check correct package repository configuration)

On all nodes:

root@pergamon:/etc/clustershell# clush -b -w @hypervisors "pveversion"
---------------
branly,pompidou,uffizi (3)
---------------
pve-manager/6.4-13/9f411e79 (running kernel: 5.4.103-1-pve)
---------------
beaubourg
---------------
pve-manager/6.4-13/9f411e79 (running kernel: 5.4.143-1-pve)
---------------
hypervisor3
---------------
pve-manager/6.4-13/9f411e79 (running kernel: 5.4.128-1-pve)
  • TODO Hyper-converged Ceph: upgrade the Ceph Nautilus cluster to Ceph 15.2 Octopus before you start the Proxmox VE upgrade to 7.0. Follow the guide Ceph Nautilus to Octopus
  • No backup server Co-installed Proxmox Backup Server: see the Proxmox Backup Server 1.1 to 2.x upgrade how-to
  • Reliable access to the node (through ssh, iKVM/IPMI or physical access)
  • A healthy cluster
  • Valid and tested backup of all VMs and CTs (in case something goes wrong) At least 4 GiB free disk space on the root mount point.
  • Check known upgrade issues
  • from later on the doc Test the pve6to7 migration checklist
Dec 9 2021, 9:16 AM · System administration (Component upgrades)

Dec 8 2021

vsellier closed D6802: sysadmin/puppet: Add a puppet agent certificate renewal section.
Dec 8 2021, 6:18 PM
vsellier committed rDDOC2ca74bfbcf0f: sysadmin/puppet: Add a puppet agent certificate renewal section (authored by vsellier).
sysadmin/puppet: Add a puppet agent certificate renewal section
Dec 8 2021, 6:18 PM
vsellier accepted D6792: moma: Set the memory usage to the currently defined value.

WDYT to put this value in production/common.yaml to also align webapp1 with this value ?

Dec 8 2021, 4:31 PM
vsellier closed T3769: NPM lister is failing with a database update conflict as Resolved.

The lister was fixed with the deployment of the swh-scheduler v0.22.0.

Dec 8 2021, 3:05 PM · System administration, Npm Lister
vsellier closed T3773: Deploy swh-scheduler v0.22.0, a subtask of T3769: NPM lister is failing with a database update conflict, as Resolved.
Dec 8 2021, 2:17 PM · System administration, Npm Lister
vsellier closed T3773: Deploy swh-scheduler v0.22.0 as Resolved.
Dec 8 2021, 2:17 PM · System administration, Npm Lister
vsellier added a comment to T3773: Deploy swh-scheduler v0.22.0.

deployment of version v0.22.0 in production

Dec 8 2021, 2:17 PM · System administration, Npm Lister
vsellier added a comment to T3773: Deploy swh-scheduler v0.22.0.

Deployment of the version v0.22.0 in staging

Dec 8 2021, 11:44 AM · System administration, Npm Lister
vsellier closed D6782: Increase the swh-web timeout for swh-storage requests.
Dec 8 2021, 11:08 AM
vsellier committed rDENV74f4d9969f4e: Increase the swh-web timeout for swh-storage requests (authored by vsellier).
Increase the swh-web timeout for swh-storage requests
Dec 8 2021, 11:08 AM
vsellier requested review of D6782: Increase the swh-web timeout for swh-storage requests.
Dec 8 2021, 10:50 AM
vsellier added a revision to T3778: The docker-dev build is often failing: D6782: Increase the swh-web timeout for swh-storage requests.
Dec 8 2021, 10:50 AM · System administration
vsellier added a comment to T3778: The docker-dev build is often failing.

The timeout occurs after 1s on the swh-web side on a directory/ls call.

04:11:10 nginx_1                             | 172.23.0.1 - - [08/Dec/2021:03:11:09 +0000] "GET /api/1/directory/877df54c7dda406e9ad56ca09f793799aedbb26b/ HTTP/1.1" 500 4996 "-" "curl/7.64.0" 1.013
Dec 8 2021, 10:49 AM · System administration

Dec 7 2021

vsellier renamed T3773: Deploy swh-scheduler v0.22.0 from Deploy swh-scheduler v0.21.0 to Deploy swh-scheduler v0.22.0.
Dec 7 2021, 6:43 PM · System administration, Npm Lister
vsellier updated the summary of D6779: jenkins: Change node name to built-in.
Dec 7 2021, 6:36 PM
vsellier accepted D6779: jenkins: Change node name to built-in.

Thanks
More info here: https://www.jenkins.io/doc/book/managing/built-in-node-migration/

Dec 7 2021, 6:34 PM
vsellier accepted D6777: Make next_visit_queue_position an integer.

Thanks!

Dec 7 2021, 5:53 PM
vsellier added a comment to T3778: The docker-dev build is often failing.

the last builds were successful and are not indicating any response time too long.
let's see tomorrow if the response times are slower at the usual build time.

Dec 7 2021, 5:33 PM · System administration
vsellier closed D6775: Ensure swh-web is started before trying to refresh the save code now statuses.
Dec 7 2021, 3:40 PM