Page MenuHomeSoftware Heritage

Upgrade belvedere/main to PostgreSQL 12
Closed, ResolvedPublic


When doing so, keep a clone of the database to allow @vlorentz to handle T2513

Event Timeline

olasd triaged this task as High priority.Sep 10 2020, 10:17 AM
olasd created this task.
olasd changed the task status from Open to Work in Progress.Sep 10 2020, 10:53 AM

Worker shutdown in progress.

olasd@pergamon:~$ sudo clush -w @swh-workers 'puppet agent --disable "Upgrade of the main database"; cd /etc/systemd/system/; for unit in swh-worker@*; do systemctl disable $unit; done; systemctl stop swh-worker@*'

Moved moma to the replica on somerset, which is outdated but works.

Done the pg_upgrade process.

Switched barman over to a new database (moved all settings to swh-12 from swh-11); Restarted archiving to that new barman directory.

Restarted the cluster on port 15433 to perform some maintenance ops before restarting the full load.

  • analyzed all tables
  • I was doing a full vacuum on the revision table, with a larger maintenance_work_mem, but this is taking a while; I'll probably just restart the cluster on the default port now

I've stopped the cluster, taken a snapshot of the dataset, and restarted the cluster on its default port.

I'll do T2583 before restarting the workers.

*ahem* Do not forget to restart the storage backends when doing such an upgrade, lest you end up with a slew of PostgreSQL connection errors when your clients connect again.

Restarted worker01 which seemed to be happy, so all other workers have been restarted.

olasd@pergamon:~$ sudo clush -w @swh-workers 'sleep=$((RANDOM * 60 / 32768)); echo sleeping for $sleep; sleep $sleep; puppet agent --enable && puppet agent -t; puppet_status=$?; if [ $puppet_status -eq 2 ]; then systemctl default; else echo puppet exited with unexpected status code $puppet_status; exit 2; fi'