Page MenuHomeSoftware Heritage

swh-indexer db replica on azure
Closed, ResolvedPublic

Description

If we want to fully host the web ui on azure, we thus need the softwareheritage-indexer db replicated there as well (and the associated indexer-storage service).

Related T883

Related Objects

Event Timeline

ardumont created this task.Jun 8 2018, 12:26 PM
ardumont triaged this task as Normal priority.
ardumont updated the task description. (Show Details)Jun 8 2018, 12:36 PM
ardumont renamed this task from swh-indexer replica on azure to swh-indexer db replica on azure.Jun 8 2018, 1:55 PM
ardumont added a project: Indexer.
ardumont added a comment.EditedJun 13 2018, 6:31 PM

Relatedly, fixed/adapted according to latest use to provision a new node:

  • ee02aa5 * origin/master azure/provision-vm.sh: Install facter from backports
  • 79bcec0 * azure/create_vm.sh: Respect current dashboard convention
  • f39382d * azure/provision-vm.sh: Fix hostname generation for private node
  • 017adbc * azure/README.md: Update the dns records
  • 597fd29 * azure/README.md: Fix missing sudo command
  • 9aec50b * azure/provision-vm.sh: Do not use APT variable as this does not work
  • 50d37bc * azure/provision-vm.sh: Improve docstring sentence
  • db493ba * azure/provision-vm.sh: Deal with apt-transport-https use case
  • d08d73a * azure/README.md: Update to latest actions
  • 2603908 * azure/create-vm.sh: Compute resource group according to nodes' types
  • 4fec36e * azure/create-vm.sh: Create resource group when type <> worker
  • 9d39c2d * azure/: Remove unneeded script
  • 7fedc24 * azure/README: Push the provisioning script in /tmp
  • d53f702 * azure/provision-vm.sh: Use strictly the necessary in /etc/hosts
  • e73b56d * azure/README.md: Reference how to use the cli
  • bb9368c * azure/provision-vm.sh: Update according to our latest use
  • f07be8f * azure/create-vm.sh: Update according to our latest use
  • e849279 * azure/create-vm.sh: Reference long description flag
ardumont changed the task status from Open to Work in Progress.Jun 13 2018, 6:32 PM

Replying to self here as well.

I'll edit the current wiki page [1] for that...

I finally updated the [1] link but to add a note about us moving away from pg_logical replication (and targets the page about the new one [2]).
I thus added another page [2] instead since it's not the same replication technology.

[1] https://intranet.softwareheritage.org/index.php?title=Pglogical_replication

[2] https://intranet.softwareheritage.org/index.php?title=Streaming_Replication

about us moving away from pg_logical replication (and targets the page about the new one [2]).

i was wrong, we still use this so i removed the update (prado -> somerset).

Status on this, running cluster and db ;)

But there remains:

  • a wal replication issue to investigate and fix.
  • associated documentation update

Apparently, it would be due to some missing options needed to be flipped prior to replication trigger.

Status on this, replication up and running.

The first tests were too high level (using pg_basebackup).
This resulted in systematic missing wal files.

Pairing with @olasd to make it work, we used:

  • archiving wals to slave method (based on rsync)
  • started the replication using lower api (than pg_basebackup)
  • when the replication seemed stable, activated the cleaning up feature of the slave to remove applied wals

cf. P276 for details.

ardumont closed this task as Resolved.