Page MenuHomeSoftware Heritage

set up a replica of the main DB on azure
Closed, MigratedEdits Locked

Description

As per title.

We want a fast DB access, read-only, for workers running on azure.
Use cases are: vault workers and provenance cache population.

Event Timeline

zack triaged this task as Normal priority.Dec 12 2017, 3:00 PM
zack created this task.
zack added a project: Restricted Project.Jan 3 2018, 3:10 PM
zack raised the priority of this task from Normal to High.
zack moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Jan 3 2018, 3:27 PM

Current database size: ~= 3880 GB
Maximum Azure drive size: 4095 GB

  • We will need to use more than one drive to create a bigger volume
  • We will also most likely need to grow this volume at some point in the future

The sweet spot for Azure drives seems to be 2TB : 7500 IOPS and 250MB/s provisioned.
Smaller ones have less provisioned IOPS and bandwidth per second.
Bigger ones don't get faster.

Azure performance appears to be very far from reliable. Linux keeps complaining about issues similar to this one:

[Tue Jan  9 15:47:04 2018] INFO: task kworker/1:3:148 blocked for more than 120 seconds.
[Tue Jan  9 15:47:04 2018]       Not tainted 4.9.0-5-amd64 #1 Debian 4.9.65-3+deb9u2
[Tue Jan  9 15:47:04 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  • Created a 6TB /srv volume with 4x 1536GB drives in RAID0
  • 1536GB drives have the same theoritical performance limits as 2TB ones
  • Used lvm2 for the software raid layer.
  • mdadm doesn't allow to grow a RAID0 volume whereas lvm2 could (by concatenating another storage provider at the end)
  • Built-in Postgres replication can't replicate schema changes (as of PostgreSQL 10.1)
  • For this reason, it is best to use pglogical
  • it has a pglogical.replicate_ddl_command function which creates a synchronization point to pause replication, does a DDL change on the primary, then sends the ddl change to replicas within the replication stream and then resumes replication (so said @olasd)
zack moved this task from Restricted Project Column to Restricted Project Column on the Restricted Project board.Jan 16 2018, 1:31 PM

Pglogical replication requirements:

  • The replication user MUST exist on both provider and subscriber hosts.
  • The replication user MUST be able to connect without password
    • from subscriber to provider
    • from provider to subscriber
    • from subscriber to subscriber (yes, to itself)

Use a .pgpass file in the postgres Unix account directory for that.

Provider setup (was already done, instructions may be unreliable):

createuser replicator --replication
su - postgres
psql template1
\c softwareheritage postgres localhost 5433
create extension pglogical;
select pglogical.create_node(
	node_name := 'prado',
	dsn := 'host=prado.internal.softwareheritage.org port=5433 dbname=softwareheritage'
);
select pglogical.replication_set_add_table('default', 'content', true);

ALTER DEFAULT PRIVILEGES FOR ROLE swhstorage grant select on tables to replicator;
GRANT USAGE ON SCHEMA pglogical TO replicator;

Subscriber setup:

su - postgres
/usr/lib/postgresql/10/bin/initdb  -D /srv/softwareheritage/postgres/10/replica

[Start up Postgres here]

createuser swhstorage
psql template1
create user replicator SUPERUSER REPLICATION PASSWORD 'toto90';
create database softwareheritage with owner swhstorage encoding utf8 lc_collate 'C.UTF-8' lc_ctype 'C.UTF-8';
\q

git clone https://forge.softwareheritage.org/source/swh-storage.git
cd swh-storage

psql softwareheritage
create extension pglogical;
ALTER DEFAULT PRIVILEGES FOR ROLE swhstorage grant all on tables to replicator;
GRANT USAGE ON SCHEMA pglogical TO replicator;
\i sql/swh-init.sql
\i sql/swh-enums.sql
\i sql/swh-schema.sql

select pglogical.create_node(
	node_name := 'dbreplica0',
	dsn := 'host=dbreplica0.euwest.azure.internal.softwareheritage.org port=5433 dbname=softwareheritage user=replicator'
);
select pglogical.create_subscription(
	subscription_name := 'dbreplica0',
	provider_dsn := 'host=prado.internal.softwareheritage.org port=5433 dbname=softwareheritage user=replicator'
);

The replica database has to be properly initialized with the .sql scripts.
Do not try to use the synchronize_structure feature of pglogical, it will fail.

ftigeot changed the task status from Open to Work in Progress.Apr 13 2018, 10:57 AM

The existing replica using pglogical is unable to stay in sync with its master database
Replication technology changed to streaming replication (wal shipping).

Replication has been running fine since yesterday.
PostgreSQL master server is somerset.internal.softwareheritage.org:5433 .

Rough setup steps:

  1. Make sure a replication user exists on the master server
su - postgres
createuser -p 5433 replicator --replication
psql -p 5433 template1
alter user replicator with password 'my.precious.secret';
  1. Allow clients to use this replication user on the master server
# pg_hba.conf
# TYPE  DATABASE        USER            CIDR-ADDRESS            METHOD
host    replication     replicator      192.168.200.0/21        md5	# Azure
kill -HUP $pid_of_postgres_master_process
  1. Check and/or modify postgresql settings on master
# postgresql.conf
wal_level = replica      # or higher, logical is fine
wal_keep_segments = 500  # in logfile segments, 16MB each; 0 disables
wal_compression = on     # enable compression of full-page writes
  • wal_compression is an optimization which reduces I/O and network traffic at the expense of CPU time. It is not required.
  • wal_keep_segments needs to be set to a reasonable value in order to keep enough wal data between the initial database backup and its subsequent regular startup on the slave. Slave startup can fail at first, so make sure to have at least a couple hours of slack in case things go wrong.
  1. Make sure the slave can connect
su - postgres
touch .pgpass
echo "somerset.internal.softwareheritage.org:5433:replication:replicator:my.precious.secret" >> .pgpass
chmod 0600 .pgpass
  1. Make sure the slave can run
# postgresql.conf
max_connections = 128    # must be the same value as on the master
  • Postgres refuses to start standby servers if max_connections is smaller on the slave than on its master
  1. Get a database backup
pg_basebackup --progress --write-recovery-conf \
	-h somerset.internal.softwareheritage.org -p 5433 -U replicator \
	-D /srv/softwareheritage/postgres/10/somerset.backup
  1. Move the backup files to the regular pgdata location and start the slave server
cd /postgres/10
mv somerset.backup replica
service postgresql start

great, thanks!

can you consolidate the bits of docs somewhere on the intranet? they'll be easier to find than on a task in the future

echo "somerset.internal.softwareheritage.org:5433:replication:replicator:my.precious.secret" >> .pgpass

There must be a typo here.
Testing on somerset with the postgres user, it does not work (as in, does not connect and falls back to ask for a password).
As far as i understood, replication is not a db, it's a right (or something).

cli used:

psql -h somerset.internal.softwareheritage.org -p 5433 -U replicator softwareheritage

There should be the name of the db instead of replication, as in:

echo "somerset.internal.softwareheritage.org:5433:softwareheritage:replicator:my.precious.secret" >> .pgpass

Now, the previous cli works, as in, i can connect.

can you consolidate the bits of docs somewhere on the intranet? they'll be easier to find than on a task in the future

@zack As i'm working on T1094, i'm consolidating notes based on those plus what i'm currently doing.
I'll edit the current wiki page [1] for that as soon as i'm successfully done with my own replica.

[1] https://intranet.softwareheritage.org/index.php?title=Pglogical_replication

Cheers,

There must be a typo here.

Well, yes and no.

TL;DR
The way to check if the replication connection is ok is wrong.

echo "somerset.internal.softwareheritage.org:5433:replication:replicator:my.precious.secret" >> .pgpass

So indeed, actually, this is the right line for the replication ;)

But, turns out that it's the way to check if the connection is working that's not right.
Actually triggering the last step (pg_basebackup) will say if it's ok or not so no need for an extra check ;)

Cheers,

I'll edit the current wiki page [1] for that...

I finally updated the [1] link but to add a note about us moving away from pg_logical replication (and targets the page about the new one [2]).
I thus added another page [2] instead since it's not the same replication technology.

[1] https://intranet.softwareheritage.org/index.php?title=Pglogical_replication

[2] https://intranet.softwareheritage.org/index.php?title=Streaming_Replication

about us moving away from pg_logical replication (and targets the page about the new one [2]).

i was wrong, we still use this so i removed the update (prado -> somerset).