current best option for this seems to be daily pg_dump sent over to the Antelink-INRIA machine sesi-pv-lc2.inria.fr
(task moved from trello)
current best option for this seems to be daily pg_dump sent over to the Antelink-INRIA machine sesi-pv-lc2.inria.fr
(task moved from trello)
Status | Assigned | Task | ||
---|---|---|---|---|
Migrated | gitlab-migration | T6 backup: postgres DB | ||
Migrated | gitlab-migration | T53 open network connectivity between sesi-pv-lc2 and swh machines |
we now have this space available:
zacchiro@sesi-pv-lc2:~$ df -h /antelink/store0/ Filesystem Size Used Avail Use% Mounted on /dev/sdc 19T 33M 19T 1% /antelink/store0
waiting for DSI to open firewall parts to connect to it from SWH machines
daily pg_dump over the net is now setup on prado for the databases gitimport and snapshot.debian.org, see prado:/usr/local/bin/swh-postgres-backup-sesi and /srv/softwareheritage/postgres/backup.conf
Note that we already have "local" (= on the NAS) backup for the database lister-github implemented by /usr/local/bin/swh-postgres-backup-nas. We need to decide which one we keep (probably -sesi).
Dumping ~8 GB to sesi took ~20 minutes, but was bound by pg_dump actually producing the dump, rather than by the network.
Overall, this is what the current cron of postgres@prado looks like:
postgres@prado:/srv/softwareheritage/postgres$ crontab -l # m h dom mon dow command 3 3 * * * /usr/local/bin/swh-postgres-backup-nas 33 3 * * * /usr/local/bin/swh-postgres-backup-sesi /srv/softwareheritage/postgres/backup.conf
update, backup conf is now:
we might want to reduce backup frequency to weekly, but let's see how does it do now