Feed Advanced Search

Advanced Search
Use Results
Edit Query
Hide Query

	Include stories about projects I am a member of.

Feb 24 2022

vsellier updated the summary of D7246: sanoid: Add the configuration to manage the snapshot retention on backup01.

Feb 24 2022, 4:35 PM

vsellier updated the diff for D7246: sanoid: Add the configuration to manage the snapshot retention on backup01.

update commit message

Feb 24 2022, 3:31 PM

vsellier added a revision to T3889: Admin database backup: D7246: sanoid: Add the configuration to manage the snapshot retention on backup01.

Feb 24 2022, 3:22 PM · System administration

vsellier requested review of D7246: sanoid: Add the configuration to manage the snapshot retention on backup01.

Feb 24 2022, 3:22 PM

vsellier closed D7239: syncoid: Fix wrong timer frequency variable name.

Feb 24 2022, 11:28 AM

vsellier committed rSPSITE2880bf612fdd: syncoid: Fix wrong timer frequency variable name (authored by vsellier).

syncoid: Fix wrong timer frequency variable name

Feb 24 2022, 11:28 AM

vsellier requested review of D7239: syncoid: Fix wrong timer frequency variable name.

Feb 24 2022, 11:27 AM

vsellier added a revision to T3889: Admin database backup: D7239: syncoid: Fix wrong timer frequency variable name.

Feb 24 2022, 11:27 AM · System administration

vsellier committed rSPPRIVC3df2781cc4ae: add the syncoid::ssh_key::backup01-azure key (authored by vsellier).

add the syncoid::ssh_key::backup01-azure key

Feb 24 2022, 10:49 AM

vsellier closed D7235: backup: copy the dali snapshots to the azure's backup vm.

Feb 24 2022, 10:47 AM

vsellier committed rSPSITEd94c24ea30b7: backup: copy the dali snapshots to the azure's backup vm (authored by vsellier).

backup: copy the dali snapshots to the azure's backup vm

Feb 24 2022, 10:47 AM

vsellier updated the diff for D7235: backup: copy the dali snapshots to the azure's backup vm.

avoid unnecessary update if the no-sync-snap is not specified

Feb 24 2022, 10:46 AM

vsellier committed rSENV93d29df72a4c: Declare backu01.euwest.azure node (authored by vsellier).

Declare backu01.euwest.azure node

Feb 24 2022, 9:10 AM

vsellier requested review of D7235: backup: copy the dali snapshots to the azure's backup vm.

Feb 24 2022, 9:08 AM

vsellier added a revision to T3889: Admin database backup: D7235: backup: copy the dali snapshots to the azure's backup vm.

Feb 24 2022, 9:08 AM · System administration

Feb 23 2022

vsellier added a comment to T3889: Admin database backup.

backup01 vm created on azure
zfs installed (will be reported in puppet):
- add contrib repository
- install zfs

# apt install linux-headers-cloud-amd64 zfs-dkms

configure zfs pool

root@backup01:~# fdisk /dev/sdc -l
Disk /dev/sdc: 200 GiB, 214748364800 bytes, 419430400 sectors
Disk model: Virtual Disk    
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: D0FB08C6-F046-F340-AC8B-D6C9372015D5

Feb 23 2022, 3:50 PM · System administration

vsellier updated the diff for D7220: backups: Add an azure backup vm.

Assign a static ip to not use an address in the middle of the workers
Ensure the data disk is not deleted in case of accidental removal of the vm

Feb 23 2022, 2:42 PM

vsellier updated the diff for D7220: backups: Add an azure backup vm.

Use a supported rsa key
fix the ssh-key provisioning

Feb 23 2022, 12:35 PM

Feb 22 2022

vsellier updated the diff for D7220: backups: Add an azure backup vm.

Update facts:

Remove the location entry
add the deployment variable
add the subnet variable

Feb 22 2022, 6:00 PM

vsellier requested review of D7220: backups: Add an azure backup vm.

Feb 22 2022, 2:53 PM

vsellier added a revision to T3889: Admin database backup: D7220: backups: Add an azure backup vm.

Feb 22 2022, 2:53 PM · System administration

vsellier added a comment to T3784: swh-search / staging: transient timeouts on elasticsearch queries.

After the elasticsearch restart, there is no more message relative to any gc overhead in the logs but there were a couple of timeouts during the night.
Further investigations are needed

Feb 22 2022, 11:20 AM · Archive search, System administration

vsellier closed T3968: Race condition when a zfs sync is started in the same second as Resolved.

A workaround is deployed to restart the sync if it was interrupted by a race condition scenario

Feb 22 2022, 10:39 AM · System administration

vsellier closed D7216: syncoid: Try to restart the synchronization if a race condition occurred.

Feb 22 2022, 10:32 AM

vsellier committed rSPSITE36a13a6410dd: syncoid: Try to restart the synchronization if a race condition occurred (authored by vsellier).

syncoid: Try to restart the synchronization if a race condition occurred

Feb 22 2022, 10:32 AM

Feb 21 2022

vsellier requested review of D7216: syncoid: Try to restart the synchronization if a race condition occurred.

Feb 21 2022, 6:30 PM

vsellier added a revision to T3968: Race condition when a zfs sync is started in the same second: D7216: syncoid: Try to restart the synchronization if a race condition occurred.

Feb 21 2022, 6:30 PM · System administration

vsellier moved T3968: Race condition when a zfs sync is started in the same second from Backlog to in-progress on the System administration board.

Feb 21 2022, 5:28 PM · System administration

vsellier changed the status of T3968: Race condition when a zfs sync is started in the same second from Open to Work in Progress.

Feb 21 2022, 5:27 PM · System administration

vsellier moved T3784: swh-search / staging: transient timeouts on elasticsearch queries from in-progress to deployed/landed/monitoring on the System administration board.

Feb 21 2022, 4:42 PM · Archive search, System administration

vsellier added a comment to T3784: swh-search / staging: transient timeouts on elasticsearch queries.

Elastisearch was restarted and the sentry issues closed.
Let's monitor if the gcs are coming coming again

Feb 21 2022, 4:42 PM · Archive search, System administration

vsellier added a comment to T3784: swh-search / staging: transient timeouts on elasticsearch queries.

first, clean the unused resources, even if it will not free a lot of resources:

aliases cleanup

vsellier@search-esnode0 ~ % export ES_SERVER=192.168.130.80:9200
vsellier@search-esnode0 ~ % curl -XGET http://$ES_SERVER/_cat/aliases
origin-read         origin-v0.11  - - - -
origin-write        origin-v0.11  - - - -
origin-v0.9.0-read  origin-v0.9.0 - - - -
origin-v0.9.0-write origin-v0.9.0 - - - -
vsellier@search-esnode0 ~ % curl -XDELETE http://$ES_SERVER/origin-v0.9.0/_alias/origin-v0.9.0-read
{"acknowledged":true}%                                                                                                                                       vsellier@search-esnode0 ~ % curl -XDELETE -H "Content-Type: application/json" http://$ES_SERVER/origin-v0.9.0/_alias/origin-v0.9.0-write
{"acknowledged":true}%

Feb 21 2022, 3:20 PM · Archive search, System administration

vsellier changed the status of T3784: swh-search / staging: transient timeouts on elasticsearch queries from Open to Work in Progress.

Feb 21 2022, 11:59 AM · Archive search, System administration

vsellier moved T3911: Cross replicate the staging storage between db1 and storage1 from in-progress to done on the System administration board.

Feb 21 2022, 8:48 AM · System administration

vsellier closed T3911: Cross replicate the staging storage between db1 and storage1 as Resolved.

The replication of object storage is now running correctly:

-- Journal begins at Thu 2022-02-17 04:52:45 UTC, ends at Mon 2022-02-21 07:44:15 UTC. --
Feb 17 15:41:22 db1 systemd[1]: Starting ZFS dataset synchronization of...
Feb 17 15:41:23 db1 syncoid[283583]: INFO: Sending oldest full snapshot data/objects@syncoid_db1_2022-02-17:15:41:23 (~ 11811.3 GB) to new target filesystem:
Feb 19 13:41:09 db1 systemd[1]: syncoid-storage1-objects.service: Succeeded.
Feb 19 13:41:09 db1 systemd[1]: Finished ZFS dataset synchronization of.
Feb 19 13:41:09 db1 systemd[1]: syncoid-storage1-objects.service: Consumed 1d 10h 59min 6.865s CPU time.
Feb 19 13:41:09 db1 systemd[1]: Starting ZFS dataset synchronization of...
Feb 19 13:41:11 db1 syncoid[3716482]: Sending incremental data/objects@syncoid_db1_2022-02-17:15:41:23 ... syncoid_db1_2022-02-19:13:41:09 (~ 130.3 GB):
Feb 19 14:29:18 db1 systemd[1]: syncoid-storage1-objects.service: Succeeded.
Feb 19 14:29:18 db1 systemd[1]: Finished ZFS dataset synchronization of.
Feb 19 14:29:18 db1 systemd[1]: syncoid-storage1-objects.service: Consumed 25min 43.311s CPU time.
Feb 19 14:29:18 db1 systemd[1]: Starting ZFS dataset synchronization of...
Feb 19 14:29:25 db1 syncoid[1084137]: Sending incremental data/objects@syncoid_db1_2022-02-19:13:41:09 ... syncoid_db1_2022-02-19:14:29:18 (~ 5.3 GB):
Feb 19 14:31:12 db1 systemd[1]: syncoid-storage1-objects.service: Succeeded.
Feb 19 14:31:12 db1 systemd[1]: Finished ZFS dataset synchronization of.
Feb 19 14:31:12 db1 systemd[1]: syncoid-storage1-objects.service: Consumed 1min 7.439s CPU time.
Feb 19 14:35:03 db1 systemd[1]: Starting ZFS dataset synchronization of...
Feb 19 14:35:07 db1 syncoid[1174209]: Sending incremental data/objects@syncoid_db1_2022-02-19:14:29:18 ... syncoid_db1_2022-02-19:14:35:04 (~ 710.1 MB):
Feb 19 14:35:35 db1 systemd[1]: syncoid-storage1-objects.service: Succeeded.
Feb 19 14:35:35 db1 systemd[1]: Finished ZFS dataset synchronization of.
Feb 19 14:35:35 db1 systemd[1]: syncoid-storage1-objects.service: Consumed 10.015s CPU time.
Feb 19 14:40:48 db1 systemd[1]: Starting ZFS dataset synchronization of...
Feb 19 14:40:52 db1 syncoid[1223955]: Sending incremental data/objects@syncoid_db1_2022-02-19:14:35:04 ... syncoid_db1_2022-02-19:14:40:49 (~ 271.6 MB):
Feb 19 14:41:14 db1 systemd[1]: syncoid-storage1-objects.service: Succeeded.
Feb 19 14:41:14 db1 systemd[1]: Finished ZFS dataset synchronization of.
Feb 19 14:41:14 db1 systemd[1]: syncoid-storage1-objects.service: Consumed 5.701s CPU time.
Feb 19 14:46:32 db1 systemd[1]: Starting ZFS dataset synchronization of...
Feb 19 14:46:37 db1 syncoid[1267267]: Sending incremental data/objects@syncoid_db1_2022-02-19:14:40:49 ... syncoid_db1_2022-02-19:14:46:33 (~ 461.8 MB):
Feb 19 14:47:05 db1 systemd[1]: syncoid-storage1-objects.service: Succeeded.
Feb 19 14:47:05 db1 systemd[1]: Finished ZFS dataset synchronization of.
Feb 19 14:47:05 db1 systemd[1]: syncoid-storage1-objects.service: Consumed 8.945s CPU time.
Feb 19 14:52:18 db1 systemd[1]: Starting ZFS dataset synchronization of...
Feb 19 14:52:22 db1 syncoid[1312265]: Sending incremental data/objects@syncoid_db1_2022-02-19:14:46:33 ... syncoid_db1_2022-02-19:14:52:19 (~ 263.2 MB):
Feb 19 14:52:42 db1 systemd[1]: syncoid-storage1-objects.service: Succeeded.
Feb 19 14:52:42 db1 systemd[1]: Finished ZFS dataset synchronization of.
Feb 19 14:52:42 db1 systemd[1]: syncoid-storage1-objects.service: Consumed 6.021s CPU time.
Feb 19 14:58:04 db1 systemd[1]: Starting ZFS dataset synchronization of...
...

Feb 21 2022, 8:48 AM · System administration

Feb 18 2022

vsellier accepted D7203: azure: Drop storage02 vm and associated resources.

Feb 18 2022, 4:05 PM

vsellier closed D7201: azure: upgrade definitions for last terraform and azurerm versions.

Feb 18 2022, 3:40 PM

vsellier committed rSPREb46aa75dd621: azure: upgrade definitions for last terraform and azurerm versions (authored by vsellier).

azure: upgrade definitions for last terraform and azurerm versions

Feb 18 2022, 3:40 PM

vsellier added a revision to T3903: Clean up unused azure vms or services: D7201: azure: upgrade definitions for last terraform and azurerm versions.

Feb 18 2022, 3:36 PM · System administration

vsellier requested review of D7201: azure: upgrade definitions for last terraform and azurerm versions.

Feb 18 2022, 3:36 PM

Feb 17 2022

vsellier added a comment to T3784: swh-search / staging: transient timeouts on elasticsearch queries.

looks like the server is short in heap

[2022-02-17T15:26:30,847][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5965188] overhead, spent [408ms] collecting in the last [1s]
[2022-02-17T15:27:08,154][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5965225] overhead, spent [296ms] collecting in the last [1s]
[2022-02-17T15:29:31,383][WARN ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][young][5965368][3283] duration [1s], collections [1]/[1.1s], total [1s]/[5.8m], memory [8.2gb]->[5.4gb]/[16gb], all_pools {[young] [2.8gb]->[0b]/[0b]}{[old] [4.7gb]->[5.3gb]/[16gb]}{[survivor] [652mb]->[184mb]/[0b]}
[2022-02-17T15:29:31,384][WARN ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5965368] overhead, spent [1s] collecting in the last [1.1s]
[2022-02-17T15:31:49,449][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5965506] overhead, spent [260ms] collecting in the last [1s]
[2022-02-17T15:33:46,505][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5965623] overhead, spent [256ms] collecting in the last [1s]
[2022-02-17T15:37:11,728][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5965828] overhead, spent [372ms] collecting in the last [1s]
[2022-02-17T15:47:19,087][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5966435] overhead, spent [289ms] collecting in the last [1s]
[2022-02-17T15:49:56,439][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5966592] overhead, spent [315ms] collecting in the last [1.1s]
[2022-02-17T15:55:40,579][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5966936] overhead, spent [274ms] collecting in the last [1s]

Feb 17 2022, 5:17 PM · Archive search, System administration

vsellier closed D7180: zfs sync: Add the staging objects dataset replication to db1.

closed by rSPSITE16b929369b1967718da97b71f5af5949721b9578

Feb 17 2022, 5:01 PM

vsellier closed D7179: zfs sync: configure the staging kafka replication to db1.

closed by rSPSITEe3d6d0dfc00d339529d68227954229c7e7b6b1aa

Feb 17 2022, 5:01 PM

vsellier added a comment to T3911: Cross replicate the staging storage between db1 and storage1.

Objects replication:

land D7180
run puppet on db1 and storage1
the sync automatically starts:

Feb 17 15:41:22 db1 systemd[1]: Starting ZFS dataset synchronization of...
Feb 17 15:41:23 db1 syncoid[283583]: INFO: Sending oldest full snapshot data/objects@syncoid_db1_2022-02-17:15:41:23 (~ 11811.3 GB) to new target filesystem:

It will take some time to complete.

Feb 17 2022, 4:59 PM · System administration

vsellier committed rSPSITE16b929369b19: zfs sync: Add the staging objects dataset replication to db1 (authored by vsellier).

zfs sync: Add the staging objects dataset replication to db1

Feb 17 2022, 4:34 PM

vsellier committed rSPPRIVC4ab1b5374a4d: add the syncoid::ssh_key::db1 key (authored by vsellier).

add the syncoid::ssh_key::db1 key

Feb 17 2022, 2:01 PM

vsellier committed rSPSITEe3d6d0dfc00d: zfs sync: configure the staging kafka replication to db1 (authored by vsellier).

zfs sync: configure the staging kafka replication to db1

Feb 17 2022, 1:56 PM

vsellier added a comment to T3911: Cross replicate the staging storage between db1 and storage1.

kafka data replication:

prepare the dataset (ensure there is no mouts this time)

root@db1:~# zfs create -o canmount=noauto -o mountpoint=none data/sync
root@db1:~# zfs create -o canmount=noauto -o mountpoint=none data/sync/storage1
root@db1:~# zfs list
NAME                         USED  AVAIL     REFER  MOUNTPOINT
data                         736G  25.7T       96K  /data
data/postgres-indexer-12      96K  25.7T       96K  /srv/softwareheritage/postgres/12/indexer
data/postgres-main-12        733G  25.7T      729G  /srv/softwareheritage/postgres/12/main
data/postgres-misc           112K  25.7T      112K  /srv/softwareheritage/postgres
data/postgres-secondary-12    96K  25.7T       96K  /srv/softwareheritage/postgres/12/secondary
data/sync                    192K  25.7T       96K  none
data/sync/storage1            96K  25.7T       96K  none

land D7179
run puppet en db1 and storage
initial synchronization started:

Feb 17 13:05:09 db1 syncoid[999999]: INFO: Sending oldest full snapshot data/kafka@syncoid_db1_2022-02-17:13:05:09 (~ 1686.6 GB) to new target filesystem:

Feb 17 2022, 1:53 PM · System administration

vsellier added a comment to T3942: borg issues on multiple nodes.

Yes, my bad, it's due to T3911.

Feb 17 2022, 1:15 PM · System administration

Feb 15 2022

vsellier added a revision to T3911: Cross replicate the staging storage between db1 and storage1: D7180: zfs sync: Add the staging objects dataset replication to db1.

Feb 15 2022, 4:28 PM · System administration

vsellier requested review of D7180: zfs sync: Add the staging objects dataset replication to db1.

Feb 15 2022, 4:28 PM

vsellier requested review of D7179: zfs sync: configure the staging kafka replication to db1.

Feb 15 2022, 4:18 PM

vsellier added a revision to T3911: Cross replicate the staging storage between db1 and storage1: D7179: zfs sync: configure the staging kafka replication to db1.

Feb 15 2022, 4:18 PM · System administration

vsellier added a comment to T3911: Cross replicate the staging storage between db1 and storage1.

The initial synchronization took 2h20
After a stabilization period, the synchronization is done every 5mn and takes ~1mn (the sizes are logged uncompressed and must be / by ~2.5 to have the real size)

Feb 15 2022, 3:08 PM · System administration

vsellier added a comment to T3911: Cross replicate the staging storage between db1 and storage1.

D7173 landed. It initially focuses on the db1 -> storage1 replication to avoid having several initial replication at the same time. the storage1 -> db1 replication will be configured after the initial db1 replication will be done.
The replication will be done in this way (initiated by storage1):
db1 dataset data/postgres-main-12 replicated on storage1 /data/sync/db1/postgresql-main-12

Feb 15 2022, 12:07 PM · System administration

vsellier committed rSPPRIVCd1b60989bc54: Add swh::deploy::loader_bzr::sentry_token key (authored by vsellier).

Add swh::deploy::loader_bzr::sentry_token key

Feb 15 2022, 11:38 AM

vsellier committed rSPPRIVC23c4e24ec016: Add syncoid::ssh_key::storage1 (authored by vsellier).

Add syncoid::ssh_key::storage1

Feb 15 2022, 11:38 AM

vsellier committed rSPSITE87171b61695c: sanoid: configure the db1 -> storage1 zfs replication (authored by vsellier).

sanoid: configure the db1 -> storage1 zfs replication

Feb 15 2022, 11:36 AM

vsellier closed D7173: sanoid: configure the db1 -> storage1 zfs replication.

Feb 15 2022, 11:35 AM

vsellier committed rSPSITEbe3c146e8812: sanoid: prepare the server to server zfs replication (authored by vsellier).

sanoid: prepare the server to server zfs replication

Feb 15 2022, 11:35 AM

vsellier updated the diff for D7173: sanoid: configure the db1 -> storage1 zfs replication.

fix the doc of the key name computation

Feb 15 2022, 11:30 AM

vsellier added inline comments to D7173: sanoid: configure the db1 -> storage1 zfs replication.

Feb 15 2022, 10:55 AM

Feb 14 2022

vsellier retitled D7173: sanoid: configure the db1 -> storage1 zfs replication from sanoid: configure the db1 -> storage1 zfs replication 2 commits: - first: sanoid: prepare the server to server zfs replication to sanoid: configure the db1 -> storage1 zfs replication.

Feb 14 2022, 7:57 PM

vsellier requested review of D7173: sanoid: configure the db1 -> storage1 zfs replication.

Feb 14 2022, 7:56 PM

vsellier added a revision to T3911: Cross replicate the staging storage between db1 and storage1: D7173: sanoid: configure the db1 -> storage1 zfs replication.

Feb 14 2022, 7:56 PM · System administration

Feb 11 2022

vsellier committed rSENVf87990fd3fc8: Update the debian version of the migrated vms (authored by vsellier).

Update the debian version of the migrated vms

Feb 11 2022, 6:54 PM

vsellier accepted D7158: keycloak: Remove realm direct grant flow override.

Feb 11 2022, 3:23 PM

vsellier created P1285 keycloak error.

Feb 11 2022, 2:51 PM

Feb 10 2022

vsellier changed the status of T3911: Cross replicate the staging storage between db1 and storage1 from Open to Work in Progress.

Feb 10 2022, 2:35 PM · System administration

vsellier accepted D7141: provenance: Give some permissions to provenance team.

Feb 10 2022, 2:23 PM

vsellier committed rSENV117a1686fe53: Add new servers facts (authored by vsellier).

Add new servers facts

Feb 10 2022, 11:29 AM

vsellier closed D7136: icinga: don't try to monitor directories under the postgresql datadir.

Feb 10 2022, 11:19 AM

vsellier committed rSPSITE27e269717fbd: icinga: don't try to monitor directories under the postgresql datadir (authored by vsellier).

icinga: don't try to monitor directories under the postgresql datadir

Feb 10 2022, 11:19 AM

Feb 9 2022

vsellier requested review of D7136: icinga: don't try to monitor directories under the postgresql datadir.

Feb 9 2022, 4:45 PM

vsellier added a revision to T3889: Admin database backup: D7136: icinga: don't try to monitor directories under the postgresql datadir.

Feb 9 2022, 4:45 PM · System administration

Feb 8 2022

vsellier added a comment to T3889: Admin database backup.

the first local snapshots worked:

root@dali:~# zfs list -t all
NAME                                                       USED  AVAIL     REFER  MOUNTPOINT
data                                                      66.7G   126G       24K  /data
data/postgresql                                           66.6G   126G     66.6G  /srv/postgresql/14/main
data/postgresql@autosnap_2022-02-08_19:04:44_monthly      1.47M      -     66.6G  -
data/postgresql@autosnap_2022-02-08_19:04:44_daily         194K      -     66.6G  -
data/postgresql/wal                                       31.8M   126G     14.9M  /srv/postgresql/14/main/pg_wal
data/postgresql/wal@autosnap_2022-02-08_19:04:44_monthly  16.3M      -     31.3M  -
data/postgresql/wal@autosnap_2022-02-08_19:04:44_daily      13K      -     15.0M  -

Feb 8 2022, 8:11 PM · System administration

vsellier closed D7118: backups: implements a zfs snapshot backup.

Feb 8 2022, 8:01 PM

vsellier committed rSPSITE9300ba9a5783: backups: implements a postgresql backup based on zfs snapshots (authored by vsellier).

backups: implements a postgresql backup based on zfs snapshots

Feb 8 2022, 8:01 PM

vsellier updated the diff for D7118: backups: implements a zfs snapshot backup.

rebase

Feb 8 2022, 8:01 PM

vsellier added a comment to T3889: Admin database backup.

The dali database directory tree was prepared to have a dedicated mount dataset for the wals:

root@dali:~# date
Tue Feb  8 18:48:57 UTC 2022
root@dali:~# systemctl stop postgresql@14-main
● postgresql@14-main.service - PostgreSQL Cluster 14-main
     Loaded: loaded (/lib/systemd/system/postgresql@.service; enabled-runtime; vendor preset: enabled)
     Active: inactive (dead) since Tue 2022-02-08 18:48:58 UTC; 5ms ago
    Process: 2705743 ExecStop=/usr/bin/pg_ctlcluster --skip-systemctl-redirect -m fast 14-main stop (code=exited, status=0/SUCCESS)
   Main PID: 31293 (code=exited, status=0/SUCCESS)
        CPU: 1d 6h 12min 2.894s

Feb 8 2022, 7:55 PM · System administration

vsellier updated the test plan for D7118: backups: implements a zfs snapshot backup.

Feb 8 2022, 4:30 PM

vsellier updated the diff for D7118: backups: implements a zfs snapshot backup.

use a template instead of stdlib::to_toml function not compatible with puppet 5

Feb 8 2022, 4:29 PM

vsellier accepted D7123: Configure vault cookers to send their issue to sentry.

Feb 8 2022, 4:22 PM

vsellier added a comment to D7118: backups: implements a zfs snapshot backup.

thanks, I will fix that.

Feb 8 2022, 3:02 PM

vsellier committed rSENV331e3b73f650: vagrant: declare saam node (authored by vsellier).

vagrant: declare saam node

Feb 8 2022, 2:20 PM

vsellier accepted D7112: Deploy swh-worker@loader_bzr service to staging workers.

Feb 8 2022, 2:16 PM

vsellier updated the diff for D7118: backups: implements a zfs snapshot backup.

update commit message

Feb 8 2022, 2:12 PM

vsellier updated the summary of D7118: backups: implements a zfs snapshot backup.

Feb 8 2022, 2:11 PM

vsellier updated the summary of D7118: backups: implements a zfs snapshot backup.

Feb 8 2022, 2:11 PM

vsellier retitled D7118: backups: implements a zfs snapshot backup from WIP backups: implements a zfs snapshot backup to backups: implements a zfs snapshot backup.

Feb 8 2022, 2:11 PM

vsellier updated the diff for D7118: backups: implements a zfs snapshot backup.

add the postgresql backup management script
ensure the snapshot of the wal is done after the postgresql snapshot

Feb 8 2022, 2:04 PM

vsellier updated the diff for D7118: backups: implements a zfs snapshot backup.

Update to only keep the local snapshot section.
The sync deployment will be implemented in another diff.

Feb 8 2022, 11:20 AM

vsellier planned changes to D7118: backups: implements a zfs snapshot backup.

Feb 8 2022, 11:14 AM

vsellier requested review of D7118: backups: implements a zfs snapshot backup.

Feb 8 2022, 11:14 AM

vsellier added a revision to T3889: Admin database backup: D7118: backups: implements a zfs snapshot backup.

Feb 8 2022, 11:14 AM · System administration

Feb 7 2022

vsellier closed D7110: sysadm: add a postgresql backup management section.

Feb 7 2022, 5:17 PM

vsellier committed rDDOCee054f4c41db: sysadm: add a postgresql backup management section (authored by vsellier).

sysadm: add a postgresql backup management section

Feb 7 2022, 5:17 PM

vsellier renamed T3911: Cross replicate the staging storage between db1 and storage1 from Replicate the staging storage between db1 and storage1 to Cross replicate the staging storage between db1 and storage1.

Feb 7 2022, 10:29 AM · System administration

vsellier triaged T3911: Cross replicate the staging storage between db1 and storage1 as Normal priority.

Feb 7 2022, 10:29 AM · System administration

vsellier closed T2733: Explore / install a varnish prometheus probe as Resolved.

the exporter is deployed.
The varnish stats are available on this dashboard: https://grafana.softwareheritage.org/d/pE2xMZank/varnish

Feb 7 2022, 9:00 AM · Metrics/monitoring, System administration