Description
Related Objects
Event Timeline
Now that the first batch import (github + snapshot.debian.org + gnu.org) is done and we won't be importing other sources for a while, a full object store backup from uffizi to banco has now started.
The backup is rather bare bone, using the following script:
#!/bin/bash SRCDIR="/srv/softwareheritage/objects" DESTHOST="swhstorage@backup.softwareheritage.org" if [ -z "$1" ] ; then echo "Usage: $0 OBJ_FIRST_DIGIT [TAR_OPTIONS]" echo "E.g.: $0 c" exit 1 fi digit="$1" shift 1 echo "* `date -R` considering objects starting with ${digit}" for srcdir in ${SRCDIR}/${digit}* ; do test -d $srcdir || continue echo "* `date -R` sending $srcdir over" (cd $srcdir && tar caf - "$@" .) | ssh $DESTHOST "cd $srcdir && tar xaf -" done
executed in parallel 16 times, one for each 0-9a-f digit.
The processes are running in a script session of my user on uffizi.
Note that this means no integrity check is being done on individual objects, we will need to do that later on (and we need to do that periodically on all object copies anyhow).
This is back on hold now, as we discovered that the read performances on uffizi from the object store are not as good as they should.
By looking at bonnie++ output and doing some math, we have concluded that transfer slowness is essentially dominated by seek time.
We have therefore switched to a different "backup" strategy: *cough* | dd if=/dev/* | nc | *cough*. By transferring 4 (out of 16) shards of objects we saturated the 1Gb link we have between the two machines.
olasd started the first 4 dd in a screen session.
We now have a backup of all the contents that were stored on uffizi at the end of our first batch import.