Page MenuHomeSoftware Heritage

Error: Cron <postgres@belvedere> /usr/local/bin/swh-postgres-backup-logs-nas
Closed, ResolvedPublic

Description

Getting spammed with those errors:

mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-main-2019-07-20_120000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-main-2019-07-20_130000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-main-2019-07-20_140000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-19_110000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-19_120000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-19_130000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-19_140000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-19_150000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-19_160000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-19_170000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-19_180000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-19_190000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-19_200000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-19_210000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-19_220000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-19_230000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_000000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_010000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_020000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_030000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_040000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_050000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_060000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_070000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_080000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_090000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_100000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_110000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_120000.log.xz': No space left on device
mv: cannot create regular file '/srv/remote-backups/postgres/logs/postgresql-11-secondary-2019-07-20_130000.log.xz': No space left on device

Seek and destroy the root cause.

Event Timeline

ardumont triaged this task as High priority.Jul 20 2019, 7:14 PM
ardumont created this task.
olasd added a subscriber: olasd.Jul 20 2019, 9:05 PM

The root cause is that uffizi:/srv/storage/space is full.

Ack

Thanks for confirming (i was thinking along those lines but did not check further).

I opened this task mostly as a reminder to get to it on monday... (starting by checking the space hypothesis).

So, as explained, indeed, no more disk space:

root@belvedere:~# df -h /srv/remote-backups/
Filesystem                               Size  Used Avail Use% Mounted on
uffizi:/srv/storage/space/backups/prado  100T  100T   50G 100% /srv/remote-backups

I tried to clean up a bit /srv/storage/space.
I removed 50G of no longer intermediary dumps (they were done to be backup in case migrations went bad. They did not).

There remains quite a lot of files that could possibly go away but that'd need to be checked with their respective authors first.

ardumont closed this task as Resolved.Jul 22 2019, 10:01 AM
ardumont claimed this task.

Testing from postgres@belvedere, the execution of the script runs fine now (as long as there remains space that is).

Seek and destroy the root cause.

Well, the real fix would be to:

  • triage /srv/storage/space/'s contents
  • have more space (that's more or less the case but it's not clear yet whether we will move data on the new storage or not)

In the mean time, i closed as i cannot do more.

Seek and destroy the root cause.

Well, the real fix would be to:

  • triage /srv/storage/space/'s contents
  • have more space (that's more or less the case but it's not clear yet whether we will move data on the new storage or not)

In the mean time, i closed as i cannot do more.

which is T1382 ;)