Page MenuHomeSoftware Heritage

Deploy swh-scrubber v0.1.1
Closed, MigratedEdits Locked

Description

Add checkpointing on storage_checker to avoid rechecking objects at the beginning of ranges again and again

Staging:

  • apply swh/scrubber/sql/upgrades/4.sql [1]
  • upgrade package on workers and stop all workers
  • start one worker with --log-level swh.scrubber.storage_checker:DEBUG [2]
  • wait for a couple of Processing %s range %s to %s lines [2]
  • restart it (still with debug logs) [3]
  • check it is not processing the same ranges [3]
  • restart all workers (without debug logs)

Production:

  • apply swh/scrubber/sql/upgrades/4.sql [4]
  • upgrade package on workers
  • restart all workers

[1]

swhworker@scrubber0:~$ swh db --config-file /etc/softwareheritage/scrubber/primary.yml upgrade scrubber --module-config-key=scrubber_db
INFO:swh.core.db.db_utils:Executing migration script '/usr/lib/python3/dist-packages/swh/scrubber/sql/upgrades/4.sql'
Migration to version 4 done

[2]

swhworker@scrubber0:~$ export SWH_CONFIG_FILENAME=/etc/softwareheritage/scrubber/primary.yml
swhworker@scrubber0:~$ swh --log-level swh.scrubber.storage_checker:DEBUG scrubber check storage --object-type directory --start-object 0000000000000000000000000000000000000000 --end-object 3fffffffffffffffffffffffffffffffffffffff
DEBUG:swh.scrubber.storage_checker:Processing directory range None to 000001
DEBUG:swh.scrubber.storage_checker:Processing directory range 000001 to 000002
DEBUG:swh.scrubber.storage_checker:Processing directory range 000002 to 000003
DEBUG:swh.scrubber.storage_checker:Processing directory range 000003 to 000004
DEBUG:swh.scrubber.storage_checker:Processing directory range 000004 to 000005

[3]

swhworker@scrubber0:~$ swh --log-level swh.scrubber.storage_checker:DEBUG scrubber check storage --object-type directory --start-object 0000000000000000000000000000000000000000 --end-object 3fffffffffffffffffffffffffffffffffffffff
DEBUG:swh.scrubber.storage_checker:Skipping processing of directory range None to 000001: already done at 2022-10-18 08:32:42.926663+00:00
DEBUG:swh.scrubber.storage_checker:Skipping processing of directory range 000001 to 000002: already done at 2022-10-18 08:32:49.098090+00:00
DEBUG:swh.scrubber.storage_checker:Skipping processing of directory range 000002 to 000003: already done at 2022-10-18 08:32:57.651759+00:00
DEBUG:swh.scrubber.storage_checker:Skipping processing of directory range 000003 to 000004: already done at 2022-10-18 08:33:11.836088+00:00
DEBUG:swh.scrubber.storage_checker:Processing directory range 000004 to 000005

[4]

swhworker@scrubber1:~$ swh db --config-file /etc/softwareheritage/scrubber/primary.yml upgrade scrubber --module-config-key=scrubber_db
INFO:swh.core.db.db_utils:Executing migration script '/usr/lib/python3/dist-packages/swh/scrubber/sql/upgrades/4.sql'
Migration to version 4 done