Page MenuHomeSoftware Heritage

pgstorage: Migrate db to storage 0.13.2 (db versions 160, 161)
Closed, MigratedEdits Locked

Description

Those scripts [1] [2] rework the revision to set correctly the default values on fields:

  • date_neg_utc_offset
  • committer_date_neg_utc_offset
  • extra_headers

(same goes for release on other field but this is done as their volume is small
enough compared to revision)

The end goal is to enforce those default values with constraints.

Extra optimization work has been done so the migration passes smoothly (cache
reuse, no replication problem, one vacuum trigger and not 3) [3]

This is currently running on belvedere [4]
The workers are still running alongside.

[1] https://forge.softwareheritage.org/source/swh-storage/browse/master/sql/upgrades/160.sql

[2] https://forge.softwareheritage.org/source/swh-storage/browse/master/sql/upgrades/161.sql

[3] P747

[4] those scripts can be interrupted and rerun if needed:

postgres@belvedere $ psql -p 5433 softwareheritage
> \i 160-v4.1.sql
...
postgres@belvedere $ psql -p 5433 softwareheritage
> \i 160-v4.2.sql
...
postgres@belvedere $ psql -p 5433 softwareheritage
> \i 160-v4.3.sql

Event Timeline

ardumont changed the task status from Open to Work in Progress.Aug 28 2020, 11:20 AM
ardumont triaged this task as Normal priority.
ardumont created this task.
ardumont updated the task description. (Show Details)
ardumont renamed this task from pgstorage: Migrate db to storage 0.13.2 (db version 160 + 161) to pgstorage: Migrate db to storage 0.13.2 (db versions 160, 161).Aug 28 2020, 11:23 AM
ardumont updated the task description. (Show Details)

ETA as of 28/08/2020 with 2 update process concurrently running (and workers) of ~55 days (actual speed around 1.5M revisions per hour, which is not fast but steady and replication compliant).
I have started a 3rd update process, i'll update the task with some ETA update on monday.

ardumont changed the task status from Work in Progress to Open.Sep 9 2020, 11:20 AM

revisions done so far (as of yesterday, when it got stopped):

As a heads up:

> select count(*) from revision where id < '\x46e2eb1c432ca66f400000000000000000000000'
  or id > '\xbc3fe5c91d14e646e00000000000000000000000'
  or ('\x51fbe76c8b43969c200000000000000000000000' <= id and id <= '\x93a92a305532637a000000000000000000000000');
   count
------------
 1510911750
(1 row)

boundaries from the query are where the queries got stopped.

ardumont changed the task status from Open to Work in Progress.Sep 10 2020, 4:58 PM

started back the migration, this time running from a shared tmux session on belvedere (before it was with my user):

$ tmux new -s 160-161-migration-revisions

everything ran now including and up to D3936.

ardumont claimed this task.