Page MenuHomeSoftware Heritage

pgstorage: Migrate db to storage 0.13.2 (db versions 160, 161)
Closed, ResolvedPublic

Description

Those scripts [1] [2] rework the revision to set correctly the default values on fields:

  • date_neg_utc_offset
  • committer_date_neg_utc_offset
  • extra_headers

(same goes for release on other field but this is done as their volume is small
enough compared to revision)

The end goal is to enforce those default values with constraints.

Extra optimization work has been done so the migration passes smoothly (cache
reuse, no replication problem, one vacuum trigger and not 3) [3]

This is currently running on belvedere [4]
The workers are still running alongside.

[1] https://forge.softwareheritage.org/source/swh-storage/browse/master/sql/upgrades/160.sql

[2] https://forge.softwareheritage.org/source/swh-storage/browse/master/sql/upgrades/161.sql

[3] P747

[4] those scripts can be interrupted and rerun if needed:

postgres@belvedere $ psql -p 5433 softwareheritage
> \i 160-v4.1.sql
...
postgres@belvedere $ psql -p 5433 softwareheritage
> \i 160-v4.2.sql
...
postgres@belvedere $ psql -p 5433 softwareheritage
> \i 160-v4.3.sql

Event Timeline

ardumont changed the task status from Open to Work in Progress.Aug 28 2020, 11:20 AM
ardumont triaged this task as Normal priority.
ardumont created this task.
ardumont updated the task description. (Show Details)
ardumont renamed this task from pgstorage: Migrate db to storage 0.13.2 (db version 160 + 161) to pgstorage: Migrate db to storage 0.13.2 (db versions 160, 161).Aug 28 2020, 11:23 AM
ardumont updated the task description. (Show Details)
ardumont updated the task description. (Show Details)Aug 28 2020, 2:13 PM
ardumont added a comment.EditedAug 28 2020, 4:06 PM

ETA as of 28/08/2020 with 2 update process concurrently running (and workers) of ~55 days (actual speed around 1.5M revisions per hour, which is not fast but steady and replication compliant).
I have started a 3rd update process, i'll update the task with some ETA update on monday.

stand-by related to T2561

ardumont updated the task description. (Show Details)Sep 9 2020, 11:17 AM
ardumont changed the task status from Work in Progress to Open.Sep 9 2020, 11:20 AM

revisions done so far (as of yesterday, when it got stopped):

As a heads up:

> select count(*) from revision where id < '\x46e2eb1c432ca66f400000000000000000000000'
  or id > '\xbc3fe5c91d14e646e00000000000000000000000'
  or ('\x51fbe76c8b43969c200000000000000000000000' <= id and id <= '\x93a92a305532637a000000000000000000000000');
   count
------------
 1510911750
(1 row)

boundaries from the query are where the queries got stopped.

ardumont changed the task status from Open to Work in Progress.Sep 10 2020, 4:58 PM

started back the migration, this time running from a shared tmux session on belvedere (before it was with my user):

$ tmux new -s 160-161-migration-revisions

everything ran now including and up to D3936.

ardumont closed this task as Resolved.Sep 14 2020, 1:09 PM
ardumont claimed this task.