Page MenuHomeSoftware Heritage

raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique
ClosedPublic

Authored by vlorentz on Feb 5 2021, 2:39 PM.

Details

Summary

Uniqueness is only based on the id from now on.

Also adds the 'id' column to the Cassandra schema (it was already present in postgresql's schema)

Depends on D5029.

the code in D5029 MUST NOT be released and deployed as the same time as this one (see comment in the migration)

Diff Detail

Repository
rDSTO Storage manager
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build has FAILED

Patch application report for D5030 (id=17924)

Could not rebase; Attempt merge onto efd8815b89...

Merge made by the 'recursive' strategy.
 sql/upgrades/168.sql                               |  28 +
 sql/upgrades/169.sql                               |  25 +
 swh/storage/cassandra/common.py                    |   5 -
 swh/storage/cassandra/converters.py                |   2 +-
 swh/storage/cassandra/cql.py                       |  19 +-
 swh/storage/cassandra/model.py                     |   5 +-
 swh/storage/cassandra/schema.py                    |  31 +-
 swh/storage/cassandra/storage.py                   |  25 +-
 swh/storage/cli.py                                 |  82 ++
 swh/storage/in_memory.py                           |   9 +-
 swh/storage/postgresql/db.py                       |  16 +-
 swh/storage/postgresql/storage.py                  |   1 +
 swh/storage/sql/30-schema.sql                      |   4 +-
 swh/storage/sql/60-indexes.sql                     |   6 +-
 .../tests/data/sql-v0.18.0/10-superuser-init.sql   |  27 +
 swh/storage/tests/data/sql-v0.18.0/15-flavor.sql   |  22 +
 swh/storage/tests/data/sql-v0.18.0/20-enums.sql    |  23 +
 swh/storage/tests/data/sql-v0.18.0/30-schema.sql   | 499 +++++++++++
 swh/storage/tests/data/sql-v0.18.0/40-funcs.sql    | 960 +++++++++++++++++++++
 swh/storage/tests/data/sql-v0.18.0/60-indexes.sql  | 283 ++++++
 .../logical_replication/replication_source.sql     |  25 +
 swh/storage/tests/storage_tests.py                 |  75 +-
 swh/storage/tests/test_postgresql_migrated.py      |  63 ++
 swh/storage/tests/test_postgresql_migration.py     | 194 +++++
 swh/storage/utils.py                               |   5 +
 25 files changed, 2351 insertions(+), 83 deletions(-)
 create mode 100644 sql/upgrades/168.sql
 create mode 100644 sql/upgrades/169.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/10-superuser-init.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/15-flavor.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/20-enums.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/30-schema.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/40-funcs.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/60-indexes.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/logical_replication/replication_source.sql
 create mode 100644 swh/storage/tests/test_postgresql_migrated.py
 create mode 100644 swh/storage/tests/test_postgresql_migration.py
Changes applied before test
commit a82ae927e854daa4bc5c7eda7e078f046c517034
Merge: efd8815b 691f3af9
Author: Jenkins user <jenkins@localhost>
Date:   Fri Feb 5 13:39:38 2021 +0000

    Merge branch 'diff-target' into HEAD

commit 691f3af979adf12a0f99dbf82027cfa92808cb50
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 14:33:49 2021 +0100

    raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique
    
    Uniqueness is only based on the id from now on.

commit 14bfef5ca8780291f71af7eae2e3b9c45051a101
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 13:56:15 2021 +0100

    Add raw_extrinsic_metadata.id column in postgresql.
    
    For now, this has absolutely no effect on the API users,
    as rows are already deduplicated based on a subset of the
    fields hashed by the id.

commit 27440f9862b7df4e36b400a4fb74f1679d9ec6f4
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Feb 4 14:36:50 2021 +0100

    Add basic migration tests for postgresql
    
    This adds two test files:
    
    * `test_postgresql_migrated.py` applies an old schema definition, runs the
      migrations, then runs all the usual tests
    * `test_postgresql_migration.py` applies an old schema definition, inserts
      data, runs the migrations, and checks the data is still available
    
    `test_postgresql_migration.py` will probably break in some releases as
    it uses the old SQL with the new Python to insert, but it should be good
    enough, and we can disable it in some releases when needed.

commit e9441fef13c11c3eb500403275ba7337ed0c77e1
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Feb 4 14:29:38 2021 +0100

    postgresql: Fix dbversion() to return the max version instead of a random one.

commit 508399ce2abf21f813acc9c56422cbbccca0ae3d
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Feb 4 13:59:09 2021 +0100

    storage_tests: recompute ids when evolving RawExtrinsicMetadata objects.
    
    For now this does nothing as RawExtrinsicMetadata has no 'id' field,
    but the equality assertions will become errors when the next version
    of swh.model is released.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1136/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1136/console

Harbormaster returned this revision to the author for changes because remote builds failed.Feb 5 2021, 2:40 PM
Harbormaster failed remote builds in B19043: Diff 17924!
vlorentz edited the summary of this revision. (Show Details)
vlorentz edited the test plan for this revision. (Show Details)

Build has FAILED

Patch application report for D5030 (id=17972)

Could not rebase; Attempt merge onto b0383833fe...

Updating b0383833..d23ed2c2
Fast-forward
 sql/upgrades/168.sql                               |  28 +
 sql/upgrades/169.sql                               |  25 +
 swh/storage/cassandra/common.py                    |   5 -
 swh/storage/cassandra/converters.py                |   2 +-
 swh/storage/cassandra/cql.py                       |  19 +-
 swh/storage/cassandra/model.py                     |   5 +-
 swh/storage/cassandra/schema.py                    |  31 +-
 swh/storage/cassandra/storage.py                   |  25 +-
 swh/storage/cli.py                                 |  82 ++
 swh/storage/in_memory.py                           |   9 +-
 swh/storage/postgresql/db.py                       |   7 +-
 swh/storage/postgresql/storage.py                  |   1 +
 swh/storage/sql/30-schema.sql                      |   4 +-
 swh/storage/sql/60-indexes.sql                     |   6 +-
 .../tests/data/sql-v0.18.0/10-superuser-init.sql   |  27 +
 swh/storage/tests/data/sql-v0.18.0/15-flavor.sql   |  22 +
 swh/storage/tests/data/sql-v0.18.0/20-enums.sql    |  23 +
 swh/storage/tests/data/sql-v0.18.0/30-schema.sql   | 499 +++++++++++
 swh/storage/tests/data/sql-v0.18.0/40-funcs.sql    | 960 +++++++++++++++++++++
 swh/storage/tests/data/sql-v0.18.0/60-indexes.sql  | 283 ++++++
 .../logical_replication/replication_source.sql     |  25 +
 swh/storage/tests/storage_tests.py                 |  75 +-
 swh/storage/tests/test_postgresql_migrated.py      |  63 ++
 swh/storage/tests/test_postgresql_migration.py     | 194 +++++
 swh/storage/utils.py                               |   5 +
 25 files changed, 2343 insertions(+), 82 deletions(-)
 create mode 100644 sql/upgrades/168.sql
 create mode 100644 sql/upgrades/169.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/10-superuser-init.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/15-flavor.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/20-enums.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/30-schema.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/40-funcs.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/60-indexes.sql
 create mode 100644 swh/storage/tests/data/sql-v0.18.0/logical_replication/replication_source.sql
 create mode 100644 swh/storage/tests/test_postgresql_migrated.py
 create mode 100644 swh/storage/tests/test_postgresql_migration.py
Changes applied before test
commit d23ed2c287a8ae0be2dc8b6b45d1d3aa76bbae3c
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 14:33:49 2021 +0100

    raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique
    
    Uniqueness is only based on the id from now on.

commit d3872c70a5b0a431ef92a3b3c5cc9ddfc8f6251e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 13:56:15 2021 +0100

    Add raw_extrinsic_metadata.id column in postgresql.
    
    For now, this has absolutely no effect on the API users,
    as rows are already deduplicated based on a subset of the
    fields hashed by the id.

commit ebe4dab07feed7eb8ae03cd407980ba7bea78b37
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Feb 4 14:36:50 2021 +0100

    Add basic migration tests for postgresql
    
    This adds two test files:
    
    * `test_postgresql_migrated.py` applies an old schema definition, runs the
      migrations, then runs all the usual tests
    * `test_postgresql_migration.py` applies an old schema definition, inserts
      data, runs the migrations, and checks the data is still available
    
    `test_postgresql_migration.py` will probably break in some releases as
    it uses the old SQL with the new Python to insert, but it should be good
    enough, and we can disable it in some releases when needed.

commit 75939e4f9b02e130224a90929eb3ed2ba0730592
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Feb 4 13:59:09 2021 +0100

    storage_tests: recompute ids when evolving RawExtrinsicMetadata objects.
    
    For now this does nothing as RawExtrinsicMetadata has no 'id' field,
    but the equality assertions will become errors when the next version
    of swh.model is released.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1145/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1145/console

Build is green

Patch application report for D5030 (id=18624)

Rebasing onto 88ff2c2fa0...

Current branch diff-target is up to date.
Changes applied before test
commit 7ed181fe084152ebb2ad5b1792a8d2d9e5a1c429
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 13:56:15 2021 +0100

    Add raw_extrinsic_metadata.id column in postgresql.
    
    For now, this has absolutely no effect on the API users,
    as rows are already deduplicated based on a subset of the
    fields hashed by the id.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1182/ for more details.

vlorentz edited the test plan for this revision. (Show Details)

plus ça rate, plus on a de chances que ça marche

Build is green

Patch application report for D5030 (id=18627)

Rebasing onto 88ff2c2fa0...

Current branch diff-target is up to date.
Changes applied before test
commit 9b8cefeeeb7c3a395389edd1f04da21924b21ced
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 14:33:49 2021 +0100

    raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique
    
    Uniqueness is only based on the id from now on.
    
    Also adds the 'id' column to the Cassandra schema (it was already
    present in postgresql's schema)

commit 7ed181fe084152ebb2ad5b1792a8d2d9e5a1c429
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 13:56:15 2021 +0100

    Add raw_extrinsic_metadata.id column in postgresql.
    
    For now, this has absolutely no effect on the API users,
    as rows are already deduplicated based on a subset of the
    fields hashed by the id.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1184/ for more details.

Build is green

Patch application report for D5030 (id=18628)

Could not rebase; Attempt merge onto 88ff2c2fa0...

Updating 88ff2c2f..9b8cefee
Fast-forward
 sql/upgrades/169.sql               | 30 ++++++++++++++++++++++++++++
 sql/upgrades/170.sql               | 26 +++++++++++++++++++++++++
 swh/storage/cassandra/cql.py       | 16 ++++-----------
 swh/storage/cassandra/model.py     |  5 +++--
 swh/storage/cassandra/schema.py    | 31 +++++++++++++++++++++++++++--
 swh/storage/cassandra/storage.py   | 22 +++++----------------
 swh/storage/in_memory.py           |  9 ++++-----
 swh/storage/postgresql/db.py       |  7 +++++--
 swh/storage/postgresql/storage.py  |  1 +
 swh/storage/sql/30-schema.sql      |  4 +++-
 swh/storage/sql/60-indexes.sql     |  6 +++++-
 swh/storage/tests/storage_tests.py | 40 +++++++++++---------------------------
 12 files changed, 126 insertions(+), 71 deletions(-)
 create mode 100644 sql/upgrades/169.sql
 create mode 100644 sql/upgrades/170.sql
Changes applied before test
commit 9b8cefeeeb7c3a395389edd1f04da21924b21ced
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 14:33:49 2021 +0100

    raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique
    
    Uniqueness is only based on the id from now on.
    
    Also adds the 'id' column to the Cassandra schema (it was already
    present in postgresql's schema)

commit 7ed181fe084152ebb2ad5b1792a8d2d9e5a1c429
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 13:56:15 2021 +0100

    Add raw_extrinsic_metadata.id column in postgresql.
    
    For now, this has absolutely no effect on the API users,
    as rows are already deduplicated based on a subset of the
    fields hashed by the id.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1185/ for more details.

Build is green

Patch application report for D5030 (id=18629)

Could not rebase; Attempt merge onto 88ff2c2fa0...

Updating 88ff2c2f..ff5e6250
Fast-forward
 sql/upgrades/169.sql               | 30 ++++++++++++++++++++++++++++
 sql/upgrades/170.sql               | 25 ++++++++++++++++++++++++
 swh/storage/cassandra/cql.py       | 16 ++++-----------
 swh/storage/cassandra/model.py     |  5 +++--
 swh/storage/cassandra/schema.py    | 31 +++++++++++++++++++++++++++--
 swh/storage/cassandra/storage.py   | 22 +++++----------------
 swh/storage/in_memory.py           |  9 ++++-----
 swh/storage/postgresql/db.py       |  7 +++++--
 swh/storage/postgresql/storage.py  |  1 +
 swh/storage/sql/30-schema.sql      |  4 +++-
 swh/storage/sql/60-indexes.sql     |  6 +++++-
 swh/storage/tests/storage_tests.py | 40 +++++++++++---------------------------
 12 files changed, 125 insertions(+), 71 deletions(-)
 create mode 100644 sql/upgrades/169.sql
 create mode 100644 sql/upgrades/170.sql
Changes applied before test
commit ff5e6250989e7a3300bb943e777778aa3a6e0a21
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 14:33:49 2021 +0100

    raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique
    
    Uniqueness is only based on the id from now on.
    
    Also adds the 'id' column to the Cassandra schema (it was already
    present in postgresql's schema)

commit 7ed181fe084152ebb2ad5b1792a8d2d9e5a1c429
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 13:56:15 2021 +0100

    Add raw_extrinsic_metadata.id column in postgresql.
    
    For now, this has absolutely no effect on the API users,
    as rows are already deduplicated based on a subset of the
    fields hashed by the id.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1186/ for more details.

Build was aborted

Patch application report for D5030 (id=18961)

Could not rebase; Attempt merge onto 8dd9f7b635...

Updating 8dd9f7b6..85492942
Fast-forward
 sql/upgrades/171.sql               | 26 +++++++++++++++++++++++++
 sql/upgrades/172.sql               | 30 ++++++++++++++++++++++++++++
 swh/storage/cassandra/cql.py       | 16 ++++-----------
 swh/storage/cassandra/model.py     |  5 +++--
 swh/storage/cassandra/schema.py    | 31 +++++++++++++++++++++++++++--
 swh/storage/cassandra/storage.py   | 22 +++++----------------
 swh/storage/in_memory.py           |  9 ++++-----
 swh/storage/postgresql/db.py       |  7 +++++--
 swh/storage/postgresql/storage.py  |  1 +
 swh/storage/sql/30-schema.sql      |  4 +++-
 swh/storage/sql/60-indexes.sql     |  6 +++++-
 swh/storage/tests/storage_tests.py | 40 +++++++++++---------------------------
 12 files changed, 126 insertions(+), 71 deletions(-)
 create mode 100644 sql/upgrades/171.sql
 create mode 100644 sql/upgrades/172.sql
Changes applied before test
commit 854929429631bfc4270d6b538b007a887bb79091
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 14:33:49 2021 +0100

    raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique
    
    Uniqueness is only based on the id from now on.
    
    Also adds the 'id' column to the Cassandra schema (it was already
    present in postgresql's schema)

commit 2d540b0580cc9699bd8a593db45942b1f14d8e21
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 13:56:15 2021 +0100

    Add raw_extrinsic_metadata.id column in postgresql.
    
    For now, this has absolutely no effect on the API users,
    as rows are already deduplicated based on a subset of the
    fields hashed by the id.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1230/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1230/console

Build is green

Patch application report for D5030 (id=18961)

Could not rebase; Attempt merge onto 8dd9f7b635...

Updating 8dd9f7b6..85492942
Fast-forward
 sql/upgrades/171.sql               | 26 +++++++++++++++++++++++++
 sql/upgrades/172.sql               | 30 ++++++++++++++++++++++++++++
 swh/storage/cassandra/cql.py       | 16 ++++-----------
 swh/storage/cassandra/model.py     |  5 +++--
 swh/storage/cassandra/schema.py    | 31 +++++++++++++++++++++++++++--
 swh/storage/cassandra/storage.py   | 22 +++++----------------
 swh/storage/in_memory.py           |  9 ++++-----
 swh/storage/postgresql/db.py       |  7 +++++--
 swh/storage/postgresql/storage.py  |  1 +
 swh/storage/sql/30-schema.sql      |  4 +++-
 swh/storage/sql/60-indexes.sql     |  6 +++++-
 swh/storage/tests/storage_tests.py | 40 +++++++++++---------------------------
 12 files changed, 126 insertions(+), 71 deletions(-)
 create mode 100644 sql/upgrades/171.sql
 create mode 100644 sql/upgrades/172.sql
Changes applied before test
commit 854929429631bfc4270d6b538b007a887bb79091
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 14:33:49 2021 +0100

    raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique
    
    Uniqueness is only based on the id from now on.
    
    Also adds the 'id' column to the Cassandra schema (it was already
    present in postgresql's schema)

commit 2d540b0580cc9699bd8a593db45942b1f14d8e21
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 13:56:15 2021 +0100

    Add raw_extrinsic_metadata.id column in postgresql.
    
    For now, this has absolutely no effect on the API users,
    as rows are already deduplicated based on a subset of the
    fields hashed by the id.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1232/ for more details.

Build is green

Patch application report for D5030 (id=18963)

Could not rebase; Attempt merge onto 8dd9f7b635...

Updating 8dd9f7b6..9b473d3b
Fast-forward
 sql/upgrades/171.sql               | 26 +++++++++++++++++++++++++
 sql/upgrades/172.sql               | 25 ++++++++++++++++++++++++
 swh/storage/cassandra/cql.py       | 16 ++++-----------
 swh/storage/cassandra/model.py     |  5 +++--
 swh/storage/cassandra/schema.py    | 31 +++++++++++++++++++++++++++--
 swh/storage/cassandra/storage.py   | 22 +++++----------------
 swh/storage/in_memory.py           |  9 ++++-----
 swh/storage/postgresql/db.py       |  7 +++++--
 swh/storage/postgresql/storage.py  |  1 +
 swh/storage/sql/30-schema.sql      |  4 +++-
 swh/storage/sql/60-indexes.sql     |  6 +++++-
 swh/storage/tests/storage_tests.py | 40 +++++++++++---------------------------
 12 files changed, 121 insertions(+), 71 deletions(-)
 create mode 100644 sql/upgrades/171.sql
 create mode 100644 sql/upgrades/172.sql
Changes applied before test
commit 9b473d3bca344d234525002111647a687a244a7a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 14:33:49 2021 +0100

    raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique
    
    Uniqueness is only based on the id from now on.
    
    Also adds the 'id' column to the Cassandra schema (it was already
    present in postgresql's schema)

commit 2d540b0580cc9699bd8a593db45942b1f14d8e21
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 13:56:15 2021 +0100

    Add raw_extrinsic_metadata.id column in postgresql.
    
    For now, this has absolutely no effect on the API users,
    as rows are already deduplicated based on a subset of the
    fields hashed by the id.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1233/ for more details.

olasd added inline comments.
sql/upgrades/172.sql
19–20

The comment and the index don't match ;)

swh/storage/sql/60-indexes.sql
271

But this one is correct.

Build is green

Patch application report for D5030 (id=18977)

Could not rebase; Attempt merge onto 8dd9f7b635...

Updating 8dd9f7b6..eff23837
Fast-forward
 sql/upgrades/171.sql               | 26 +++++++++++++++++++++++++
 sql/upgrades/172.sql               | 25 ++++++++++++++++++++++++
 swh/storage/cassandra/cql.py       | 16 ++++-----------
 swh/storage/cassandra/model.py     |  5 +++--
 swh/storage/cassandra/schema.py    | 31 +++++++++++++++++++++++++++--
 swh/storage/cassandra/storage.py   | 22 +++++----------------
 swh/storage/in_memory.py           |  9 ++++-----
 swh/storage/postgresql/db.py       |  7 +++++--
 swh/storage/postgresql/storage.py  |  1 +
 swh/storage/sql/30-schema.sql      |  4 +++-
 swh/storage/sql/60-indexes.sql     |  6 +++++-
 swh/storage/tests/storage_tests.py | 40 +++++++++++---------------------------
 12 files changed, 121 insertions(+), 71 deletions(-)
 create mode 100644 sql/upgrades/171.sql
 create mode 100644 sql/upgrades/172.sql
Changes applied before test
commit eff23837173913ee485ae91c670f58989ca98060
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 14:33:49 2021 +0100

    raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique
    
    Uniqueness is only based on the id from now on.
    
    Also adds the 'id' column to the Cassandra schema (it was already
    present in postgresql's schema)

commit 2d540b0580cc9699bd8a593db45942b1f14d8e21
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 5 13:56:15 2021 +0100

    Add raw_extrinsic_metadata.id column in postgresql.
    
    For now, this has absolutely no effect on the API users,
    as rows are already deduplicated based on a subset of the
    fields hashed by the id.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1234/ for more details.

This revision is now accepted and ready to land.Mar 22 2021, 12:56 PM