Details
- Reviewers
olasd - Group Reviewers
Reviewers - Maniphest Tasks
- T3020: Add an "index" for raw_extrinsic_metadata.id in swh.storage.cassandra
- Commits
- rDSTOeff238371739: raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id)…
Diff Detail
- Repository
- rDSTO Storage manager
- Branch
- metadata-id2
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 20041 Build 31115: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 31114: arc lint + arc unit
Event Timeline
Build has FAILED
Patch application report for D5030 (id=17924)
Could not rebase; Attempt merge onto efd8815b89...
Merge made by the 'recursive' strategy. sql/upgrades/168.sql | 28 + sql/upgrades/169.sql | 25 + swh/storage/cassandra/common.py | 5 - swh/storage/cassandra/converters.py | 2 +- swh/storage/cassandra/cql.py | 19 +- swh/storage/cassandra/model.py | 5 +- swh/storage/cassandra/schema.py | 31 +- swh/storage/cassandra/storage.py | 25 +- swh/storage/cli.py | 82 ++ swh/storage/in_memory.py | 9 +- swh/storage/postgresql/db.py | 16 +- swh/storage/postgresql/storage.py | 1 + swh/storage/sql/30-schema.sql | 4 +- swh/storage/sql/60-indexes.sql | 6 +- .../tests/data/sql-v0.18.0/10-superuser-init.sql | 27 + swh/storage/tests/data/sql-v0.18.0/15-flavor.sql | 22 + swh/storage/tests/data/sql-v0.18.0/20-enums.sql | 23 + swh/storage/tests/data/sql-v0.18.0/30-schema.sql | 499 +++++++++++ swh/storage/tests/data/sql-v0.18.0/40-funcs.sql | 960 +++++++++++++++++++++ swh/storage/tests/data/sql-v0.18.0/60-indexes.sql | 283 ++++++ .../logical_replication/replication_source.sql | 25 + swh/storage/tests/storage_tests.py | 75 +- swh/storage/tests/test_postgresql_migrated.py | 63 ++ swh/storage/tests/test_postgresql_migration.py | 194 +++++ swh/storage/utils.py | 5 + 25 files changed, 2351 insertions(+), 83 deletions(-) create mode 100644 sql/upgrades/168.sql create mode 100644 sql/upgrades/169.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/10-superuser-init.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/15-flavor.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/20-enums.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/30-schema.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/40-funcs.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/60-indexes.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/logical_replication/replication_source.sql create mode 100644 swh/storage/tests/test_postgresql_migrated.py create mode 100644 swh/storage/tests/test_postgresql_migration.py
Changes applied before test
commit a82ae927e854daa4bc5c7eda7e078f046c517034 Merge: efd8815b 691f3af9 Author: Jenkins user <jenkins@localhost> Date: Fri Feb 5 13:39:38 2021 +0000 Merge branch 'diff-target' into HEAD commit 691f3af979adf12a0f99dbf82027cfa92808cb50 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 14:33:49 2021 +0100 raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique Uniqueness is only based on the id from now on. commit 14bfef5ca8780291f71af7eae2e3b9c45051a101 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 13:56:15 2021 +0100 Add raw_extrinsic_metadata.id column in postgresql. For now, this has absolutely no effect on the API users, as rows are already deduplicated based on a subset of the fields hashed by the id. commit 27440f9862b7df4e36b400a4fb74f1679d9ec6f4 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Feb 4 14:36:50 2021 +0100 Add basic migration tests for postgresql This adds two test files: * `test_postgresql_migrated.py` applies an old schema definition, runs the migrations, then runs all the usual tests * `test_postgresql_migration.py` applies an old schema definition, inserts data, runs the migrations, and checks the data is still available `test_postgresql_migration.py` will probably break in some releases as it uses the old SQL with the new Python to insert, but it should be good enough, and we can disable it in some releases when needed. commit e9441fef13c11c3eb500403275ba7337ed0c77e1 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Feb 4 14:29:38 2021 +0100 postgresql: Fix dbversion() to return the max version instead of a random one. commit 508399ce2abf21f813acc9c56422cbbccca0ae3d Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Feb 4 13:59:09 2021 +0100 storage_tests: recompute ids when evolving RawExtrinsicMetadata objects. For now this does nothing as RawExtrinsicMetadata has no 'id' field, but the equality assertions will become errors when the next version of swh.model is released.
Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1136/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1136/console
Build has FAILED
Patch application report for D5030 (id=17972)
Could not rebase; Attempt merge onto b0383833fe...
Updating b0383833..d23ed2c2 Fast-forward sql/upgrades/168.sql | 28 + sql/upgrades/169.sql | 25 + swh/storage/cassandra/common.py | 5 - swh/storage/cassandra/converters.py | 2 +- swh/storage/cassandra/cql.py | 19 +- swh/storage/cassandra/model.py | 5 +- swh/storage/cassandra/schema.py | 31 +- swh/storage/cassandra/storage.py | 25 +- swh/storage/cli.py | 82 ++ swh/storage/in_memory.py | 9 +- swh/storage/postgresql/db.py | 7 +- swh/storage/postgresql/storage.py | 1 + swh/storage/sql/30-schema.sql | 4 +- swh/storage/sql/60-indexes.sql | 6 +- .../tests/data/sql-v0.18.0/10-superuser-init.sql | 27 + swh/storage/tests/data/sql-v0.18.0/15-flavor.sql | 22 + swh/storage/tests/data/sql-v0.18.0/20-enums.sql | 23 + swh/storage/tests/data/sql-v0.18.0/30-schema.sql | 499 +++++++++++ swh/storage/tests/data/sql-v0.18.0/40-funcs.sql | 960 +++++++++++++++++++++ swh/storage/tests/data/sql-v0.18.0/60-indexes.sql | 283 ++++++ .../logical_replication/replication_source.sql | 25 + swh/storage/tests/storage_tests.py | 75 +- swh/storage/tests/test_postgresql_migrated.py | 63 ++ swh/storage/tests/test_postgresql_migration.py | 194 +++++ swh/storage/utils.py | 5 + 25 files changed, 2343 insertions(+), 82 deletions(-) create mode 100644 sql/upgrades/168.sql create mode 100644 sql/upgrades/169.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/10-superuser-init.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/15-flavor.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/20-enums.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/30-schema.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/40-funcs.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/60-indexes.sql create mode 100644 swh/storage/tests/data/sql-v0.18.0/logical_replication/replication_source.sql create mode 100644 swh/storage/tests/test_postgresql_migrated.py create mode 100644 swh/storage/tests/test_postgresql_migration.py
Changes applied before test
commit d23ed2c287a8ae0be2dc8b6b45d1d3aa76bbae3c Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 14:33:49 2021 +0100 raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique Uniqueness is only based on the id from now on. commit d3872c70a5b0a431ef92a3b3c5cc9ddfc8f6251e Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 13:56:15 2021 +0100 Add raw_extrinsic_metadata.id column in postgresql. For now, this has absolutely no effect on the API users, as rows are already deduplicated based on a subset of the fields hashed by the id. commit ebe4dab07feed7eb8ae03cd407980ba7bea78b37 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Feb 4 14:36:50 2021 +0100 Add basic migration tests for postgresql This adds two test files: * `test_postgresql_migrated.py` applies an old schema definition, runs the migrations, then runs all the usual tests * `test_postgresql_migration.py` applies an old schema definition, inserts data, runs the migrations, and checks the data is still available `test_postgresql_migration.py` will probably break in some releases as it uses the old SQL with the new Python to insert, but it should be good enough, and we can disable it in some releases when needed. commit 75939e4f9b02e130224a90929eb3ed2ba0730592 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Feb 4 13:59:09 2021 +0100 storage_tests: recompute ids when evolving RawExtrinsicMetadata objects. For now this does nothing as RawExtrinsicMetadata has no 'id' field, but the equality assertions will become errors when the next version of swh.model is released.
Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1145/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1145/console
Build has FAILED
Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1181/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1181/console
Build is green
Patch application report for D5030 (id=18624)
Rebasing onto 88ff2c2fa0...
Current branch diff-target is up to date.
Changes applied before test
commit 7ed181fe084152ebb2ad5b1792a8d2d9e5a1c429 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 13:56:15 2021 +0100 Add raw_extrinsic_metadata.id column in postgresql. For now, this has absolutely no effect on the API users, as rows are already deduplicated based on a subset of the fields hashed by the id.
See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1182/ for more details.
Build is green
Patch application report for D5030 (id=18627)
Rebasing onto 88ff2c2fa0...
Current branch diff-target is up to date.
Changes applied before test
commit 9b8cefeeeb7c3a395389edd1f04da21924b21ced Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 14:33:49 2021 +0100 raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique Uniqueness is only based on the id from now on. Also adds the 'id' column to the Cassandra schema (it was already present in postgresql's schema) commit 7ed181fe084152ebb2ad5b1792a8d2d9e5a1c429 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 13:56:15 2021 +0100 Add raw_extrinsic_metadata.id column in postgresql. For now, this has absolutely no effect on the API users, as rows are already deduplicated based on a subset of the fields hashed by the id.
See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1184/ for more details.
Build is green
Patch application report for D5030 (id=18628)
Could not rebase; Attempt merge onto 88ff2c2fa0...
Updating 88ff2c2f..9b8cefee Fast-forward sql/upgrades/169.sql | 30 ++++++++++++++++++++++++++++ sql/upgrades/170.sql | 26 +++++++++++++++++++++++++ swh/storage/cassandra/cql.py | 16 ++++----------- swh/storage/cassandra/model.py | 5 +++-- swh/storage/cassandra/schema.py | 31 +++++++++++++++++++++++++++-- swh/storage/cassandra/storage.py | 22 +++++---------------- swh/storage/in_memory.py | 9 ++++----- swh/storage/postgresql/db.py | 7 +++++-- swh/storage/postgresql/storage.py | 1 + swh/storage/sql/30-schema.sql | 4 +++- swh/storage/sql/60-indexes.sql | 6 +++++- swh/storage/tests/storage_tests.py | 40 +++++++++++--------------------------- 12 files changed, 126 insertions(+), 71 deletions(-) create mode 100644 sql/upgrades/169.sql create mode 100644 sql/upgrades/170.sql
Changes applied before test
commit 9b8cefeeeb7c3a395389edd1f04da21924b21ced Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 14:33:49 2021 +0100 raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique Uniqueness is only based on the id from now on. Also adds the 'id' column to the Cassandra schema (it was already present in postgresql's schema) commit 7ed181fe084152ebb2ad5b1792a8d2d9e5a1c429 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 13:56:15 2021 +0100 Add raw_extrinsic_metadata.id column in postgresql. For now, this has absolutely no effect on the API users, as rows are already deduplicated based on a subset of the fields hashed by the id.
See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1185/ for more details.
Build is green
Patch application report for D5030 (id=18629)
Could not rebase; Attempt merge onto 88ff2c2fa0...
Updating 88ff2c2f..ff5e6250 Fast-forward sql/upgrades/169.sql | 30 ++++++++++++++++++++++++++++ sql/upgrades/170.sql | 25 ++++++++++++++++++++++++ swh/storage/cassandra/cql.py | 16 ++++----------- swh/storage/cassandra/model.py | 5 +++-- swh/storage/cassandra/schema.py | 31 +++++++++++++++++++++++++++-- swh/storage/cassandra/storage.py | 22 +++++---------------- swh/storage/in_memory.py | 9 ++++----- swh/storage/postgresql/db.py | 7 +++++-- swh/storage/postgresql/storage.py | 1 + swh/storage/sql/30-schema.sql | 4 +++- swh/storage/sql/60-indexes.sql | 6 +++++- swh/storage/tests/storage_tests.py | 40 +++++++++++--------------------------- 12 files changed, 125 insertions(+), 71 deletions(-) create mode 100644 sql/upgrades/169.sql create mode 100644 sql/upgrades/170.sql
Changes applied before test
commit ff5e6250989e7a3300bb943e777778aa3a6e0a21 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 14:33:49 2021 +0100 raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique Uniqueness is only based on the id from now on. Also adds the 'id' column to the Cassandra schema (it was already present in postgresql's schema) commit 7ed181fe084152ebb2ad5b1792a8d2d9e5a1c429 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 13:56:15 2021 +0100 Add raw_extrinsic_metadata.id column in postgresql. For now, this has absolutely no effect on the API users, as rows are already deduplicated based on a subset of the fields hashed by the id.
See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1186/ for more details.
Build was aborted
Patch application report for D5030 (id=18961)
Could not rebase; Attempt merge onto 8dd9f7b635...
Updating 8dd9f7b6..85492942 Fast-forward sql/upgrades/171.sql | 26 +++++++++++++++++++++++++ sql/upgrades/172.sql | 30 ++++++++++++++++++++++++++++ swh/storage/cassandra/cql.py | 16 ++++----------- swh/storage/cassandra/model.py | 5 +++-- swh/storage/cassandra/schema.py | 31 +++++++++++++++++++++++++++-- swh/storage/cassandra/storage.py | 22 +++++---------------- swh/storage/in_memory.py | 9 ++++----- swh/storage/postgresql/db.py | 7 +++++-- swh/storage/postgresql/storage.py | 1 + swh/storage/sql/30-schema.sql | 4 +++- swh/storage/sql/60-indexes.sql | 6 +++++- swh/storage/tests/storage_tests.py | 40 +++++++++++--------------------------- 12 files changed, 126 insertions(+), 71 deletions(-) create mode 100644 sql/upgrades/171.sql create mode 100644 sql/upgrades/172.sql
Changes applied before test
commit 854929429631bfc4270d6b538b007a887bb79091 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 14:33:49 2021 +0100 raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique Uniqueness is only based on the id from now on. Also adds the 'id' column to the Cassandra schema (it was already present in postgresql's schema) commit 2d540b0580cc9699bd8a593db45942b1f14d8e21 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 13:56:15 2021 +0100 Add raw_extrinsic_metadata.id column in postgresql. For now, this has absolutely no effect on the API users, as rows are already deduplicated based on a subset of the fields hashed by the id.
Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1230/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1230/console
Build is green
Patch application report for D5030 (id=18961)
Could not rebase; Attempt merge onto 8dd9f7b635...
Updating 8dd9f7b6..85492942 Fast-forward sql/upgrades/171.sql | 26 +++++++++++++++++++++++++ sql/upgrades/172.sql | 30 ++++++++++++++++++++++++++++ swh/storage/cassandra/cql.py | 16 ++++----------- swh/storage/cassandra/model.py | 5 +++-- swh/storage/cassandra/schema.py | 31 +++++++++++++++++++++++++++-- swh/storage/cassandra/storage.py | 22 +++++---------------- swh/storage/in_memory.py | 9 ++++----- swh/storage/postgresql/db.py | 7 +++++-- swh/storage/postgresql/storage.py | 1 + swh/storage/sql/30-schema.sql | 4 +++- swh/storage/sql/60-indexes.sql | 6 +++++- swh/storage/tests/storage_tests.py | 40 +++++++++++--------------------------- 12 files changed, 126 insertions(+), 71 deletions(-) create mode 100644 sql/upgrades/171.sql create mode 100644 sql/upgrades/172.sql
Changes applied before test
commit 854929429631bfc4270d6b538b007a887bb79091 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 14:33:49 2021 +0100 raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique Uniqueness is only based on the id from now on. Also adds the 'id' column to the Cassandra schema (it was already present in postgresql's schema) commit 2d540b0580cc9699bd8a593db45942b1f14d8e21 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 13:56:15 2021 +0100 Add raw_extrinsic_metadata.id column in postgresql. For now, this has absolutely no effect on the API users, as rows are already deduplicated based on a subset of the fields hashed by the id.
See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1232/ for more details.
Build is green
Patch application report for D5030 (id=18963)
Could not rebase; Attempt merge onto 8dd9f7b635...
Updating 8dd9f7b6..9b473d3b Fast-forward sql/upgrades/171.sql | 26 +++++++++++++++++++++++++ sql/upgrades/172.sql | 25 ++++++++++++++++++++++++ swh/storage/cassandra/cql.py | 16 ++++----------- swh/storage/cassandra/model.py | 5 +++-- swh/storage/cassandra/schema.py | 31 +++++++++++++++++++++++++++-- swh/storage/cassandra/storage.py | 22 +++++---------------- swh/storage/in_memory.py | 9 ++++----- swh/storage/postgresql/db.py | 7 +++++-- swh/storage/postgresql/storage.py | 1 + swh/storage/sql/30-schema.sql | 4 +++- swh/storage/sql/60-indexes.sql | 6 +++++- swh/storage/tests/storage_tests.py | 40 +++++++++++--------------------------- 12 files changed, 121 insertions(+), 71 deletions(-) create mode 100644 sql/upgrades/171.sql create mode 100644 sql/upgrades/172.sql
Changes applied before test
commit 9b473d3bca344d234525002111647a687a244a7a Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 14:33:49 2021 +0100 raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique Uniqueness is only based on the id from now on. Also adds the 'id' column to the Cassandra schema (it was already present in postgresql's schema) commit 2d540b0580cc9699bd8a593db45942b1f14d8e21 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 13:56:15 2021 +0100 Add raw_extrinsic_metadata.id column in postgresql. For now, this has absolutely no effect on the API users, as rows are already deduplicated based on a subset of the fields hashed by the id.
See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1233/ for more details.
Build is green
Patch application report for D5030 (id=18977)
Could not rebase; Attempt merge onto 8dd9f7b635...
Updating 8dd9f7b6..eff23837 Fast-forward sql/upgrades/171.sql | 26 +++++++++++++++++++++++++ sql/upgrades/172.sql | 25 ++++++++++++++++++++++++ swh/storage/cassandra/cql.py | 16 ++++----------- swh/storage/cassandra/model.py | 5 +++-- swh/storage/cassandra/schema.py | 31 +++++++++++++++++++++++++++-- swh/storage/cassandra/storage.py | 22 +++++---------------- swh/storage/in_memory.py | 9 ++++----- swh/storage/postgresql/db.py | 7 +++++-- swh/storage/postgresql/storage.py | 1 + swh/storage/sql/30-schema.sql | 4 +++- swh/storage/sql/60-indexes.sql | 6 +++++- swh/storage/tests/storage_tests.py | 40 +++++++++++--------------------------- 12 files changed, 121 insertions(+), 71 deletions(-) create mode 100644 sql/upgrades/171.sql create mode 100644 sql/upgrades/172.sql
Changes applied before test
commit eff23837173913ee485ae91c670f58989ca98060 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 14:33:49 2021 +0100 raw_extrinsic_metadata: Make (target, authority_id, discovery_date, fetcher_id) non-unique Uniqueness is only based on the id from now on. Also adds the 'id' column to the Cassandra schema (it was already present in postgresql's schema) commit 2d540b0580cc9699bd8a593db45942b1f14d8e21 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Feb 5 13:56:15 2021 +0100 Add raw_extrinsic_metadata.id column in postgresql. For now, this has absolutely no effect on the API users, as rows are already deduplicated based on a subset of the fields hashed by the id.
See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1234/ for more details.