Details
Details
- Reviewers
ardumont - Group Reviewers
Reviewers - Maniphest Tasks
- T4297: Add support for indexing intrinsic metadata from releases
- Commits
- rDCIDX986c672ed1ac: Add support for indexing from head releases
Diff Detail
Diff Detail
- Repository
- rDCIDX Metadata indexer
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
Comment Actions
Build is green
Patch application report for D7941 (id=28602)
Could not rebase; Attempt merge onto 1aa8bdaca1...
Updating 1aa8bda..244f073 Fast-forward swh/indexer/indexer.py | 48 ++++---- swh/indexer/metadata.py | 189 ++++++++++++++++++------------ swh/indexer/origin_head.py | 2 +- swh/indexer/sql/30-schema.sql | 20 ++-- swh/indexer/sql/50-func.sql | 30 ++--- swh/indexer/sql/60-indexes.sql | 10 +- swh/indexer/sql/upgrades/134.sql | 18 +++ swh/indexer/storage/__init__.py | 38 +++--- swh/indexer/storage/db.py | 26 ++-- swh/indexer/storage/in_memory.py | 24 ++-- swh/indexer/storage/interface.py | 22 ++-- swh/indexer/storage/model.py | 6 +- swh/indexer/tests/conftest.py | 2 +- swh/indexer/tests/storage/conftest.py | 6 +- swh/indexer/tests/storage/test_storage.py | 134 ++++++++++----------- swh/indexer/tests/tasks.py | 10 +- swh/indexer/tests/test_cli.py | 20 ++-- swh/indexer/tests/test_indexer.py | 16 +-- swh/indexer/tests/test_metadata.py | 59 +++++----- swh/indexer/tests/test_origin_head.py | 7 +- swh/indexer/tests/test_origin_metadata.py | 118 +++++++++++++------ swh/indexer/tests/utils.py | 62 +++++++++- 22 files changed, 519 insertions(+), 348 deletions(-) create mode 100644 swh/indexer/sql/upgrades/134.sql
Changes applied before test
commit 244f073a9492b5e6568de455f906a6f8b8b0c3d9 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Wed Jun 1 17:42:22 2022 +0200 Add support for indexing from head releases Needed since package loaders now create release objects instead of revision objects. commit 78903476df18f59030ce647392708841918dacb9 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Wed Jun 1 14:42:37 2022 +0200 Replace RevisionMetadataIndexer with DirectoryMetadataIndexer This will make it easier to support indexing from releases in the future, as it will remove the strong dependency on revision ids in the database and interfaces. The existence of the indexer/table is mostly to deduplicate work between origins with the same head revision, and we do not use it outside this context, so this should have no impact. The DB migration works by dropping both tables and re-indexing from scratch; which is necessary as we need to replace revision ids with directory ids.
See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/244/ for more details.
Comment Actions
lgtm
But i don't get why you use of assert within the runtime code instead of raising proper exception instead.
swh/indexer/metadata.py | ||
---|---|---|
360 | Why not raise something more explicit? |
Comment Actions
Build is green
Patch application report for D7941 (id=28659)
Could not rebase; Attempt merge onto ca4e91e5b7...
Updating ca4e91e..b7cb270 Fast-forward swh/indexer/indexer.py | 48 ++++---- swh/indexer/metadata.py | 189 ++++++++++++++++++------------ swh/indexer/origin_head.py | 2 +- swh/indexer/sql/30-schema.sql | 20 ++-- swh/indexer/sql/50-func.sql | 30 ++--- swh/indexer/sql/60-indexes.sql | 10 +- swh/indexer/sql/upgrades/134.sql | 18 +++ swh/indexer/storage/__init__.py | 38 +++--- swh/indexer/storage/db.py | 26 ++-- swh/indexer/storage/in_memory.py | 24 ++-- swh/indexer/storage/interface.py | 22 ++-- swh/indexer/storage/model.py | 6 +- swh/indexer/tests/conftest.py | 2 +- swh/indexer/tests/storage/conftest.py | 6 +- swh/indexer/tests/storage/test_storage.py | 134 ++++++++++----------- swh/indexer/tests/tasks.py | 10 +- swh/indexer/tests/test_cli.py | 20 ++-- swh/indexer/tests/test_indexer.py | 16 +-- swh/indexer/tests/test_metadata.py | 59 +++++----- swh/indexer/tests/test_origin_head.py | 7 +- swh/indexer/tests/test_origin_metadata.py | 118 +++++++++++++------ swh/indexer/tests/utils.py | 62 +++++++++- 22 files changed, 519 insertions(+), 348 deletions(-) create mode 100644 swh/indexer/sql/upgrades/134.sql
Changes applied before test
commit b7cb270ebbfc829df48b6c5a9f36f4c6cde6f672 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Wed Jun 1 17:42:22 2022 +0200 Add support for indexing from head releases Needed since package loaders now create release objects instead of revision objects. commit 7dc09f93a7ab5bb12a80ed5d81f7ccd590752256 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Wed Jun 1 14:42:37 2022 +0200 Replace RevisionMetadataIndexer with DirectoryMetadataIndexer This will make it easier to support indexing from releases in the future, as it will remove the strong dependency on revision ids in the database and interfaces. The existence of the indexer/table is mostly to deduplicate work between origins with the same head revision, and we do not use it outside this context, so this should have no impact. The DB migration works by dropping both tables and re-indexing from scratch; which is necessary as we need to replace revision ids with directory ids.
See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/249/ for more details.
Comment Actions
Build is green
Patch application report for D7941 (id=28667)
Could not rebase; Attempt merge onto ca4e91e5b7...
Updating ca4e91e..986c672 Fast-forward swh/indexer/indexer.py | 48 ++++---- swh/indexer/metadata.py | 189 ++++++++++++++++++------------ swh/indexer/origin_head.py | 2 +- swh/indexer/sql/30-schema.sql | 20 ++-- swh/indexer/sql/50-func.sql | 30 ++--- swh/indexer/sql/60-indexes.sql | 10 +- swh/indexer/sql/upgrades/134.sql | 145 +++++++++++++++++++++++ swh/indexer/storage/__init__.py | 38 +++--- swh/indexer/storage/db.py | 28 ++--- swh/indexer/storage/in_memory.py | 24 ++-- swh/indexer/storage/interface.py | 22 ++-- swh/indexer/storage/model.py | 6 +- swh/indexer/tests/conftest.py | 2 +- swh/indexer/tests/storage/conftest.py | 6 +- swh/indexer/tests/storage/test_storage.py | 134 ++++++++++----------- swh/indexer/tests/tasks.py | 10 +- swh/indexer/tests/test_cli.py | 20 ++-- swh/indexer/tests/test_indexer.py | 16 +-- swh/indexer/tests/test_metadata.py | 59 +++++----- swh/indexer/tests/test_origin_head.py | 7 +- swh/indexer/tests/test_origin_metadata.py | 118 +++++++++++++------ swh/indexer/tests/utils.py | 62 +++++++++- 22 files changed, 647 insertions(+), 349 deletions(-) create mode 100644 swh/indexer/sql/upgrades/134.sql
Changes applied before test
commit 986c672ed1acf34c3f7c0e4f2d6e959b8d012278 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Wed Jun 1 17:42:22 2022 +0200 Add support for indexing from head releases Needed since package loaders now create release objects instead of revision objects. commit b88e9572f5aaee0707771f2e06b6ecb906a674c1 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Wed Jun 1 14:42:37 2022 +0200 Replace RevisionMetadataIndexer with DirectoryMetadataIndexer This will make it easier to support indexing from releases in the future, as it will remove the strong dependency on revision ids in the database and interfaces. The existence of the indexer/table is mostly to deduplicate work between origins with the same head revision, and we do not use it outside this context, so this should have no impact. The DB migration works by dropping both tables and re-indexing from scratch; which is necessary as we need to replace revision ids with directory ids.
See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/252/ for more details.