This allows to remove some unnecessary psycopg2 dependencies in modules that
don't deal directly with SQL queries.
Depends on D5948
Differential D5957
Remove `db_utilis` in favour of `swh.core.db.BaseDb` methods aeviso on Jul 1 2021, 5:26 PM. Authored by
Details
This allows to remove some unnecessary psycopg2 dependencies in modules that Depends on D5948
Diff Detail
Event TimelineComment Actions Build is green Patch application report for D5957 (id=21406)Could not rebase; Attempt merge onto d892b29e40... Updating d892b29..061769b Fast-forward swh/provenance/__init__.py | 57 +-- swh/provenance/archive.py | 25 +- swh/provenance/backend.py | 357 ++++++++++++++++++ swh/provenance/cli.py | 28 +- swh/provenance/graph.py | 4 +- swh/provenance/model.py | 53 ++- swh/provenance/origin.py | 21 +- swh/provenance/postgresql/archive.py | 126 +++---- swh/provenance/postgresql/db_utils.py | 61 ---- swh/provenance/postgresql/provenancedb_base.py | 404 ++++++++++++--------- .../postgresql/provenancedb_with_path.py | 117 +++--- .../postgresql/provenancedb_without_path.py | 96 ++--- swh/provenance/provenance.py | 349 +++++++++--------- swh/provenance/revision.py | 13 +- swh/provenance/sql/30-schema.sql | 30 +- swh/provenance/storage/archive.py | 39 +- swh/provenance/tests/conftest.py | 32 +- .../tests/data/generate_storage_from_git.py | 3 +- .../data/history_graphs_with-merges_visits-01.yaml | 55 +++ swh/provenance/tests/data/with-merges.msgpack | Bin 0 -> 7501 bytes ...repo_with_merges.yaml => with-merges_repo.yaml} | 0 ...s-visits-01.yaml => with-merges_visits-01.yaml} | 0 swh/provenance/tests/test_archive_interface.py | 50 +++ swh/provenance/tests/test_cli.py | 4 +- swh/provenance/tests/test_conftest.py | 2 +- swh/provenance/tests/test_history_graph.py | 62 ++++ swh/provenance/tests/test_isochrone_graph.py | 17 +- swh/provenance/tests/test_origin_iterator.py | 8 +- swh/provenance/tests/test_provenance_db.py | 4 +- swh/provenance/tests/test_provenance_heuristics.py | 56 +-- 30 files changed, 1259 insertions(+), 814 deletions(-) create mode 100644 swh/provenance/backend.py delete mode 100644 swh/provenance/postgresql/db_utils.py create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml create mode 100644 swh/provenance/tests/data/with-merges.msgpack rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%) rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%) create mode 100644 swh/provenance/tests/test_archive_interface.py create mode 100644 swh/provenance/tests/test_history_graph.py Changes applied before testcommit 061769b31e4310d305bb594b7d94ea401a8058a9 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jul 1 17:22:08 2021 +0200 Remove `db_utilis` in favour of `swh.core.db.BaseDb` methods This allows to remove some unncessary `psycopg2` dependencies in modules that don't deal directly with SQL queries. commit 0c3c7817cfd267bf3b77ce2e594ea7feb3ec96d5 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Tue Jun 29 14:28:54 2021 +0200 Force `snapshot_get_heads` to return revisions in chronological order commit afb67f665ab00c03c0ca33e96b1bfc109c827c58 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 19:21:58 2021 +0200 Add `ProvenanceStorageInterface` as discussed during backend design Rework backend-related classes to properly use the new interface. Adapt tests to the new structure as well. commit 3672235c3258cf93fb37a82d060bf40ba1761b8b Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 14:37:50 2021 +0200 Move `ProvenanceBackend` implementation to a separate file commit 6f4da6fed7e663273627ad4a46c8489ef0a0e784 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jul 1 13:47:26 2021 +0200 Use `RealDictCursor` in `ProvenanceDBBase` to improve the way `ProvenanceResult`s are generated. commit 07a30e43a76e170ab03764035da68dcf7db1fc3b Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 14:28:32 2021 +0200 Rework `ProvenanceInterface` as discussed during backend design Add `ProvenanceResult` class to be returned by `content_find_first` and `content_find_all` methods. Rename some methods. Improve type annotations. commit 2fd3f56b57f8db6691ae6b8b7cb7ac557b764172 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:31:16 2021 +0200 Add tests for history graph topology commit d45d6ff9e9317ecfe38d584df7297c548b654d28 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:10:38 2021 +0200 Fix database queries related to the origin-revision layer This required allowing null dates in the `revision` table so that revision can be added by the origin-revision layer algorithm but not recognized as already processed by the revision-content layer. Revision and origin entries are now inserted in the database prior to inserting rows to revision_in_origin and revision_before_revision relations, so that internal ids are properly resolved. commit 0e2a3c64ce3c368b53c101c541e8aebcde789477 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Fri Jun 25 13:38:26 2021 +0200 Add test to compare both `ArchiveInterface` implementations Improve documentation of the interface and complete pending TODO's. commit 98bba93cccece2b47ec4cd5887997cb5bede1e87 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:25:15 2021 +0200 Rename test files to keep naming convension Also added missing .msgpack file dump for new with-merges repository. commit fa9198afb71bcf3b8abea07d88d763a430f7358e Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:05:24 2021 +0200 Refactor `ArchiveInterface` to fit origin-revision layer needs Replace `revision_get` method by `revision_get_parents` returning an iterable of parents' ids only, instead of a swh.model.model.Revision object. commit 9e0c1aa099073887206c9334e17b49ee31bbef9a Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Wed Jun 23 20:00:40 2021 +0200 Use `Sha1Git` type to explicitly state the kind of identifiers Previous occurrences of `bytes` and `Sha1` are now correctly using `Sha1Git`. Also, some bytes conversion methods were replaced by their counterparts in the swh.model.hashutil module. commit a27ffff67b6b14bf37d153bb9b1d1c2ae63773fc Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Wed Jun 23 19:12:06 2021 +0200 Add support for sha1 identifiers for origins See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/221/ for more details. Comment Actions Build is green Patch application report for D5957 (id=21408)Could not rebase; Attempt merge onto d892b29e40... Updating d892b29..7b66b9c Fast-forward swh/provenance/__init__.py | 57 +-- swh/provenance/archive.py | 25 +- swh/provenance/backend.py | 357 ++++++++++++++++++ swh/provenance/cli.py | 28 +- swh/provenance/graph.py | 4 +- swh/provenance/model.py | 53 ++- swh/provenance/origin.py | 21 +- swh/provenance/postgresql/archive.py | 126 +++---- swh/provenance/postgresql/db_utils.py | 61 ---- swh/provenance/postgresql/provenancedb_base.py | 404 ++++++++++++--------- .../postgresql/provenancedb_with_path.py | 117 +++--- .../postgresql/provenancedb_without_path.py | 96 ++--- swh/provenance/provenance.py | 349 +++++++++--------- swh/provenance/revision.py | 13 +- swh/provenance/sql/30-schema.sql | 30 +- swh/provenance/storage/archive.py | 39 +- swh/provenance/tests/conftest.py | 32 +- .../tests/data/generate_storage_from_git.py | 3 +- .../data/history_graphs_with-merges_visits-01.yaml | 55 +++ swh/provenance/tests/data/with-merges.msgpack | Bin 0 -> 7501 bytes ...repo_with_merges.yaml => with-merges_repo.yaml} | 0 ...s-visits-01.yaml => with-merges_visits-01.yaml} | 0 swh/provenance/tests/test_archive_interface.py | 50 +++ swh/provenance/tests/test_cli.py | 4 +- swh/provenance/tests/test_conftest.py | 2 +- swh/provenance/tests/test_history_graph.py | 62 ++++ swh/provenance/tests/test_isochrone_graph.py | 17 +- swh/provenance/tests/test_origin_iterator.py | 8 +- swh/provenance/tests/test_provenance_db.py | 4 +- swh/provenance/tests/test_provenance_heuristics.py | 56 +-- 30 files changed, 1259 insertions(+), 814 deletions(-) create mode 100644 swh/provenance/backend.py delete mode 100644 swh/provenance/postgresql/db_utils.py create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml create mode 100644 swh/provenance/tests/data/with-merges.msgpack rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%) rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%) create mode 100644 swh/provenance/tests/test_archive_interface.py create mode 100644 swh/provenance/tests/test_history_graph.py Changes applied before testcommit 7b66b9c320592001d1831654779079a65af8429d Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jul 1 17:22:08 2021 +0200 Remove `db_utilis` in favour of `swh.core.db.BaseDb` methods This allows to remove some unnecessary `psycopg2` dependencies in modules that don't deal directly with SQL queries. commit 0c3c7817cfd267bf3b77ce2e594ea7feb3ec96d5 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Tue Jun 29 14:28:54 2021 +0200 Force `snapshot_get_heads` to return revisions in chronological order commit afb67f665ab00c03c0ca33e96b1bfc109c827c58 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 19:21:58 2021 +0200 Add `ProvenanceStorageInterface` as discussed during backend design Rework backend-related classes to properly use the new interface. Adapt tests to the new structure as well. commit 3672235c3258cf93fb37a82d060bf40ba1761b8b Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 14:37:50 2021 +0200 Move `ProvenanceBackend` implementation to a separate file commit 6f4da6fed7e663273627ad4a46c8489ef0a0e784 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jul 1 13:47:26 2021 +0200 Use `RealDictCursor` in `ProvenanceDBBase` to improve the way `ProvenanceResult`s are generated. commit 07a30e43a76e170ab03764035da68dcf7db1fc3b Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 14:28:32 2021 +0200 Rework `ProvenanceInterface` as discussed during backend design Add `ProvenanceResult` class to be returned by `content_find_first` and `content_find_all` methods. Rename some methods. Improve type annotations. commit 2fd3f56b57f8db6691ae6b8b7cb7ac557b764172 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:31:16 2021 +0200 Add tests for history graph topology commit d45d6ff9e9317ecfe38d584df7297c548b654d28 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:10:38 2021 +0200 Fix database queries related to the origin-revision layer This required allowing null dates in the `revision` table so that revision can be added by the origin-revision layer algorithm but not recognized as already processed by the revision-content layer. Revision and origin entries are now inserted in the database prior to inserting rows to revision_in_origin and revision_before_revision relations, so that internal ids are properly resolved. commit 0e2a3c64ce3c368b53c101c541e8aebcde789477 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Fri Jun 25 13:38:26 2021 +0200 Add test to compare both `ArchiveInterface` implementations Improve documentation of the interface and complete pending TODO's. commit 98bba93cccece2b47ec4cd5887997cb5bede1e87 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:25:15 2021 +0200 Rename test files to keep naming convension Also added missing .msgpack file dump for new with-merges repository. commit fa9198afb71bcf3b8abea07d88d763a430f7358e Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:05:24 2021 +0200 Refactor `ArchiveInterface` to fit origin-revision layer needs Replace `revision_get` method by `revision_get_parents` returning an iterable of parents' ids only, instead of a swh.model.model.Revision object. commit 9e0c1aa099073887206c9334e17b49ee31bbef9a Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Wed Jun 23 20:00:40 2021 +0200 Use `Sha1Git` type to explicitly state the kind of identifiers Previous occurrences of `bytes` and `Sha1` are now correctly using `Sha1Git`. Also, some bytes conversion methods were replaced by their counterparts in the swh.model.hashutil module. commit a27ffff67b6b14bf37d153bb9b1d1c2ae63773fc Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Wed Jun 23 19:12:06 2021 +0200 Add support for sha1 identifiers for origins See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/222/ for more details. Comment Actions Build is green Patch application report for D5957 (id=21409)Could not rebase; Attempt merge onto d892b29e40... Updating d892b29..4f0da58 Fast-forward swh/provenance/__init__.py | 57 +-- swh/provenance/archive.py | 25 +- swh/provenance/backend.py | 357 ++++++++++++++++++ swh/provenance/cli.py | 28 +- swh/provenance/graph.py | 4 +- swh/provenance/model.py | 53 ++- swh/provenance/origin.py | 21 +- swh/provenance/postgresql/archive.py | 126 +++---- swh/provenance/postgresql/db_utils.py | 61 ---- swh/provenance/postgresql/provenancedb_base.py | 404 ++++++++++++--------- .../postgresql/provenancedb_with_path.py | 117 +++--- .../postgresql/provenancedb_without_path.py | 96 ++--- swh/provenance/provenance.py | 349 +++++++++--------- swh/provenance/revision.py | 13 +- swh/provenance/sql/30-schema.sql | 30 +- swh/provenance/storage/archive.py | 39 +- swh/provenance/tests/conftest.py | 32 +- .../tests/data/generate_storage_from_git.py | 3 +- .../data/history_graphs_with-merges_visits-01.yaml | 55 +++ swh/provenance/tests/data/with-merges.msgpack | Bin 0 -> 7501 bytes ...repo_with_merges.yaml => with-merges_repo.yaml} | 0 ...s-visits-01.yaml => with-merges_visits-01.yaml} | 0 swh/provenance/tests/test_archive_interface.py | 50 +++ swh/provenance/tests/test_cli.py | 4 +- swh/provenance/tests/test_conftest.py | 2 +- swh/provenance/tests/test_history_graph.py | 62 ++++ swh/provenance/tests/test_origin_iterator.py | 8 +- swh/provenance/tests/test_provenance_db.py | 4 +- swh/provenance/tests/test_provenance_heuristics.py | 56 +-- 29 files changed, 1245 insertions(+), 811 deletions(-) create mode 100644 swh/provenance/backend.py delete mode 100644 swh/provenance/postgresql/db_utils.py create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml create mode 100644 swh/provenance/tests/data/with-merges.msgpack rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%) rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%) create mode 100644 swh/provenance/tests/test_archive_interface.py create mode 100644 swh/provenance/tests/test_history_graph.py Changes applied before testcommit 4f0da58cde268c8c378f9866d9325d7f3a98bd8c Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jul 1 17:34:08 2021 +0200 Remove `db_utils` in favour of `swh.core.db.BaseDb` methods This allows to remove some unnecessary `psycopg2` dependencies in modules that don't deal directly with SQL queries. commit 0c3c7817cfd267bf3b77ce2e594ea7feb3ec96d5 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Tue Jun 29 14:28:54 2021 +0200 Force `snapshot_get_heads` to return revisions in chronological order commit afb67f665ab00c03c0ca33e96b1bfc109c827c58 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 19:21:58 2021 +0200 Add `ProvenanceStorageInterface` as discussed during backend design Rework backend-related classes to properly use the new interface. Adapt tests to the new structure as well. commit 3672235c3258cf93fb37a82d060bf40ba1761b8b Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 14:37:50 2021 +0200 Move `ProvenanceBackend` implementation to a separate file commit 6f4da6fed7e663273627ad4a46c8489ef0a0e784 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jul 1 13:47:26 2021 +0200 Use `RealDictCursor` in `ProvenanceDBBase` to improve the way `ProvenanceResult`s are generated. commit 07a30e43a76e170ab03764035da68dcf7db1fc3b Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 14:28:32 2021 +0200 Rework `ProvenanceInterface` as discussed during backend design Add `ProvenanceResult` class to be returned by `content_find_first` and `content_find_all` methods. Rename some methods. Improve type annotations. commit 2fd3f56b57f8db6691ae6b8b7cb7ac557b764172 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:31:16 2021 +0200 Add tests for history graph topology commit d45d6ff9e9317ecfe38d584df7297c548b654d28 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:10:38 2021 +0200 Fix database queries related to the origin-revision layer This required allowing null dates in the `revision` table so that revision can be added by the origin-revision layer algorithm but not recognized as already processed by the revision-content layer. Revision and origin entries are now inserted in the database prior to inserting rows to revision_in_origin and revision_before_revision relations, so that internal ids are properly resolved. commit 0e2a3c64ce3c368b53c101c541e8aebcde789477 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Fri Jun 25 13:38:26 2021 +0200 Add test to compare both `ArchiveInterface` implementations Improve documentation of the interface and complete pending TODO's. commit 98bba93cccece2b47ec4cd5887997cb5bede1e87 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:25:15 2021 +0200 Rename test files to keep naming convension Also added missing .msgpack file dump for new with-merges repository. commit fa9198afb71bcf3b8abea07d88d763a430f7358e Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:05:24 2021 +0200 Refactor `ArchiveInterface` to fit origin-revision layer needs Replace `revision_get` method by `revision_get_parents` returning an iterable of parents' ids only, instead of a swh.model.model.Revision object. commit 9e0c1aa099073887206c9334e17b49ee31bbef9a Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Wed Jun 23 20:00:40 2021 +0200 Use `Sha1Git` type to explicitly state the kind of identifiers Previous occurrences of `bytes` and `Sha1` are now correctly using `Sha1Git`. Also, some bytes conversion methods were replaced by their counterparts in the swh.model.hashutil module. commit a27ffff67b6b14bf37d153bb9b1d1c2ae63773fc Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Wed Jun 23 19:12:06 2021 +0200 Add support for sha1 identifiers for origins See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/223/ for more details. Comment Actions Build is green Patch application report for D5957 (id=21428)Could not rebase; Attempt merge onto d892b29e40... Updating d892b29..a4eb765 Fast-forward swh/provenance/__init__.py | 57 +-- swh/provenance/archive.py | 25 +- swh/provenance/backend.py | 322 ++++++++++++++++ swh/provenance/cli.py | 28 +- swh/provenance/graph.py | 4 +- swh/provenance/model.py | 53 ++- swh/provenance/origin.py | 21 +- swh/provenance/postgresql/archive.py | 126 +++---- swh/provenance/postgresql/db_utils.py | 61 ---- swh/provenance/postgresql/provenancedb_base.py | 404 ++++++++++++--------- .../postgresql/provenancedb_with_path.py | 117 +++--- .../postgresql/provenancedb_without_path.py | 96 ++--- swh/provenance/provenance.py | 349 +++++++++--------- swh/provenance/revision.py | 13 +- swh/provenance/sql/30-schema.sql | 30 +- swh/provenance/storage/archive.py | 39 +- swh/provenance/tests/conftest.py | 32 +- .../tests/data/generate_storage_from_git.py | 3 +- .../data/history_graphs_with-merges_visits-01.yaml | 55 +++ swh/provenance/tests/data/with-merges.msgpack | Bin 0 -> 7501 bytes ...repo_with_merges.yaml => with-merges_repo.yaml} | 0 ...s-visits-01.yaml => with-merges_visits-01.yaml} | 0 swh/provenance/tests/test_archive_interface.py | 50 +++ swh/provenance/tests/test_cli.py | 4 +- swh/provenance/tests/test_conftest.py | 2 +- swh/provenance/tests/test_history_graph.py | 62 ++++ swh/provenance/tests/test_origin_iterator.py | 8 +- swh/provenance/tests/test_provenance_db.py | 4 +- swh/provenance/tests/test_provenance_heuristics.py | 56 +-- 29 files changed, 1210 insertions(+), 811 deletions(-) create mode 100644 swh/provenance/backend.py delete mode 100644 swh/provenance/postgresql/db_utils.py create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml create mode 100644 swh/provenance/tests/data/with-merges.msgpack rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%) rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%) create mode 100644 swh/provenance/tests/test_archive_interface.py create mode 100644 swh/provenance/tests/test_history_graph.py Changes applied before testcommit a4eb765f92d25d45ddea178aaeebc375d69e3830 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jul 1 17:34:08 2021 +0200 Remove `db_utils` in favour of `swh.core.db.BaseDb` methods This allows to remove some unnecessary `psycopg2` dependencies in modules that don't deal directly with SQL queries. commit 2a1aeb023627ed31ee4eea617790607e1fa1ba68 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Tue Jun 29 14:28:54 2021 +0200 Force `snapshot_get_heads` to return revisions in chronological order commit f819e4332df40b1ef35ff737f2558de570379473 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 19:21:58 2021 +0200 Add `ProvenanceStorageInterface` as discussed during backend design Rework backend-related classes to properly use the new interface. Adapt tests to the new structure as well. commit 3672235c3258cf93fb37a82d060bf40ba1761b8b Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 14:37:50 2021 +0200 Move `ProvenanceBackend` implementation to a separate file commit 6f4da6fed7e663273627ad4a46c8489ef0a0e784 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jul 1 13:47:26 2021 +0200 Use `RealDictCursor` in `ProvenanceDBBase` to improve the way `ProvenanceResult`s are generated. commit 07a30e43a76e170ab03764035da68dcf7db1fc3b Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 14:28:32 2021 +0200 Rework `ProvenanceInterface` as discussed during backend design Add `ProvenanceResult` class to be returned by `content_find_first` and `content_find_all` methods. Rename some methods. Improve type annotations. commit 2fd3f56b57f8db6691ae6b8b7cb7ac557b764172 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:31:16 2021 +0200 Add tests for history graph topology commit d45d6ff9e9317ecfe38d584df7297c548b654d28 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:10:38 2021 +0200 Fix database queries related to the origin-revision layer This required allowing null dates in the `revision` table so that revision can be added by the origin-revision layer algorithm but not recognized as already processed by the revision-content layer. Revision and origin entries are now inserted in the database prior to inserting rows to revision_in_origin and revision_before_revision relations, so that internal ids are properly resolved. commit 0e2a3c64ce3c368b53c101c541e8aebcde789477 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Fri Jun 25 13:38:26 2021 +0200 Add test to compare both `ArchiveInterface` implementations Improve documentation of the interface and complete pending TODO's. commit 98bba93cccece2b47ec4cd5887997cb5bede1e87 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:25:15 2021 +0200 Rename test files to keep naming convension Also added missing .msgpack file dump for new with-merges repository. commit fa9198afb71bcf3b8abea07d88d763a430f7358e Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:05:24 2021 +0200 Refactor `ArchiveInterface` to fit origin-revision layer needs Replace `revision_get` method by `revision_get_parents` returning an iterable of parents' ids only, instead of a swh.model.model.Revision object. commit 9e0c1aa099073887206c9334e17b49ee31bbef9a Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Wed Jun 23 20:00:40 2021 +0200 Use `Sha1Git` type to explicitly state the kind of identifiers Previous occurrences of `bytes` and `Sha1` are now correctly using `Sha1Git`. Also, some bytes conversion methods were replaced by their counterparts in the swh.model.hashutil module. commit a27ffff67b6b14bf37d153bb9b1d1c2ae63773fc Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Wed Jun 23 19:12:06 2021 +0200 Add support for sha1 identifiers for origins See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/232/ for more details. Comment Actions I'm not sure I understand the rationale. They still depend on psycopg2 transitively, so what is the change? Comment Actions lgtm Maybe, please check if the psycopg2 is still in requirements.txt, if so it can be
Yes, the diff description is a bit off. I'd say that removes duplicated db utils code So, in any case, thanks for the effort ;) Comment Actions Build is green Patch application report for D5957 (id=21448)Could not rebase; Attempt merge onto d892b29e40... Updating d892b29..6617ca5 Fast-forward swh/provenance/__init__.py | 57 +-- swh/provenance/archive.py | 25 +- swh/provenance/backend.py | 324 +++++++++++++++++ swh/provenance/cli.py | 28 +- swh/provenance/graph.py | 4 +- swh/provenance/model.py | 53 ++- swh/provenance/origin.py | 21 +- swh/provenance/postgresql/archive.py | 126 +++---- swh/provenance/postgresql/db_utils.py | 61 ---- swh/provenance/postgresql/provenancedb_base.py | 404 ++++++++++++--------- .../postgresql/provenancedb_with_path.py | 117 +++--- .../postgresql/provenancedb_without_path.py | 96 ++--- swh/provenance/provenance.py | 349 +++++++++--------- swh/provenance/revision.py | 13 +- swh/provenance/sql/30-schema.sql | 30 +- swh/provenance/storage/archive.py | 39 +- swh/provenance/tests/conftest.py | 32 +- .../tests/data/generate_storage_from_git.py | 3 +- .../data/history_graphs_with-merges_visits-01.yaml | 55 +++ swh/provenance/tests/data/with-merges.msgpack | Bin 0 -> 7501 bytes ...repo_with_merges.yaml => with-merges_repo.yaml} | 0 ...s-visits-01.yaml => with-merges_visits-01.yaml} | 0 swh/provenance/tests/test_archive_interface.py | 50 +++ swh/provenance/tests/test_cli.py | 4 +- swh/provenance/tests/test_conftest.py | 2 +- swh/provenance/tests/test_history_graph.py | 62 ++++ swh/provenance/tests/test_origin_iterator.py | 8 +- swh/provenance/tests/test_provenance_db.py | 4 +- swh/provenance/tests/test_provenance_heuristics.py | 56 +-- 29 files changed, 1212 insertions(+), 811 deletions(-) create mode 100644 swh/provenance/backend.py delete mode 100644 swh/provenance/postgresql/db_utils.py create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml create mode 100644 swh/provenance/tests/data/with-merges.msgpack rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%) rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%) create mode 100644 swh/provenance/tests/test_archive_interface.py create mode 100644 swh/provenance/tests/test_history_graph.py Changes applied before testcommit 6617ca5615ec000b90890cbf51979c0311ff1ea5 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jul 1 17:34:08 2021 +0200 Remove `db_utils` in favour of `swh.core.db.BaseDb` methods This allows to remove some unnecessary `psycopg2` dependencies in modules that don't deal directly with SQL queries. commit c72b2e2428fe410c17220648d378c505bba35350 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Tue Jun 29 14:28:54 2021 +0200 Force `snapshot_get_heads` to return revisions in chronological order commit 799839120cb99f22ce4272468ae0e388c335fb06 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 19:21:58 2021 +0200 Add `ProvenanceStorageInterface` as discussed during backend design Rework backend-related classes to properly use the new interface. Adapt tests to the new structure as well. commit 7c0a091ce5ffbf0a02dbe9d7fc84435ddd46cde2 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 14:37:50 2021 +0200 Move `ProvenanceBackend` implementation to a separate file commit 34898ad3cb18c24a7d7bef79dcfe470c3a1374ef Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jul 1 13:47:26 2021 +0200 Use `RealDictCursor` in `ProvenanceDBBase` to improve the way `ProvenanceResult`s are generated. commit 721354c436b5f5a861800b11e6151afa1aa634b6 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Mon Jun 28 14:28:32 2021 +0200 Rework `ProvenanceInterface` as discussed during backend design Add `ProvenanceResult` class to be returned by `content_find_first` and `content_find_all` methods. Rename some methods. Improve type annotations. commit 01f8d40ffccbcab6ecec6c2cf85478364e006caa Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:31:16 2021 +0200 Add tests for history graph topology commit b7fdcdec7ea96101d62a57d9aeed114c897df961 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:10:38 2021 +0200 Fix database queries related to the origin-revision layer This required allowing null dates in the `revision` table so that revision can be added by the origin-revision layer algorithm but not recognized as already processed by the revision-content layer. Revision and origin entries are now inserted in the database prior to inserting rows to revision_in_origin and revision_before_revision relations, so that internal ids are properly resolved. commit 0e2a3c64ce3c368b53c101c541e8aebcde789477 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Fri Jun 25 13:38:26 2021 +0200 Add test to compare both `ArchiveInterface` implementations Improve documentation of the interface and complete pending TODO's. commit 98bba93cccece2b47ec4cd5887997cb5bede1e87 Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:25:15 2021 +0200 Rename test files to keep naming convension Also added missing .msgpack file dump for new with-merges repository. commit fa9198afb71bcf3b8abea07d88d763a430f7358e Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Thu Jun 24 16:05:24 2021 +0200 Refactor `ArchiveInterface` to fit origin-revision layer needs Replace `revision_get` method by `revision_get_parents` returning an iterable of parents' ids only, instead of a swh.model.model.Revision object. commit 9e0c1aa099073887206c9334e17b49ee31bbef9a Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Wed Jun 23 20:00:40 2021 +0200 Use `Sha1Git` type to explicitly state the kind of identifiers Previous occurrences of `bytes` and `Sha1` are now correctly using `Sha1Git`. Also, some bytes conversion methods were replaced by their counterparts in the swh.model.hashutil module. commit a27ffff67b6b14bf37d153bb9b1d1c2ae63773fc Author: Andres Ezequiel Viso <aeviso@softwareheritage.org> Date: Wed Jun 23 19:12:06 2021 +0200 Add support for sha1 identifiers for origins See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/246/ for more details. |