Rework backend-related classes to properly use the new interface.
Adapt tests to the new structure as well.
Depends on D5946
Differential D5947
Add `ProvenanceStorageInterface` as discussed during backend design Authored by aeviso on Jun 29 2021, 12:52 PM.
Details
Rework backend-related classes to properly use the new interface. Depends on D5946
Diff Detail
Event TimelineComment Actions Build is green Patch application report for D5947 (id=21342)Could not rebase; Attempt merge onto d892b29e40... Updating d892b29..6afc8b3
Fast-forward
swh/provenance/__init__.py | 47 ++-
swh/provenance/archive.py | 24 +-
swh/provenance/backend.py | 357 ++++++++++++++++++
swh/provenance/graph.py | 4 +-
swh/provenance/model.py | 53 ++-
swh/provenance/origin.py | 21 +-
swh/provenance/postgresql/archive.py | 118 +++---
swh/provenance/postgresql/provenancedb_base.py | 411 +++++++++++++--------
.../postgresql/provenancedb_with_path.py | 103 ++----
.../postgresql/provenancedb_without_path.py | 82 ++--
swh/provenance/provenance.py | 257 ++++---------
swh/provenance/revision.py | 13 +-
swh/provenance/sql/30-schema.sql | 30 +-
swh/provenance/storage/archive.py | 30 +-
swh/provenance/tests/conftest.py | 32 +-
.../tests/data/generate_storage_from_git.py | 3 +-
.../data/history_graphs_with-merges_visits-01.yaml | 55 +++
swh/provenance/tests/data/with-merges.msgpack | Bin 0 -> 7501 bytes
...repo_with_merges.yaml => with-merges_repo.yaml} | 0
...s-visits-01.yaml => with-merges_visits-01.yaml} | 0
swh/provenance/tests/test_archive_interface.py | 51 +++
swh/provenance/tests/test_conftest.py | 2 +-
swh/provenance/tests/test_history_graph.py | 62 ++++
swh/provenance/tests/test_origin_iterator.py | 8 +-
swh/provenance/tests/test_provenance_db.py | 4 +-
swh/provenance/tests/test_provenance_heuristics.py | 30 +-
26 files changed, 1121 insertions(+), 676 deletions(-)
create mode 100644 swh/provenance/backend.py
create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml
create mode 100644 swh/provenance/tests/data/with-merges.msgpack
rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%)
rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%)
create mode 100644 swh/provenance/tests/test_archive_interface.py
create mode 100644 swh/provenance/tests/test_history_graph.pyChanges applied before testcommit 6afc8b39601c0f93375bcf37daa1b8a3d5bf242a
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 19:21:58 2021 +0200
Add ProvenanceStorageInterface
Rework backend-related classes to properly use the new interface.
Adapt tests to the new structure as well.
commit 23184e7de91d7e60577ce730868098b91a72b1d1
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:37:50 2021 +0200
Move `ProvenanceBackend` implementation to a separate file
commit f32475952907452f3dbe3d51be9433aa854413bf
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:28:32 2021 +0200
Rework `ProvenanceInterface` as discussed during backend design
Add `ProvenanceResult` class to be returned by `content_find_first` and
`content_find_all` methods. Rename some methods. Improve type annotations.
commit ad860db9bfeff7f276b3e356c9e21cb57cafc4c2
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:31:16 2021 +0200
Add tests for history graph topology
commit 37ac81faf15a32c4471a3c4ee5140bcb9bf57178
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:10:38 2021 +0200
Fix database queries related to the origin-revision layer
This required allowing null dates in the `revision` table so that revision can be added
by the origin-revision layer algorithm but not recognized as already processed by the
revision-content layer. Revision and origin entries are now inserted in the database
prior to inserting rows to revision_in_origin and revision_before_revision relations,
so that internal ids are properly resolved.
commit 4eb166cc4f2aa036c932b9a5eb462454a70ee0d9
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Fri Jun 25 13:38:26 2021 +0200
Add test to compare both ArchiveInterface implementations
Improve documentation of the interface and complete pending TODO's.
commit 01ac9eea375258ac1e000389d3fd286d0dbae458
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:25:15 2021 +0200
Rename test files to keep naming convension
Also added missing .msgpack file dump for new with-merges repository.
commit 76d1560924251396c1ac63c286d8612ce0f7e9d9
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:05:24 2021 +0200
Refactor ArchiveInterface to fit origin-revision layer needs
Replace `revision_get` method by `revision_get_parents` returning an iterable of
parents' ids only, instead of a swh.model.model.Revision object.
commit df69a9e57692ed9d4d870c295a21b3ac187d7b9c
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 20:00:40 2021 +0200
Use `Sha1Git` type to explicitly state the kind of identifiers
Previous occurrences of `bytes` and `Sha1` are not correctly using `Sha1Git`.
Also, some bytes conversion methods were replaced by their counterparts in
the swh.model.hashutil module.
commit fa22dc902781e30e46823030681f003983cc6d6e
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 19:12:06 2021 +0200
Add support for sha1 identifiers for originsSee https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/196/ for more details. Comment Actions Build is green Patch application report for D5947 (id=21353)Could not rebase; Attempt merge onto d892b29e40... Updating d892b29..a3da061
Fast-forward
swh/provenance/__init__.py | 47 ++-
swh/provenance/archive.py | 24 +-
swh/provenance/backend.py | 357 ++++++++++++++++++
swh/provenance/graph.py | 4 +-
swh/provenance/model.py | 53 ++-
swh/provenance/origin.py | 21 +-
swh/provenance/postgresql/archive.py | 118 +++---
swh/provenance/postgresql/provenancedb_base.py | 411 +++++++++++++--------
.../postgresql/provenancedb_with_path.py | 103 ++----
.../postgresql/provenancedb_without_path.py | 82 ++--
swh/provenance/provenance.py | 298 ++++++---------
swh/provenance/revision.py | 13 +-
swh/provenance/sql/30-schema.sql | 30 +-
swh/provenance/storage/archive.py | 30 +-
swh/provenance/tests/conftest.py | 32 +-
.../tests/data/generate_storage_from_git.py | 3 +-
.../data/history_graphs_with-merges_visits-01.yaml | 55 +++
swh/provenance/tests/data/with-merges.msgpack | Bin 0 -> 7501 bytes
...repo_with_merges.yaml => with-merges_repo.yaml} | 0
...s-visits-01.yaml => with-merges_visits-01.yaml} | 0
swh/provenance/tests/test_archive_interface.py | 51 +++
swh/provenance/tests/test_conftest.py | 2 +-
swh/provenance/tests/test_history_graph.py | 62 ++++
swh/provenance/tests/test_origin_iterator.py | 8 +-
swh/provenance/tests/test_provenance_db.py | 4 +-
swh/provenance/tests/test_provenance_heuristics.py | 30 +-
26 files changed, 1162 insertions(+), 676 deletions(-)
create mode 100644 swh/provenance/backend.py
create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml
create mode 100644 swh/provenance/tests/data/with-merges.msgpack
rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%)
rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%)
create mode 100644 swh/provenance/tests/test_archive_interface.py
create mode 100644 swh/provenance/tests/test_history_graph.pyChanges applied before testcommit a3da0612eae1ded260eeafee9dc77f2bbf84a47f
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 19:21:58 2021 +0200
Add `ProvenanceStorageInterface` as discussed during backend design
Rework backend-related classes to properly use the new interface.
Adapt tests to the new structure as well.
commit 361199109d7d5a6cb694685cb2062940abe814bb
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:37:50 2021 +0200
Move `ProvenanceBackend` implementation to a separate file
commit d058de2c080ee0c79ae57131d5c8ebdbeb6d0486
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:28:32 2021 +0200
Rework `ProvenanceInterface` as discussed during backend design
Add `ProvenanceResult` class to be returned by `content_find_first` and
`content_find_all` methods. Rename some methods. Improve type annotations.
commit ad860db9bfeff7f276b3e356c9e21cb57cafc4c2
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:31:16 2021 +0200
Add tests for history graph topology
commit 37ac81faf15a32c4471a3c4ee5140bcb9bf57178
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:10:38 2021 +0200
Fix database queries related to the origin-revision layer
This required allowing null dates in the `revision` table so that revision can be added
by the origin-revision layer algorithm but not recognized as already processed by the
revision-content layer. Revision and origin entries are now inserted in the database
prior to inserting rows to revision_in_origin and revision_before_revision relations,
so that internal ids are properly resolved.
commit 4eb166cc4f2aa036c932b9a5eb462454a70ee0d9
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Fri Jun 25 13:38:26 2021 +0200
Add test to compare both ArchiveInterface implementations
Improve documentation of the interface and complete pending TODO's.
commit 01ac9eea375258ac1e000389d3fd286d0dbae458
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:25:15 2021 +0200
Rename test files to keep naming convension
Also added missing .msgpack file dump for new with-merges repository.
commit 76d1560924251396c1ac63c286d8612ce0f7e9d9
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:05:24 2021 +0200
Refactor ArchiveInterface to fit origin-revision layer needs
Replace `revision_get` method by `revision_get_parents` returning an iterable of
parents' ids only, instead of a swh.model.model.Revision object.
commit df69a9e57692ed9d4d870c295a21b3ac187d7b9c
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 20:00:40 2021 +0200
Use `Sha1Git` type to explicitly state the kind of identifiers
Previous occurrences of `bytes` and `Sha1` are not correctly using `Sha1Git`.
Also, some bytes conversion methods were replaced by their counterparts in
the swh.model.hashutil module.
commit fa22dc902781e30e46823030681f003983cc6d6e
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 19:12:06 2021 +0200
Add support for sha1 identifiers for originsSee https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/199/ for more details. Comment Actions Build is green Patch application report for D5947 (id=21364)Could not rebase; Attempt merge onto d892b29e40... Updating d892b29..695b498
Fast-forward
swh/provenance/__init__.py | 47 ++-
swh/provenance/archive.py | 24 +-
swh/provenance/backend.py | 357 ++++++++++++++++++
swh/provenance/graph.py | 4 +-
swh/provenance/model.py | 53 ++-
swh/provenance/origin.py | 21 +-
swh/provenance/postgresql/archive.py | 118 +++---
swh/provenance/postgresql/provenancedb_base.py | 398 ++++++++++++---------
.../postgresql/provenancedb_with_path.py | 115 +++---
.../postgresql/provenancedb_without_path.py | 94 ++---
swh/provenance/provenance.py | 341 ++++++++----------
swh/provenance/revision.py | 13 +-
swh/provenance/sql/30-schema.sql | 30 +-
swh/provenance/storage/archive.py | 30 +-
swh/provenance/tests/conftest.py | 32 +-
.../tests/data/generate_storage_from_git.py | 3 +-
.../data/history_graphs_with-merges_visits-01.yaml | 55 +++
swh/provenance/tests/data/with-merges.msgpack | Bin 0 -> 7501 bytes
...repo_with_merges.yaml => with-merges_repo.yaml} | 0
...s-visits-01.yaml => with-merges_visits-01.yaml} | 0
swh/provenance/tests/test_archive_interface.py | 51 +++
swh/provenance/tests/test_conftest.py | 2 +-
swh/provenance/tests/test_history_graph.py | 62 ++++
swh/provenance/tests/test_origin_iterator.py | 8 +-
swh/provenance/tests/test_provenance_db.py | 4 +-
swh/provenance/tests/test_provenance_heuristics.py | 30 +-
26 files changed, 1194 insertions(+), 698 deletions(-)
create mode 100644 swh/provenance/backend.py
create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml
create mode 100644 swh/provenance/tests/data/with-merges.msgpack
rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%)
rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%)
create mode 100644 swh/provenance/tests/test_archive_interface.py
create mode 100644 swh/provenance/tests/test_history_graph.pyChanges applied before testcommit 695b498600682045004fdd04859ecf9e96819479
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 19:21:58 2021 +0200
Add `ProvenanceStorageInterface` as discussed during backend design
Rework backend-related classes to properly use the new interface.
Adapt tests to the new structure as well.
commit 361199109d7d5a6cb694685cb2062940abe814bb
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:37:50 2021 +0200
Move `ProvenanceBackend` implementation to a separate file
commit d058de2c080ee0c79ae57131d5c8ebdbeb6d0486
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:28:32 2021 +0200
Rework `ProvenanceInterface` as discussed during backend design
Add `ProvenanceResult` class to be returned by `content_find_first` and
`content_find_all` methods. Rename some methods. Improve type annotations.
commit ad860db9bfeff7f276b3e356c9e21cb57cafc4c2
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:31:16 2021 +0200
Add tests for history graph topology
commit 37ac81faf15a32c4471a3c4ee5140bcb9bf57178
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:10:38 2021 +0200
Fix database queries related to the origin-revision layer
This required allowing null dates in the `revision` table so that revision can be added
by the origin-revision layer algorithm but not recognized as already processed by the
revision-content layer. Revision and origin entries are now inserted in the database
prior to inserting rows to revision_in_origin and revision_before_revision relations,
so that internal ids are properly resolved.
commit 4eb166cc4f2aa036c932b9a5eb462454a70ee0d9
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Fri Jun 25 13:38:26 2021 +0200
Add test to compare both ArchiveInterface implementations
Improve documentation of the interface and complete pending TODO's.
commit 01ac9eea375258ac1e000389d3fd286d0dbae458
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:25:15 2021 +0200
Rename test files to keep naming convension
Also added missing .msgpack file dump for new with-merges repository.
commit 76d1560924251396c1ac63c286d8612ce0f7e9d9
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:05:24 2021 +0200
Refactor ArchiveInterface to fit origin-revision layer needs
Replace `revision_get` method by `revision_get_parents` returning an iterable of
parents' ids only, instead of a swh.model.model.Revision object.
commit df69a9e57692ed9d4d870c295a21b3ac187d7b9c
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 20:00:40 2021 +0200
Use `Sha1Git` type to explicitly state the kind of identifiers
Previous occurrences of `bytes` and `Sha1` are not correctly using `Sha1Git`.
Also, some bytes conversion methods were replaced by their counterparts in
the swh.model.hashutil module.
commit fa22dc902781e30e46823030681f003983cc6d6e
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 19:12:06 2021 +0200
Add support for sha1 identifiers for originsSee https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/201/ for more details. Comment Actions Build is green Patch application report for D5947 (id=21370)Could not rebase; Attempt merge onto d892b29e40... Updating d892b29..2304647
Fast-forward
swh/provenance/__init__.py | 47 ++-
swh/provenance/archive.py | 24 +-
swh/provenance/backend.py | 357 ++++++++++++++++++
swh/provenance/graph.py | 4 +-
swh/provenance/model.py | 53 ++-
swh/provenance/origin.py | 21 +-
swh/provenance/postgresql/archive.py | 118 +++---
swh/provenance/postgresql/provenancedb_base.py | 398 ++++++++++++---------
.../postgresql/provenancedb_with_path.py | 115 +++---
.../postgresql/provenancedb_without_path.py | 94 ++---
swh/provenance/provenance.py | 341 ++++++++----------
swh/provenance/revision.py | 13 +-
swh/provenance/sql/30-schema.sql | 30 +-
swh/provenance/storage/archive.py | 30 +-
swh/provenance/tests/conftest.py | 32 +-
.../tests/data/generate_storage_from_git.py | 3 +-
.../data/history_graphs_with-merges_visits-01.yaml | 55 +++
swh/provenance/tests/data/with-merges.msgpack | Bin 0 -> 7501 bytes
...repo_with_merges.yaml => with-merges_repo.yaml} | 0
...s-visits-01.yaml => with-merges_visits-01.yaml} | 0
swh/provenance/tests/test_archive_interface.py | 51 +++
swh/provenance/tests/test_conftest.py | 2 +-
swh/provenance/tests/test_history_graph.py | 62 ++++
swh/provenance/tests/test_origin_iterator.py | 8 +-
swh/provenance/tests/test_provenance_db.py | 4 +-
swh/provenance/tests/test_provenance_heuristics.py | 30 +-
26 files changed, 1194 insertions(+), 698 deletions(-)
create mode 100644 swh/provenance/backend.py
create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml
create mode 100644 swh/provenance/tests/data/with-merges.msgpack
rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%)
rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%)
create mode 100644 swh/provenance/tests/test_archive_interface.py
create mode 100644 swh/provenance/tests/test_history_graph.pyChanges applied before testcommit 2304647745e79308b72978ba3a9141f3e6f844f8
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 19:21:58 2021 +0200
Add `ProvenanceStorageInterface` as discussed during backend design
Rework backend-related classes to properly use the new interface.
Adapt tests to the new structure as well.
commit e25122d2e47de942a772164e9f1a60f425c87d97
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:37:50 2021 +0200
Move `ProvenanceBackend` implementation to a separate file
commit b7678a341da72587cc48848f5a72f65861f892af
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:28:32 2021 +0200
Rework `ProvenanceInterface` as discussed during backend design
Add `ProvenanceResult` class to be returned by `content_find_first` and
`content_find_all` methods. Rename some methods. Improve type annotations.
commit 7a59ff712bb8b5ae22e6f016475d03317c27b64a
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:31:16 2021 +0200
Add tests for history graph topology
commit 3171ae2f129df433689fd22e32c8eeebf7af4171
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:10:38 2021 +0200
Fix database queries related to the origin-revision layer
This required allowing null dates in the `revision` table so that revision can be added
by the origin-revision layer algorithm but not recognized as already processed by the
revision-content layer. Revision and origin entries are now inserted in the database
prior to inserting rows to revision_in_origin and revision_before_revision relations,
so that internal ids are properly resolved.
commit 6736f6068280f167df5616681dee9ad67b2b7dbd
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Fri Jun 25 13:38:26 2021 +0200
Add test to compare both `ArchiveInterface` implementations
Improve documentation of the interface and complete pending TODO's.
commit dde867254e51dd87f4aba3cdea59da8bffc2d160
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:25:15 2021 +0200
Rename test files to keep naming convension
Also added missing .msgpack file dump for new with-merges repository.
commit 14001c1844598a3d4ebd1b5f609070f9c85dcaa9
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:05:24 2021 +0200
Refactor `ArchiveInterface` to fit origin-revision layer needs
Replace `revision_get` method by `revision_get_parents` returning an iterable of
parents' ids only, instead of a swh.model.model.Revision object.
commit df69a9e57692ed9d4d870c295a21b3ac187d7b9c
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 20:00:40 2021 +0200
Use `Sha1Git` type to explicitly state the kind of identifiers
Previous occurrences of `bytes` and `Sha1` are not correctly using `Sha1Git`.
Also, some bytes conversion methods were replaced by their counterparts in
the swh.model.hashutil module.
commit fa22dc902781e30e46823030681f003983cc6d6e
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 19:12:06 2021 +0200
Add support for sha1 identifiers for originsSee https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/207/ for more details. Comment Actions Build is green Patch application report for D5947 (id=21385)Could not rebase; Attempt merge onto d892b29e40... Updating d892b29..35dafe5
Fast-forward
swh/provenance/__init__.py | 47 ++-
swh/provenance/archive.py | 24 +-
swh/provenance/backend.py | 357 ++++++++++++++++++
swh/provenance/cli.py | 28 +-
swh/provenance/graph.py | 4 +-
swh/provenance/model.py | 53 ++-
swh/provenance/origin.py | 21 +-
swh/provenance/postgresql/archive.py | 118 +++---
swh/provenance/postgresql/provenancedb_base.py | 402 ++++++++++++---------
.../postgresql/provenancedb_with_path.py | 117 +++---
.../postgresql/provenancedb_without_path.py | 96 ++---
swh/provenance/provenance.py | 349 +++++++++---------
swh/provenance/revision.py | 13 +-
swh/provenance/sql/30-schema.sql | 30 +-
swh/provenance/storage/archive.py | 30 +-
swh/provenance/tests/conftest.py | 32 +-
.../tests/data/generate_storage_from_git.py | 3 +-
.../data/history_graphs_with-merges_visits-01.yaml | 55 +++
swh/provenance/tests/data/with-merges.msgpack | Bin 0 -> 7501 bytes
...repo_with_merges.yaml => with-merges_repo.yaml} | 0
...s-visits-01.yaml => with-merges_visits-01.yaml} | 0
swh/provenance/tests/test_archive_interface.py | 51 +++
swh/provenance/tests/test_conftest.py | 2 +-
swh/provenance/tests/test_history_graph.py | 62 ++++
swh/provenance/tests/test_origin_iterator.py | 8 +-
swh/provenance/tests/test_provenance_db.py | 4 +-
swh/provenance/tests/test_provenance_heuristics.py | 56 +--
27 files changed, 1224 insertions(+), 738 deletions(-)
create mode 100644 swh/provenance/backend.py
create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml
create mode 100644 swh/provenance/tests/data/with-merges.msgpack
rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%)
rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%)
create mode 100644 swh/provenance/tests/test_archive_interface.py
create mode 100644 swh/provenance/tests/test_history_graph.pyChanges applied before testcommit 35dafe5f8b1b95a0610199c93864ad16a1659283
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jul 1 12:07:41 2021 +0200
Use `RealDictCursor` in `ProvenanceDBBase`
to improve the way `ProvenanceResult`s are generated.
Change `ProvenanceDBBase` from a `TypedDict` to a regular class.
commit 2304647745e79308b72978ba3a9141f3e6f844f8
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 19:21:58 2021 +0200
Add `ProvenanceStorageInterface` as discussed during backend design
Rework backend-related classes to properly use the new interface.
Adapt tests to the new structure as well.
commit e25122d2e47de942a772164e9f1a60f425c87d97
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:37:50 2021 +0200
Move `ProvenanceBackend` implementation to a separate file
commit b7678a341da72587cc48848f5a72f65861f892af
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:28:32 2021 +0200
Rework `ProvenanceInterface` as discussed during backend design
Add `ProvenanceResult` class to be returned by `content_find_first` and
`content_find_all` methods. Rename some methods. Improve type annotations.
commit 7a59ff712bb8b5ae22e6f016475d03317c27b64a
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:31:16 2021 +0200
Add tests for history graph topology
commit 3171ae2f129df433689fd22e32c8eeebf7af4171
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:10:38 2021 +0200
Fix database queries related to the origin-revision layer
This required allowing null dates in the `revision` table so that revision can be added
by the origin-revision layer algorithm but not recognized as already processed by the
revision-content layer. Revision and origin entries are now inserted in the database
prior to inserting rows to revision_in_origin and revision_before_revision relations,
so that internal ids are properly resolved.
commit 6736f6068280f167df5616681dee9ad67b2b7dbd
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Fri Jun 25 13:38:26 2021 +0200
Add test to compare both `ArchiveInterface` implementations
Improve documentation of the interface and complete pending TODO's.
commit dde867254e51dd87f4aba3cdea59da8bffc2d160
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:25:15 2021 +0200
Rename test files to keep naming convension
Also added missing .msgpack file dump for new with-merges repository.
commit 14001c1844598a3d4ebd1b5f609070f9c85dcaa9
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:05:24 2021 +0200
Refactor `ArchiveInterface` to fit origin-revision layer needs
Replace `revision_get` method by `revision_get_parents` returning an iterable of
parents' ids only, instead of a swh.model.model.Revision object.
commit df69a9e57692ed9d4d870c295a21b3ac187d7b9c
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 20:00:40 2021 +0200
Use `Sha1Git` type to explicitly state the kind of identifiers
Previous occurrences of `bytes` and `Sha1` are not correctly using `Sha1Git`.
Also, some bytes conversion methods were replaced by their counterparts in
the swh.model.hashutil module.
commit fa22dc902781e30e46823030681f003983cc6d6e
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 19:12:06 2021 +0200
Add support for sha1 identifiers for originsSee https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/209/ for more details. Comment Actions Build is green Patch application report for D5947 (id=21397)Could not rebase; Attempt merge onto d892b29e40... Updating d892b29..afb67f6
Fast-forward
swh/provenance/__init__.py | 47 ++-
swh/provenance/archive.py | 24 +-
swh/provenance/backend.py | 357 ++++++++++++++++++
swh/provenance/cli.py | 28 +-
swh/provenance/graph.py | 4 +-
swh/provenance/model.py | 53 ++-
swh/provenance/origin.py | 21 +-
swh/provenance/postgresql/archive.py | 115 ++----
swh/provenance/postgresql/provenancedb_base.py | 402 ++++++++++++---------
.../postgresql/provenancedb_with_path.py | 117 +++---
.../postgresql/provenancedb_without_path.py | 96 ++---
swh/provenance/provenance.py | 349 +++++++++---------
swh/provenance/revision.py | 13 +-
swh/provenance/sql/30-schema.sql | 30 +-
swh/provenance/storage/archive.py | 30 +-
swh/provenance/tests/conftest.py | 32 +-
.../tests/data/generate_storage_from_git.py | 3 +-
.../data/history_graphs_with-merges_visits-01.yaml | 55 +++
swh/provenance/tests/data/with-merges.msgpack | Bin 0 -> 7501 bytes
...repo_with_merges.yaml => with-merges_repo.yaml} | 0
...s-visits-01.yaml => with-merges_visits-01.yaml} | 0
swh/provenance/tests/test_archive_interface.py | 51 +++
swh/provenance/tests/test_conftest.py | 2 +-
swh/provenance/tests/test_history_graph.py | 62 ++++
swh/provenance/tests/test_origin_iterator.py | 8 +-
swh/provenance/tests/test_provenance_db.py | 4 +-
swh/provenance/tests/test_provenance_heuristics.py | 56 +--
27 files changed, 1219 insertions(+), 740 deletions(-)
create mode 100644 swh/provenance/backend.py
create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml
create mode 100644 swh/provenance/tests/data/with-merges.msgpack
rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%)
rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%)
create mode 100644 swh/provenance/tests/test_archive_interface.py
create mode 100644 swh/provenance/tests/test_history_graph.pyChanges applied before testcommit afb67f665ab00c03c0ca33e96b1bfc109c827c58
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 19:21:58 2021 +0200
Add `ProvenanceStorageInterface` as discussed during backend design
Rework backend-related classes to properly use the new interface.
Adapt tests to the new structure as well.
commit 3672235c3258cf93fb37a82d060bf40ba1761b8b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:37:50 2021 +0200
Move `ProvenanceBackend` implementation to a separate file
commit 6f4da6fed7e663273627ad4a46c8489ef0a0e784
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jul 1 13:47:26 2021 +0200
Use `RealDictCursor` in `ProvenanceDBBase`
to improve the way `ProvenanceResult`s are generated.
commit 07a30e43a76e170ab03764035da68dcf7db1fc3b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:28:32 2021 +0200
Rework `ProvenanceInterface` as discussed during backend design
Add `ProvenanceResult` class to be returned by `content_find_first` and
`content_find_all` methods. Rename some methods. Improve type annotations.
commit 2fd3f56b57f8db6691ae6b8b7cb7ac557b764172
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:31:16 2021 +0200
Add tests for history graph topology
commit d45d6ff9e9317ecfe38d584df7297c548b654d28
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:10:38 2021 +0200
Fix database queries related to the origin-revision layer
This required allowing null dates in the `revision` table so that revision can be added
by the origin-revision layer algorithm but not recognized as already processed by the
revision-content layer. Revision and origin entries are now inserted in the database
prior to inserting rows to revision_in_origin and revision_before_revision relations,
so that internal ids are properly resolved.
commit 0e2a3c64ce3c368b53c101c541e8aebcde789477
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Fri Jun 25 13:38:26 2021 +0200
Add test to compare both `ArchiveInterface` implementations
Improve documentation of the interface and complete pending TODO's.
commit 98bba93cccece2b47ec4cd5887997cb5bede1e87
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:25:15 2021 +0200
Rename test files to keep naming convension
Also added missing .msgpack file dump for new with-merges repository.
commit fa9198afb71bcf3b8abea07d88d763a430f7358e
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:05:24 2021 +0200
Refactor `ArchiveInterface` to fit origin-revision layer needs
Replace `revision_get` method by `revision_get_parents` returning an iterable of
parents' ids only, instead of a swh.model.model.Revision object.
commit 9e0c1aa099073887206c9334e17b49ee31bbef9a
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 20:00:40 2021 +0200
Use `Sha1Git` type to explicitly state the kind of identifiers
Previous occurrences of `bytes` and `Sha1` are now correctly using `Sha1Git`.
Also, some bytes conversion methods were replaced by their counterparts in
the swh.model.hashutil module.
commit a27ffff67b6b14bf37d153bb9b1d1c2ae63773fc
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 19:12:06 2021 +0200
Add support for sha1 identifiers for originsSee https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/219/ for more details. Comment Actions Build is green Patch application report for D5947 (id=21426)Could not rebase; Attempt merge onto d892b29e40... Updating d892b29..f819e43
Fast-forward
swh/provenance/__init__.py | 47 ++-
swh/provenance/archive.py | 24 +-
swh/provenance/backend.py | 322 +++++++++++++++++
swh/provenance/cli.py | 28 +-
swh/provenance/graph.py | 4 +-
swh/provenance/model.py | 53 ++-
swh/provenance/origin.py | 21 +-
swh/provenance/postgresql/archive.py | 115 ++----
swh/provenance/postgresql/provenancedb_base.py | 402 ++++++++++++---------
.../postgresql/provenancedb_with_path.py | 117 +++---
.../postgresql/provenancedb_without_path.py | 96 ++---
swh/provenance/provenance.py | 349 +++++++++---------
swh/provenance/revision.py | 13 +-
swh/provenance/sql/30-schema.sql | 30 +-
swh/provenance/storage/archive.py | 30 +-
swh/provenance/tests/conftest.py | 32 +-
.../tests/data/generate_storage_from_git.py | 3 +-
.../data/history_graphs_with-merges_visits-01.yaml | 55 +++
swh/provenance/tests/data/with-merges.msgpack | Bin 0 -> 7501 bytes
...repo_with_merges.yaml => with-merges_repo.yaml} | 0
...s-visits-01.yaml => with-merges_visits-01.yaml} | 0
swh/provenance/tests/test_archive_interface.py | 51 +++
swh/provenance/tests/test_conftest.py | 2 +-
swh/provenance/tests/test_history_graph.py | 62 ++++
swh/provenance/tests/test_origin_iterator.py | 8 +-
swh/provenance/tests/test_provenance_db.py | 4 +-
swh/provenance/tests/test_provenance_heuristics.py | 56 +--
27 files changed, 1184 insertions(+), 740 deletions(-)
create mode 100644 swh/provenance/backend.py
create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml
create mode 100644 swh/provenance/tests/data/with-merges.msgpack
rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%)
rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%)
create mode 100644 swh/provenance/tests/test_archive_interface.py
create mode 100644 swh/provenance/tests/test_history_graph.pyChanges applied before testcommit f819e4332df40b1ef35ff737f2558de570379473
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 19:21:58 2021 +0200
Add `ProvenanceStorageInterface` as discussed during backend design
Rework backend-related classes to properly use the new interface.
Adapt tests to the new structure as well.
commit 3672235c3258cf93fb37a82d060bf40ba1761b8b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:37:50 2021 +0200
Move `ProvenanceBackend` implementation to a separate file
commit 6f4da6fed7e663273627ad4a46c8489ef0a0e784
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jul 1 13:47:26 2021 +0200
Use `RealDictCursor` in `ProvenanceDBBase`
to improve the way `ProvenanceResult`s are generated.
commit 07a30e43a76e170ab03764035da68dcf7db1fc3b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:28:32 2021 +0200
Rework `ProvenanceInterface` as discussed during backend design
Add `ProvenanceResult` class to be returned by `content_find_first` and
`content_find_all` methods. Rename some methods. Improve type annotations.
commit 2fd3f56b57f8db6691ae6b8b7cb7ac557b764172
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:31:16 2021 +0200
Add tests for history graph topology
commit d45d6ff9e9317ecfe38d584df7297c548b654d28
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:10:38 2021 +0200
Fix database queries related to the origin-revision layer
This required allowing null dates in the `revision` table so that revision can be added
by the origin-revision layer algorithm but not recognized as already processed by the
revision-content layer. Revision and origin entries are now inserted in the database
prior to inserting rows to revision_in_origin and revision_before_revision relations,
so that internal ids are properly resolved.
commit 0e2a3c64ce3c368b53c101c541e8aebcde789477
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Fri Jun 25 13:38:26 2021 +0200
Add test to compare both `ArchiveInterface` implementations
Improve documentation of the interface and complete pending TODO's.
commit 98bba93cccece2b47ec4cd5887997cb5bede1e87
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:25:15 2021 +0200
Rename test files to keep naming convension
Also added missing .msgpack file dump for new with-merges repository.
commit fa9198afb71bcf3b8abea07d88d763a430f7358e
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:05:24 2021 +0200
Refactor `ArchiveInterface` to fit origin-revision layer needs
Replace `revision_get` method by `revision_get_parents` returning an iterable of
parents' ids only, instead of a swh.model.model.Revision object.
commit 9e0c1aa099073887206c9334e17b49ee31bbef9a
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 20:00:40 2021 +0200
Use `Sha1Git` type to explicitly state the kind of identifiers
Previous occurrences of `bytes` and `Sha1` are now correctly using `Sha1Git`.
Also, some bytes conversion methods were replaced by their counterparts in
the swh.model.hashutil module.
commit a27ffff67b6b14bf37d153bb9b1d1c2ae63773fc
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 19:12:06 2021 +0200
Add support for sha1 identifiers for originsSee https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/230/ for more details.
Comment Actions Build is green Patch application report for D5947 (id=21446)Could not rebase; Attempt merge onto d892b29e40... Updating d892b29..7998391
Fast-forward
swh/provenance/__init__.py | 47 ++-
swh/provenance/archive.py | 24 +-
swh/provenance/backend.py | 324 +++++++++++++++++
swh/provenance/cli.py | 28 +-
swh/provenance/graph.py | 4 +-
swh/provenance/model.py | 53 ++-
swh/provenance/origin.py | 21 +-
swh/provenance/postgresql/archive.py | 115 ++----
swh/provenance/postgresql/provenancedb_base.py | 402 ++++++++++++---------
.../postgresql/provenancedb_with_path.py | 117 +++---
.../postgresql/provenancedb_without_path.py | 96 ++---
swh/provenance/provenance.py | 349 +++++++++---------
swh/provenance/revision.py | 13 +-
swh/provenance/sql/30-schema.sql | 30 +-
swh/provenance/storage/archive.py | 30 +-
swh/provenance/tests/conftest.py | 32 +-
.../tests/data/generate_storage_from_git.py | 3 +-
.../data/history_graphs_with-merges_visits-01.yaml | 55 +++
swh/provenance/tests/data/with-merges.msgpack | Bin 0 -> 7501 bytes
...repo_with_merges.yaml => with-merges_repo.yaml} | 0
...s-visits-01.yaml => with-merges_visits-01.yaml} | 0
swh/provenance/tests/test_archive_interface.py | 51 +++
swh/provenance/tests/test_conftest.py | 2 +-
swh/provenance/tests/test_history_graph.py | 62 ++++
swh/provenance/tests/test_origin_iterator.py | 8 +-
swh/provenance/tests/test_provenance_db.py | 4 +-
swh/provenance/tests/test_provenance_heuristics.py | 56 +--
27 files changed, 1186 insertions(+), 740 deletions(-)
create mode 100644 swh/provenance/backend.py
create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml
create mode 100644 swh/provenance/tests/data/with-merges.msgpack
rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%)
rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%)
create mode 100644 swh/provenance/tests/test_archive_interface.py
create mode 100644 swh/provenance/tests/test_history_graph.pyChanges applied before testcommit 799839120cb99f22ce4272468ae0e388c335fb06
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 19:21:58 2021 +0200
Add `ProvenanceStorageInterface` as discussed during backend design
Rework backend-related classes to properly use the new interface.
Adapt tests to the new structure as well.
commit 7c0a091ce5ffbf0a02dbe9d7fc84435ddd46cde2
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:37:50 2021 +0200
Move `ProvenanceBackend` implementation to a separate file
commit 34898ad3cb18c24a7d7bef79dcfe470c3a1374ef
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jul 1 13:47:26 2021 +0200
Use `RealDictCursor` in `ProvenanceDBBase`
to improve the way `ProvenanceResult`s are generated.
commit 721354c436b5f5a861800b11e6151afa1aa634b6
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Mon Jun 28 14:28:32 2021 +0200
Rework `ProvenanceInterface` as discussed during backend design
Add `ProvenanceResult` class to be returned by `content_find_first` and
`content_find_all` methods. Rename some methods. Improve type annotations.
commit 01f8d40ffccbcab6ecec6c2cf85478364e006caa
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:31:16 2021 +0200
Add tests for history graph topology
commit b7fdcdec7ea96101d62a57d9aeed114c897df961
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:10:38 2021 +0200
Fix database queries related to the origin-revision layer
This required allowing null dates in the `revision` table so that revision can be added
by the origin-revision layer algorithm but not recognized as already processed by the
revision-content layer. Revision and origin entries are now inserted in the database
prior to inserting rows to revision_in_origin and revision_before_revision relations,
so that internal ids are properly resolved.
commit 0e2a3c64ce3c368b53c101c541e8aebcde789477
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Fri Jun 25 13:38:26 2021 +0200
Add test to compare both `ArchiveInterface` implementations
Improve documentation of the interface and complete pending TODO's.
commit 98bba93cccece2b47ec4cd5887997cb5bede1e87
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:25:15 2021 +0200
Rename test files to keep naming convension
Also added missing .msgpack file dump for new with-merges repository.
commit fa9198afb71bcf3b8abea07d88d763a430f7358e
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Thu Jun 24 16:05:24 2021 +0200
Refactor `ArchiveInterface` to fit origin-revision layer needs
Replace `revision_get` method by `revision_get_parents` returning an iterable of
parents' ids only, instead of a swh.model.model.Revision object.
commit 9e0c1aa099073887206c9334e17b49ee31bbef9a
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 20:00:40 2021 +0200
Use `Sha1Git` type to explicitly state the kind of identifiers
Previous occurrences of `bytes` and `Sha1` are now correctly using `Sha1Git`.
Also, some bytes conversion methods were replaced by their counterparts in
the swh.model.hashutil module.
commit a27ffff67b6b14bf37d153bb9b1d1c2ae63773fc
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date: Wed Jun 23 19:12:06 2021 +0200
Add support for sha1 identifiers for originsSee https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/244/ for more details.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||