Page MenuHomeSoftware Heritage

Improve typing annotations for `origin` and `revision` modules
ClosedPublic

Authored by aeviso on Jul 1 2021, 5:51 PM.

Details

Summary

This implie fixing RevisionCSVIterator, and updating cli and related tests.

Depends on D5957

Diff Detail

Event Timeline

Build is green

Patch application report for D5958 (id=21411)

Could not rebase; Attempt merge onto d892b29e40...

Updating d892b29..177dd64
Fast-forward
 swh/provenance/__init__.py                         |  57 +--
 swh/provenance/archive.py                          |  25 +-
 swh/provenance/backend.py                          | 357 ++++++++++++++++++
 swh/provenance/cli.py                              |  79 ++--
 swh/provenance/graph.py                            |   4 +-
 swh/provenance/model.py                            |  53 ++-
 swh/provenance/origin.py                           |  33 +-
 swh/provenance/postgresql/archive.py               | 126 +++----
 swh/provenance/postgresql/db_utils.py              |  61 ----
 swh/provenance/postgresql/provenancedb_base.py     | 404 ++++++++++++---------
 .../postgresql/provenancedb_with_path.py           | 117 +++---
 .../postgresql/provenancedb_without_path.py        |  96 ++---
 swh/provenance/provenance.py                       | 349 +++++++++---------
 swh/provenance/revision.py                         |  43 +--
 swh/provenance/sql/30-schema.sql                   |  30 +-
 swh/provenance/storage/archive.py                  |  39 +-
 swh/provenance/tests/conftest.py                   |  32 +-
 .../tests/data/generate_storage_from_git.py        |   3 +-
 .../data/history_graphs_with-merges_visits-01.yaml |  55 +++
 swh/provenance/tests/data/with-merges.msgpack      | Bin 0 -> 7501 bytes
 ...repo_with_merges.yaml => with-merges_repo.yaml} |   0
 ...s-visits-01.yaml => with-merges_visits-01.yaml} |   0
 swh/provenance/tests/test_archive_interface.py     |  50 +++
 swh/provenance/tests/test_cli.py                   |   4 +-
 swh/provenance/tests/test_conftest.py              |   2 +-
 swh/provenance/tests/test_history_graph.py         |  62 ++++
 swh/provenance/tests/test_origin_iterator.py       |   8 +-
 swh/provenance/tests/test_provenance_db.py         |   4 +-
 swh/provenance/tests/test_provenance_heuristics.py |  56 +--
 swh/provenance/tests/test_revision_iterator.py     |   3 +-
 30 files changed, 1302 insertions(+), 850 deletions(-)
 create mode 100644 swh/provenance/backend.py
 delete mode 100644 swh/provenance/postgresql/db_utils.py
 create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml
 create mode 100644 swh/provenance/tests/data/with-merges.msgpack
 rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%)
 rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%)
 create mode 100644 swh/provenance/tests/test_archive_interface.py
 create mode 100644 swh/provenance/tests/test_history_graph.py
Changes applied before test
commit 177dd6488e3255e65e95b6fddee56978036d3e54
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jul 1 17:48:12 2021 +0200

    Improve typing annotations for `origin` and `revision` modules
    
    This implie fixing `RevisionCSVIterator`, and updating cli and related tests.

commit 4f0da58cde268c8c378f9866d9325d7f3a98bd8c
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jul 1 17:34:08 2021 +0200

    Remove `db_utils` in favour of `swh.core.db.BaseDb` methods
    
    This allows to remove some unnecessary `psycopg2` dependencies in modules that
    don't deal directly with SQL queries.

commit 0c3c7817cfd267bf3b77ce2e594ea7feb3ec96d5
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Tue Jun 29 14:28:54 2021 +0200

    Force `snapshot_get_heads` to return revisions in chronological order

commit afb67f665ab00c03c0ca33e96b1bfc109c827c58
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 28 19:21:58 2021 +0200

    Add `ProvenanceStorageInterface` as discussed during backend design
    
    Rework backend-related classes to properly use the new interface.
    Adapt tests to the new structure as well.

commit 3672235c3258cf93fb37a82d060bf40ba1761b8b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 28 14:37:50 2021 +0200

    Move `ProvenanceBackend` implementation to a separate file

commit 6f4da6fed7e663273627ad4a46c8489ef0a0e784
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jul 1 13:47:26 2021 +0200

    Use `RealDictCursor` in `ProvenanceDBBase`
    
    to improve the way `ProvenanceResult`s are generated.

commit 07a30e43a76e170ab03764035da68dcf7db1fc3b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 28 14:28:32 2021 +0200

    Rework `ProvenanceInterface` as discussed during backend design
    
    Add `ProvenanceResult` class to be returned by `content_find_first` and
    `content_find_all` methods. Rename some methods. Improve type annotations.

commit 2fd3f56b57f8db6691ae6b8b7cb7ac557b764172
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:31:16 2021 +0200

    Add tests for history graph topology

commit d45d6ff9e9317ecfe38d584df7297c548b654d28
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:10:38 2021 +0200

    Fix database queries related to the origin-revision layer
    
    This required allowing null dates in the `revision` table so that revision can be added
    by the origin-revision layer algorithm but not recognized as already processed by the
    revision-content layer. Revision and origin entries are now inserted in the database
    prior to inserting rows to revision_in_origin and revision_before_revision relations,
    so that internal ids are properly resolved.

commit 0e2a3c64ce3c368b53c101c541e8aebcde789477
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Fri Jun 25 13:38:26 2021 +0200

    Add test to compare both `ArchiveInterface` implementations
    
    Improve documentation of the interface and complete pending TODO's.

commit 98bba93cccece2b47ec4cd5887997cb5bede1e87
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:25:15 2021 +0200

    Rename test files to keep naming convension
    
    Also added missing .msgpack file dump for new with-merges repository.

commit fa9198afb71bcf3b8abea07d88d763a430f7358e
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:05:24 2021 +0200

    Refactor `ArchiveInterface` to fit origin-revision layer needs
    
    Replace `revision_get` method by `revision_get_parents` returning an iterable of
    parents' ids only, instead of a swh.model.model.Revision object.

commit 9e0c1aa099073887206c9334e17b49ee31bbef9a
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 23 20:00:40 2021 +0200

    Use `Sha1Git` type to explicitly state the kind of identifiers
    
    Previous occurrences of `bytes` and `Sha1` are now correctly using `Sha1Git`.
    Also, some bytes conversion methods were replaced by their counterparts in
    the swh.model.hashutil module.

commit a27ffff67b6b14bf37d153bb9b1d1c2ae63773fc
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 23 19:12:06 2021 +0200

    Add support for sha1 identifiers for origins

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/224/ for more details.

aeviso requested review of this revision.Jul 1 2021, 5:55 PM

Build is green

Patch application report for D5958 (id=21429)

Could not rebase; Attempt merge onto d892b29e40...

Updating d892b29..ffa4cb9
Fast-forward
 swh/provenance/__init__.py                         |  57 +--
 swh/provenance/archive.py                          |  25 +-
 swh/provenance/backend.py                          | 322 ++++++++++++++++
 swh/provenance/cli.py                              |  79 ++--
 swh/provenance/graph.py                            |   4 +-
 swh/provenance/model.py                            |  53 ++-
 swh/provenance/origin.py                           |  33 +-
 swh/provenance/postgresql/archive.py               | 126 +++----
 swh/provenance/postgresql/db_utils.py              |  61 ----
 swh/provenance/postgresql/provenancedb_base.py     | 404 ++++++++++++---------
 .../postgresql/provenancedb_with_path.py           | 117 +++---
 .../postgresql/provenancedb_without_path.py        |  96 ++---
 swh/provenance/provenance.py                       | 349 +++++++++---------
 swh/provenance/revision.py                         |  43 +--
 swh/provenance/sql/30-schema.sql                   |  30 +-
 swh/provenance/storage/archive.py                  |  39 +-
 swh/provenance/tests/conftest.py                   |  32 +-
 .../tests/data/generate_storage_from_git.py        |   3 +-
 .../data/history_graphs_with-merges_visits-01.yaml |  55 +++
 swh/provenance/tests/data/with-merges.msgpack      | Bin 0 -> 7501 bytes
 ...repo_with_merges.yaml => with-merges_repo.yaml} |   0
 ...s-visits-01.yaml => with-merges_visits-01.yaml} |   0
 swh/provenance/tests/test_archive_interface.py     |  50 +++
 swh/provenance/tests/test_cli.py                   |   4 +-
 swh/provenance/tests/test_conftest.py              |   2 +-
 swh/provenance/tests/test_history_graph.py         |  62 ++++
 swh/provenance/tests/test_origin_iterator.py       |   8 +-
 swh/provenance/tests/test_provenance_db.py         |   4 +-
 swh/provenance/tests/test_provenance_heuristics.py |  56 +--
 swh/provenance/tests/test_revision_iterator.py     |   3 +-
 30 files changed, 1267 insertions(+), 850 deletions(-)
 create mode 100644 swh/provenance/backend.py
 delete mode 100644 swh/provenance/postgresql/db_utils.py
 create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml
 create mode 100644 swh/provenance/tests/data/with-merges.msgpack
 rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%)
 rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%)
 create mode 100644 swh/provenance/tests/test_archive_interface.py
 create mode 100644 swh/provenance/tests/test_history_graph.py
Changes applied before test
commit ffa4cb98c6aa99fcc2e43c8dea0543c2b2379d8b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jul 1 17:48:12 2021 +0200

    Improve typing annotations for `origin` and `revision` modules
    
    This implie fixing `RevisionCSVIterator`, and updating cli and related tests.

commit a4eb765f92d25d45ddea178aaeebc375d69e3830
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jul 1 17:34:08 2021 +0200

    Remove `db_utils` in favour of `swh.core.db.BaseDb` methods
    
    This allows to remove some unnecessary `psycopg2` dependencies in modules that
    don't deal directly with SQL queries.

commit 2a1aeb023627ed31ee4eea617790607e1fa1ba68
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Tue Jun 29 14:28:54 2021 +0200

    Force `snapshot_get_heads` to return revisions in chronological order

commit f819e4332df40b1ef35ff737f2558de570379473
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 28 19:21:58 2021 +0200

    Add `ProvenanceStorageInterface` as discussed during backend design
    
    Rework backend-related classes to properly use the new interface.
    Adapt tests to the new structure as well.

commit 3672235c3258cf93fb37a82d060bf40ba1761b8b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 28 14:37:50 2021 +0200

    Move `ProvenanceBackend` implementation to a separate file

commit 6f4da6fed7e663273627ad4a46c8489ef0a0e784
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jul 1 13:47:26 2021 +0200

    Use `RealDictCursor` in `ProvenanceDBBase`
    
    to improve the way `ProvenanceResult`s are generated.

commit 07a30e43a76e170ab03764035da68dcf7db1fc3b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 28 14:28:32 2021 +0200

    Rework `ProvenanceInterface` as discussed during backend design
    
    Add `ProvenanceResult` class to be returned by `content_find_first` and
    `content_find_all` methods. Rename some methods. Improve type annotations.

commit 2fd3f56b57f8db6691ae6b8b7cb7ac557b764172
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:31:16 2021 +0200

    Add tests for history graph topology

commit d45d6ff9e9317ecfe38d584df7297c548b654d28
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:10:38 2021 +0200

    Fix database queries related to the origin-revision layer
    
    This required allowing null dates in the `revision` table so that revision can be added
    by the origin-revision layer algorithm but not recognized as already processed by the
    revision-content layer. Revision and origin entries are now inserted in the database
    prior to inserting rows to revision_in_origin and revision_before_revision relations,
    so that internal ids are properly resolved.

commit 0e2a3c64ce3c368b53c101c541e8aebcde789477
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Fri Jun 25 13:38:26 2021 +0200

    Add test to compare both `ArchiveInterface` implementations
    
    Improve documentation of the interface and complete pending TODO's.

commit 98bba93cccece2b47ec4cd5887997cb5bede1e87
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:25:15 2021 +0200

    Rename test files to keep naming convension
    
    Also added missing .msgpack file dump for new with-merges repository.

commit fa9198afb71bcf3b8abea07d88d763a430f7358e
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:05:24 2021 +0200

    Refactor `ArchiveInterface` to fit origin-revision layer needs
    
    Replace `revision_get` method by `revision_get_parents` returning an iterable of
    parents' ids only, instead of a swh.model.model.Revision object.

commit 9e0c1aa099073887206c9334e17b49ee31bbef9a
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 23 20:00:40 2021 +0200

    Use `Sha1Git` type to explicitly state the kind of identifiers
    
    Previous occurrences of `bytes` and `Sha1` are now correctly using `Sha1Git`.
    Also, some bytes conversion methods were replaced by their counterparts in
    the swh.model.hashutil module.

commit a27ffff67b6b14bf37d153bb9b1d1c2ae63773fc
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 23 19:12:06 2021 +0200

    Add support for sha1 identifiers for origins

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/233/ for more details.

Build is green

Patch application report for D5958 (id=21436)

Could not rebase; Attempt merge onto d892b29e40...

Updating d892b29..fec5736
Fast-forward
 swh/provenance/__init__.py                         |  57 +--
 swh/provenance/archive.py                          |  25 +-
 swh/provenance/backend.py                          | 322 ++++++++++++++++
 swh/provenance/cli.py                              |  80 ++--
 swh/provenance/graph.py                            |   4 +-
 swh/provenance/model.py                            |  53 ++-
 swh/provenance/origin.py                           |  33 +-
 swh/provenance/postgresql/archive.py               | 126 +++----
 swh/provenance/postgresql/db_utils.py              |  61 ----
 swh/provenance/postgresql/provenancedb_base.py     | 404 ++++++++++++---------
 .../postgresql/provenancedb_with_path.py           | 117 +++---
 .../postgresql/provenancedb_without_path.py        |  96 ++---
 swh/provenance/provenance.py                       | 349 +++++++++---------
 swh/provenance/revision.py                         |  40 +-
 swh/provenance/sql/30-schema.sql                   |  30 +-
 swh/provenance/storage/archive.py                  |  39 +-
 swh/provenance/tests/conftest.py                   |  32 +-
 .../tests/data/generate_storage_from_git.py        |   3 +-
 .../data/history_graphs_with-merges_visits-01.yaml |  55 +++
 swh/provenance/tests/data/with-merges.msgpack      | Bin 0 -> 7501 bytes
 ...repo_with_merges.yaml => with-merges_repo.yaml} |   0
 ...s-visits-01.yaml => with-merges_visits-01.yaml} |   0
 swh/provenance/tests/test_archive_interface.py     |  50 +++
 swh/provenance/tests/test_cli.py                   |   4 +-
 swh/provenance/tests/test_conftest.py              |   2 +-
 swh/provenance/tests/test_history_graph.py         |  62 ++++
 swh/provenance/tests/test_origin_iterator.py       |   8 +-
 swh/provenance/tests/test_provenance_db.py         |   4 +-
 swh/provenance/tests/test_provenance_heuristics.py |  56 +--
 swh/provenance/tests/test_revision_iterator.py     |   3 +-
 30 files changed, 1264 insertions(+), 851 deletions(-)
 create mode 100644 swh/provenance/backend.py
 delete mode 100644 swh/provenance/postgresql/db_utils.py
 create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml
 create mode 100644 swh/provenance/tests/data/with-merges.msgpack
 rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%)
 rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%)
 create mode 100644 swh/provenance/tests/test_archive_interface.py
 create mode 100644 swh/provenance/tests/test_history_graph.py
Changes applied before test
commit fec57362bf581758fb52ee1b3e6b908e5ee52a51
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jul 1 17:48:12 2021 +0200

    Improve typing annotations for `origin` and `revision` modules
    
    This implie fixing `RevisionCSVIterator`, and updating cli and related tests.

commit a4eb765f92d25d45ddea178aaeebc375d69e3830
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jul 1 17:34:08 2021 +0200

    Remove `db_utils` in favour of `swh.core.db.BaseDb` methods
    
    This allows to remove some unnecessary `psycopg2` dependencies in modules that
    don't deal directly with SQL queries.

commit 2a1aeb023627ed31ee4eea617790607e1fa1ba68
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Tue Jun 29 14:28:54 2021 +0200

    Force `snapshot_get_heads` to return revisions in chronological order

commit f819e4332df40b1ef35ff737f2558de570379473
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 28 19:21:58 2021 +0200

    Add `ProvenanceStorageInterface` as discussed during backend design
    
    Rework backend-related classes to properly use the new interface.
    Adapt tests to the new structure as well.

commit 3672235c3258cf93fb37a82d060bf40ba1761b8b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 28 14:37:50 2021 +0200

    Move `ProvenanceBackend` implementation to a separate file

commit 6f4da6fed7e663273627ad4a46c8489ef0a0e784
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jul 1 13:47:26 2021 +0200

    Use `RealDictCursor` in `ProvenanceDBBase`
    
    to improve the way `ProvenanceResult`s are generated.

commit 07a30e43a76e170ab03764035da68dcf7db1fc3b
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 28 14:28:32 2021 +0200

    Rework `ProvenanceInterface` as discussed during backend design
    
    Add `ProvenanceResult` class to be returned by `content_find_first` and
    `content_find_all` methods. Rename some methods. Improve type annotations.

commit 2fd3f56b57f8db6691ae6b8b7cb7ac557b764172
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:31:16 2021 +0200

    Add tests for history graph topology

commit d45d6ff9e9317ecfe38d584df7297c548b654d28
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:10:38 2021 +0200

    Fix database queries related to the origin-revision layer
    
    This required allowing null dates in the `revision` table so that revision can be added
    by the origin-revision layer algorithm but not recognized as already processed by the
    revision-content layer. Revision and origin entries are now inserted in the database
    prior to inserting rows to revision_in_origin and revision_before_revision relations,
    so that internal ids are properly resolved.

commit 0e2a3c64ce3c368b53c101c541e8aebcde789477
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Fri Jun 25 13:38:26 2021 +0200

    Add test to compare both `ArchiveInterface` implementations
    
    Improve documentation of the interface and complete pending TODO's.

commit 98bba93cccece2b47ec4cd5887997cb5bede1e87
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:25:15 2021 +0200

    Rename test files to keep naming convension
    
    Also added missing .msgpack file dump for new with-merges repository.

commit fa9198afb71bcf3b8abea07d88d763a430f7358e
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:05:24 2021 +0200

    Refactor `ArchiveInterface` to fit origin-revision layer needs
    
    Replace `revision_get` method by `revision_get_parents` returning an iterable of
    parents' ids only, instead of a swh.model.model.Revision object.

commit 9e0c1aa099073887206c9334e17b49ee31bbef9a
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 23 20:00:40 2021 +0200

    Use `Sha1Git` type to explicitly state the kind of identifiers
    
    Previous occurrences of `bytes` and `Sha1` are now correctly using `Sha1Git`.
    Also, some bytes conversion methods were replaced by their counterparts in
    the swh.model.hashutil module.

commit a27ffff67b6b14bf37d153bb9b1d1c2ae63773fc
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 23 19:12:06 2021 +0200

    Add support for sha1 identifiers for origins

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/236/ for more details.

This revision is now accepted and ready to land.Jul 2 2021, 3:13 PM

Build is green

Patch application report for D5958 (id=21449)

Could not rebase; Attempt merge onto d892b29e40...

Updating d892b29..2483b17
Fast-forward
 swh/provenance/__init__.py                         |  57 +--
 swh/provenance/archive.py                          |  25 +-
 swh/provenance/backend.py                          | 324 +++++++++++++++++
 swh/provenance/cli.py                              |  80 ++--
 swh/provenance/graph.py                            |   4 +-
 swh/provenance/model.py                            |  53 ++-
 swh/provenance/origin.py                           |  33 +-
 swh/provenance/postgresql/archive.py               | 126 +++----
 swh/provenance/postgresql/db_utils.py              |  61 ----
 swh/provenance/postgresql/provenancedb_base.py     | 404 ++++++++++++---------
 .../postgresql/provenancedb_with_path.py           | 117 +++---
 .../postgresql/provenancedb_without_path.py        |  96 ++---
 swh/provenance/provenance.py                       | 349 +++++++++---------
 swh/provenance/revision.py                         |  40 +-
 swh/provenance/sql/30-schema.sql                   |  30 +-
 swh/provenance/storage/archive.py                  |  39 +-
 swh/provenance/tests/conftest.py                   |  32 +-
 .../tests/data/generate_storage_from_git.py        |   3 +-
 .../data/history_graphs_with-merges_visits-01.yaml |  55 +++
 swh/provenance/tests/data/with-merges.msgpack      | Bin 0 -> 7501 bytes
 ...repo_with_merges.yaml => with-merges_repo.yaml} |   0
 ...s-visits-01.yaml => with-merges_visits-01.yaml} |   0
 swh/provenance/tests/test_archive_interface.py     |  50 +++
 swh/provenance/tests/test_cli.py                   |   4 +-
 swh/provenance/tests/test_conftest.py              |   2 +-
 swh/provenance/tests/test_history_graph.py         |  62 ++++
 swh/provenance/tests/test_origin_iterator.py       |   8 +-
 swh/provenance/tests/test_provenance_db.py         |   4 +-
 swh/provenance/tests/test_provenance_heuristics.py |  56 +--
 swh/provenance/tests/test_revision_iterator.py     |   3 +-
 30 files changed, 1266 insertions(+), 851 deletions(-)
 create mode 100644 swh/provenance/backend.py
 delete mode 100644 swh/provenance/postgresql/db_utils.py
 create mode 100644 swh/provenance/tests/data/history_graphs_with-merges_visits-01.yaml
 create mode 100644 swh/provenance/tests/data/with-merges.msgpack
 rename swh/provenance/tests/data/{repo_with_merges.yaml => with-merges_repo.yaml} (100%)
 rename swh/provenance/tests/data/{repo_with_merges-visits-01.yaml => with-merges_visits-01.yaml} (100%)
 create mode 100644 swh/provenance/tests/test_archive_interface.py
 create mode 100644 swh/provenance/tests/test_history_graph.py
Changes applied before test
commit 2483b17fb2a8328c997c068d68c65975f2d7f9fb
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jul 1 17:48:12 2021 +0200

    Improve typing annotations for `origin` and `revision` modules
    
    This implie fixing `RevisionCSVIterator`, and updating cli and related tests.

commit 6617ca5615ec000b90890cbf51979c0311ff1ea5
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jul 1 17:34:08 2021 +0200

    Remove `db_utils` in favour of `swh.core.db.BaseDb` methods
    
    This allows to remove some unnecessary `psycopg2` dependencies in modules that
    don't deal directly with SQL queries.

commit c72b2e2428fe410c17220648d378c505bba35350
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Tue Jun 29 14:28:54 2021 +0200

    Force `snapshot_get_heads` to return revisions in chronological order

commit 799839120cb99f22ce4272468ae0e388c335fb06
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 28 19:21:58 2021 +0200

    Add `ProvenanceStorageInterface` as discussed during backend design
    
    Rework backend-related classes to properly use the new interface.
    Adapt tests to the new structure as well.

commit 7c0a091ce5ffbf0a02dbe9d7fc84435ddd46cde2
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 28 14:37:50 2021 +0200

    Move `ProvenanceBackend` implementation to a separate file

commit 34898ad3cb18c24a7d7bef79dcfe470c3a1374ef
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jul 1 13:47:26 2021 +0200

    Use `RealDictCursor` in `ProvenanceDBBase`
    
    to improve the way `ProvenanceResult`s are generated.

commit 721354c436b5f5a861800b11e6151afa1aa634b6
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Jun 28 14:28:32 2021 +0200

    Rework `ProvenanceInterface` as discussed during backend design
    
    Add `ProvenanceResult` class to be returned by `content_find_first` and
    `content_find_all` methods. Rename some methods. Improve type annotations.

commit 01f8d40ffccbcab6ecec6c2cf85478364e006caa
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:31:16 2021 +0200

    Add tests for history graph topology

commit b7fdcdec7ea96101d62a57d9aeed114c897df961
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:10:38 2021 +0200

    Fix database queries related to the origin-revision layer
    
    This required allowing null dates in the `revision` table so that revision can be added
    by the origin-revision layer algorithm but not recognized as already processed by the
    revision-content layer. Revision and origin entries are now inserted in the database
    prior to inserting rows to revision_in_origin and revision_before_revision relations,
    so that internal ids are properly resolved.

commit 0e2a3c64ce3c368b53c101c541e8aebcde789477
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Fri Jun 25 13:38:26 2021 +0200

    Add test to compare both `ArchiveInterface` implementations
    
    Improve documentation of the interface and complete pending TODO's.

commit 98bba93cccece2b47ec4cd5887997cb5bede1e87
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:25:15 2021 +0200

    Rename test files to keep naming convension
    
    Also added missing .msgpack file dump for new with-merges repository.

commit fa9198afb71bcf3b8abea07d88d763a430f7358e
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Jun 24 16:05:24 2021 +0200

    Refactor `ArchiveInterface` to fit origin-revision layer needs
    
    Replace `revision_get` method by `revision_get_parents` returning an iterable of
    parents' ids only, instead of a swh.model.model.Revision object.

commit 9e0c1aa099073887206c9334e17b49ee31bbef9a
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 23 20:00:40 2021 +0200

    Use `Sha1Git` type to explicitly state the kind of identifiers
    
    Previous occurrences of `bytes` and `Sha1` are now correctly using `Sha1Git`.
    Also, some bytes conversion methods were replaced by their counterparts in
    the swh.model.hashutil module.

commit a27ffff67b6b14bf37d153bb9b1d1c2ae63773fc
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Jun 23 19:12:06 2021 +0200

    Add support for sha1 identifiers for origins

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/247/ for more details.