Page MenuHomeSoftware Heritage

Add `close` method to both `ProvenanceInterface` and `ProvenanceStorageInterface`
ClosedPublic

Authored by aeviso on Sep 24 2021, 11:18 AM.

Details

Summary

The idea is to have a mechanism to explicitly release resources when needed.

Depends on D6272.

Diff Detail

Event Timeline

Build is green

Patch application report for D6334 (id=23036)

Could not rebase; Attempt merge onto 4c087ea0ec...

Updating 4c087ea..18f131c
Fast-forward
 swh/provenance/__init__.py              | 21 ++++++++++++++-------
 swh/provenance/api/client.py            |  2 +-
 swh/provenance/api/server.py            |  6 +++---
 swh/provenance/cli.py                   |  8 ++++++--
 swh/provenance/interface.py             |  9 +++++++++
 swh/provenance/mongo/backend.py         |  3 +++
 swh/provenance/postgresql/provenance.py |  3 +++
 swh/provenance/provenance.py            |  3 +++
 swh/provenance/tests/conftest.py        | 27 ++++++++++++++++-----------
 9 files changed, 58 insertions(+), 24 deletions(-)
Changes applied before test
commit 18f131c8c26f999e312a7133c7c7fd380c149b52
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Fri Sep 24 11:08:08 2021 +0200

    Add `close` method to both `ProvenanceInterface` and `ProvenanceStorageInterface`
    
    The idea is to have a mechanism to explicitly release resources when needed.

commit 6c3071493b5d3f187113493275d402a27866da95
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Sep 15 16:14:10 2021 +0200

    Rename remote storage backend classes
    
    Make names consistent with the naming convention used for other components.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/369/ for more details.

Build is green

Patch application report for D6334 (id=23038)

Could not rebase; Attempt merge onto 4c087ea0ec...

Updating 4c087ea..437c2b4
Fast-forward
 swh/provenance/__init__.py              | 21 ++++++++++++++-------
 swh/provenance/api/client.py            |  2 +-
 swh/provenance/api/server.py            |  6 +++---
 swh/provenance/cli.py                   |  8 ++++++--
 swh/provenance/interface.py             |  9 +++++++++
 swh/provenance/mongo/backend.py         |  3 +++
 swh/provenance/postgresql/provenance.py |  3 +++
 swh/provenance/provenance.py            |  3 +++
 swh/provenance/tests/conftest.py        | 33 ++++++++++++++++++++-------------
 9 files changed, 62 insertions(+), 26 deletions(-)
Changes applied before test
commit 437c2b4ad60e15c0b102988a99766b2c503c91d5
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Fri Sep 24 11:08:08 2021 +0200

    Add `close` method to both `ProvenanceInterface` and `ProvenanceStorageInterface`
    
    The idea is to have a mechanism to explicitly release resources when needed.

commit 6c3071493b5d3f187113493275d402a27866da95
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Sep 15 16:14:10 2021 +0200

    Rename remote storage backend classes
    
    Make names consistent with the naming convention used for other components.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/370/ for more details.

Thanks!

I still think that the postgres and mongodb close methods on ProvenanceStorage instances should be shutting down their respective database connections.

I remember that you didn't want to do that because currently the database connection is passed to the class opened already, which is at least consistent.

However, would it make sense to instead have the storage classes take connection parameters and handle connecting to the database themselves (and therefore having their close methods close the database connections)?

In D6334#164535, @olasd wrote:

Thanks!

I still think that the postgres and mongodb close methods on ProvenanceStorage instances should be shutting down their respective database connections.

I remember that you didn't want to do that because currently the database connection is passed to the class opened already, which is at least consistent.

However, would it make sense to instead have the storage classes take connection parameters and handle connecting to the database themselves (and therefore having their close methods close the database connections)?

It would make sense, even maybe having an open method for that so that and storage can be closed and reopened eventually. But the main reason behind not doing so for know is that we need to refactor how the different implementations are initialized.
In particular Mongo takes the database object because for testing we create a mock database instead of using pymongo which connects to a real server.

In a nutshell, it should be done but it will require some extra effort that I wanted to avoid to prioritize the remote storage implementation

Add open method and refactor storage classes initialization

Build is green

Patch application report for D6334 (id=23169)

Could not rebase; Attempt merge onto 4c087ea0ec...

Updating 4c087ea..846b20e
Fast-forward
 requirements.txt                        |  1 +
 swh/provenance/__init__.py              | 35 ++++++++++++----------
 swh/provenance/api/client.py            |  2 +-
 swh/provenance/api/server.py            |  7 +++--
 swh/provenance/cli.py                   | 14 +++++++--
 swh/provenance/interface.py             | 20 +++++++++++++
 swh/provenance/mongo/backend.py         | 19 ++++++++++--
 swh/provenance/postgresql/provenance.py | 18 +++++++-----
 swh/provenance/provenance.py            |  6 ++++
 swh/provenance/tests/conftest.py        | 52 ++++++++++++++++++++++-----------
 10 files changed, 125 insertions(+), 49 deletions(-)
Changes applied before test
commit 846b20e0e9995a13591a1641bf92036ff3764be5
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Fri Sep 24 11:08:08 2021 +0200

    Add `open`/`close` methods to both `ProvenanceInterface` and `ProvenanceStorageInterface`
    
    The idea is to have a mechanism to explicitly allocate/release resources when needed.

commit 6c3071493b5d3f187113493275d402a27866da95
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Sep 15 16:14:10 2021 +0200

    Rename remote storage backend classes
    
    Make names consistent with the naming convention used for other components.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/394/ for more details.

douardda requested changes to this revision.Oct 1 2021, 2:07 PM
douardda added a subscriber: douardda.

Look to me that this open/close interface really should come with a context manager.

Then you can replace any piece of code (which appears quite some times in this diff) like:

provenance.open()
do_stuff()
provenance.close9)

by:

with provenance:
  do_stuff

Not even sure keeping open/close methods part of the public API in the interface makes sense.

This revision now requires changes to proceed.Oct 1 2021, 2:07 PM

turn backend classes into context managers

Build is green

Patch application report for D6334 (id=23268)

Could not rebase; Attempt merge onto 4c087ea0ec...

Updating 4c087ea..f0210c3
Fast-forward
 requirements-test.txt                   |  1 -
 requirements.txt                        |  2 +-
 swh/provenance/__init__.py              | 24 +++------
 swh/provenance/api/client.py            | 13 -----
 swh/provenance/api/server.py            | 92 +--------------------------------
 swh/provenance/cli.py                   | 71 ++++++++++++-------------
 swh/provenance/interface.py             | 47 ++++++++++++++++-
 swh/provenance/mongo/backend.py         | 37 +++++++++++--
 swh/provenance/postgresql/provenance.py | 36 ++++++++++---
 swh/provenance/provenance.py            | 22 +++++++-
 swh/provenance/tests/conftest.py        | 61 ++++++++--------------
 11 files changed, 192 insertions(+), 214 deletions(-)
Changes applied before test
commit f0210c3753c3a4122ee3c54f7fac97d170a142fa
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Fri Sep 24 11:08:08 2021 +0200

    Add `open`/`close` methods to both `ProvenanceInterface` and `ProvenanceStorageInterface`
    
    This allows to have an explicit mechanism to allocate/release resources when needed.
    The necessary methods for the classes implementing these interfaces to be turned in contexts
    managers are added as well (ie. `__enter__`/`__exit__`).

commit 172e327c25883bee768a9c16b850ce6aab7e2eb2
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Wed Sep 15 16:14:10 2021 +0200

    Remove remote provenance storage based on `swh.core.api.RPCClient`
    
    This implementation was a first attempt for conflict resolution that didn't worked as expected.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/419/ for more details.

looks ok to me. Just one question, why do you need __future__.annotation?

This revision is now accepted and ready to land.Oct 5 2021, 9:34 AM

looks ok to me. Just one question, why do you need __future__.annotation?

For things like this to work (lines 70-71 in interface.py):

class ProvenanceStorageInterface(Protocol):
    def __enter__(self) -> ProvenanceStorageInterface: