Page MenuHomeSoftware Heritage

Rework `ProvenanceStorageInterface` to have a single add method per entity
ClosedPublic

Authored by aeviso on Sep 9 2021, 4:45 PM.

Details

Summary

All entities now have a single <entity>_add method in the interface, with optional
associated date if applicable. A new location_add is also added to the interface.
Extend all backend implementations to support the new methods. Add pre-condition to
relation_add (relying on the new methods) which simplifies its different implementations.

Diff Detail

Repository
rDPROV Provenance database
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build has FAILED

Patch application report for D6231 (id=22543)

Rebasing onto 4c2b0907a9...

Current branch diff-target is up to date.
Changes applied before test
commit a4a41ce4ac8b21f170a509ada464cfedeb4f9d05
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Sep 9 16:40:33 2021 +0200

    Extend `ProvenanceStorageInterface` with new methods
    
    Add new methods `entity_add` and `location_add` to the interface.
    Extend all backend implementations to support the new methods.
    Add pre-condition to `relation_add` (relying on the new methods)
    which simplifies its different implementations.

Link to build: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/327/
See console output for more information: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/327/console

Harbormaster returned this revision to the author for changes because remote builds failed.Sep 9 2021, 4:48 PM
Harbormaster failed remote builds in B23488: Diff 22543!

Build is green

Patch application report for D6231 (id=22544)

Rebasing onto 4c2b0907a9...

Current branch diff-target is up to date.
Changes applied before test
commit e04501ce8fad122bfb22239041b636dbb37b7e85
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Sep 9 16:40:33 2021 +0200

    Extend `ProvenanceStorageInterface` with new methods
    
    Add new methods `entity_add` and `location_add` to the interface.
    Extend all backend implementations to support the new methods.
    Add pre-condition to `relation_add` (relying on the new methods)
    which simplifies its different implementations.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/328/ for more details.

aeviso requested review of this revision.Sep 9 2021, 5:04 PM
aeviso retitled this revision from Extend `ProvenanceStorageInterface` with new methods to Rework `ProvenanceStorageInterface` to have a single add method per entity.Sep 10 2021, 4:31 PM
aeviso edited the summary of this revision. (Show Details)

Build is green

Patch application report for D6231 (id=22575)

Rebasing onto 4c2b0907a9...

Current branch diff-target is up to date.
Changes applied before test
commit 03d156bcb7fa12b544ec385bc50cad57f426a661
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Sep 9 16:40:33 2021 +0200

    Rework `ProvenanceStorageInterface` to have a single add method per entity
    
    All entities now have a single `<entity>_add` method in the interface, with optional
    associated date if applicable. A new `location_add` is also added to the interface.
    Extend all backend implementations to support the new methods. Add pre-condition to
    `relation_add` (relying on the new methods) which simplifies its different implementations.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/329/ for more details.

jayeshv added a subscriber: jayeshv.
jayeshv added inline comments.
swh/provenance/mongo/backend.py
34

This logic can be slightly improved with upsert=True, but that improvement will be implemented in the on-going mongo refactoring. So, it is fine like this here.
This is true for all entirty_add methods.
https://docs.mongodb.com/manual/reference/method/db.collection.update/#std-label-upsert-behavior

swh/provenance/postgresql/provenance.py
63

Since we are actually adding a new object, set_date name looks a bit strange here. Should we rename the entity_set_date to entity_add?
Just a suggestion, not a blocker.

111

Catching a bare exception is generally considered as a bad idea. If you don't know the exact exception, I suggest using the base exception from psycopg2.

136

bare exception. An easier alternative is to add a method like 'execute query and handle these exceptions there'

This revision now requires changes to proceed.Sep 13 2021, 12:17 PM
aeviso added inline comments.
swh/provenance/mongo/backend.py
34

Ok, but its not the aim of this refactoring.

swh/provenance/postgresql/provenance.py
63

This internal methods handles the query for inserting entities that only have an (optional) associated date (ie. content and directory).Hence its name.

111

I don't fully agree, we want to guaranty that in production the storage don't tiers down a client. If anything happens a log will be saved. This try/catch logic needs to be abstracted though, since it is used several time in this class. Anyway, it's not in the scope of this refactoring.

136

Ack

This revision is now accepted and ready to land.Sep 13 2021, 1:07 PM
aeviso marked 4 inline comments as done.

rebase

Build is green

Patch application report for D6231 (id=22597)

Rebasing onto 4c2b0907a9...

Current branch diff-target is up to date.
Changes applied before test
commit 657226a87bbe211e9e963a701a960aff58636e08
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Sep 9 16:40:33 2021 +0200

    Rework `ProvenanceStorageInterface` to have a single add method per entity
    
    All entities now have a single `<entity>_add` method in the interface, with optional
    associated date if applicable. A new `location_add` is also added to the interface.
    Extend all backend implementations to support the new methods. Add pre-condition to
    `relation_add` (relying on the new methods) which simplifies its different implementations.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/333/ for more details.