Page MenuHomeSoftware Heritage

Provenance Mongo backend
ClosedPublic

Authored by jayeshv on Aug 17 2021, 8:24 AM.

Details

Summary

MongoDB backend implementation for provenancestorage.

This is an initial working copy with working logic and basic tests.
This can be used for early benchmarking (With some externally added
indexs)
Modular and logic simplified code will be submitted in another diff.

Related to: T3431

Diff Detail

Repository
rDPROV Provenance database
Branch
mongo-experimental
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 23020
Build 35895: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 35894: arc lint + arc unit

Event Timeline

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 17 2021, 8:27 AM
Harbormaster failed remote builds in B23020: Diff 22065!
  • Provenance Mongo backend
  • tests
  • find all

Build has FAILED

Patch application report for D6094 (id=22115)

Rebasing onto e9206efef3...

Current branch diff-target is up to date.
Changes applied before test
commit 855c8e2e745f1b6d8fb029f49d3ae0f47086fdd9
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Wed Aug 18 13:49:07 2021 +0200

    find all

commit c4cc2d606f6046267d717a3f4829a1401bbb405a
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Wed Aug 18 11:35:45 2021 +0200

    tests

commit eb3fc9063a866bcbf461b264d4fce371a25fc143
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Tue Aug 17 08:20:09 2021 +0200

    Provenance Mongo backend
    
    Work in progress for T3431
    
    Initial version, not a working copy

Link to build: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/300/
See console output for more information: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/300/console

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 18 2021, 1:51 PM
Harbormaster failed remote builds in B23072: Diff 22115!
  • Filter null dates when querying psql storage
  • Parametrize ProvenanceStorageMongoBd constructor and fix typing issues
  • Add support for mongo fixture and adapt tests
  • Minor fixes
  • Fix content_find_first/all methods on MongoDB storage
  • Add support for reverse parameter to relation_get method in MongoDB storage

Build has FAILED

Patch application report for D6094 (id=22139)

Rebasing onto e9206efef3...

Current branch diff-target is up to date.
Changes applied before test
commit e1880d3640867418dbedf5c35d1be5481a42a6f3
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Fri Aug 20 14:14:14 2021 +0200

    Add support for `reverse` parameter to `relation_get` method in MongoDB storage

commit eb5343b0730b99ff122a7a051e8777e27d71b594
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Fri Aug 20 13:49:28 2021 +0200

    Fix `content_find_first/all` methods on MongoDB storage

commit c4978913a36d7be1fc07baed8d0d08da20047f56
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Fri Aug 20 13:14:05 2021 +0200

    Minor fixes

commit f133276bea2df43226f340e057ea928bab33a0f8
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Aug 19 14:06:40 2021 +0200

    Add support for mongo fixture and adapt tests

commit f48b2a2a77ce002629307cf26c58e5e36064e6eb
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Aug 19 14:04:50 2021 +0200

    Parametrize `ProvenanceStorageMongoBd` constructor and fix typing issues

commit 301d203b2e2d9866b6c09f3f5d241eae0461107f
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Thu Aug 19 13:55:51 2021 +0200

    Filter null dates when querying psql storage

commit 1076cc80dfa65797e0e2214eb213a27d7c2b4ae6
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Wed Aug 18 14:23:09 2021 +0200

    Provenance Mongo backend
    
    Summary:
    Work in progress for T3431
    
    Initial version, not a working copy or ready for review
    
    Reviewers: #reviewers
    
    Subscribers: aeviso
    
    Tags: #provenance_database
    
    Differential Revision: https://forge.softwareheritage.org/D6094

Link to build: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/301/
See console output for more information: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/301/console

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 20 2021, 2:25 PM
Harbormaster failed remote builds in B23097: Diff 22139!
mypy.ini
25 ↗(On Diff #22139)

We should remove this after we find proper stubs for MongoDB

pytest.ini
4 ↗(On Diff #22139)

I guess this is where we can load the schema?

swh/provenance/tests/conftest.py
65

Not sure how to improve this to avoid repeating the test when a fixture that is not used gets a new instantiation. For instance, if mongo is selected we don't care about all possible instances for provenance_postgresqldb

69

We should probably remove the test database before the tests

72

The other storage classes should be enabled again

swh/provenance/tests/test_provenance_storage.py
305

This empty line should be removed

  • adding just the files needed
  • small
  • small
jayeshv added inline comments.
swh/provenance/tests/conftest.py
72

@aeviso I believe we should test all the backends all the time. I will investigate this.
Another problem is, we have an in-memory implementation to test the postgres backend. For mongo, it is expecting a real server and the tests will fail. I will fix this withe a mock at this point of time.

Build has FAILED

Patch application report for D6094 (id=22152)

Rebasing onto e9206efef3...

Current branch diff-target is up to date.
Changes applied before test
commit 129962e6e39598688a4624193ffcd3d36446deed
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Tue Aug 24 08:47:44 2021 +0200

    small

commit e2c63a563392751fb9118be3378b61adf2dcfc2c
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Tue Aug 24 08:37:42 2021 +0200

    small

commit 4339c9e86f41ad9b7f5d66a18c44d2096421a9f7
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Tue Aug 24 08:33:15 2021 +0200

    adding just the files needed

commit 3b64ac3a34d8679966b963880d9f8421ba9bd99a
Author: Andres Ezequiel Viso <aeviso@softwareheritage.org>
Date:   Mon Aug 23 09:40:57 2021 +0200

    Provenance Mongo backend
    
    Summary:
    Work in progress for T3431
    
    Initial version, not a working copy or ready for review
    
    Reviewers: #reviewers
    
    Subscribers: aeviso
    
    Tags: #provenance_database
    
    Differential Revision: https://forge.softwareheritage.org/D6094

Link to build: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/302/
See console output for more information: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/302/console

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 24 2021, 9:00 AM
Harbormaster failed remote builds in B23109: Diff 22152!

Build has FAILED

Patch application report for D6094 (id=22157)

Rebasing onto e9206efef3...

Current branch diff-target is up to date.
Changes applied before test
commit 9b92ac2cf7fae56a4b2944d9496d9cebd484a448
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Tue Aug 24 11:26:31 2021 +0200

    MongoDB backend implementation for provenancestorage.
    
    This is an initial working copy with working logic and basic tests.
    This can be used for early benchmarking (With some externally added
    indexs)
    More modular and logic simplified code will be submitted in another diff.

Link to build: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/303/
See console output for more information: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/303/console

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 24 2021, 11:31 AM
Harbormaster failed remote builds in B23114: Diff 22157!
jayeshv added a reviewer: aeviso.

Build has FAILED

Patch application report for D6094 (id=22157)

Rebasing onto e9206efef3...

Current branch diff-target is up to date.
Changes applied before test
commit 9b92ac2cf7fae56a4b2944d9496d9cebd484a448
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Tue Aug 24 11:26:31 2021 +0200

    MongoDB backend implementation for provenancestorage.
    
    This is an initial working copy with working logic and basic tests.
    This can be used for early benchmarking (With some externally added
    indexs)
    More modular and logic simplified code will be submitted in another diff.

Link to build: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/304/
See console output for more information: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/304/console

jayeshv edited the summary of this revision. (Show Details)
  • mypy config changes

Build is green

Patch application report for D6094 (id=22158)

Rebasing onto e9206efef3...

Current branch diff-target is up to date.
Changes applied before test
commit ac779208bf34d0e0db293cc2e62553ae0e01bac7
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Tue Aug 24 12:04:00 2021 +0200

    mypy config changes

commit 9b92ac2cf7fae56a4b2944d9496d9cebd484a448
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Tue Aug 24 11:26:31 2021 +0200

    MongoDB backend implementation for provenancestorage.
    
    This is an initial working copy with working logic and basic tests.
    This can be used for early benchmarking (With some externally added
    indexs)
    More modular and logic simplified code will be submitted in another diff.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/305/ for more details.

swh/provenance/tests/conftest.py
24

I believe this include doesn't belong in this group but to the previous one. Pre-commit hooks would fail with it otherwise

72

I rather keep the previous configuration with a cleaner code. Switching mongo_engine to mongomock in pytest.ini should be enough

72

Regarding testing all backends, I agree. That's what I was saying in my comment. Please enable them again

  • Pytest is using mongomock

Build has FAILED

Patch application report for D6094 (id=22197)

Rebasing onto e9206efef3...

Current branch diff-target is up to date.
Changes applied before test
commit 0f6a855a19c74562454fbb933d20917596d5dbde
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Wed Aug 25 15:08:35 2021 +0200

    Pytest is using mongomock

commit ac779208bf34d0e0db293cc2e62553ae0e01bac7
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Tue Aug 24 12:04:00 2021 +0200

    mypy config changes

commit 9b92ac2cf7fae56a4b2944d9496d9cebd484a448
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Tue Aug 24 11:26:31 2021 +0200

    MongoDB backend implementation for provenancestorage.
    
    This is an initial working copy with working logic and basic tests.
    This can be used for early benchmarking (With some externally added
    indexs)
    More modular and logic simplified code will be submitted in another diff.

Link to build: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/306/
See console output for more information: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/306/console

  • Made default provenance storage to psql

Build has FAILED

Patch application report for D6094 (id=22198)

Rebasing onto e9206efef3...

Current branch diff-target is up to date.
Changes applied before test
commit cbfa35983cc1acce27558d9fe4520d562edb0c7b
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Wed Aug 25 15:13:04 2021 +0200

    Made default provenance storage to psql

commit 0f6a855a19c74562454fbb933d20917596d5dbde
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Wed Aug 25 15:08:35 2021 +0200

    Pytest is using mongomock

commit ac779208bf34d0e0db293cc2e62553ae0e01bac7
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Tue Aug 24 12:04:00 2021 +0200

    mypy config changes

commit 9b92ac2cf7fae56a4b2944d9496d9cebd484a448
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Tue Aug 24 11:26:31 2021 +0200

    MongoDB backend implementation for provenancestorage.
    
    This is an initial working copy with working logic and basic tests.
    This can be used for early benchmarking (With some externally added
    indexs)
    More modular and logic simplified code will be submitted in another diff.

Link to build: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/307/
See console output for more information: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/307/console

MongoDB backend implementation for provenancestorage.

This is an initial working copy with working logic and basic tests.
This can be used for early benchmarking (With some externally added
indexs)
Modular and logic simplified code will be submitted in another diff.

Build has FAILED

Patch application report for D6094 (id=22199)

Rebasing onto e9206efef3...

Current branch diff-target is up to date.
Changes applied before test
commit 342c38a785608bea385581a1ce58abc7363bc00c
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Wed Aug 25 15:14:48 2021 +0200

    MongoDB backend implementation for provenancestorage.
    
    This is an initial working copy with working logic and basic tests.
    This can be used for early benchmarking (With some externally added
    indexs)
    Modular and logic simplified code will be submitted in another diff.

Link to build: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/308/
See console output for more information: https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/308/console

  • Added the necessary directory for mongo pytest to work

Build is green

Patch application report for D6094 (id=22200)

Rebasing onto e9206efef3...

Current branch diff-target is up to date.
Changes applied before test
commit b902b06999cec9e07ccfa9179424ef41b496e555
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Wed Aug 25 15:38:17 2021 +0200

    Added the necessary directory for mongo pytest to work

commit 342c38a785608bea385581a1ce58abc7363bc00c
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Wed Aug 25 15:14:48 2021 +0200

    MongoDB backend implementation for provenancestorage.
    
    This is an initial working copy with working logic and basic tests.
    This can be used for early benchmarking (With some externally added
    indexs)
    Modular and logic simplified code will be submitted in another diff.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/309/ for more details.

MongoDB backend implementation for provenancestorage.

This is an initial working copy with working logic and basic tests.
This can be used for early benchmarking (With some externally added
indexs)
Modular and logic simplified code will be submitted in another diff.

Build is green

Patch application report for D6094 (id=22201)

Rebasing onto e9206efef3...

Current branch diff-target is up to date.
Changes applied before test
commit 3e009a2f77de1d4d00eb52f838537c7af327f010
Author: Jayesh Velayudhan <jayesh@softwareheritage.org>
Date:   Wed Aug 25 15:41:15 2021 +0200

    MongoDB backend implementation for provenancestorage.
    
    This is an initial working copy with working logic and basic tests.
    This can be used for early benchmarking (With some externally added
    indexs)
    Modular and logic simplified code will be submitted in another diff.

See https://jenkins.softwareheritage.org/job/DPROV/job/tests-on-diff/310/ for more details.

This revision is now accepted and ready to land.Aug 25 2021, 4:21 PM