Page MenuHomeSoftware Heritage

opam: Allow shared state between loader runs using multi-instance opam root
ClosedPublic

Authored by ardumont on Sep 22 2021, 9:45 AM.

Details

Summary

It allow the opam loader to reuse existing opam root with multiple instances. It's the
complementary code that goes with the loader adaptation [1].

As the opam show (cli) [2] version currently packaged does not support the means to
enclose the metadata extraction per opam instance (when sharing the same opam root), we
actually work around this by opening internal details to opam.

[1] D6316

[2] opam show is currently the interface we are using to extract and list information
about a package. It does work on standalone opam root folder but it comes short when
sharing multiple instances within one opam root (for now).

Related to T3590

Test Plan

tox

Diff Detail

Repository
rDLDBASE Generic VCS/Package Loader
Branch
share-instance-state
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 23892
Build 37258: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 37257: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D6318 (id=22952)

Could not rebase; Attempt merge onto fabd950d36...

Updating fabd950..826da43
Fast-forward
 swh/loader/package/opam/loader.py | 166 ++++++++++++++++++++++----------------
 1 file changed, 95 insertions(+), 71 deletions(-)
Changes applied before test
commit 826da4319f238fe962854374a529b2473285cb4a
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Sep 21 20:16:53 2021 +0200

    wip: Work towards state sharing between loader runs
    
    This actually needs some workaround to work since the current opam version packaged in
    debian does not support the required --repo flag yet.

commit f4556e0113fae658555b0c6f7296ac98c310c971
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Sep 21 19:15:31 2021 +0200

    opam: Initialize opam root directory outside the constructor
    
    If it's required at all, this will use the network to fetch and install it.
    This should be done outside the constructor.
    
    Related to T3590

Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/561/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/561/console

Harbormaster returned this revision to the author for changes because remote builds failed.Sep 22 2021, 9:47 AM
Harbormaster failed remote builds in B23883: Diff 22952!

I completely forgot one important thing @aleo said to me yesterday.

19:13 <mclovin> and it may be faster to use `--file` than the previous way
19:17 <mclovin> for `    def get_versions(self) -> List[str]:` you'll have to do something different than --file
19:18 <mclovin> you just have to do a `ls` :-)
19:18 <mclovin> the fix has been found by rjbou (one of the main opam lead developper) btw
19:19 <+ardumont> great, thx to rjbou / Raja Boujbel? (<- man page)
19:19 <mclovin> that's her :)
19:19 <mclovin> she just told me also that this won't work on 2.1
19:20 <mclovin> on 2.1 repositories are stored as .tar.gz
19:21 <mclovin> so you'll have to decompress them but I don't think 2.1 will land soon in debian stable (which is used on swh iirc ?)
19:21 <+ardumont> buster debian is 2.0.3 and bullseye is 2.0.8 so we may have time
19:21 <+ardumont> yeah
19:21 <+ardumont> we are on oldstable now (buster)
19:21 <+ardumont> we will work on making our workers our workers
19:22 <+ardumont> more dynamic so we did not yet migrated those to stable
19:22 <+ardumont> (but yeah we usually follow stable)
19:22 <mclovin> OK, and maybe in the meantime we'll be able to add the `--repos` flag to `opam show` so that won't be a problem anymore
19:23 <+ardumont> that'd be great ;)
19:23 <+ardumont> (no more looking into the internals of opam)
19:23 <+ardumont> i mean relying on*
19:24 <+ardumont> i'll keep this discussion on the side and work on it during the week, i'll ping you with diffs ;)
19:24 <+ardumont> thx

So i'll amend my mess here ;)

Actually use --file as discussed with @aleo

swh/loader/package/opam/loader.py
168

looked sensible to add, tell me if not.

180

this actually now reads like the comment so i think it's clearer.

Build is green

Patch application report for D6318 (id=22965)

Rebasing onto f4556e0113...

Current branch diff-target is up to date.
Changes applied before test
commit d3669d7e9d4b82c40898faf2227095083f7d72b7
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Sep 21 20:16:53 2021 +0200

    Allow opam loader to actually use multi-instance opam root
    
    This actually needs some workaround to work since the current opam version packaged in
    debian does not support the required --repo flag yet.

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/562/ for more details.

Refactor a bit (drop --strict as suggested)

Build is green

Patch application report for D6318 (id=22966)

Rebasing onto f4556e0113...

Current branch diff-target is up to date.
Changes applied before test
commit a521336704748f15611c27733e3f98f26d3861ec
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Sep 21 20:16:53 2021 +0200

    Allow opam loader to actually use multi-instance opam root
    
    This actually needs some workaround to work since the current opam version packaged in
    debian does not support the required --repo flag yet.

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/563/ for more details.

ardumont edited the summary of this revision. (Show Details)

Align commit message with diffs

Build is green

Patch application report for D6318 (id=22967)

Rebasing onto f4556e0113...

Current branch diff-target is up to date.
Changes applied before test
commit 611b78f030a5823cb17bf21432167aa5c891014c
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Sep 21 20:16:53 2021 +0200

    Allow opam loader to actually use multi-instance opam root
    
    It allow the opam loader to reuse existing opam root with multiple instances. It's the
    complementary code that goes with the loader adaptation [1].
    
    As the `opam show` (cli) [2] version currently packaged does not support the means to
    enclose the metadata extraction per opam instance (when sharing the same opam root), we
    actually work around this by opening internal details to opam.
    
    [1] D6316
    
    [2] `opam show` is currently the interface we are using to extract and list information
    about a package. It does work on standalone opam root folder but it comes short when
    sharing multiple instances within one opam root (for now).

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/564/ for more details.

Build is green

Patch application report for D6318 (id=22968)

Rebasing onto f4556e0113...

Current branch diff-target is up to date.
Changes applied before test
commit b2d175965abed6d116a9f70e484ba24ea01f63cc
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Sep 21 20:16:53 2021 +0200

    Allow opam loader to actually use multi-instance opam root
    
    It allow the opam loader to reuse existing opam root with multiple instances. It's the
    complementary code that goes with the loader adaptation [1].
    
    As the `opam show` (cli) [2] version currently packaged does not support the means to
    enclose the metadata extraction per opam instance (when sharing the same opam root), we
    actually work around this by opening internal details to opam.
    
    [1] D6316
    
    [2] `opam show` is currently the interface we are using to extract and list information
    about a package. It does work on standalone opam root folder but it comes short when
    sharing multiple instances within one opam root (for now).

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/565/ for more details.

ardumont retitled this revision from loader-opam: Allow shared state between loader runs using multi-instance opam root to opam: Allow shared state between loader runs using multi-instance opam root.Sep 24 2021, 2:07 PM

Adapt according to suggestion

Build is green

Patch application report for D6318 (id=23124)

Could not rebase; Attempt merge onto f4556e0113...

Updating f4556e0..8aca257
Fast-forward
 swh/loader/package/opam/loader.py          | 182 ++++++++++++++++-------------
 swh/loader/package/opam/tests/test_opam.py |  50 +++++++-
 2 files changed, 146 insertions(+), 86 deletions(-)
Changes applied before test
commit 8aca257ae8ec6fe24da0750873528d6252adc92a
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Sep 24 13:37:40 2021 +0200

    opam: Define a initialize_opam_root parameter for opam loader
    
    When running in production, the workers should expect the opam root directory to be
    present (externally maintained). The production task should do
    nothing (initialize_opam_root's default value is False).
    
    When running with standalone loader, they should be instantiated with
    initialize_opam_root to True. They will create the opam root folder if not present so it
    can work out of the box (e.g. docker worker)
    
    Related to T3590

commit b2d175965abed6d116a9f70e484ba24ea01f63cc
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Sep 21 20:16:53 2021 +0200

    Allow opam loader to actually use multi-instance opam root
    
    It allow the opam loader to reuse existing opam root with multiple instances. It's the
    complementary code that goes with the loader adaptation [1].
    
    As the `opam show` (cli) [2] version currently packaged does not support the means to
    enclose the metadata extraction per opam instance (when sharing the same opam root), we
    actually work around this by opening internal details to opam.
    
    [1] D6316
    
    [2] `opam show` is currently the interface we are using to extract and list information
    about a package. It does work on standalone opam root folder but it comes short when
    sharing multiple instances within one opam root (for now).

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/574/ for more details.

Build is green

Patch application report for D6318 (id=23126)

Could not rebase; Attempt merge onto f4556e0113...

Updating f4556e0..8aca257
Fast-forward
 swh/loader/package/opam/loader.py          | 182 ++++++++++++++++-------------
 swh/loader/package/opam/tests/test_opam.py |  50 +++++++-
 2 files changed, 146 insertions(+), 86 deletions(-)
Changes applied before test
commit 8aca257ae8ec6fe24da0750873528d6252adc92a
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Fri Sep 24 13:37:40 2021 +0200

    opam: Define a initialize_opam_root parameter for opam loader
    
    When running in production, the workers should expect the opam root directory to be
    present (externally maintained). The production task should do
    nothing (initialize_opam_root's default value is False).
    
    When running with standalone loader, they should be instantiated with
    initialize_opam_root to True. They will create the opam root folder if not present so it
    can work out of the box (e.g. docker worker)
    
    Related to T3590

commit b2d175965abed6d116a9f70e484ba24ea01f63cc
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Sep 21 20:16:53 2021 +0200

    Allow opam loader to actually use multi-instance opam root
    
    It allow the opam loader to reuse existing opam root with multiple instances. It's the
    complementary code that goes with the loader adaptation [1].
    
    As the `opam show` (cli) [2] version currently packaged does not support the means to
    enclose the metadata extraction per opam instance (when sharing the same opam root), we
    actually work around this by opening internal details to opam.
    
    [1] D6316
    
    [2] `opam show` is currently the interface we are using to extract and list information
    about a package. It does work on standalone opam root folder but it comes short when
    sharing multiple instances within one opam root (for now).

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/575/ for more details.

Build is green

Patch application report for D6318 (id=23128)

Rebasing onto f4556e0113...

Current branch diff-target is up to date.
Changes applied before test
commit b2d175965abed6d116a9f70e484ba24ea01f63cc
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Sep 21 20:16:53 2021 +0200

    Allow opam loader to actually use multi-instance opam root
    
    It allow the opam loader to reuse existing opam root with multiple instances. It's the
    complementary code that goes with the loader adaptation [1].
    
    As the `opam show` (cli) [2] version currently packaged does not support the means to
    enclose the metadata extraction per opam instance (when sharing the same opam root), we
    actually work around this by opening internal details to opam.
    
    [1] D6316
    
    [2] `opam show` is currently the interface we are using to extract and list information
    about a package. It does work on standalone opam root folder but it comes short when
    sharing multiple instances within one opam root (for now).

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/577/ for more details.

vlorentz added inline comments.
swh/loader/package/opam/loader.py
109–111

I feel it would be clearer like this

swh/loader/package/opam/loader.py
109–111

yes ;)

swh/loader/package/opam/loader.py
109–111

I'll amend in a dedicated commit.

This revision was not accepted when it landed; it landed in state Needs Review.Sep 29 2021, 9:16 AM
This revision was automatically updated to reflect the committed changes.