Details
- Reviewers
aleo vlorentz - Group Reviewers
Reviewers - Maniphest Tasks
- T3590: opam loader: Ensure required opam state is shared amongst ingestion/listing runs
- Commits
- rDLSe7716c0122e9: opam: Share opam root directory even on multiple instances
tox
Diff Detail
- Repository
- rDLS Listers
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
Build is green
Patch application report for D6316 (id=22950)
Rebasing onto 5ab6b00408...
Current branch diff-target is up to date.
Changes applied before test
commit 9af44c655ab9526854108f07a43a8f986b9d09ca Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Sep 21 18:18:49 2021 +0200 opam: Share opam root directory even on multiple instances That avoid having multiple distinct opam root directories per opam lister instance. The current opam commands used by the lister are actually listing specifically per instance. Related to P1171
See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/377/ for more details.
Can two listers run in parallel while using the same directory?
yes, but that state should really be dealt with outside the lister.
I'm working on that in D6305.
It's still there though so maybe it's actually runnable within
say the docker dev environment.
Sorry, my question was about opam itself. Are you sure it won't corrupt data, or wait on a global lock when accessing that directory?
Sorry, my question was about opam itself. Are you sure it won't corrupt data, or wait on a global lock when accessing that directory?
That's a good question which i don't know the answer to.
I do plan to actually test this in docker (i need some adaptations in the loader first), then in staging to check if everyting is running smoothly without corruption nor lock.
Ok.
What is the motivation for merging directories?
Use less disk space and less network access during the run per worker.
I think there was already a problem before, but since we have now more chance to hit it, I'd really like the opam_init process to lock the directory when running opam commands.
Following what i said in the loader diff, i'm actually closing this.
Ack on the lock folder but i won't attend to it immediately.
[1] D6318
Following what i said in the loader diff, i'm actually closing this.
Ack on the lock folder but i won't attend to it immediately.[1] D6318
As i was wrong in my implementation of the loader implementation and @aleo made me realize, i've fixed it.
So now that lister diff becomes relevant again, so claimed it back.
I think there was already a problem before, but since we have now more chance to hit it, I'd really like the opam_init process to lock the directory when running opam commands.
It's a great idea but i've no idea how to actually do that though.
Maybe adding --safe flag [1] during the command that actually list the packages would be enough instead.
I've actually added that for the loader [2] (for the command that also read information)
[1]
--safe, --readonly Make sure nothing will be automatically updated or rewritten. Useful for calling from completion scripts, for example. Will fail whenever such an operation is needed ; also avoids waiting for locks, skips interactive questions and overrides the $OPAMDEBUG variable. This is equivalent to set environment variable $OPAMSAFE.
[2] D6318
Build has FAILED
Patch application report for D6316 (id=23037)
Rebasing onto 5ab6b00408...
Current branch diff-target is up to date.
Changes applied before test
commit 1f43183537afe8f5671357dcee929b41ffc632e0 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Sep 21 18:18:49 2021 +0200 opam: Share opam root directory even on multiple instances That avoids having multiple distinct opam root directories per opam lister instance. The current opam commands used by the lister are actually listing specifically per instance. Related to P1171
Link to build: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/379/
See console output for more information: https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/379/console
Build is green
Patch application report for D6316 (id=23039)
Rebasing onto 5ab6b00408...
Current branch diff-target is up to date.
Changes applied before test
commit e7716c0122e9c4a411ed54dde821b48939eb2ebb Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Sep 21 18:18:49 2021 +0200 opam: Share opam root directory even on multiple instances That avoids having multiple distinct opam root directories per opam lister instance. The current opam commands used by the lister are actually listing specifically per instance. Related to P1171
See https://jenkins.softwareheritage.org/job/DLS/job/tests-on-diff/380/ for more details.
You may use fcntl.flock for this
I mean using an empty (lock) file in the opam_root directory.
See also 3rd party libraries like https://pypi.org/project/filelock/
@douardda I've started a work [1] on this.
But in the end, i don't think that's the quite needed.
The expected workers for loading will actually be deployed with a concurrency of 1 [2]
(and the lister will run once a day).
What do you think?
[1] D6337
[2] That was sufficiently fast that way.
As mentioned in the ^, opam should take care of the lock.
[2] That was sufficiently fast that way.