Page MenuHomeSoftware Heritage

git_bare: Fetch directories concurrently, using threads.
ClosedPublic

Authored by vlorentz on Nov 17 2021, 6:22 PM.

Diff Detail

Event Timeline

Build is green

Patch application report for D6650 (id=24178)

Rebasing onto bd649ccb97...

Current branch diff-target is up to date.
Changes applied before test
commit c26a46f689288090930e22cf0e0f8378728bbe08
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Nov 17 18:18:06 2021 +0100

    git_bare: Fetch directories concurrently, using threads.
    
    This divides the total wall time by ~2.

See https://jenkins.softwareheritage.org/job/DVAU/job/tests-on-diff/186/ for more details.

olasd added a subscriber: olasd.

In general I'm a bit uneasy about parallelism for individual workers, as it's hidden and multiplies by the "exogeneous" parallelism of having multiple workers.

In that specific case, and considering the relatively small git-bare cooker workload, it's probably worth it though.

Could you consider adding this as a configuration option rather than a hardcoded constant? We may want to turn the parallelism off if we end up cooking a (large) batch of objects at once to reduce database load.

This revision is now accepted and ready to land.Nov 18 2021, 1:26 PM

Yeah I'm not a huge fan either, that's why I started with the async approach; but that's a lot of trouble so I'd rather do this.

make parallelism level configurable

Build is green

Patch application report for D6650 (id=24220)

Rebasing onto bd649ccb97...

Current branch diff-target is up to date.
Changes applied before test
commit 5fcbeb0be2e3af4a62028d8470b4f473f232d404
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Nov 17 18:18:06 2021 +0100

    git_bare: Fetch directories concurrently, using threads.
    
    This divides the total wall time by ~2.

See https://jenkins.softwareheritage.org/job/DVAU/job/tests-on-diff/188/ for more details.