Page MenuHomeSoftware Heritage

journalprocessor: re-enable subsharding per partition
ClosedPublic

Authored by seirl on Apr 29 2022, 5:38 PM.

Details

Summary

Significantly improves performance by reducing the number of levels in
each DB, and thus reducing the amount of compaction.

Test Plan

Tested with a manual export

Diff Detail

Repository
rDDATASET Datasets
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D7718 (id=27914)

Rebasing onto db331f27e8...

First, rewinding head to replay your work on top of it...
Applying: journalprocessor: re-enable subsharding per partition
Changes applied before test
commit 64bdfe68f337cb8a2185fc36ad891f4176a8531e
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Fri Apr 29 15:36:35 2022 +0000

    journalprocessor: re-enable subsharding per partition
    
    Significantly improves performance by reducing the number of levels in
    each DB, and thus reducing the amount of compaction.

See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/143/ for more details.

seirl requested review of this revision.Apr 29 2022, 5:40 PM
olasd added a subscriber: olasd.

In the future this may warrant a config knob (or enablement only above a given threshold in the target offsets) to avoid thrashing on 16 indexes all the time even for small(er) topics.

This revision is now accepted and ready to land.Apr 29 2022, 5:46 PM
This revision was landed with ongoing or failed builds.Apr 29 2022, 5:49 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D7718 (id=27915)

Rebasing onto db331f27e8...

Current branch diff-target is up to date.
Changes applied before test
commit c2c2c21e081c6276c2ef29248b562ff55bd44604
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Fri Apr 29 15:36:35 2022 +0000

    journalprocessor: re-enable subsharding per partition
    
    Significantly improves performance by reducing the number of levels in
    each DB, and thus reducing the amount of compaction.

See https://jenkins.softwareheritage.org/job/DDATASET/job/tests-on-diff/144/ for more details.