Page MenuHomeSoftware Heritage

Implement a separate kafka communication thread for journal clients
Changes PlannedPublicDraft

Authored by olasd on Mon, Oct 31, 4:18 PM.

Details

Reviewers
None
Group Reviewers
Reviewers
Summary

This communication thread is in charge of pulling the messages from
kafka and handing them off to a processing thread, as well as doing
regular polling of the rdkafka client (which in turn notifies the
brokers that the consumer is still alive).

Doing this allows the kafka communication thread to pause the kafka
consumption explicitly when processing a batch of messages takes too
long. This can in turn avoid a lot of rebalance traffic on the kafka
brokers, and overall avoids a bunch of internal rdkafka timeouts.

Test Plan

We've been using a variant of this on mmca (swh.provenance revision
journal client) for a while.

Diff Detail

Repository
rDJNL Journal infrastructure
Branch
olasd/poll-thread
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 32655
Build 51156: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 51155: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D8797 (id=31705)

Rebasing onto 1d879f1dd6...

Current branch diff-target is up to date.
Changes applied before test
commit a82b292ef7a5db1fdc7a0f849dd229a808a150ed
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Tue Oct 25 15:54:55 2022 +0200

    Implement a separate kafka communication thread for journal clients
    
    This communication thread is in charge of pulling the messages from
    kafka and handing them off to a processing thread, as well as doing
    regular polling of the rdkafka client (which in turn notifies the
    brokers that the consumer is still alive).
    
    Doing this allows the kafka communication thread to pause the kafka
    consumption explicitly when processing a batch of messages takes too
    long. This can in turn avoid a lot of rebalance traffic on the kafka
    brokers, and overall avoids a bunch of internal rdkafka timeouts.

Link to build: https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/231/
See console output for more information: https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/231/console

Harbormaster returned this revision to the author for changes because remote builds failed.Mon, Oct 31, 4:19 PM
Harbormaster failed remote builds in B32655: Diff 31705!