Page MenuHomeSoftware Heritage

Unify retry/error handling for content replay
ClosedPublic

Authored by olasd on Mar 6 2020, 1:57 PM.

Details

Summary

This uses a custom wrapper exception and tenacity callbacks to log exceptions
when the copy of a given content fails several times. This makes the consumer
more robust (fewer crashes), which in turns allows fewer consumer rebalances,
which finally drastically reduces the consumer bandwidth consumption.

At this point, the retry of "definitely" failed content replays needs to be
handled separately.

Depends on D2781

Test Plan

one new tox test added for the retry behavior

Diff Detail

Repository
rDJNL Journal infrastructure
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 10969
Build 16513: tox-on-jenkinsJenkins
Build 16512: arc lint + arc unit

Event Timeline

vlorentz added a subscriber: vlorentz.
vlorentz added inline comments.
swh/journal/tests/test_cli.py
485–497

please add a comment explaining the error pattern this loop emulates

This revision is now accepted and ready to land.Mar 6 2020, 3:27 PM

Make the test failure pattern clearer

olasd requested review of this revision.Mar 6 2020, 4:03 PM
This revision is now accepted and ready to land.Mar 6 2020, 4:11 PM