Page MenuHomeSoftware Heritage

Add delivery notification handling to swh.journal.writer.kafka
ClosedPublic

Authored by olasd on Apr 9 2020, 2:13 PM.

Event Timeline

Build has FAILED

Patch application report for D2994 (id=10633)

Could not rebase; Attempt merge onto 893d285adf...

Updating 893d285..1c7b38c
Fast-forward
 swh/journal/serializers.py             |   6 +-
 swh/journal/tests/test_kafka_writer.py | 141 ++++++++++++++++++++++++++++++-
 swh/journal/writer/kafka.py            | 150 ++++++++++++++++++++++++++++++---
 3 files changed, 279 insertions(+), 18 deletions(-)
Changes applied before test
commit 1c7b38c5acb2a0bc6b0991f74490da22ea02d0ad
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Apr 9 13:01:11 2020 +0200

    Add delivery notification handling to swh.journal.writer.kafka

commit e9d970ffb1594b98578e6011da0a93627be4deca
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Apr 8 15:22:37 2020 +0200

    Make the default value for the kafka writer's message.max.bytes 100MB
    
    This matches the current configuration everywhere, and the future configuration
    of brokers.

commit 7b9dc4d477b994846c7027481c96e949bbda7fd9
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Apr 8 17:00:42 2020 +0200

    Use Optional[x] instead of Union[x, None]

commit e1ab119cc51aec0440ffaf8431e05b2e60609763
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Apr 8 16:58:51 2020 +0200

    Introduce a KeyType alias in serializers to factor out its definition

commit 08348047d28bfd699a827f6a6b0fcf5dc8625aca
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Apr 8 16:50:57 2020 +0200

    Accept Sequence[ModelObject] instead of List[ModelObject]
    
    The latter is invariant (because List is a mutable type), so it doesn't allow
    passing in lists of ModelObject subtypes.

Link to build: https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/25/
See console output for more information: https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/25/console

vlorentz requested changes to this revision.Apr 9 2020, 3:01 PM
vlorentz added a subscriber: vlorentz.

Missing a test for delivery timeouts (you can mock time.monotonic)

swh/journal/writer/kafka.py
79–84

Doesn't need to be an inner function

94

KafkaDeliveryErrors?

97

Iterable is enough

107

impl __repr__ instead, object.__str__ defaults to calling it

108

f"KafkaDeliveryError({self.message}, [{self.pretty_failures()}])"

122

Could you document flush_timeout?

This revision now requires changes to proceed.Apr 9 2020, 3:01 PM
olasd marked 5 inline comments as done.

Rebase, and apply some of @vlorentz's comments

Build is green

Patch application report for D2994 (id=10649)

Could not rebase; Attempt merge onto 7a61d21811...

Updating 7a61d21..1024e95
Fast-forward
 swh/journal/tests/test_kafka_writer.py | 139 +++++++++++++++++++++++++++++++-
 swh/journal/writer/kafka.py            | 143 +++++++++++++++++++++++++++++++--
 2 files changed, 273 insertions(+), 9 deletions(-)
Changes applied before test
commit 1024e95f16409ea519532886f8542d87e5f4e179
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Apr 9 13:01:11 2020 +0200

    Add delivery notification handling to swh.journal.writer.kafka

commit 799069e6e8798f650fc3d23ccea490cdff9c789d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Apr 9 16:25:53 2020 +0200

    Add documentation for KafkaJournalWriter arguments

See https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/31/ for more details.

swh/journal/writer/kafka.py
94

Exception names are usually singular, even if they merge multiple errors.

107

I'm not too sure I want to do that.

vlorentz added inline comments.
swh/journal/writer/kafka.py
75

-> str

107

Why not?

This revision is now accepted and ready to land.Apr 9 2020, 5:35 PM

Rebase on top of spun out changes

Build is green

Patch application report for D2994 (id=10670)

Could not rebase; Attempt merge onto 750d4c6c5b...

Updating 750d4c6..3b6e283
Fast-forward
 swh/journal/serializers.py             |  88 +++++++++++++-
 swh/journal/tests/conftest.py          |  41 +++++--
 swh/journal/tests/test_kafka_writer.py | 207 ++++++++++++++++++++++++++-------
 swh/journal/tests/test_replay.py       |  38 +++---
 swh/journal/tests/test_serializers.py  |  26 +++++
 swh/journal/writer/kafka.py            | 158 ++++++++++++++++---------
 6 files changed, 431 insertions(+), 127 deletions(-)
Changes applied before test
commit 3b6e2834328672695453868a7e11fc3ab01a1921
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Apr 9 13:01:11 2020 +0200

    Add delivery notification handling to swh.journal.writer.kafka

commit 6ef3ee811cd6206f3ac50e2ef8d7cf69f087fb97
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Apr 9 18:00:10 2020 +0200

    Add a key pretty printer to the serializers module

commit b7a6b52867ffb38703bdcd1082bb4ca6899e3159
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Fri Apr 10 12:45:59 2020 +0200

    Ensure all object keys are passed through unharmed by the journal
    
    The previous implementation of this test only checked keys that were plain
    bytes. It turns out that the "compounded" keys decoding was wrong.
    
    This allows us to simplify the TEST_OBJECT_DICTS structure to only contain a
    list of object dicts.

commit fb21467e8864d2e6e71601038ef0a160b1a33a4f
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Apr 9 17:58:40 2020 +0200

    Move _get_key from the kafka writer to serializers
    
    Use the opportunity to add some trivial smoke tests

commit 9f0cd745d07e7e379446eb82bcfd25e88e6d1d4a
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Fri Apr 10 11:41:54 2020 +0200

    Add a model object based version of the journal writer test objects
    
    This refactors the various conversions happening in the journal writer tests in
    a single place.
    
    When developing this, I noticed that the journal writer would let Content data
    go through; Make sure we do not let it do that.

commit 1c9ccb29eea3b974ea0d3613d61a65cd4e83037b
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Apr 9 18:50:18 2020 +0200

    Rename OBJECT_TYPE_KEYS to TEST_OBJECT_DICTS

See https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/38/ for more details.

Build is green

Patch application report for D2994 (id=10672)

Could not rebase; Attempt merge onto 750d4c6c5b...

Updating 750d4c6..646dedd
Fast-forward
 swh/journal/serializers.py             |  88 +++++++++++++-
 swh/journal/tests/conftest.py          |  41 +++++--
 swh/journal/tests/test_kafka_writer.py | 207 ++++++++++++++++++++++++++-------
 swh/journal/tests/test_replay.py       |  38 +++---
 swh/journal/tests/test_serializers.py  |  26 +++++
 swh/journal/writer/kafka.py            | 158 ++++++++++++++++---------
 6 files changed, 431 insertions(+), 127 deletions(-)
Changes applied before test
commit 646dedd2ba843f9be430146a3db16ef7b6c72e76
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Apr 9 13:01:11 2020 +0200

    Add delivery notification handling to swh.journal.writer.kafka

commit 5df65c50777325ae710f51a0e93ee94af2791f43
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Apr 9 18:00:10 2020 +0200

    Add a key pretty printer to the serializers module

commit b7a6b52867ffb38703bdcd1082bb4ca6899e3159
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Fri Apr 10 12:45:59 2020 +0200

    Ensure all object keys are passed through unharmed by the journal
    
    The previous implementation of this test only checked keys that were plain
    bytes. It turns out that the "compounded" keys decoding was wrong.
    
    This allows us to simplify the TEST_OBJECT_DICTS structure to only contain a
    list of object dicts.

commit fb21467e8864d2e6e71601038ef0a160b1a33a4f
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Apr 9 17:58:40 2020 +0200

    Move _get_key from the kafka writer to serializers
    
    Use the opportunity to add some trivial smoke tests

commit 9f0cd745d07e7e379446eb82bcfd25e88e6d1d4a
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Fri Apr 10 11:41:54 2020 +0200

    Add a model object based version of the journal writer test objects
    
    This refactors the various conversions happening in the journal writer tests in
    a single place.
    
    When developing this, I noticed that the journal writer would let Content data
    go through; Make sure we do not let it do that.

commit 1c9ccb29eea3b974ea0d3613d61a65cd4e83037b
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Apr 9 18:50:18 2020 +0200

    Rename OBJECT_TYPE_KEYS to TEST_OBJECT_DICTS

See https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/40/ for more details.

Add test for flush_timeout

This revision is now accepted and ready to land.Apr 10 2020, 6:40 PM

Build is green

Patch application report for D2994 (id=10696)

Could not rebase; Attempt merge onto 63bd07af49...

Updating 63bd07a..7ff372a
Fast-forward
 swh/journal/serializers.py             |  22 +++++
 swh/journal/tests/test_kafka_writer.py | 173 ++++++++++++++++++++++++++++++++-
 swh/journal/tests/test_serializers.py  |  17 ++++
 swh/journal/writer/kafka.py            | 106 ++++++++++++++++++--
 4 files changed, 308 insertions(+), 10 deletions(-)
Changes applied before test
commit 7ff372a02de486b8305c42ed76dac209a6de71b5
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Apr 9 13:01:11 2020 +0200

    Add delivery notification handling to swh.journal.writer.kafka

commit 944bc0bac58672c9b0f08908965881f72d9305c3
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Fri Apr 10 18:36:56 2020 +0200

    Allow overriding the Producer class in the kafka journal writer

commit a20e557a824c35ee4ebdb03b28504be138ba0baf
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Apr 9 18:00:10 2020 +0200

    Add a key pretty printer to the serializers module

See https://jenkins.softwareheritage.org/job/DJNL/job/tests-on-diff/45/ for more details.

swh/journal/writer/kafka.py
157–159

Jenkins says this is not tested

swh/journal/writer/kafka.py
157–159

It's not a regression from the previous state though is it?