Page MenuHomeSoftware Heritage

add a content replayer service
ClosedPublic

Authored by douardda on Jun 5 2019, 12:19 PM.

Details

Summary
  • service that listen to the 'swh.journal.objects.content' topic and replicate object blobs from a source object storage to a destination one.
  • refactor replayer tests; implement the whole kafka mocking logic in the Mocked* classes and delete the test_write_replay_same_order which was not testing anything more than what the orther test do.
  • client: simplify a bit the JournalClient class no need for those poll() and commit() proxy methods.

Depends on D1540

Diff Detail

Repository
rDJNL Journal infrastructure
Branch
master
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 6023
Build 8278: tox-on-jenkinsJenkins
Build 8277: arc lint + arc unit

Event Timeline

vlorentz added a subscriber: vlorentz.
vlorentz added inline comments.
swh/journal/cli.py
165–177

I think these error messages should tell the exact name of the missing key, so it's easier to look for them in an example config file.

(And we should also have an example config file)

179–183

We should raise an error if there is no prefix or group id.

swh/journal/client.py
81–86

These methods are mocked in the tests.

swh/journal/tests/test_write_replay.py
44–49

not needed in a writer.

You should put that in a new class, eg. MockConsumer.

55

A writer is not a consumer. You would want to use MockConsumer instead.

107

same

This revision now requires changes to proceed.Jun 5 2019, 12:27 PM
swh/journal/cli.py
179–183

not for the prefix, since the default value is fine (and is the one that will be used in most of the cases... TBH, I don't really see the reason this is configurable).

group_id is indeed a mandatory config option and should not have a default value...

swh/journal/client.py
81–86

not any more :-)

swh/journal/tests/test_write_replay.py
44–49

this mock is both a writer and a consumer in this test. I mostly just merged several sparse functions in this class. I was not very pleased with the way these tests were written before, so I did this, but the result does not please me too much neither...

55

as said before, this mockup is not a nice mockup of a kafka consumer nor a kafka producer, but some sort of frankenstein of both. It does the job, but is not so pretty.

Also, your new CLI needs tests :)

swh/journal/cli.py
179–183

if prefix is None and there is no prefix in the config, then prefix will be set to None, not to DEFAULT_PREFIX. And we don't want that.

swh/journal/client.py
81–86

I forgot to remove that comment after actually reading the diff, sorry

swh/journal/tests/test_write_replay.py
44–49

this mock is both a writer and a consumer in this test

these are two different things, so they should be different classes.

but the result does not please me too much neither...

Did you try using unittest.mock.patch?

swh/journal/cli.py
179–183

indeed we do not want that. I did write the correct version of this stuff at some point but I guess I messed during a rebase or smthg

swh/journal/tests/test_write_replay.py
44–49

please note I "just" refactored existing tests...

should address vlortenz's comments

swh/journal/replay.py
45–48

inconsistent with the behavior of _insert_objects (which asserts False).

I have a preference to using the existing behavior (crashing) for both instead of just logging, but that's up to you

swh/journal/replay.py
45–48

Thinking about it, it's probably better to log.

Could you raise the log level to warning (so it doesn't go unnoticed), and make _insert_objects do that too?

swh/journal/replay.py
45–48

I think I also prefer this second solution, i will

Also, your new CLI needs tests :)

This one is actually trickier to write...

replay: log warnings in case of unexpected object types

as suggested by vlorentz.

add a TODO in test_cli about content-replay

add a TODO in test_cli about content-replay

You're cheating!

  • Add a content replayer service
  • cli: extract the JournalClient instanciation in a dedicated function
  • replay: make unexpectedly received object_type log a warning
This revision is now accepted and ready to land.Jun 7 2019, 3:10 PM
This revision was landed with ongoing or failed builds.Jun 7 2019, 4:07 PM
This revision was automatically updated to reflect the committed changes.