Page MenuHomeSoftware Heritage

Factorize StorageReplayer and JournalClient.
ClosedPublic

Authored by vlorentz on May 22 2019, 2:26 PM.

Diff Detail

Repository
rDJNL Journal infrastructure
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Finish handling polled message after max_message is reached,
so we don't commit un-handled messages.

I am not very happy with the API of the JournalClient (it's not the result of this diff, but since we are discussing refactorings in there...)

I don't like to be obliged to use inheritance to, in fact, only define a callback. Why not pass this as a callable argument to JournalClient.process()?

Doing so, the StorageReplayer class is not needed anymore and insert_objects becomes a simple function (and JournalClient must not be abstract any more either).

douardda added inline comments.
swh/journal/replay.py
31–52

This code could also be simplified. At least there is no need for keeping the 'content' in the list of checked object type in the first if statement since the first statement in the block is if object_type == 'content':

I mean this should be enough:

if object_type == 'content':
  # do stuff
elif object_type == 'origin_visit':
  # do stuff
elif content_type in (remaining_types):
  # do stuff
else:
  raise ValueError()

This could also be implemeted as a visitor, but meh.

This revision now requires changes to proceed.May 22 2019, 3:47 PM
swh/journal/client.py
87–91

In fact I don't really see the advantage of implementing the max_messages behavior in there.

The process should be a process_one_poll (or so) and let the caller handle the max messages logic.

  • Make JournalClient take a worker function, instead of using inheritence.
  • Make JournalClient take a worker function, instead of using inheritence.

thanks!

This revision is now accepted and ready to land.May 24 2019, 2:45 PM
This revision was automatically updated to reflect the committed changes.