Page MenuHomeSoftware Heritage

docker-compose: Add journal stack
ClosedPublic

Authored by ardumont on Dec 19 2018, 8:42 PM.

Details

Reviewers
olasd
douardda
vlorentz
Group Reviewers
Reviewers
Commits
rDENV187730ea9d73: publisher: Fix default configuration, directory does not exist yet
rDENV47cc9067f1ed: swh-journal-client: Fix logging and configuration
rDENVb2f3c73311ef: storage-listener: Fix .pgpass rights
rDENVe07dda8d4bbb: docker-compose: Remove unused volume
rDENV5bc984e146f7: Ignore docker-compose.override.yml
rDENVccb3294e4bbe: publisher: Fix default configuration
rDENV9e2c20ed0024: swh-journal-publisher: Run in debug mode
rDENV7dc5690ebfda: swh-journal-publisher: Simplify default configuration
rDENVf8d5ab294024: kafka.env: Simplify default configuration
rDENV6e51e31e1016: swh-journal-publisher: Update default configuration
rDENVc9cba911600d: journal/listener: Fix default configuration
rDENV24a84e493fe5: storage-listener: Add --verbose flag
rDENV4416afad1fa2: storage-listener: Add the missing runtime dependency
rDENV81cf4e4c625f: swh-journal-client: Add a journal sample client
rDENV64f92fa0f21e: kafka: Update the default expected swh topics
rDENV43202f74d3e6: Add swh-journal's publisher service
rDENV15b3403b9039: swh-storage: Fix mix tab/spaces by removing all tabs
rDENVef0f4cbfdf78: swh-storage-listener: Simplify default configuration
rDENV95b98a181e01: docker-compose: Use kafka.env file
rDENVc90ea18154a1: docker-compose: Add swh-storage-listener
rDENV666c2225a635: docker-compose: Make kafka start as a single node
rDENVe38a2967b894: docker-compose: Add kafka stack
rCDFDb2f3c73311ef: storage-listener: Fix .pgpass rights
rCDFD187730ea9d73: publisher: Fix default configuration, directory does not exist yet
rCDFDe07dda8d4bbb: docker-compose: Remove unused volume
rCDFD47cc9067f1ed: swh-journal-client: Fix logging and configuration
rCDFD5bc984e146f7: Ignore docker-compose.override.yml
rCDFD7dc5690ebfda: swh-journal-publisher: Simplify default configuration
rCDFDccb3294e4bbe: publisher: Fix default configuration
rCDFD9e2c20ed0024: swh-journal-publisher: Run in debug mode
rCDFDf8d5ab294024: kafka.env: Simplify default configuration
rCDFDc9cba911600d: journal/listener: Fix default configuration
rCDFD6e51e31e1016: swh-journal-publisher: Update default configuration
rCDFD64f92fa0f21e: kafka: Update the default expected swh topics
rCDFD4416afad1fa2: storage-listener: Add the missing runtime dependency
rCDFD24a84e493fe5: storage-listener: Add --verbose flag
rCDFD81cf4e4c625f: swh-journal-client: Add a journal sample client
rCDFD15b3403b9039: swh-storage: Fix mix tab/spaces by removing all tabs
rCDFDef0f4cbfdf78: swh-storage-listener: Simplify default configuration
rCDFD43202f74d3e6: Add swh-journal's publisher service
rCDFDc90ea18154a1: docker-compose: Add swh-storage-listener
rCDFD666c2225a635: docker-compose: Make kafka start as a single node
rCDFD95b98a181e01: docker-compose: Use kafka.env file
rCDFDe38a2967b894: docker-compose: Add kafka stack
Summary

Add:

  • journal (through kafka/zookeeper instance)
  • swh-storage-listener
  • swh-journal-publisher
  • swh-journal-client (dedicated client for the occasion)

Scenario is:

  • sql triggers on newly inserted objects (swh db) events
  • listener picked up those events, published those on the journal on temporary topics
  • publisher who is subscribed to the listener's topics, picked up those events, reified those associated objects, published them to the journal on new public topics
  • client who is subscribed on the origin/origin-visit public topics picked those up and logged them

Depends on D859
Depends on D863

Related T1443

Test Plan
docker-compose up

Load a repository, we should see logs about swh-journal-client logging events received (origin, origin-visit).

2018-12-20 13:20:35,344 1 INFO Setting newly assigned partitions {TopicPartition(topic='swh.journal.objects.origin_visit', partition=0), TopicPartition(topic='swh.journal.objects.origin', partition=0)} for group swh.journal.client
2018-12-20 13:20:35,404 1 INFO client received the following messages: defaultdict(<class 'list'>, {'origin_visit': [{b'date': b'2018-12-20 13:03:46.600807+00:00', b'metadata': None, b'origin': 2, b'snapshot': None, b'status': b'ongoing', b'visit': 4}, {b'date': b'2018-12-20 13:03:46.600807+00:00', b'metadata': None, b'origin': 2, b'snapshot': b'|4\x99n(\x8d\x98\xae\xa4\xed\xb2\x9eg\xaeT\x1a\xb2\x8b\xac\xd8', b'status': b'full', b'visit': 4}]})
2018-12-20 13:20:49,681 1 INFO client received the following messages: defaultdict(<class 'list'>, {'origin_visit': [{b'date': b'2018-12-20 13:20:33.234323+00:00', b'metadata': None, b'origin': 2, b'snapshot': None, b'status': b'ongoing', b'visit': 5}, {b'date': b'2018-12-20 13:20:33.234323+00:00', b'metadata': None, b'origin': 2, b'snapshot': b'|4\x99n(\x8d\x98\xae\xa4\xed\xb2\x9eg\xaeT\x1a\xb2\x8b\xac\xd8', b'status': b'full', b'visit': 5}]})

Diff Detail

Repository
rCDFD Dockerfiles for developers
Branch
add-journal
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 3199
Build 4101: arc lint + arc unit

Event Timeline

  • journal/listener: Fix default configuration
  • kafka.env: Simplify default configuration
  • swh-journal-publisher: Update default configuration
  • swh-journal-publisher: Run in debug mode
  • swh-journal-publisher: Simplify default configuration
vlorentz added inline comments.
docker-compose.yml
20

why?

dockerfiles/swh-storage-listener/entrypoint.sh
30

2>&1 to redirect stderr to stdout.

ardumont added inline comments.
docker-compose.yml
20

i don't know, i'm planning on checking if it's useful or not.

dockerfiles/swh-storage-listener/entrypoint.sh
30

lol
Right you are!
thanks

It's not completely ready, i have for example:

  • the client that does nothing for the moment.
  • The listener and publisher are working but not completely.
  • The publisher somehow seems to not do anything in regards to origin/origin-visits
docker-compose.yml
20

It's useful only if Kafka needs to spawn other containers. We don't do that, do we?

ardumont added inline comments.
docker-compose.yml
20

No, well not intentionally at least ;)

  • publisher: Fix default configuration
  • Ignore docker-compose.override.yml
  • docker-compose: Remove unused volume
  • publisher: Fix default configuration, directory does not exist yet
  • swh-journal-client: Fix logging and configuration
This revision is now accepted and ready to land.Dec 20 2018, 2:04 PM
ardumont edited the test plan for this revision. (Show Details)
ardumont edited the test plan for this revision. (Show Details)
  • storage-listener: Fix .pgpass rights
ardumont edited the test plan for this revision. (Show Details)
This revision was automatically updated to reflect the committed changes.