Details
- Reviewers
vlorentz - Group Reviewers
Reviewers - Maniphest Tasks
- T2821: indexer: Improve tests
- Commits
- rDCIDXe0f1199c2102: test_journal_client_cli: Send production objects to journal
tox
Diff Detail
- Repository
- rDCIDX Metadata indexer
- Branch
- master
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 17745 Build 27431: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 27430: arc lint + arc unit
Event Timeline
Build has FAILED
Patch application report for D4641 (id=16471)
Could not rebase; Attempt merge onto 28ae49da34...
Updating 28ae49d..23a036a Fast-forward conftest.py | 6 +- requirements-swh.txt | 2 +- swh/indexer/storage/db.py | 4 +- swh/indexer/tests/conftest.py | 76 ++++++--- swh/indexer/tests/test_cli.py | 275 ++++++++++++++++++++---------- swh/indexer/tests/test_journal_client.py | 172 +++++++++---------- swh/indexer/tests/test_origin_head.py | 34 ++-- swh/indexer/tests/test_origin_metadata.py | 89 +++++++--- 8 files changed, 395 insertions(+), 263 deletions(-)
Changes applied before test
commit 23a036ad20bd88b1f15e469e06fb01d0523424dd Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Wed Dec 2 09:21:51 2020 +0100 wip: test_journal_client: Send production objects to journal for testing It's broken for now as somehow only 1 object pass through the pipe and I don't get why so far. I also tried to use our real journal writer instead of the plain Producer so it match even more the reality of this test but that just hangs. So I reverted because, one problem at a time... Related to T2821 commit 305b265359e503459b78ec6de43b64fb0b31a511 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Dec 1 17:55:36 2020 +0100 test_journal_client: Migrate away from mocks Related to D4638 commit 394576b95688b867a86a5bcb4da30534640a0577 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Dec 1 15:40:01 2020 +0100 tests: Use production backends within the indexer tests This detected some paper cuts within cli tests for example. The main goal is to decrease friction when actually deploying indexer related services (backend, indexers, ...). The pg backends tests should still be reasonably fast as it's using the swh.core.db.pytest_plugin (which truncate tables in between tests). Related to T2821
Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/133/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/133/console
When I suggested this, I figured this would be configuring swh_storage to use a journal writer, rather than reimplementing that yourself.
producer.produce is asynchronous; You need to wait for the delivery callback before being able to check the messages at the other end of the pipeline. There's a good chance the client exhausts the messages available before they're actually all sent.
When I suggested this, I figured this would be configuring swh_storage to use a journal writer, rather than reimplementing that yourself.
That did not occur to me. Interesting.
I'll check that.
Thanks.
When I suggested this, I figured this would be configuring swh_storage to use a journal writer, rather than reimplementing that yourself.
That's actually annoying... create an origin-visit-status requires to create the origin visits first which requires to create the origins...
I actually used a journal writer nonetheless (independently from a storage), it no longer hangs [1] as initially said in the diff description.
That's still finding only 1 message as well though...
so...
producer.produce is asynchronous; You need to wait for the delivery callback before being able to check the messages at the other end of the pipeline. There's a good chance the client exhausts the messages available before they're actually all sent.
I guess this holds true for the journal writer as well.
I'll update the diff with that checkout so it's actually demonstrated.
[1] hang solved: maybe the input in the get_journal_writer was incorrect (broker expected a list of brokers, not a single string...).
Build has FAILED
Patch application report for D4641 (id=16519)
Rebasing onto f775fea12b...
Current branch diff-target is up to date.
Changes applied before test
commit 133de321ead3e33c9cdfd45cc2ceca5ca7702c97 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Dec 3 15:56:43 2020 +0100 test_cli: Use a journal_writer instead of a plain Producer It's able to deal with swh.model.model objects. commit 84e2ade7612d974aff977a77bdc7f8cbe5cf8f8d Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Wed Dec 2 09:21:51 2020 +0100 wip: test_journal_client: Send production objects to journal for testing It's broken for now as somehow only 1 object pass through the pipe and I don't get why so far. I also tried to use our real journal writer instead of the plain Producer so it match even more the reality of this test but that just hangs. So I reverted because, one problem at a time... Related to T2821
Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/136/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/136/console
I'll check more properly how those are tested in the actual journal... There might be hints there.
Thanks @vlorentz which pointed out that there is one or more tasks (undeterministically).
The scheduled task arguments contains the origin-visit-status which are grouped together...
So the test need to be reworked a tad.
swh/indexer/tests/test_cli.py | ||
---|---|---|
477 | this makes the test succeed even if there are no tasks at all. Why not an equality? |
swh/indexer/tests/test_cli.py | ||
---|---|---|
477 | ah i forgot to deal with the non-deterministic part... ... note that currently, given the celery stuff issue, tests will timeout... |
Build was aborted
Patch application report for D4641 (id=16554)
Rebasing onto f775fea12b...
Current branch diff-target is up to date.
Changes applied before test
commit 449ce394bb3866a9093b4be8f1451550f14c7210 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Fri Dec 4 13:26:18 2020 +0100 test_cli_journal_client: Rework assertions commit 133de321ead3e33c9cdfd45cc2ceca5ca7702c97 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Dec 3 15:56:43 2020 +0100 test_cli: Use a journal_writer instead of a plain Producer It's able to deal with swh.model.model objects. commit 84e2ade7612d974aff977a77bdc7f8cbe5cf8f8d Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Wed Dec 2 09:21:51 2020 +0100 wip: test_journal_client: Send production objects to journal for testing It's broken for now as somehow only 1 object pass through the pipe and I don't get why so far. I also tried to use our real journal writer instead of the plain Producer so it match even more the reality of this test but that just hangs. So I reverted because, one problem at a time... Related to T2821
Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/137/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/137/console
Build was aborted
Patch application report for D4641 (id=16557)
Rebasing onto f775fea12b...
Current branch diff-target is up to date.
Changes applied before test
commit 5644ba47267266600927905d94057804ee978a50 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Fri Dec 4 13:26:18 2020 +0100 test_cli_journal_client: Rework assertions commit 133de321ead3e33c9cdfd45cc2ceca5ca7702c97 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Dec 3 15:56:43 2020 +0100 test_cli: Use a journal_writer instead of a plain Producer It's able to deal with swh.model.model objects. commit 84e2ade7612d974aff977a77bdc7f8cbe5cf8f8d Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Wed Dec 2 09:21:51 2020 +0100 wip: test_journal_client: Send production objects to journal for testing It's broken for now as somehow only 1 object pass through the pipe and I don't get why so far. I also tried to use our real journal writer instead of the plain Producer so it match even more the reality of this test but that just hangs. So I reverted because, one problem at a time... Related to T2821
Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/138/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/138/console
swh/indexer/tests/test_cli.py | ||
---|---|---|
478–481 | remove assert origin in set([vs.origin for vs in visit_statuses_full]) and replace assert len(actual_origins) == len(visit_statuses_full) with assert actual_origins == {vs.origin for vs in visit_statuses_full} it's equivalent but simpler. |
- Adapt according to last suggestion (neat :)
- Rework commit messages (squash and adapt message)
swh/indexer/tests/test_cli.py | ||
---|---|---|
478–481 | neat ;) |
Build was aborted
Patch application report for D4641 (id=16562)
Rebasing onto f775fea12b...
Current branch diff-target is up to date.
Changes applied before test
commit e0f1199c210245fab00e12136a49295c902b3efc Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Wed Dec 2 09:21:51 2020 +0100 test_journal_client_cli: Send production objects to journal The reason for this is to avoid surprises like the indexer journal client stuck in limbo for a while. Related to T2821 Related to T2814
Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/139/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/139/console
Build was aborted
Patch application report for D4641 (id=16562)
Rebasing onto f775fea12b...
Current branch diff-target is up to date.
Changes applied before test
commit e0f1199c210245fab00e12136a49295c902b3efc Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Wed Dec 2 09:21:51 2020 +0100 test_journal_client_cli: Send production objects to journal The reason for this is to avoid surprises like the indexer journal client stuck in limbo for a while. Related to T2821 Related to T2814
Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/140/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/140/console
Build is green
Patch application report for D4641 (id=16562)
Rebasing onto f775fea12b...
Current branch diff-target is up to date.
Changes applied before test
commit e0f1199c210245fab00e12136a49295c902b3efc Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Wed Dec 2 09:21:51 2020 +0100 test_journal_client_cli: Send production objects to journal The reason for this is to avoid surprises like the indexer journal client stuck in limbo for a while. Related to T2821 Related to T2814
See https://jenkins.softwareheritage.org/job/DCIDX/job/tests-on-diff/141/ for more details.