Details
- Reviewers
olasd - Group Reviewers
Reviewers - Maniphest Tasks
- T3603: Replace stateful hypothesis strategies by pytest fixtures
- Commits
- rDWAPPSc2d10c90c3bd: tests: Turn origin* hypothesis strategies into pytest fixtures
Diff Detail
- Repository
- rDWAPPS Web applications
- Branch
- origin-strategies-to-pytest-fixtures
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 24383 Build 38056: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 38055: arc lint + arc unit
Event Timeline
Build is green
Patch application report for D6458 (id=23466)
Could not rebase; Attempt merge onto d7c16735a4...
Updating d7c16735..c2d10c90 Fast-forward swh/web/tests/api/test_apiresponse.py | 4 +- swh/web/tests/api/test_utils.py | 17 +- swh/web/tests/api/views/test_content.py | 12 - swh/web/tests/api/views/test_directory.py | 7 - swh/web/tests/api/views/test_graph.py | 6 +- swh/web/tests/api/views/test_identifiers.py | 10 +- swh/web/tests/api/views/test_origin.py | 5 +- swh/web/tests/api/views/test_release.py | 4 +- swh/web/tests/api/views/test_revision.py | 4 +- swh/web/tests/api/views/test_snapshot.py | 13 +- swh/web/tests/api/views/test_vault.py | 17 +- swh/web/tests/browse/test_snapshot_context.py | 37 ++- swh/web/tests/browse/views/test_content.py | 190 +++++++-------- swh/web/tests/browse/views/test_directory.py | 67 +++--- swh/web/tests/browse/views/test_identifiers.py | 14 +- swh/web/tests/browse/views/test_origin.py | 86 +++---- swh/web/tests/browse/views/test_release.py | 14 +- swh/web/tests/browse/views/test_revision.py | 5 +- swh/web/tests/common/test_archive.py | 47 +--- swh/web/tests/common/test_identifiers.py | 113 +++++---- swh/web/tests/common/test_origin_visits.py | 21 +- swh/web/tests/conftest.py | 307 ++++++++++++++++++++++++- swh/web/tests/misc/test_badges.py | 10 +- swh/web/tests/misc/test_iframe.py | 12 +- swh/web/tests/strategies.py | 265 +-------------------- 25 files changed, 601 insertions(+), 686 deletions(-)
Changes applied before test
commit c2d10c90c3bd6f5fa3634fa456475c3fa92ae194 Author: Antoine Lambert <anlambert@softwareheritage.org> Date: Tue Oct 12 13:48:57 2021 +0200 tests: Turn origin* hypothesis strategies into pytest fixtures Related to T3603 commit 81486f517526f7cb94e0ca2ed49433ebd7c3d1b3 Author: Antoine Lambert <anlambert@softwareheritage.org> Date: Tue Oct 12 11:45:43 2021 +0200 tests: Turn directory* hypothesis strategies into pytest fixtures Related to T3603 commit 23a8dc33453fdcf3fb0037ba1c0c76752d40c1c4 Author: Antoine Lambert <anlambert@softwareheritage.org> Date: Mon Oct 11 15:17:58 2021 +0200 tests: Turn content* hypothesis strategies into pytest fixtures Using such hypothesis strategies have several drawbacks so let's turn them into pytest fixtures as it feels a better way to do so. Related to T3603
See https://jenkins.softwareheritage.org/job/DWAPPS/job/tests-on-diff/1164/ for more details.
Thanks for working on reducing the number of hypothesis fixtures!
I'm a bit concerned about the reproducibility of test results, given fixtures that pull random list elements, with no control on the sequence of test executions and on the seed of the python random module when the fixture is called. (Now that I've looked at swh.web.tests.data, I'm even more concerned :-))
I don't have an answer about "what to use?", unfortunately, except just going for exhaustive tests (i.e. running the test functions for *all* values of the origins in the test data set), which doesn't sound very compelling unless the size of the sample dataset is small, which doesn't look to be the case.
https://github.com/pytest-dev/pytest/issues/5463 has some background about concerns with respect to random seeding in tests.
Apart from that, I see that some of the function-level fixtures are doing ""heavy"" querying on the test data for information that is, in effect, static (e.g. the list of origins with more than two visits, etc.). I wonder if it would be possible to extract this logic to only run it once on initialization of the test data?
I initially wrote: we may want to initialize a single, module scoped seed_storage fixture with all data inserted, and make the storage fixture used by tests a function-scoped fixture which would clone this seed storage instance - I assume some tests have to *write* to the storage, so you can't just have one global read only storage fixture - but I now see that's what swh.web.tests.data does. Maybe _init_tests_data could be turned into that seed_storage module-scoped pytest fixture, instead of the current ad-hoc logic? This would also help us control the random seed used for generating the test data (allowing us to override it to reproduce test results)?
@olasd Could you open a task, so anlambert can land this stack of diffs now before we discuss the next step?
Thanks for pointing that out. I was also wondering how to make failing tests reproducible when random input data are involved.
I think one possible solution would be to:
- define a function scoped fixture that set the random seed to current time before test execution and reset random state on test teardown
- implement custom pytest reporting for displaying the seed used for each failing test
- add a new pytest option to explicitely set the random seed from CLI in order to reproduce any failing test
Regarding static tests data, we could use functools.lru_cache to avoid executing code returning the same result multiple times.
I will handle these improvements in other diffs, meanwhile we can land this pile of diffs as the previous tests behavior remain unchanged.
(That is, I don't have a problem with these changes landing first, as long as we make sure that eventually we have a proper way of reproducing test failures that have come out of "random" fixtures.)
I might have a proper solution to reproduce test failures involving random fixtures. I am going to land that pile of diffs and submit a new one for reproducibility afterwards.