Page MenuHomeSoftware Heritage

tests: refactor main storage tests
ClosedPublic

Authored by douardda on Tue, Oct 8, 4:52 PM.

Details

Summary
  • use pytest instead of unittest.TestCase plumbing
  • extract data from the TestStorageData into a data storage_data module; this module also provide a simple helper StorageData class that mimics the original class (access by attributes),
  • implement a series of pytest fixtures for these storage specific tests,
  • get rid of most hypothesis-based tests.

Diff Detail

Repository
rDSTO Storage manager
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
vlorentz added inline comments.Tue, Oct 8, 5:03 PM
swh/storage/tests/conftest.py
60

and by sha1_git

92–96

swh.storage.storage does not support psycopg2cffi.

vlorentz added inline comments.Tue, Oct 8, 5:08 PM
swh/storage/tests/conftest.py
22

nvm

ardumont added inline comments.
swh/storage/tests/conftest.py
42

'memory' (we fixed it together ;)

also olasd migrated the existing entries in storage recently.

douardda updated this revision to Diff 7137.Fri, Oct 11, 3:16 PM

mark test_in_memory and test_api_client as xfail

until D2095 and D2119

douardda updated this revision to Diff 7140.Fri, Oct 11, 3:44 PM

rebase + fix tests in tox (update mypy.ini and requirements-test.txt)

douardda updated this revision to Diff 7141.Fri, Oct 11, 3:56 PM

also xfail test_snapshot (until it has been ported to pytest/pytest_postgresql)

in D2117

douardda updated this revision to Diff 7142.Fri, Oct 11, 4:07 PM

xfail does not mean xerror...

douardda updated this revision to Diff 7153.Fri, Oct 11, 4:19 PM

also fix test_db to not depends on running tests via pifpaf

douardda updated this revision to Diff 7156.Fri, Oct 11, 4:42 PM

use memory cls for the journal_writer instead of deprecated inmemory

vlorentz requested changes to this revision.Fri, Oct 11, 4:45 PM

I think the xfails should have a comment for future readers stumbling upon this commit.

swh/storage/tests/conftest.py
50–61

not done

60

not done

92–96

not done

tox.ini
28–35

Why?

This revision now requires changes to proceed.Fri, Oct 11, 4:45 PM
douardda added inline comments.Fri, Oct 11, 4:54 PM
swh/storage/tests/conftest.py
92–96

well, I would have expected a comment about the fact there is no comment in this code stating that this function is stolen from pytest-postgresql ;-), and explaining why it is needed.

tox.ini
28–35

this chunk could be moved in D2095 I guess. I removed it since I gradually drop support for ENABLE_ORIGIN_IDS in tests.

douardda updated this revision to Diff 7159.Fri, Oct 11, 5:06 PM

Reactivate [testenv:py3-no-origin-ids] in tox

vlorentz added inline comments.Fri, Oct 11, 5:07 PM
tox.ini
28–35

py3-no-origin-ids runs tests for the new behavior. Now tox only runs tests for the old behavior.

douardda updated this revision to Diff 7160.Fri, Oct 11, 5:23 PM

add some comment in conftest to explain a bit the postgres stuff

and add sha1_git as unicity key for the hypothesis based gen_contents
helper function.

douardda marked 4 inline comments as done.Fri, Oct 11, 5:26 PM
douardda added inline comments.
swh/storage/tests/conftest.py
50–61

I do prefer gen_xxx since it's a function that generates these objects. xxx_lists seems less clear to me.

vlorentz requested changes to this revision.Fri, Oct 11, 5:32 PM
vlorentz added inline comments.
swh/storage/tests/conftest.py
50–61

but it doesn't follow Hypothesis' naming convention.

64–77

If we're going with the gen_ prefix, then these functions should have it too.

And swh_origins is way too different from swh_contents to have such a close name.

74

why 20?

using these many origins even for tests that won't use it wil make Hypothesis explore much more space than needed.

This revision now requires changes to proceed.Fri, Oct 11, 5:32 PM
douardda marked an inline comment as done.Fri, Oct 11, 5:57 PM
douardda added inline comments.
swh/storage/tests/conftest.py
74

Have you noticed I do not use hypothesis for exploring anything here? I'm using it as an easy way of generating (fixed-size) datasets.

But, it would make sense to use the gen_origins() in this fixture, instead of reimplementing it.

douardda added inline comments.Fri, Oct 11, 5:59 PM
swh/storage/tests/conftest.py
64–77

I agree with you, the names for swh_origins and swh_contents are confusing here.

74

20: because I do not ask hypothesis to explore anything, I just generate ONE list of 20 (by default) origin/content objects.

ardumont added inline comments.Fri, Oct 11, 6:38 PM
swh/storage/tests/conftest.py
18

Don't we want a fixture sqldir similar to the datadir we introduced in swh-core for the sql files?

38

Isn't a memory objstorage enough?

74

But, it would make sense to use the gen_origins() in this fixture, instead of reimplementing it.

Yes, i don't get why you did not do it since it's the same code ;)

ardumont added inline comments.Fri, Oct 11, 6:43 PM
swh/storage/tests/conftest.py
82

I don't even know what's the DBJanitor you are talking about ;)

ardumont added inline comments.Fri, Oct 11, 6:46 PM
swh/storage/tests/data/storage.yml
10

?
If we use a memory objstorage, we can skip that (i guess there is some stuff replacing this somewhere ;)

14

memory but that might be fixed in other commits...

Looks good to me.

Got a couple of questions/remarks but nothing too blocking.

ardumont accepted this revision.Fri, Oct 11, 6:46 PM
douardda updated this revision to Diff 7174.Mon, Oct 14, 10:53 AM

fix definition and usage of swh_contents and swh_origins fixtures

also add a revision that use an in memory obj storage in swh_storage fixture

ardumont accepted this revision.Mon, Oct 14, 11:06 AM
douardda updated this revision to Diff 7184.Mon, Oct 14, 12:11 PM

revert usage for self._test_origin_ids and use_url to fix py3-no-origin-ids

also replace boolean hypothesis strategy by pytest.mark.parametrize, so we do
not need to reset the storage in these tests (since the test is truly
executed twice).

also replace boolean hypothesis strategy by pytest.mark.parametrize, so we do
not need to reset the storage in these tests (since the test is truly
executed twice).

Nice!

douardda edited the summary of this revision. (Show Details)Mon, Oct 14, 1:04 PM
douardda edited the test plan for this revision. (Show Details)
ardumont accepted this revision.Mon, Oct 14, 1:13 PM
vlorentz accepted this revision.Mon, Oct 14, 1:19 PM
This revision is now accepted and ready to land.Mon, Oct 14, 1:19 PM
douardda updated this revision to Diff 7202.Mon, Oct 14, 2:23 PM
douardda edited the summary of this revision. (Show Details)

smalle rework related to skipping a few tests due to origin id handling

This revision was automatically updated to reflect the committed changes.