Page MenuHomeSoftware Heritage

Migrate loader tests to use pytest
ClosedPublic

Authored by ardumont on Jul 6 2020, 1:49 PM.

Details

Summary

This simplifies the base classes indirection introduced to make the tests run.

It keeps the unittest scaffolding to allow declaring tests once and reuse
amongst the different loader instances (GitLoader, GitLoaderFromDisk,
GitLoaderFromArchive).

This uses the pg-storage and no longer the in-memory storage.

The code coverage increased from 82% to 85%.

This duplicates a bit helper function from the loader-core (equivalent code
exists in loader-svn). I'll attend to this to remove that duplication soon (the
diff could land without ;)

Related to T2482

Test Plan

tox

Diff Detail

Repository
rDLDG Git loader
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D3428 (id=12142)

Rebasing onto 55ff84b1a6...

Current branch diff-target is up to date.
Changes applied before test
commit 07419a649ec0c02f98d2a0e8ee62c5b99dacf1d8
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Jul 6 12:41:57 2020 +0200

    Migrate loader tests to use pytest
    
    This simplifies the base classes indirection introduced to make the tests run.
    
    It keeps the unittest scaffolding to allow declaring tests once and reuse
    amongst the different loader instances (GitLoader, GitLoaderFromDisk,
    GitLoaderFromArchive).
    
    The code coverage increased from 82% to 85%.

See https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/27/ for more details.

ardumont added inline comments.
swh/loader/git/tests/__init__.py
22

I will definitely move that to the loader-core now.
Also, note that the uncompress_archive option is no longer needed (i dropped its use).

swh/loader/git/tests/conftest.py
49

That should probably go up in the loader-core now.

swh/loader/git/tests/__init__.py
22

That's pretty nice, thanks.

Could we make the check_snapshot function check object references recursively? I think checking the statistics output by the loader is already pretty good, but we've had, in the past, loaders generate buggy edges. Making a recursive check of all objects pointed at by the snapshot would alleviate these concerns completely (and for all loaders).

MANIFEST.in
5

No need for both * and *.xz.

requirements-test.txt
2

I don't see you instantiating a postgresql anywhere. I guess this should read swh.storage[testing].

swh/loader/git/tests/conftest.py
13

I understand why this is going on, but instead of generalizing this (somewhat clunky) pattern I think the relevant fixtures should be properly documented and moved to a dedicated pytest plugin in the source modules. I've ended up doing it for swh.scheduler in D3430, it's not too bad.

(No need to do it right now: you can keep this import *, do the refactoring of swh.storage test fixtures in parallel, and drop the import * when that's done. On the release that moves the pytest fixtures, they'll get auto-imported instead of being pulled in by the import *, so the switch over is backwards compatible.)

Could we make the check_snapshot function check object references recursively?

Yes! Good idea.

I'll add this to my todo list on refactoring then [1] ;)

[1] T2483

I think checking the statistics output by the loader is already pretty good, but we've had, in the past, loaders generate buggy edges. Making a recursive check of all objects pointed at by the snapshot would alleviate these concerns completely (and for all loaders).

Indeed.

swh/loader/git/tests/conftest.py
13

right!

requirements-test.txt
2

oh yeah indeed, i did not realize i could do it that way.
thanks.

swh/loader/git/tests/conftest.py
13

T2484

(I took my old approach back from the origin-visit immutable. This time, create dedicated subtask with a dedicated perimeter instead of letting a task opened too long with an almost ever growing perimeter ;)

Build has FAILED

Patch application report for D3428 (id=12154)

Rebasing onto 55ff84b1a6...

Current branch diff-target is up to date.
Changes applied before test
commit f78bb8acce06baee86a974e68d9cac6ccbe0f077
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Jul 6 12:41:57 2020 +0200

    Migrate loader tests to use pytest
    
    This simplifies the base classes indirection introduced to make the tests run.
    
    It keeps the unittest scaffolding to allow declaring tests once and reuse
    amongst the different loader instances (GitLoader, GitLoaderFromDisk,
    GitLoaderFromArchive).
    
    The code coverage increased from 82% to 85%.
    
    Related to T2482

Link to build: https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/28/
See console output for more information: https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/28/console

Build has FAILED

I guess swh.storage[testing] also needs to be added explicitly to tox.ini dependencies. Or pytest-postgresql. *grmbl*

Add swh.storage[testing] to tox.ini, tox you happy?

Build is green

Patch application report for D3428 (id=12155)

Rebasing onto 55ff84b1a6...

Current branch diff-target is up to date.
Changes applied before test
commit f0e48ff7810a0bb3daa5c0e4ded92601e2547540
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Mon Jul 6 12:41:57 2020 +0200

    Migrate loader tests to use pytest
    
    This simplifies the base classes indirection introduced to make the tests run.
    
    It keeps the unittest scaffolding to allow declaring tests once and reuse
    amongst the different loader instances (GitLoader, GitLoaderFromDisk,
    GitLoaderFromArchive).
    
    The code coverage increased from 82% to 85%.
    
    Related to T2482

See https://jenkins.softwareheritage.org/job/DLDG/job/tests-on-diff/29/ for more details.

This revision was not accepted when it landed; it landed in state Needs Review.Jul 6 2020, 4:30 PM
This revision was automatically updated to reflect the committed changes.