Page MenuHomeSoftware Heritage

Add module to generate relevant data as tests input
ClosedPublic

Authored by anlambert on Dec 14 2018, 6:15 PM.

Details

Summary

First diff exposing the work I have done so far on improving swh-web tests.
This one is about the tests data generation by populating an in-memory archive.

In order to avoid harcoding tests input data and get closer to real world ones,
populate a test archive by loading in it a couple of lightweight git repositories.

The ids of the objects in this test archive (contents, directories, revisions, ...)
will then be provided as tests input in order to retrieve their associated data
from the in-memory storages. Proceeding like this will allow us to remove a
lot of mocks in the tests implementation.

Related T1271

Diff Detail

Repository
rDWAPPS Web applications
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

vlorentz added a subscriber: vlorentz.

The two zips you add are about 200KB in size. Could you find smaller ones?

swh/web/tests/data.py
155

storage is an instance of swh.storage.in_memory.Storage, it shouldn't be in a config dict.

156

That's a pretty tricky side-effect. It should be documented and explained, both here and in swh.web.common.service.

198–222

This can be factorized:

indexers = {}
for idx_name, idx_class in (('mimetype', _MimetypeIndexer), ('language',_LanguageIndexer), ('license', _FossologyLicenseIndexer), ('ctags', _CtagsIndexer)):
    idx = idx_class()
    idx.storage = storage
    idx.objstorage = storage.objstorage
    idx.idx_storage = idx_storage
    indexers[idx_name] = idx

Then use **indexers in the returned dict

This revision now requires changes to proceed.Dec 14 2018, 6:57 PM

The two zips you add are about 200KB in size. Could you find smaller ones?

That's not really big either ... Those repos enables me to capture a lot of test cases (notably ctags, releases, non linear revision history)
so I would prefer to continue using them.

That sounds pretty neat for the next steps!
\m/

swh/web/tests/data.py
156

That's the only way to share the in-memory storage consistently i think.
So yes, explaining why we do that would be great.

swh/web/tests/data.py
156

That's the only way to share the in-memory storage consistently i think.

You can use unittest.mock.patch('swh.storage.in_memory.Storage'): https://forge.softwareheritage.org/source/swh-indexer/browse/master/swh/indexer/tests/test_origin_metadata.py$98

anlambert retitled this revision from swh-web: Add module to generate relevant data as tests input to Add module to generate relevant data as tests input.Dec 17 2018, 10:42 AM
anlambert added inline comments.
swh/web/tests/data.py
156

The idea here is to patch globally the storage instances in order to avoid using decorators in all tests to do so.

But I agree this operation should not be performed in that module, who should only be dedicated to the generation
of tests data. I will move it to the swh.web.tests.testcase module.

198–222

Indeed, thanks.

Update:

  • address vlorentz comments
  • move storage patching out of this module
  • bump swh-loader-git version
This revision is now accepted and ready to land.Dec 17 2018, 5:02 PM
This revision was automatically updated to reflect the committed changes.