Page MenuHomeSoftware Heritage

cran.lister: Fix cran lister and add proper integration test
ClosedPublic

Authored by ardumont on Oct 10 2019, 9:03 PM.

Details

Summary

Which checks the cran lister tasks written in the scheduler.

Related d30d574dbe9acc8722cac02986485a9dbd3dd098
Related 5ea9d5ed392a12dc5558fe165b1cbf2c0dfbcbf0

Related T2032

Test Plan

tox

Diff Detail

Repository
rDLS Listers
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

  • Update slightly docstrings
  • Revert code that did not needed to change
  • Improve the MANIFEST.in instruction
swh/lister/cran/lister.py
84

This was a tryout at keeping metadata within the lister.
We now know it's the loader's concern so we can drop this.

94

We do not have any 'load-cran' task-type.
I don't know if we want some so for now...

swh/lister/cran/tests/data/list-r-packages.json
39

This is a sample extracted from a sample run of the lister through the docker-dev.
And then removing a lot of the sample.

swh/lister/cran/tests/test_lister.py
13

This tested nothing...
The cran lister broke and it did not trigger any issues.

Update commit message to include the task reference

Integrate fix from D2125 about decoding R lister out prior to json loading

vlorentz added inline comments.
swh/lister/cran/lister.py
54–59

It's not Python code.

"""Return task format dict. This creates tasks with args and kwargs set, for
example::

    args: ['package', 'https://cran.r-project.org/...', 'version']
    kwargs: {}
"""
swh/lister/cran/tests/test_lister.py
41

What about mocking the scheduler, like the indexer CLI tests do?

This revision now requires changes to proceed.Oct 11 2019, 10:27 AM
zack requested changes to this revision.Oct 11 2019, 10:35 AM
zack added a subscriber: zack.
zack added inline comments.
swh/lister/cran/lister.py
21

First, thanks of all for starting to add type annotations, it's great! :-)

Here, I think you want Mapping rather than Dict (duck typing and all).

Also, you should specify which key/value types you expect to have, or fallback to Mapping[Any, Any] if you don't want to do that yet.

115

as above: probably s/Dict/Mapping/, and having key/value types would be nice

swh/lister/cran/lister.py
54–59

thx.

swh/lister/cran/tests/test_lister.py
41

no, the point is to check the lister's output inside the scheduler.
That's an integration test, i don't want mocks involved.

ardumont added inline comments.
swh/lister/cran/lister.py
21

right, i'll check what's the type here...
Also, it's wrong, it's a list... List[Mapping[str, str]], i think.

ardumont marked an inline comment as done.

Adapt according to reviews

This revision is now accepted and ready to land.Oct 11 2019, 1:28 PM