Page MenuHomeSoftware Heritage

gnu: Separate listing and parsing logic and add integration test
ClosedPublic

Authored by ardumont on Oct 5 2019, 6:05 PM.

Details

Summary

Also:

  • drop the unused package name loader side (also it collides between gnu/old-gnu listing).
  • Add more tests on the parsing side
  • clean up docstrings
  • Improve slightly the parsing code

Related T2023

Depends on D2077

Test Plan

tox

Diff Detail

Repository
rDLS Listers
Branch
master
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 8166
Build 11771: tox-on-jenkinsJenkins
Build 11770: arc lint + arc unit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
swh/lister/gnu/tests/test_tree.py
95–101

Yes, i'm currently trying to finalize and propose a fixture for that in swh-core.
I'd like to refactor this part when that other cog lands though ;)

Add missing docstring code block formatting and some more dead code clean up

vlorentz requested changes to this revision.Oct 7 2019, 2:39 PM

You forgot to fix the false negative

swh/lister/gnu/lister.py
35–39

inside of code block should be indented

72–82

should be indented to be in the Returns block.

swh/lister/gnu/tree.py
110–123

should be indented to be in the Returns block.

This revision now requires changes to proceed.Oct 7 2019, 2:39 PM

You forgot to fix the false negative

Yes, i was not really set on fixing it... since that does not happen right now.

python ./analysis.py --dataset ./tree.json.gz | grep -c "\.tar.'"

Note: i have committed the snippets/ardumont/gnu/analysis.py script and referenced it for your benefit ;)

oh well.

grep '\.tar"' /tmp/tree.json
            {"type":"file","name":"rms-french-interview-2001-11-20.tar","size":23418880,"time":"1017450803"},
        {"type":"file","name":"gzip-1.2.4.tar","size":798720,"time":"745830000"},
        {"type":"file","name":"gzip-1.2.4a.tar","size":798720,"time":"936571444"},
        {"type":"file","name":"gzip-1.3.12.tar","size":1822720,"time":"1176517974"},
        {"type":"file","name":"gzip-1.3.9.tar","size":1648640,"time":"1166172244"},
          {"type":"file","name":"sather-doc-000328-src.sgml.tar","size":289548,"time":"954345818"},
        {"type":"file","name":"tar-1.13.tar","size":3942400,"time":"936572499"},
            {"type":"file","name":".tar","size":327680,"time":"1011303620"},
            {"type":"file","name":"ctrl2cap.vxd.tar","size":30720,"time":"844973673"},
            {"type":"file","name":"keyswap.tar","size":30720,"time":"844973673"},
                {"type":"file","name":"ctrl2cap.vxd.tar","size":30720,"time":"974996877"},
                {"type":"file","name":"keyswap.tar","size":30720,"time":"974996877"},

heh righty right, I was on the .tar. use case not .tar indeed.

.tar. still returns false now, .tar is ok.

Fix docstring indentation block and fix ".tar" filtering bug

swh/lister/gnu/lister.py
72–82

still not indented

swh/lister/gnu/tests/test_tree.py
107–110

consistency is still not done

swh/lister/gnu/tree.py
48

://

110–123

still not indented

157

bool: Whether filename is an archive or not

This revision now requires changes to proceed.Oct 8 2019, 11:29 AM
swh/lister/gnu/tree.py
157

I mentioned the returned type annotation so that's not needed.

Fix indentation and assertion consistency

Add some more adaptations according to review

Build has FAILED

/me grunts
investigating the issue...

Fix build for python3.5 (for real this time)

swh/lister/gnu/tests/__init__.py
25

D2082 defines the generic fixture for this btw.

swh/lister/gnu/tests/test_tree.py
95–101

D2082 defines a "datadir" fixture for that.

vlorentz requested changes to this revision.Oct 8 2019, 1:15 PM
This revision now requires changes to proceed.Oct 8 2019, 1:15 PM
swh/lister/gnu/tree.py
110–123

still not :/

Fix last mis-indentation block

Reuse swh.core.pytest_plugin.mock_requests_datadir* fixture

ardumont retitled this revision from gnu: Separate listing and parsing logic to gnu: Separate listing and parsing logic and add integration test.Oct 9 2019, 6:06 PM
ardumont edited the summary of this revision. (Show Details)

Build has FAILED

Because the parent revision changed to D2077

Plug to master branch (really trigger a new ci build)

BUILD has failed

mmm

[tox] $ arc patch --diff 7060 --nocommit --nobranch --force --conduit-uri=https://forge.softwareheritage.org/ ********
Checking patch tox.ini...
error: while searching for:
  .[testing]
  pytest-cov
commands =
  pytest --cov=swh --cov-branch {posargs}

well, yes, jenkins, please try and load the right tox.ini, it will work better after that...

well, yes, jenkins, please try and load the right tox.ini, it will work better after that...

I've gotten back to D2077 (which is now the parent task).
I've removed the parent revision from D2077 (since everything it depended on landed on master).
And since the diff dependencies changed... (rebase and all that)

I've restarted the diff's build after that...
Let's see if that works better now.

Remove dead code

swh/lister/gnu/tests/__init__.py
25

This needs to go away, on it.

This revision is now accepted and ready to land.Oct 10 2019, 10:15 AM