Page MenuHomeSoftware Heritage

gnu: Separate listing and parsing logic and add integration test
ClosedPublic

Authored by ardumont on Sat, Oct 5, 6:05 PM.

Details

Summary

Also:

  • drop the unused package name loader side (also it collides between gnu/old-gnu listing).
  • Add more tests on the parsing side
  • clean up docstrings
  • Improve slightly the parsing code

Related T2023

Depends on D2077

Test Plan

tox

Diff Detail

Repository
rDLS Listers
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
ardumont added inline comments.Mon, Oct 7, 2:24 PM
swh/lister/gnu/tests/test_tree.py
96–102

Yes, i'm currently trying to finalize and propose a fixture for that in swh-core.
I'd like to refactor this part when that other cog lands though ;)

ardumont updated this revision to Diff 6979.Mon, Oct 7, 2:25 PM

Add missing docstring code block formatting and some more dead code clean up

ardumont edited the summary of this revision. (Show Details)Mon, Oct 7, 2:27 PM
vlorentz requested changes to this revision.Mon, Oct 7, 2:39 PM

You forgot to fix the false negative

swh/lister/gnu/lister.py
36–40

inside of code block should be indented

73–83

should be indented to be in the Returns block.

swh/lister/gnu/tree.py
111–124

should be indented to be in the Returns block.

This revision now requires changes to proceed.Mon, Oct 7, 2:39 PM

You forgot to fix the false negative

Yes, i was not really set on fixing it... since that does not happen right now.

python ./analysis.py --dataset ./tree.json.gz | grep -c "\.tar.'"

Note: i have committed the snippets/ardumont/gnu/analysis.py script and referenced it for your benefit ;)

oh well.

grep '\.tar"' /tmp/tree.json
            {"type":"file","name":"rms-french-interview-2001-11-20.tar","size":23418880,"time":"1017450803"},
        {"type":"file","name":"gzip-1.2.4.tar","size":798720,"time":"745830000"},
        {"type":"file","name":"gzip-1.2.4a.tar","size":798720,"time":"936571444"},
        {"type":"file","name":"gzip-1.3.12.tar","size":1822720,"time":"1176517974"},
        {"type":"file","name":"gzip-1.3.9.tar","size":1648640,"time":"1166172244"},
          {"type":"file","name":"sather-doc-000328-src.sgml.tar","size":289548,"time":"954345818"},
        {"type":"file","name":"tar-1.13.tar","size":3942400,"time":"936572499"},
            {"type":"file","name":".tar","size":327680,"time":"1011303620"},
            {"type":"file","name":"ctrl2cap.vxd.tar","size":30720,"time":"844973673"},
            {"type":"file","name":"keyswap.tar","size":30720,"time":"844973673"},
                {"type":"file","name":"ctrl2cap.vxd.tar","size":30720,"time":"974996877"},
                {"type":"file","name":"keyswap.tar","size":30720,"time":"974996877"},

heh righty right, I was on the .tar. use case not .tar indeed.

.tar. still returns false now, .tar is ok.

ardumont updated this revision to Diff 6983.Mon, Oct 7, 3:14 PM

Fix docstring indentation block and fix ".tar" filtering bug

vlorentz added inline comments.Tue, Oct 8, 11:29 AM
swh/lister/gnu/lister.py
73–83

still not indented

swh/lister/gnu/tests/test_tree.py
108–111

consistency is still not done

swh/lister/gnu/tree.py
49

://

111–124

still not indented

158

bool: Whether filename is an archive or not

vlorentz requested changes to this revision.Tue, Oct 8, 11:29 AM
This revision now requires changes to proceed.Tue, Oct 8, 11:29 AM
ardumont added inline comments.Tue, Oct 8, 11:39 AM
swh/lister/gnu/tree.py
158

I mentioned the returned type annotation so that's not needed.

ardumont updated this revision to Diff 6994.Tue, Oct 8, 11:46 AM

Fix indentation and assertion consistency

ardumont updated this revision to Diff 6995.Tue, Oct 8, 12:10 PM

Add some more adaptations according to review

ardumont updated this revision to Diff 6996.Tue, Oct 8, 12:54 PM

Fix for python3.5 build

Build has FAILED

/me grunts
investigating the issue...

ardumont updated this revision to Diff 6997.Tue, Oct 8, 1:06 PM

Fix build for python3.5 (for real this time)

ardumont added inline comments.Tue, Oct 8, 1:10 PM
swh/lister/gnu/tests/__init__.py
25 ↗(On Diff #6997)

D2082 defines the generic fixture for this btw.

swh/lister/gnu/tests/test_tree.py
96–102

D2082 defines a "datadir" fixture for that.

vlorentz requested changes to this revision.Tue, Oct 8, 1:15 PM
This revision now requires changes to proceed.Tue, Oct 8, 1:15 PM
vlorentz added inline comments.Tue, Oct 8, 1:19 PM
swh/lister/gnu/tree.py
111–124

still not :/

ardumont updated this revision to Diff 6999.Tue, Oct 8, 1:28 PM

Fix last mis-indentation block

vlorentz accepted this revision.Tue, Oct 8, 1:31 PM
ardumont updated this revision to Diff 7059.Wed, Oct 9, 6:04 PM

Reuse swh.core.pytest_plugin.mock_requests_datadir* fixture

ardumont retitled this revision from gnu: Separate listing and parsing logic to gnu: Separate listing and parsing logic and add integration test.Wed, Oct 9, 6:06 PM
ardumont edited the summary of this revision. (Show Details)

Build has FAILED

Because the parent revision changed to D2077

ardumont updated this revision to Diff 7060.Wed, Oct 9, 6:13 PM

Plug to master branch (really trigger a new ci build)

BUILD has failed

mmm

[tox] $ arc patch --diff 7060 --nocommit --nobranch --force --conduit-uri=https://forge.softwareheritage.org/ ********
Checking patch tox.ini...
error: while searching for:
  .[testing]
  pytest-cov
commands =
  pytest --cov=swh --cov-branch {posargs}

well, yes, jenkins, please try and load the right tox.ini, it will work better after that...

ardumont added a comment.EditedWed, Oct 9, 6:20 PM

well, yes, jenkins, please try and load the right tox.ini, it will work better after that...

I've gotten back to D2077 (which is now the parent task).
I've removed the parent revision from D2077 (since everything it depended on landed on master).
And since the diff dependencies changed... (rebase and all that)

I've restarted the diff's build after that...
Let's see if that works better now.

Build is green

\m/

ardumont updated this revision to Diff 7061.Wed, Oct 9, 6:24 PM

Remove dead code

swh/lister/gnu/tests/__init__.py
25 ↗(On Diff #6997)

This needs to go away, on it.

ardumont updated this revision to Diff 7066.Thu, Oct 10, 9:48 AM

Plug to latest master

douardda accepted this revision.Thu, Oct 10, 10:15 AM
This revision is now accepted and ready to land.Thu, Oct 10, 10:15 AM
This revision was automatically updated to reflect the committed changes.