Page MenuHomeSoftware Heritage

Fix tox execution and move dataset folder
ClosedPublic

Authored by anlambert on Oct 24 2019, 3:37 PM.

Details

Summary

CI builds for swh-graph were failing since a couple of weeks, likely due to some recent changes in upstream pytest.

That diff fixes pytest invocation in tox.ini.

The tests/dataset folder is also moved in swh/graph/tests as it is required to run the tests.
Previously it was installed in <python-site-packages>/tests which is not a proper install location.

Related D2100 (not fixed either)

Diff Detail

Repository
rDGRPH Compressed graph representation
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Hum, arc patch failed ...

Surely related to this warning message when I created the diff:

arc diff origin/master
Linting...
No lint engine configured for this project.
Running unit tests...
No unit test engine is configured for this project.
Invalid Content Encoding (Non-UTF8)
This diff includes files which are not valid UTF-8 (they contain invalid byte
sequences). You can either stop this workflow and fix these files, or
continue. If you continue, these files will be marked as binary.

You can learn more about how Phabricator handles character encodings (and how
to configure encoding settings and detect and correct encoding problems) by
reading 'User Guide: UTF-8 and Character Encoding' in the Phabricator
documentation.

    AFFECTED FILES
    swh/graph/tests/dataset/output/example-transposed.graph
    tests/dataset/output/example-transposed.graph
    swh/graph/tests/dataset/output/example-transposed.offsets
    tests/dataset/output/example-transposed.offsets
    swh/graph/tests/dataset/output/example.graph
    tests/dataset/output/example.graph
    swh/graph/tests/dataset/output/example.offsets
    tests/dataset/output/example.offsets

    Do you want to mark these files as binary and continue? [Y/n] Y

Looks like the issue has already been reported to Phabricator (https://secure.phabricator.com/T9069) but has not been fixed yet.

Let's try to submit the diff again to avoid the arc patch failing.

Reopening it as I can not figure how to workaround the arcanist issue. It looks like some files are not detected as binaries so arcanist expects them to be UTF-8 encoded.

Anyway, even if the Jenkins job fails due to arc patch failing, that diff effectively fixes tests execution.

ardumont edited the summary of this revision. (Show Details)

Beware that dataset is also used by some java code.
And as far as i understood it, its generated by a docker container.
So if you move this dataset around, you might either break it or stop some form of synchronization ;)

I hinted D2100 because i tried to unstuck it and explain some trail of thoughts there.
To be unstuck i think a discussion with @seirl (or @zack) is needed.

Cheers,

@ardumont, whoopsie I did not see you already created a diff on the subject.

You are right about moving the dataset folder, I missed some paths to fix, notably here.

I will update that diff accordingly.

Update: Fix dataset folder path in GraphTest.java

Rebase to master, fix example path in test_cli.py and ensure to use Pathlib to manipulate paths

This revision was not accepted when it landed; it landed in state Needs Review.Nov 4 2019, 12:06 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.