Page MenuHomeSoftware Heritage

fix the semantic of max_edges parameter
Needs RevisionPublic

Authored by Hakimb on May 6 2021, 8:36 PM.

Details

Reviewers
vlorentz
Summary

the test on test_api_client about visit/edges using max_edges did not work, because the semantic of max_edges was not totally correct.

Diff Detail

Repository
rDGRPH Compressed graph representation
Branch
fix-max-edges
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 21350
Build 33157: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 33156: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D5707 (id=20370)

Rebasing onto 3e0da94777...

Current branch diff-target is up to date.
Changes applied before test
commit 01e8b92f1deb53266cd8262522156e9f35028cab
Author: Hakim Baaloudj <hbaaloud@po461-pro.paris.inria.fr>
Date:   Thu May 6 20:34:01 2021 +0200

    fix the semantic of max_edges parameter

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/122/ for more details.

Hakimb requested review of this revision.May 6 2021, 8:38 PM
  • add all possible values for the request

Build is green

Patch application report for D5707 (id=20371)

Rebasing onto 3e0da94777...

Current branch diff-target is up to date.
Changes applied before test
commit 2000b1e8953c7d3520211ae157661dc8a24fbdf2
Author: Hakim Baaloudj <hbaaloud@po461-pro.paris.inria.fr>
Date:   Thu May 6 20:44:13 2021 +0200

    fix the semantic of max_edges parameter

commit 01e8b92f1deb53266cd8262522156e9f35028cab
Author: Hakim Baaloudj <hbaaloud@po461-pro.paris.inria.fr>
Date:   Thu May 6 20:34:01 2021 +0200

    fix the semantic of max_edges parameter

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/123/ for more details.

Build is green

Patch application report for D5707 (id=20372)

Rebasing onto 3e0da94777...

Current branch diff-target is up to date.
Changes applied before test
commit 45140887dc8ecc86488359ed0fad789834216ac0
Author: Hakim Baaloudj <hbaaloud@po461-pro.paris.inria.fr>
Date:   Thu May 6 20:34:01 2021 +0200

    fix the semantic of max_edges parameter
    
    fix the semantic of max_edges parameter
    
    fix the semantic of max_edges parameter

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/124/ for more details.

Build is green

Patch application report for D5707 (id=20373)

Rebasing onto 3e0da94777...

Current branch diff-target is up to date.
Changes applied before test
commit 9338346effcb0c4767d041d3db7475ebb5bf4866
Author: Hakim Baaloudj <hbaaloud@po461-pro.paris.inria.fr>
Date:   Thu May 6 20:34:01 2021 +0200

    fix the semantic of max_edges parameter

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/125/ for more details.

What is the difference between nbEdgesAccessed and currentEdgeAccessed? It looks like the latter is not used at all (only incremented, but never read), but I may be missing something

I see that in multiple place, you set currentEdgeAccessed = 0; before breaking/returning. What is it for? Is the same instance of Traversal used to serve multiple queries?

java/src/main/java/org/softwareheritage/graph/Traversal.java
168–169

Why is it defined both here and as an object attribute?

swh/graph/naive_client.py
177–178

Remove these two lines

swh/graph/tests/test_api_client.py
147

It would be nice to test both with and without the edges argument, to test whether it's counter before or after the filtering. (whichever you think makes the most sense)

java/src/main/java/org/softwareheritage/graph/Traversal.java
168–169

an error, my bad

What is the difference between nbEdgesAccessed and currentEdgeAccessed? It looks like the latter is not used at all (only incremented, but never read), but I may be missing something

I see that in multiple place, you set currentEdgeAccessed = 0; before breaking/returning. What is it for? Is the same instance of Traversal used to serve multiple queries?

all lines about 'nbEdgesAccessed' was here before i add max_edges parameter etc... And yes it was not used, it is only used to be incremented by the outdegree of the current processed node, I can't use it to limit the number of edges traversed, for example, if max_edges = 3 and I test if nbEdgesAccessed >= max_edges, the traversal will stop and I wouldn't have actually processed any edges, so I do this job with currentEdgeAccessed

swh/graph/tests/test_api_client.py
147

I don't know if it makes sense to test both at the same time, because the edges are filtered before counting them, so if the number of edges returned is less than max_edges, because of the edges filter, the test will not pass.

Build has FAILED

Patch application report for D5707 (id=20374)

Rebasing onto 3e0da94777...

Current branch diff-target is up to date.
Changes applied before test
commit 93fc6483253723f89e667d09f26df72cabb6a007
Author: Hakim Baaloudj <hbaaloud@po461-pro.paris.inria.fr>
Date:   Thu May 6 20:34:01 2021 +0200

    fix the semantic of max_edges parameter

Link to build: https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/126/
See console output for more information: https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/126/console

remove unecessered line on test_api_client

Build has FAILED

Patch application report for D5707 (id=20376)

Rebasing onto 3e0da94777...

Current branch diff-target is up to date.
Changes applied before test
commit 36748b5c12acaa5576f743445e42924a1138b739
Author: Hakim Baaloudj <hbaaloud@po461-pro.paris.inria.fr>
Date:   Thu May 6 20:34:01 2021 +0200

    fix the semantic of max_edges parameter

Link to build: https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/127/
See console output for more information: https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/127/console

Build is green

Patch application report for D5707 (id=20377)

Rebasing onto 3e0da94777...

Current branch diff-target is up to date.
Changes applied before test
commit 8713f079180d98750bbbb708071437dae786f692
Author: Hakim Baaloudj <hbaaloud@po461-pro.paris.inria.fr>
Date:   Thu May 6 20:34:01 2021 +0200

    fix the semantic of max_edges parameter

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/128/ for more details.

swh/graph/tests/test_api_client.py
147

Ok. You can test with an inequality then. (assert 1 <= len(actual) <= max_edges)

Build is green

Patch application report for D5707 (id=21250)

Rebasing onto f885bdb009...

First, rewinding head to replay your work on top of it...
Applying: Fix the semantic of max_edges parameter
Changes applied before test
commit b56ab91089cfd3e68e8b4bfe40c2012b5ae90c1d
Author: Hakim Baaloudj <hbaaloud@po461-pro.paris.inria.fr>
Date:   Thu May 6 20:34:01 2021 +0200

    Fix the semantic of max_edges parameter
    
    fix the semantic of max_edges parameter
    
    fix the semantic of max_edges parameter

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/129/ for more details.

This revision is now accepted and ready to land.Jun 25 2021, 12:06 PM

it looks like there are some failures now. Could you rebase on master and take a look?

This revision now requires changes to proceed.Jul 9 2021, 12:52 PM