Page MenuHomeSoftware Heritage

Add visit/edges endpoint
ClosedPublic

Authored by seirl on Jul 23 2020, 1:18 PM.

Details

Reviewers
douardda
Group Reviewers
Reviewers
Summary

Allows us to export subdatasets easily

Test Plan

pytest swh/

Diff Detail

Event Timeline

Build is green

Patch application report for D3600 (id=12681)

Rebasing onto 3943007422...

Current branch diff-target is up to date.
Changes applied before test
commit d987735e40d74adfc13e6f88956f0d6be26720cb
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Thu Jul 23 13:15:08 2020 +0200

    doc: add visit/edges endpoint documentation

commit 1e9b322ea34dbb67dbd205283c0891790bb56e89
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Thu Jul 23 13:05:06 2020 +0200

    pid: use SWHID functions instead of deprecated PID ones

commit dd9ecf6e6fc70489c882d9518b301a774188fc0f
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Thu Jul 23 11:26:15 2020 +0200

    Add visit_edges endpoint

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/11/ for more details.

zack added inline comments.
java/src/main/java/org/softwareheritage/graph/algo/Traversal.java
170–173

Consider the graph A→B, C→B and assume A and B are visited first.
I think this implementation of visit_edges will only return one of the two edges, because once you've marked B as visited for one of the two (say, A→B), you will never emit the other edge leading to B (say, C→B).
Or am I missing something?

seirl added inline comments.
java/src/main/java/org/softwareheritage/graph/algo/Traversal.java
170–173

You're right, I don't know how I missed that. This call should be outside the if. Nice catch, thanks.

  • visit/edges: fix incorrect handling of diamond pattern

Fixed and added a test for that problem.

Build is green

Patch application report for D3600 (id=12692)

Rebasing onto 3943007422...

Current branch diff-target is up to date.
Changes applied before test
commit 7e97538b7ffa27984f633b429779fdd471880c61
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Thu Jul 23 15:44:20 2020 +0200

    visit/edges: fix incorrect handling of diamond pattern

commit d987735e40d74adfc13e6f88956f0d6be26720cb
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Thu Jul 23 13:15:08 2020 +0200

    doc: add visit/edges endpoint documentation

commit 1e9b322ea34dbb67dbd205283c0891790bb56e89
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Thu Jul 23 13:05:06 2020 +0200

    pid: use SWHID functions instead of deprecated PID ones

commit dd9ecf6e6fc70489c882d9518b301a774188fc0f
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Thu Jul 23 11:26:15 2020 +0200

    Add visit_edges endpoint

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/13/ for more details.

douardda added a subscriber: douardda.

looks overall good to me (see my comments), not sure I'm 100% confident there are enough corner case tests (for complex multi diamond shaped graphs), but I let zack judge this.

Also, the "visit/edges: fix incorrect handling of diamond pattern" revision should be squashed in the first one of this branch.

swh/graph/client.py
55

no fstring?

58

is this print statement supposed to live here?

swh/graph/server/app.py
213

same "print" question

218

same fstring question :-)

This revision now requires changes to proceed.Jul 27 2020, 5:07 PM
seirl marked 5 inline comments as done.
  • Remove debug prints

Build is green

Patch application report for D3600 (id=12763)

Rebasing onto 3943007422...

Current branch diff-target is up to date.
Changes applied before test
commit c5802bbe595673102ff593b8fb91ea600748cd1b
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Mon Jul 27 18:12:53 2020 +0200

    Remove debug prints

commit 7e97538b7ffa27984f633b429779fdd471880c61
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Thu Jul 23 15:44:20 2020 +0200

    visit/edges: fix incorrect handling of diamond pattern

commit d987735e40d74adfc13e6f88956f0d6be26720cb
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Thu Jul 23 13:15:08 2020 +0200

    doc: add visit/edges endpoint documentation

commit 1e9b322ea34dbb67dbd205283c0891790bb56e89
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Thu Jul 23 13:05:06 2020 +0200

    pid: use SWHID functions instead of deprecated PID ones

commit dd9ecf6e6fc70489c882d9518b301a774188fc0f
Author: Antoine Pietri <antoine.pietri1@gmail.com>
Date:   Thu Jul 23 11:26:15 2020 +0200

    Add visit_edges endpoint

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/14/ for more details.

This revision is now accepted and ready to land.Jul 29 2020, 5:06 PM

Already landed

swh/graph/client.py
55

Yes, it's consistent with the rest of the file