Page MenuHomeSoftware Heritage

[WIP] Add the option to traverse using a DFS instead of BFS
Changes PlannedPublicDraft

Authored by vlorentz on Jan 6 2023, 12:52 PM.

Details

Reviewers
None
Group Reviewers
Reviewers
Summary

This would make the generation of blobs-origins.csv.zst (which looks for one
origin leaf) much faster

Diff Detail

Event Timeline

Build has FAILED

Patch application report for D9004 (id=32468)

Could not rebase; Attempt merge onto f87e0a3c3c...

Updating f87e0a3..04009e1
Fast-forward
 .../graph/rpc/NodePropertyBuilder.java             | 15 +++-
 .../org/softwareheritage/graph/rpc/Traversal.java  | 75 ++++++++++++--------
 .../java/org/softwareheritage/graph/GraphTest.java |  1 +
 .../org/softwareheritage/graph/SubgraphTest.java   | 13 ++--
 .../softwareheritage/graph/rpc/CountEdgesTest.java |  6 +-
 .../softwareheritage/graph/rpc/CountNodesTest.java |  8 +--
 .../graph/rpc/FindPathBetweenTest.java             |  2 +-
 .../softwareheritage/graph/rpc/FindPathToTest.java | 21 ++++--
 .../softwareheritage/graph/rpc/GetNodeTest.java    | 82 +++++++++++++---------
 .../org/softwareheritage/graph/rpc/StatsTest.java  |  6 +-
 .../graph/rpc/TraverseLeavesTest.java              |  2 +
 .../graph/rpc/TraverseNeighborsTest.java           |  2 +
 .../graph/rpc/TraverseNodesTest.java               | 20 +++---
 proto/swhgraph.proto                               | 10 +++
 swh/graph/cli.py                                   | 32 ++++++++-
 swh/graph/grpc/swhgraph_pb2.py                     | 74 ++++++++++---------
 swh/graph/grpc/swhgraph_pb2.pyi                    | 35 ++++++++-
 swh/graph/luigi/__init__.py                        |  9 ++-
 swh/graph/luigi/compressed_graph.py                |  3 +
 swh/graph/webgraph.py                              |  6 ++
 20 files changed, 282 insertions(+), 140 deletions(-)
Changes applied before test
commit 04009e12e5a062b7a5e9f05ebcc9ee2fa82ee7ad
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Jan 6 12:52:04 2023 +0100

    [WIP] Add the option to traverse using a DFS instead of BFS
    
    This would make the generation of blobs-origins.csv.zst (which looks for one
    origin leaf) much faster

commit 8818995ac9ee63f35e019f90c3cc8de42db39088
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Jan 6 11:05:56 2023 +0100

    compression: Force log level to be either DEBUG or INFO

commit f179609ba1f694d78a535ad1a96948585128c7a5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Jan 6 11:05:11 2023 +0100

    luigi: Add an option to define the maximum RAM used by graph compression

commit 99107b2f2178985e07aa0cb8bb1e7f1002156f62
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Jan 6 11:04:05 2023 +0100

    cli: Add more useful defaults
    
    two paths + reimport tasks at the package level so they are automatically
    picked up by luigi without passing all module names on the CLI

commit b4a18be9460314403cd8ce6bf91e8e2d9a7ffb76
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Jan 6 10:55:26 2023 +0100

    cli: Add flag --s3-athena-output-location to configure all Luigi tasks at once

commit 29bd614631282287e6ef9617ea749d1ebe32e049
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Jan 5 18:37:43 2023 +0100

    Fix Java tests broken by 559d4068bfe1dd50d57062192c0e22664ada03c8

Link to build: https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/349/
See console output for more information: https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/349/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jan 6 2023, 12:53 PM
Harbormaster failed remote builds in B33436: Diff 32468!