Page MenuHomeSoftware Heritage

Add parameter to load a single graph direction in memory.
AbandonedPublic

Authored by andrey-star on Nov 2 2021, 8:41 AM.

Details

Summary

Currently, the server loads both the forward and the transposed graphs into memory.

Added direction server option to specify which graph directions should be loaded into memory (forward|backward|both) defaulting to both of not provided.
If a single direction was loaded, the direction in the query defaults to the loaded one.

If the user queries the direction, that was not loaded, the server response reflects that.

Diff Detail

Repository
rDGRPH Compressed graph representation
Branch
swh-graph-single-dir
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 24835
Build 38785: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 38784: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D6594 (id=23956)

Rebasing onto c29ee2787e...

Current branch diff-target is up to date.
Changes applied before test
commit 812544d5da7f91f78c811d64c09b6cf25ba724b0
Author: Andrey <andreystar2403@gmail.com>
Date:   Mon Nov 1 22:58:32 2021 +0300

    Add tests for forward/backward loaded directions.

commit 67ab1092079bcb7c6b358a3d89c1265961f463ed
Author: Andrey <andreystar2403@gmail.com>
Date:   Mon Nov 1 15:26:27 2021 +0300

    Throw exception, if the loaded direction doesn't match query, or transpose/symmetrize is called on a single loaded graph.
    If only single direction is loaded default the query param to chosen direction.

commit 475bf9b18b1dd578f7fabd4a503ae0ecd637d457
Author: Andrey <andreystar2403@gmail.com>
Date:   Mon Nov 1 12:53:53 2021 +0300

    Pass 'both' direction wherever a graph was constructed.
    Not adding an overload which defaults to 'both', because having to explicitly set the value makes you asses which option fits best.

commit c18a2fb4dfdb33df6ee427e4476d716e8e304a85
Author: Andrey <andreystar2403@gmail.com>
Date:   Mon Nov 1 12:50:12 2021 +0300

    Pass direction parameter to Graph constructor and load the directions into memory conditionally.

commit f2458613e2a681d5dfc05c46d7636d773b154ded
Author: Andrey <andreystar2403@gmail.com>
Date:   Mon Nov 1 10:20:11 2021 +0300

    Add 'direction' server option to specify which graph directions should be loaded into memory (forward|backward|both) defaulting to 'both' of not provided.

commit 4e831cec8441d090bf4c0e93f605f6640b22ca49
Author: Andrey <andreystar2403@gmail.com>
Date:   Sun Oct 31 19:00:41 2021 +0300

    Add IntelliJ and MacOS files to .gitignore

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/137/ for more details.

andrey-star retitled this revision from Add IntelliJ and MacOS files to .gitignore to Add parameter to load a single graph direction in memory..Nov 2 2021, 8:43 AM
andrey-star edited the summary of this revision. (Show Details)

Interesting, thanks!

This is an interesting coincidence, we recently discussed doing something like this. May I ask what your use case is?

The code looks fine to me, but I'd like a review from @seirl (this may take some time, he is busy this week)

Interesting, thanks!

This is an interesting coincidence, we recently discussed doing something like this. May I ask what your use case is?

The code looks fine to me, but I'd like a review from @seirl (this may take some time, he is busy this week)

I was discussing an internship opportunity with @zack and he gave me this small task as a test.
Regarding the use case, this flag can be added to rpc-serve and passed to the server.

olasd added inline comments.
.gitignore
13 ↗(On Diff #23956)

Please consider adding this to a global gitignore file that would be local to your machine (https://sebastiandedeyne.com/setting-up-a-global-gitignore-file/)

java/.gitignore
10–12 ↗(On Diff #23956)

I believe this should go to a global gitignore file as well.

Remove local gitignore

Build is green

Patch application report for D6594 (id=23958)

Rebasing onto c29ee2787e...

Current branch diff-target is up to date.
Changes applied before test
commit c8e2f64e012a5fa10bfaa5c770300b20e3db49f9
Author: Andrey <andreystar2403@gmail.com>
Date:   Mon Nov 1 22:58:32 2021 +0300

    Add tests for forward/backward loaded directions.

commit 6653cf77b3ecab35aa0e58d76a40e2c9091ae75c
Author: Andrey <andreystar2403@gmail.com>
Date:   Mon Nov 1 15:26:27 2021 +0300

    Throw exception, if the loaded direction doesn't match query, or transpose/symmetrize is called on a single loaded graph.
    If only single direction is loaded default the query param to chosen direction.

commit 7e40071c2e76a05ca4639577a3bc3bf406effe28
Author: Andrey <andreystar2403@gmail.com>
Date:   Mon Nov 1 12:53:53 2021 +0300

    Pass 'both' direction wherever a graph was constructed.
    Not adding an overload which defaults to 'both', because having to explicitly set the value makes you asses which option fits best.

commit 198eb8256bdca4b2d64726e378619df4e87685d5
Author: Andrey <andreystar2403@gmail.com>
Date:   Mon Nov 1 12:50:12 2021 +0300

    Pass direction parameter to Graph constructor and load the directions into memory conditionally.

commit 4a60870a71d751b7559a1bc2c733e3340b05e5da
Author: Andrey <andreystar2403@gmail.com>
Date:   Mon Nov 1 10:20:11 2021 +0300

    Add 'direction' server option to specify which graph directions should be loaded into memory (forward|backward|both) defaulting to 'both' of not provided.

See https://jenkins.softwareheritage.org/job/DGRPH/job/tests-on-diff/138/ for more details.

Hey! Thanks for this initial diff.

I think we're adding too much behavior to this class, it's starting to look like spaghetti code.

The design I had in mind was to split the SwhGraph into two different layers:

  • a SwhGraph interface that will contain the swh-specific metadata handling (node types, SWHIDs etc)
  • a BidirectionalGraph interface on top of ImmutableGraph that will take a graph and its transposed version, and define indegree(), predecessors() and transpose().

Using this, we could:

  • Have a generic bidirectional graph by using a SwhGraph(BidirectionalGraph(graph, graph_transposed))
  • Have a swh-specific unidirectional graph, either forward or backward, by using a SwhGraph(graph) or a SwhGraph(graph_transposed)
  • Have a swh-specific bidirectional graph by using a SwhGraph(BidirectionalGraph(graph, graph_transposed))

Note: I don't know yet what is the correct data model to implement this in Java as I'm not too proficient with interfaces/mixins.

@andrey-star if you want to have a shot at it you can go ahead, otherwise I'll try to implement that in the coming weeks. It's not a mandatory assignment, your initial proposal is already great! I'd just rather directly merge the correct design if it's not an immediate blocker for other things.