Page MenuHomeSoftware Heritage

Graph API: add a (node type) result filters
Closed, MigratedEdits Locked

Description

In various use cases we want to perform a graph traversal and only return some of the nodes encountered, instead of all nodes visited. That can be done downstream, but it's a waste of transferred data, that might be significant. We want to add a "result filter", that allows to do that filtering server side.

The expressivity of it is up for discussion. For sure we want a "node type" filter that applies to visits that return nodes one by one and allows to return instead only nodes of a given (set of) types.

Result filters for traversals that return data other than individual nodes are more complicated and need further discussion.
Ditto for filters that discriminate nodes based on properties other than node type (e.g., does it have a edge of a given type?).

Related Objects

Event Timeline

zack triaged this task as Normal priority.Jan 20 2021, 3:25 PM
zack created this task.

questions:

1/ So for the "filter that applies to visits that return nodes one by one" part, we are talking about: neighbors, walk, visit/nodes only?
2/ the filter is a query parameter I guess?

In T2981#63164, @Hakimb wrote:

questions:

1/ So for the "filter that applies to visits that return nodes one by one" part, we are talking about: neighbors, walk, visit/nodes only?

there's also /leaves

with "walk" you probably meant /randomwalk, which is indeed another one

2/ the filter is a query parameter I guess?

yes. Regarding naming, I propose something like "return_nodes=:node_types" (in theory "nodes" would be non ambiguous, because filtering during traversal is currently possible only filtering on edges, but that might change in the future, so better be explicit with "return" in the parameter name)

I just want to write something here that maybe isn't clear from the initial task description. This filtering must happen *after* the visit, not during. We can already change *how* the graph is visited using the edges parameter, the goal of this task is to filter the result post-visit.

seirl changed the task status from Open to Work in Progress.Jan 15 2022, 12:04 AM
seirl moved this task from Backlog to In progress on the Compressed graph service board.