In my latest benchmarks I noticed RAM usage being huge when calling endpoints methods. After some investigation this is because of Traversal inner big arrays visited and nodeParent that are used to keep track of information during traversals:
public class Traversal { /** Graph used in the traversal */ Graph graph; /** Boolean to specify the use of the transposed graph */ boolean useTransposed; /** Graph edge restriction */ AllowedEdges edges; /** Bit array storing if we have visited a node */ LongArrayBitVector visited; /** LongBigArray storing parent node id for each nodes during a traversal */ long[][] nodeParent; /** Number of edges accessed during traversal */ long nbEdgesAccessed; [...] }
These arrays are allocated every time we create a new Endpoint (for the benchmark this is at most 2 times, but the two arrays combined take up to ~100GB of RAM).
One solution would be to move the definitions up to the higher Graph class, which is only instantiated once. However this is problematic if we want to use multiple threads to query the graph.