Page MenuHomeSoftware Heritage

Graph property access is not thread-safe
Open, HighPublic

Description

The SwhGraphProperties class contains MappedBigLists, which aren't thread-safe by default. They should all be copy()-ed when a flyweight copy of the graph is created with copy().

This has a very big impact, considering that right now the production GRPC service might return completely incorrect value when it is run with multiple threads (which it is).

Unfortunately there is currently no way of knowing at runtime whether a given BigList has a flyweight copy() method: https://github.com/vigna/fastutil/issues/277

This could be circumvented with a runtime check like this:

private <T> T copyBigListThreadsafe(T list) {
    if (list instanceof ByteMappedBigList) {
        return (T) ((ByteMappedBigList) list).copy();
    }
    else if (list instanceof ShortMappedBigList) {
        return (T) ((ShortMappedBigList) list).copy();
    }
    else if (list instanceof IntMappedBigList) {
        return (T) ((IntMappedBigList) list).copy();
    }
    else if (list instanceof LongMappedBigList) {
        return (T) ((LongMappedBigList) list).copy();
    }
    else if (list instanceof MappedFrontCodedStringBigList) {
        return (T) ((MappedFrontCodedStringBigList) list).copy();
    }
    [...]
    return list;
}

However, there's something even more problematic. The MappedFrontCodedStringBigList class doesn't provide a way to flyweight copy() its attributes at all: https://github.com/vigna/dsiutils/issues/5 so right now there is no way to make label access threadsafe (except with a synchronize() which ruins performance).