Page MenuHomeSoftware Heritage

Run swh-graph with gunicorn to support multiple/parallel requests
Closed, MigratedEdits Locked

Description

Currently it's ran on granet with the CLI: /opt/swhgraph_venv/bin/python3 /opt/swhgraph_venv/bin/swh graph rpc-serve -g /dev/shm/swh-graph/default/graph

This means it uses a single process, which is currently the bottleneck when processing a dozen queries in parallel.

The app path is a bit different from other packages, it's swh.graph.server.app:make_app_from_configfile()

Event Timeline

vlorentz triaged this task as Normal priority.Sep 30 2021, 2:27 PM
vlorentz created this task.
vlorentz updated the task description. (Show Details)
vlorentz updated the task description. (Show Details)
vlorentz updated the task description. (Show Details)
vlorentz lowered the priority of this task from Normal to Low.EditedSep 30 2021, 3:04 PM

Hmm actually this might be harder than just using gunicorn, because the java subprocess needs to be shared between workers, hmm...

I see there are some unapplied performance improvements in swh-graph, let's try that first.

zack renamed this task from Run swh-graph with gunicorn to Run swh-graph with gunicorn to support multiple/parallel requests.Oct 2 2021, 8:01 AM
zack raised the priority of this task from Low to Normal.
zack raised the priority of this task from Normal to High.Oct 2 2021, 8:06 AM

actually, before it starts leaking RAM, the Java process only uses 10% of the memory, so there is room to start a few more.

(Even if, ideally, it would be shared)

seirl claimed this task.
seirl added a subscriber: seirl.

Obsoleted by the migration to GRPC. Now we use GRPC's threading model, with a threadpool configurable by passing --threads to the Java service. By default, nproc is used.