Page MenuHomeSoftware Heritage

Winery backend server
Open, NormalPublic

Description

The winery backend (T3533) can be used as a library from the objstorage and storage servers, which is helpful to reduce latency and increase bandwidth. It however has drawbacks such as:

  • Increasing the attack surface
  • Requiring the machines running obstorage / storage have elevated access to the postgresql server and the Ceph cluster
  • Inability to separate read workers from write workers and control how many of them there are, which may be necessary for I/O throttling

To address these problems, the winery backend could be run on a standalone server exposing an API identical to the objstorage server (that would be the winery server). The objstorage server and the storage servers could then be configured to use the RemoteObjStorage backend and delegate requests to the winery server. It would provide the required isolation and configuration flexibility.

Related Objects

StatusAssignedTask
Work in Progressdachary
OpenNone

Event Timeline

dachary triaged this task as Normal priority.Dec 13 2021, 5:54 PM
dachary created this task.
dachary created this object in space S1 Public.
dachary added a parent task: T3432: Add winery backend.
dachary updated the task description. (Show Details)

In practical terms, the two winery objstorage database servers and Ceph itself will be hosted at CEA, while the main ingestion storage / graph storage / ... will remain in Rocquencourt (separated sites, with fairly high bandwidth networking between them).

The current colocation between the main objstorage and the main ingestion storage will not exist. In this situation we'll probably not have direct Ceph or PostgreSQL access from the main ingestion storage, so we will want to deploy some sort of RPC component to handle objstorage queries back and forth.

To integrate this, we need the Python interface of the Winery RPC client to quack like a swh.objstorage.objstorage.ObjStorage.

Maybe this can be done within the current objstorage swh.objstorage."api" (which is really rpc) client/server framework, but we shouldn't feel *bound* to it. The winery "server" doesn't need to be compatible with the current msgpack over http objstorage server interface.

dachary renamed this task from Winery backend proxy to Winery backend server.Dec 13 2021, 6:22 PM
dachary updated the task description. (Show Details)
dachary updated the task description. (Show Details)