Page MenuHomeSoftware Heritage

Add stasd probes in swh-storage
Closed, MigratedEdits Locked

Event Timeline

douardda triaged this task as Normal priority.Mar 25 2019, 11:26 AM
douardda created this task.
ardumont changed the task status from Open to Work in Progress.Mar 26 2019, 2:02 PM
ardumont claimed this task.
ardumont moved this task from Backlog to in progress on the Sprint 2019 03 board.

The current state of these probes has been deployed in production (and the workers restarted).

https://grafana.softwareheritage.org/d/8ywqc76mk/storage-backend-statistics?orgId=1

The response times show quite a wide spread as we're bundling an arbitrary amount objects per RPC requests; we'll likely want to add some normalization with respect to the number of objects requested per query to get some more meaningful results.

Some more granular probes inside the main storage implementation would be helpful, e.g. counting the actual number of objects inserted by each request.

Finally, as the main storage uses the object storage in "direct mode" (i.e. not going through an objstorage backend but directly writing to the pathslicer) we should consider adding some "byte counting" probes there as well

Some more granular probes inside the main storage implementation would be helpful, e.g. counting the actual number of objects inserted by each request.

Right, with @douardda we discussed it and we wanted to check first if that was needed ;)
There you go

Thanks.

As an implementation detail, first, i'm going for altering the _add* methods to summarize what they did (right now they don't return anything).
That way, we can lift that through statsd but that could also be used by clients (loaders, listers, ...).

Finally, as the main storage uses the object storage in "direct mode" (i.e. not going through an objstorage backend but directly writing to the pathslicer) we should consider adding some "byte counting" probes there as well

Right, i'll do that after the storage part.

(Implementation wise, i'll go do that in the swh-objstorage module).

Btw, i'll remove the @timed get_storage as this is just noisy on the graph ;)
(That concurs with what you said orally)