Page MenuHomeSoftware Heritage

swh-web should use a proper elasticsearch library to do its requests
Closed, MigratedEdits Locked

Description

swh-web fishes the save code now logs out of elasticsearch if available.

However, the way it's implemented currently depends on the availability of a given member of the elasticsearch cluster. There's a few issues with that, but most notably:

  • requests depend on that member being available
  • requests are done synchronously and make the (gunicorn) workers hang until an unspecified timeout
  • there's no provision for failover.

It'd be nicer if these requests used an elasticsearch library which would wrap the failover mechanism (and possibly allow asynchronous requests as well?)

Event Timeline

olasd triaged this task as Normal priority.Oct 1 2019, 3:44 PM
olasd created this task.

Indeed, I forgot the requests module does not set any default timeout value and will hang until the connection is closed.

I propose as a first step to set a timeout value for all requests calls in swh-web (also used to get deposits list and history counters).
This will prevent possible hangs when requesting data from external services.