swh-web should use a proper elasticsearch library to do its requests
swh-web fishes the save code now logs out of elasticsearch if available.

However, the way it's implemented currently depends on the availability of a given member of the elasticsearch cluster. There's a few issues with that, but most notably:

  • requests depend on that member being available
  • requests are done synchronously and make the (gunicorn) workers hang until an unspecified timeout
  • there's no provision for failover.

It'd be nicer if these requests used an elasticsearch library which would wrap the failover mechanism (and possibly allow asynchronous requests as well?)

Indeed, I forgot the requests module does not set any default timeout value and will hang until the connection is closed.

I propose as a first step to set a timeout value for all requests calls in swh-web (also used to get deposits list and history counters).
This will prevent possible hangs when requesting data from external services.