Page MenuHomeSoftware Heritage

web client: add async API
Open, WishlistPublic


We want to add an async API to the web client, in order to avoid blocking in async contexts.

One of these contexts is going to be the FUSE filesystem (T1926), but it's generally a useful thing to increase throughput when using the web client in client code.

We do not want to break the current sync API, nor force all users of the web client to go down the asyncio rabbit hole.
So we should migrate the current feature to low-level async code, which will be exposed as a new API (maybe a new swh.web.client.async module?), and then forward port the current sync API to a set of wrappers (return sitting on top of the async API.

Event Timeline

zack triaged this task as Normal priority.Sep 23 2020, 9:29 PM
zack created this task.

As @olasd pointed out, async is a keyword so let's not use it as module name :-)

My concrete proposal is then to add a new swh.web.client.aclient (a for async) module, which will be a sibling of the current swh.web.client.client.
aclient will reimplement and expose the current public client API, all as coroutines.
client will maintain the current API, but all implementations will be wrappers.

Note that this means that we will change the HTTP client implementation of the entire module, no matter which sync/async API is used, from requests to aiohttp. (Because there is no async support in requests AFAICT.) This is fine by me, but we should all be aware of this (cc: @anlambert for info).
It seems better than having two different implementations, one using requests for the sync part and one using aiohttp for the async one.

(As discussed with @seirl I'm not considering the option of keep on using requests and run it in a dedicated thread, because starting threads behind the back of client code seems like a bad design pattern.)

zack lowered the priority of this task from Normal to Wishlist.Oct 20 2020, 6:19 PM

lowering priority, as it's not needed for swh-fuse for now