|Open||None||T885 Vault: use objstorage streaming to store and fetch bundles|
|Open||None||T805 objstorage: allow use of file-like objects for streaming methods|
|Resolved||seirl||T928 Rewrite the Vault Cookers I/O pipeline with file objects|
see T1964 for a concrete example where the lack of streaming is causing problems (after the cooking, when the bundle is ready)
$ wget https://archive.softwareheritage.org/api/1/vault/revision/85678b0d6c52d6fd0af50c8e493c74dd15a7115d/gitfast/raw/ --2019-09-19 11:43:50-- https://archive.softwareheritage.org/api/1/vault/revision/85678b0d6c52d6fd0af50c8e493c74dd15a7115d/gitfast/raw/ Resolving archive.softwareheritage.org (archive.softwareheritage.org)... 184.108.40.206 Connecting to archive.softwareheritage.org (archive.softwareheritage.org)|220.127.116.11|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 539845226 (515M) [application/gzip] Saving to: ‘index.html’ index.html 31%[=============> ] 162,18M 2,66MB/s in 66s 2019-09-19 11:46:13 (2,46 MB/s) - Connection closed at byte 170059557. Retrying. --2019-09-19 11:46:14-- (try: 2) https://archive.softwareheritage.org/api/1/vault/revision/85678b0d6c52d6fd0af50c8e493c74dd15a7115d/gitfast/raw/ Connecting to archive.softwareheritage.org (archive.softwareheritage.org)|18.104.22.168|:443... connected. HTTP request sent, awaiting response... 400 Bad Request 2019-09-19 11:50:29 ERROR 400: Bad Request.
I wonder whether the best solution wouldn't be to just generate a redirect to a direct download url from the azure bucket using a temporary shared access signature.
e.g. in the rocrail case: https://swhvaultstorage.blob.core.windows.net/contents/36489f4afbc3d2d3a43bf00d79f03deb4e9ed5f7?sp=r&st=2019-09-19T11:14:55Z&se=2019-09-19T19:14:55Z&spr=https&sv=2018-03-28&sig=IlIioroy1rkUxCRxLirH7newNos4AQbigrioxIpXpWA%3D&sr=b (expiry today at 19:15 UTC)
Pluggable compression has been implemented for all objstorage backends, which means we could
- store the (compressed) bundles in an uncompressed objstorage on azure
- when a user requests the bundle
- generate a temporary URL (using BlobSharedAccessSignature.generate_blob)
- redirect to that temporary URL