Details
- Reviewers
olasd - Group Reviewers
Reviewers - Maniphest Tasks
- Restricted Maniphest Task
- Commits
- rDOBJS32cfda0406aa: Use flask instead of aiohttp as RPC server.
will fail because it depends on two diffs in swh-core
Diff Detail
- Repository
- rDOBJS Object storage
- Lint
Automatic diff as part of commit; lint not applicable. - Unit
Automatic diff as part of commit; unit tests not applicable.
Event Timeline
Build has FAILED
Patch application report for D7272 (id=26332)
Could not rebase; Attempt merge onto 76a5f36b41...
Updating 76a5f36..d29a584 Fast-forward requirements.txt | 1 - swh/objstorage/api/client.py | 70 ++------- swh/objstorage/api/server.py | 215 +++++++++----------------- swh/objstorage/backends/pathslicing.py | 16 ++ swh/objstorage/backends/seaweedfs/http.py | 4 +- swh/objstorage/cli.py | 20 ++- swh/objstorage/interface.py | 232 ++++++++++++++++++++++++++++ swh/objstorage/objstorage.py | 209 ------------------------- swh/objstorage/tests/test_objstorage_api.py | 10 +- 9 files changed, 353 insertions(+), 424 deletions(-) create mode 100644 swh/objstorage/interface.py
Changes applied before test
commit d29a584369b2802037da65a91c140801b3700e47 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 15:49:53 2022 +0100 Use flask instead of aiohttp as RPC server. The azure backend cannot be run in the same thread as an asyncio event loop: https://forge.softwareheritage.org/T3981 Additionally, this simplifies the code, by reusing the same method auto-generation as other SWH components. commit 43e7d45961ef5241f4157d864cf4218dba86b55e Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 15:17:42 2022 +0100 server: Group multiple object ids in the same HTTP chunk This reduces the size of the response from 26 bytes per id to an average of 20.06 bytes (rounded up to the closest multiple of 26). commit ff319b60b2221333074d16659f8222a0d19bfe66 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 15:13:43 2022 +0100 client: Do not depend on how the server chunks the response Currently, the server uses HTTP chunked encoding to send object ids one by one, each in its own HTTP frame. I think this is a mistake to rely on such a detail of the HTTP protocol in a high-level API like this. Additionally, a future commit will rewrite the server to use Flask instead of aiohttp, which does not allow this kind of fine-grained control about HTTP chunks. commit b159f04b8d591947003cf3a4cd15fbcf0f3cb31e Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 13:36:02 2022 +0100 Remove method add_stream from the RPC API. Rationale: 1. Only the pathslicing backend implements it 2. No other package uses it (besides a dead code path in the vault) 3. A future commit will rewrite the RPC server to use Flask instead of aiohttp, and rewriting this view correctly is going to be hard (though possible)
Link to build: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/122/
See console output for more information: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/122/console
Build has FAILED
Patch application report for D7272 (id=26332)
Could not rebase; Attempt merge onto 76a5f36b41...
Updating 76a5f36..d29a584 Fast-forward requirements.txt | 1 - swh/objstorage/api/client.py | 70 ++------- swh/objstorage/api/server.py | 215 +++++++++----------------- swh/objstorage/backends/pathslicing.py | 16 ++ swh/objstorage/backends/seaweedfs/http.py | 4 +- swh/objstorage/cli.py | 20 ++- swh/objstorage/interface.py | 232 ++++++++++++++++++++++++++++ swh/objstorage/objstorage.py | 209 ------------------------- swh/objstorage/tests/test_objstorage_api.py | 10 +- 9 files changed, 353 insertions(+), 424 deletions(-) create mode 100644 swh/objstorage/interface.py
Changes applied before test
commit d29a584369b2802037da65a91c140801b3700e47 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 15:49:53 2022 +0100 Use flask instead of aiohttp as RPC server. The azure backend cannot be run in the same thread as an asyncio event loop: https://forge.softwareheritage.org/T3981 Additionally, this simplifies the code, by reusing the same method auto-generation as other SWH components. commit 43e7d45961ef5241f4157d864cf4218dba86b55e Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 15:17:42 2022 +0100 server: Group multiple object ids in the same HTTP chunk This reduces the size of the response from 26 bytes per id to an average of 20.06 bytes (rounded up to the closest multiple of 26). commit ff319b60b2221333074d16659f8222a0d19bfe66 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 15:13:43 2022 +0100 client: Do not depend on how the server chunks the response Currently, the server uses HTTP chunked encoding to send object ids one by one, each in its own HTTP frame. I think this is a mistake to rely on such a detail of the HTTP protocol in a high-level API like this. Additionally, a future commit will rewrite the server to use Flask instead of aiohttp, which does not allow this kind of fine-grained control about HTTP chunks. commit b159f04b8d591947003cf3a4cd15fbcf0f3cb31e Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 13:36:02 2022 +0100 Remove method add_stream from the RPC API. Rationale: 1. Only the pathslicing backend implements it 2. No other package uses it (besides a dead code path in the vault) 3. A future commit will rewrite the RPC server to use Flask instead of aiohttp, and rewriting this view correctly is going to be hard (though possible)
Link to build: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/123/
See console output for more information: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/123/console
Build is green
Patch application report for D7272 (id=26336)
Could not rebase; Attempt merge onto 76a5f36b41...
Updating 76a5f36..f578d1e Fast-forward requirements.txt | 1 - swh/objstorage/api/client.py | 70 ++------- swh/objstorage/api/server.py | 214 +++++++++---------------- swh/objstorage/backends/pathslicing.py | 16 ++ swh/objstorage/backends/seaweedfs/http.py | 4 +- swh/objstorage/cli.py | 20 ++- swh/objstorage/interface.py | 232 ++++++++++++++++++++++++++++ swh/objstorage/objstorage.py | 206 ------------------------ swh/objstorage/tests/test_objstorage_api.py | 10 +- 9 files changed, 351 insertions(+), 422 deletions(-) create mode 100644 swh/objstorage/interface.py
Changes applied before test
commit f578d1ef703064e20fdafc70d4be7469eb554790 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 15:49:53 2022 +0100 Use flask instead of aiohttp as RPC server. The azure backend cannot be run in the same thread as an asyncio event loop: https://forge.softwareheritage.org/T3981 Additionally, this simplifies the code, by reusing the same method auto-generation as other SWH components. commit 43e7d45961ef5241f4157d864cf4218dba86b55e Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 15:17:42 2022 +0100 server: Group multiple object ids in the same HTTP chunk This reduces the size of the response from 26 bytes per id to an average of 20.06 bytes (rounded up to the closest multiple of 26). commit ff319b60b2221333074d16659f8222a0d19bfe66 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 15:13:43 2022 +0100 client: Do not depend on how the server chunks the response Currently, the server uses HTTP chunked encoding to send object ids one by one, each in its own HTTP frame. I think this is a mistake to rely on such a detail of the HTTP protocol in a high-level API like this. Additionally, a future commit will rewrite the server to use Flask instead of aiohttp, which does not allow this kind of fine-grained control about HTTP chunks. commit b159f04b8d591947003cf3a4cd15fbcf0f3cb31e Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 13:36:02 2022 +0100 Remove method add_stream from the RPC API. Rationale: 1. Only the pathslicing backend implements it 2. No other package uses it (besides a dead code path in the vault) 3. A future commit will rewrite the RPC server to use Flask instead of aiohttp, and rewriting this view correctly is going to be hard (though possible)
See https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/124/ for more details.
Build is green
Patch application report for D7272 (id=26349)
Could not rebase; Attempt merge onto 76a5f36b41...
Updating 76a5f36..32cfda0 Fast-forward requirements.txt | 1 - swh/objstorage/api/client.py | 70 ++------- swh/objstorage/api/server.py | 214 +++++++++---------------- swh/objstorage/backends/pathslicing.py | 16 ++ swh/objstorage/backends/seaweedfs/http.py | 4 +- swh/objstorage/cli.py | 20 ++- swh/objstorage/interface.py | 232 ++++++++++++++++++++++++++++ swh/objstorage/objstorage.py | 206 ------------------------ swh/objstorage/tests/test_objstorage_api.py | 10 +- 9 files changed, 351 insertions(+), 422 deletions(-) create mode 100644 swh/objstorage/interface.py
Changes applied before test
commit 32cfda0406aa2a8b9f4727abf2ea56b7563460ea Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 15:49:53 2022 +0100 Use flask instead of aiohttp as RPC server. The azure backend cannot be run in the same thread as an asyncio event loop: https://forge.softwareheritage.org/T3981 Additionally, this simplifies the code, by reusing the same method auto-generation as other SWH components. commit 9be723dcbc9813a4a66e0068de22e36dbaee09c2 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 15:17:42 2022 +0100 server: Group multiple object ids in the same HTTP chunk This reduces the size of the response from 26 bytes per id to an average of 20.06 bytes (rounded up to the closest multiple of 26). commit a2424b7a13e9a696329a9e9efa6c86f9864dd60b Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 15:13:43 2022 +0100 client: Do not depend on how the server chunks the response Currently, the server uses HTTP chunked encoding to send object ids one by one, each in its own HTTP frame. I think this is a mistake to rely on such a detail of the HTTP protocol in a high-level API like this. Additionally, a future commit will rewrite the server to use Flask instead of aiohttp, which does not allow this kind of fine-grained control about HTTP chunks. commit b159f04b8d591947003cf3a4cd15fbcf0f3cb31e Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Tue Mar 1 13:36:02 2022 +0100 Remove method add_stream from the RPC API. Rationale: 1. Only the pathslicing backend implements it 2. No other package uses it (besides a dead code path in the vault) 3. A future commit will rewrite the RPC server to use Flask instead of aiohttp, and rewriting this view correctly is going to be hard (though possible)
See https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/132/ for more details.
I think this all looks good, thank you! Obviously, the consistency with other RPCs is much appreciated.
swh/objstorage/tests/test_objstorage_api.py | ||
---|---|---|
41–48 | I'm a bit doubtful of these overrides considering that delete is in fact implemented in the remote APIs? |
swh/objstorage/tests/test_objstorage_api.py | ||
---|---|---|
41–48 | they both error because they expect PermissionError to be raised, but it's not. 🤷 |
swh/objstorage/tests/test_objstorage_api.py | ||
---|---|---|
41–48 | Maybe it could be added to reraise_exceptions? |
swh/objstorage/tests/test_objstorage_api.py | ||
---|---|---|
41–48 | Nope, reraise_exceptions is only used when it would raise RemoteException, but here, no exception is raised at all. |