Page MenuHomeSoftware Heritage

Add test to enforce all objstorage backends follow the interface
ClosedPublic

Authored by vlorentz on Mar 1 2022, 5:58 PM.

Diff Detail

Repository
rDOBJS Object storage
Branch
remove-aiohttp
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 27213
Build 42568: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 42567: arc lint + arc unit

Event Timeline

Build has FAILED

Patch application report for D7273 (id=26337)

Could not rebase; Attempt merge onto 76a5f36b41...

Updating 76a5f36..de9f7b6
Fast-forward
 requirements.txt                                   |   1 -
 swh/objstorage/api/client.py                       |  72 ++-----
 swh/objstorage/api/server.py                       | 214 +++++++------------
 swh/objstorage/backends/in_memory.py               |  10 +-
 swh/objstorage/backends/pathslicing.py             |  16 ++
 swh/objstorage/backends/seaweedfs/http.py          |   4 +-
 swh/objstorage/cli.py                              |  20 +-
 swh/objstorage/interface.py                        | 235 +++++++++++++++++++++
 .../multiplexer/multiplexer_objstorage.py          |   3 +-
 swh/objstorage/multiplexer/striping_objstorage.py  |   7 +-
 swh/objstorage/objstorage.py                       | 222 +------------------
 swh/objstorage/tests/objstorage_testing.py         |  32 +++
 swh/objstorage/tests/test_objstorage_api.py        |  10 +-
 13 files changed, 406 insertions(+), 440 deletions(-)
 create mode 100644 swh/objstorage/interface.py
Changes applied before test
commit de9f7b629a1a17410765b000b0e6cd6bb7136e26
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 17:57:58 2022 +0100

    Add test to enforce all objstorage backends follow the interface

commit f578d1ef703064e20fdafc70d4be7469eb554790
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 15:49:53 2022 +0100

    Use flask instead of aiohttp as RPC server.
    
    The azure backend cannot be run in the same thread as an asyncio
    event loop: https://forge.softwareheritage.org/T3981
    
    Additionally, this simplifies the code, by reusing the same
    method auto-generation as other SWH components.

commit 43e7d45961ef5241f4157d864cf4218dba86b55e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 15:17:42 2022 +0100

    server: Group multiple object ids in the same HTTP chunk
    
    This reduces the size of the response from 26 bytes per id
    to an average of 20.06 bytes (rounded up to the closest
    multiple of 26).

commit ff319b60b2221333074d16659f8222a0d19bfe66
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 15:13:43 2022 +0100

    client: Do not depend on how the server chunks the response
    
    Currently, the server uses HTTP chunked encoding to send object ids
    one by one, each in its own HTTP frame.
    
    I think this is a mistake to rely on such a detail of the HTTP protocol
    in a high-level API like this.
    
    Additionally, a future commit will rewrite the server to use Flask
    instead of aiohttp, which does not allow this kind of fine-grained
    control about HTTP chunks.

commit b159f04b8d591947003cf3a4cd15fbcf0f3cb31e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 13:36:02 2022 +0100

    Remove method add_stream from the RPC API.
    
    Rationale:
    
    1. Only the pathslicing backend implements it
    2. No other package uses it (besides a dead code path in the vault)
    3. A future commit will rewrite the RPC server to use Flask instead of
       aiohttp, and rewriting this view correctly is going to be hard
       (though possible)

Link to build: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/125/
See console output for more information: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/125/console

Harbormaster returned this revision to the author for changes because remote builds failed.Mar 1 2022, 5:59 PM
Harbormaster failed remote builds in B27212: Diff 26337!

Build is green

Patch application report for D7273 (id=26338)

Could not rebase; Attempt merge onto 76a5f36b41...

Updating 76a5f36..9c09ec2
Fast-forward
 requirements.txt                                   |   2 +-
 swh/objstorage/api/client.py                       |  72 ++-----
 swh/objstorage/api/server.py                       | 214 ++++++-------------
 swh/objstorage/backends/in_memory.py               |  10 +-
 swh/objstorage/backends/pathslicing.py             |  16 ++
 swh/objstorage/backends/seaweedfs/http.py          |   4 +-
 swh/objstorage/cli.py                              |  20 +-
 swh/objstorage/interface.py                        | 237 +++++++++++++++++++++
 .../multiplexer/multiplexer_objstorage.py          |   3 +-
 swh/objstorage/multiplexer/striping_objstorage.py  |   7 +-
 swh/objstorage/objstorage.py                       | 222 +------------------
 swh/objstorage/tests/objstorage_testing.py         |  32 +++
 swh/objstorage/tests/test_objstorage_api.py        |  10 +-
 13 files changed, 409 insertions(+), 440 deletions(-)
 create mode 100644 swh/objstorage/interface.py
Changes applied before test
commit 9c09ec2ef2b9df3d5a7027fa55cf872d897e09cd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 17:57:58 2022 +0100

    Add test to enforce all objstorage backends follow the interface

commit f578d1ef703064e20fdafc70d4be7469eb554790
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 15:49:53 2022 +0100

    Use flask instead of aiohttp as RPC server.
    
    The azure backend cannot be run in the same thread as an asyncio
    event loop: https://forge.softwareheritage.org/T3981
    
    Additionally, this simplifies the code, by reusing the same
    method auto-generation as other SWH components.

commit 43e7d45961ef5241f4157d864cf4218dba86b55e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 15:17:42 2022 +0100

    server: Group multiple object ids in the same HTTP chunk
    
    This reduces the size of the response from 26 bytes per id
    to an average of 20.06 bytes (rounded up to the closest
    multiple of 26).

commit ff319b60b2221333074d16659f8222a0d19bfe66
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 15:13:43 2022 +0100

    client: Do not depend on how the server chunks the response
    
    Currently, the server uses HTTP chunked encoding to send object ids
    one by one, each in its own HTTP frame.
    
    I think this is a mistake to rely on such a detail of the HTTP protocol
    in a high-level API like this.
    
    Additionally, a future commit will rewrite the server to use Flask
    instead of aiohttp, which does not allow this kind of fine-grained
    control about HTTP chunks.

commit b159f04b8d591947003cf3a4cd15fbcf0f3cb31e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 13:36:02 2022 +0100

    Remove method add_stream from the RPC API.
    
    Rationale:
    
    1. Only the pathslicing backend implements it
    2. No other package uses it (besides a dead code path in the vault)
    3. A future commit will rewrite the RPC server to use Flask instead of
       aiohttp, and rewriting this view correctly is going to be hard
       (though possible)

See https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/126/ for more details.

olasd added a subscriber: olasd.

Great, thanks.

This revision is now accepted and ready to land.Mar 2 2022, 11:29 AM

Build is green

Patch application report for D7273 (id=26350)

Could not rebase; Attempt merge onto 76a5f36b41...

Updating 76a5f36..33a786c
Fast-forward
 requirements.txt                                   |   2 +-
 swh/objstorage/api/client.py                       |  72 ++-----
 swh/objstorage/api/server.py                       | 214 ++++++-------------
 swh/objstorage/backends/in_memory.py               |  10 +-
 swh/objstorage/backends/pathslicing.py             |  16 ++
 swh/objstorage/backends/seaweedfs/http.py          |   4 +-
 swh/objstorage/cli.py                              |  20 +-
 swh/objstorage/interface.py                        | 237 +++++++++++++++++++++
 .../multiplexer/multiplexer_objstorage.py          |   3 +-
 swh/objstorage/multiplexer/striping_objstorage.py  |   7 +-
 swh/objstorage/objstorage.py                       | 222 +------------------
 swh/objstorage/tests/objstorage_testing.py         |  32 +++
 swh/objstorage/tests/test_objstorage_api.py        |  10 +-
 13 files changed, 409 insertions(+), 440 deletions(-)
 create mode 100644 swh/objstorage/interface.py
Changes applied before test
commit 33a786ce15c093c7ad9a9b040937850f0a51affd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 17:57:58 2022 +0100

    Add test to enforce all objstorage backends follow the interface

commit 32cfda0406aa2a8b9f4727abf2ea56b7563460ea
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 15:49:53 2022 +0100

    Use flask instead of aiohttp as RPC server.
    
    The azure backend cannot be run in the same thread as an asyncio
    event loop: https://forge.softwareheritage.org/T3981
    
    Additionally, this simplifies the code, by reusing the same
    method auto-generation as other SWH components.

commit 9be723dcbc9813a4a66e0068de22e36dbaee09c2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 15:17:42 2022 +0100

    server: Group multiple object ids in the same HTTP chunk
    
    This reduces the size of the response from 26 bytes per id
    to an average of 20.06 bytes (rounded up to the closest
    multiple of 26).

commit a2424b7a13e9a696329a9e9efa6c86f9864dd60b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 15:13:43 2022 +0100

    client: Do not depend on how the server chunks the response
    
    Currently, the server uses HTTP chunked encoding to send object ids
    one by one, each in its own HTTP frame.
    
    I think this is a mistake to rely on such a detail of the HTTP protocol
    in a high-level API like this.
    
    Additionally, a future commit will rewrite the server to use Flask
    instead of aiohttp, which does not allow this kind of fine-grained
    control about HTTP chunks.

commit b159f04b8d591947003cf3a4cd15fbcf0f3cb31e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Mar 1 13:36:02 2022 +0100

    Remove method add_stream from the RPC API.
    
    Rationale:
    
    1. Only the pathslicing backend implements it
    2. No other package uses it (besides a dead code path in the vault)
    3. A future commit will rewrite the RPC server to use Flask instead of
       aiohttp, and rewriting this view correctly is going to be hard
       (though possible)

See https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/133/ for more details.