Page MenuHomeSoftware Heritage

Start introducing composite ObjId in the interface
ClosedPublic

Authored by vlorentz on Jun 22 2022, 5:19 PM.

Details

Summary

For now, any hash algo other than ID_HASH_ALGO ("sha1") is ignored.

This changes the network protocol used by list_content from a
concatenation of fixed-length hashes to a stream of msgpacked
dictionaries.

Diff Detail

Repository
rDOBJS Object storage
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build has FAILED

Patch application report for D8029 (id=28913)

Could not rebase; Attempt merge onto 847285a28d...

Updating 847285a..3bc1985
Fast-forward
 mypy.ini                                           |  3 +
 requirements.txt                                   |  1 +
 swh/objstorage/api/client.py                       | 21 ++++---
 swh/objstorage/api/server.py                       |  8 ++-
 swh/objstorage/backends/azure.py                   |  4 +-
 swh/objstorage/backends/generator.py               | 15 +++--
 swh/objstorage/backends/http.py                    |  6 +-
 swh/objstorage/backends/in_memory.py               | 10 ++--
 swh/objstorage/backends/libcloud.py                |  7 +--
 swh/objstorage/backends/pathslicing.py             | 66 +++++++---------------
 swh/objstorage/backends/seaweedfs/objstorage.py    |  6 +-
 swh/objstorage/backends/winery/objstorage.py       |  8 +--
 swh/objstorage/interface.py                        | 47 +++++++--------
 swh/objstorage/multiplexer/filter/filter.py        |  8 +--
 swh/objstorage/multiplexer/filter/id_filter.py     |  9 +--
 .../multiplexer/multiplexer_objstorage.py          | 24 +-------
 swh/objstorage/objstorage.py                       | 25 +++++---
 swh/objstorage/tests/objstorage_testing.py         |  5 +-
 swh/objstorage/tests/test_multiplexer_filter.py    | 19 -------
 .../tests/test_objstorage_multiplexer.py           |  7 ---
 .../tests/test_objstorage_pathslicing.py           |  7 ---
 .../tests/test_objstorage_random_generator.py      |  8 ++-
 swh/objstorage/tests/test_objstorage_winery.py     | 13 ++---
 23 files changed, 132 insertions(+), 195 deletions(-)
Changes applied before test
commit 3bc19851a1600520e58edb66fba6a440a4b189a7
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 22 17:18:02 2022 +0200

    Start introducing composite ObjId in the interface
    
    For now, any hash algo other than ID_HASH_ALGO ("sha1") is ignored.
    
    This changes the network protocol used by `list_content` from a
    concatenation of fixed-length hashes to a stream of msgpacked
    dictionaries.

commit 3491bf7eb24c7aaed2eabdef664787cb6a1aabdd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 22 16:56:39 2022 +0200

    Make WineryWriter.add return None
    
    Like WineryObjStorage and other ObjStorage backends.

commit 0fd9cefc33d50b1a345a3c965b7f63d7bcf58d61
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 22 16:18:47 2022 +0200

    Remove get_random()
    
    It is not used anywhere (and swh-storage provides content_get_random() that
    feels the same need), and cloud backends do not implement it anyway.

Link to build: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/163/
See console output for more information: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/163/console

Build has FAILED

Patch application report for D8029 (id=28914)

Could not rebase; Attempt merge onto 847285a28d...

Updating 847285a..4246996
Fast-forward
 mypy.ini                                           |  3 +
 requirements.txt                                   |  1 +
 swh/objstorage/api/client.py                       | 21 ++++---
 swh/objstorage/api/server.py                       |  8 ++-
 swh/objstorage/backends/azure.py                   |  4 +-
 swh/objstorage/backends/generator.py               | 15 +++--
 swh/objstorage/backends/http.py                    |  6 +-
 swh/objstorage/backends/in_memory.py               | 10 ++--
 swh/objstorage/backends/libcloud.py                |  7 +--
 swh/objstorage/backends/pathslicing.py             | 66 +++++++---------------
 swh/objstorage/backends/seaweedfs/objstorage.py    |  6 +-
 swh/objstorage/backends/winery/objstorage.py       |  8 +--
 swh/objstorage/interface.py                        | 46 ++++++---------
 swh/objstorage/multiplexer/filter/filter.py        |  8 +--
 swh/objstorage/multiplexer/filter/id_filter.py     |  9 +--
 .../multiplexer/multiplexer_objstorage.py          | 24 +-------
 swh/objstorage/objstorage.py                       | 25 +++++---
 swh/objstorage/tests/objstorage_testing.py         |  5 +-
 swh/objstorage/tests/test_multiplexer_filter.py    | 19 -------
 .../tests/test_objstorage_multiplexer.py           |  7 ---
 .../tests/test_objstorage_pathslicing.py           |  7 ---
 .../tests/test_objstorage_random_generator.py      |  8 ++-
 swh/objstorage/tests/test_objstorage_winery.py     | 13 ++---
 23 files changed, 129 insertions(+), 197 deletions(-)
Changes applied before test
commit 42469967ceb2ea2ea4866c3d4caed67ab1d1c0bd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 22 17:18:02 2022 +0200

    Start introducing composite ObjId in the interface
    
    For now, any hash algo other than ID_HASH_ALGO ("sha1") is ignored.
    
    This changes the network protocol used by `list_content` from a
    concatenation of fixed-length hashes to a stream of msgpacked
    dictionaries.

commit 3491bf7eb24c7aaed2eabdef664787cb6a1aabdd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 22 16:56:39 2022 +0200

    Make WineryWriter.add return None
    
    Like WineryObjStorage and other ObjStorage backends.

commit 0fd9cefc33d50b1a345a3c965b7f63d7bcf58d61
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 22 16:18:47 2022 +0200

    Remove get_random()
    
    It is not used anywhere (and swh-storage provides content_get_random() that
    feels the same need), and cloud backends do not implement it anyway.

Link to build: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/164/
See console output for more information: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/164/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jun 22 2022, 5:24 PM
Harbormaster failed remote builds in B30008: Diff 28914!

Build is green

Patch application report for D8029 (id=28925)

Could not rebase; Attempt merge onto d7f4daa242...

Updating d7f4daa..91c6308
Fast-forward
 mypy.ini                                           |  3 +
 requirements.txt                                   |  1 +
 swh/objstorage/api/client.py                       | 21 ++++---
 swh/objstorage/api/server.py                       |  8 ++-
 swh/objstorage/backends/azure.py                   |  6 +-
 swh/objstorage/backends/generator.py               | 15 +++--
 swh/objstorage/backends/http.py                    |  8 +--
 swh/objstorage/backends/in_memory.py               | 12 ++--
 swh/objstorage/backends/libcloud.py                |  9 ++-
 swh/objstorage/backends/pathslicing.py             | 68 +++++++---------------
 swh/objstorage/backends/seaweedfs/objstorage.py    |  8 +--
 swh/objstorage/backends/winery/objstorage.py       |  8 +--
 swh/objstorage/interface.py                        | 46 +++++----------
 swh/objstorage/multiplexer/filter/filter.py        |  8 +--
 swh/objstorage/multiplexer/filter/id_filter.py     |  9 +--
 .../multiplexer/multiplexer_objstorage.py          | 26 ++-------
 swh/objstorage/objstorage.py                       | 25 +++++---
 swh/objstorage/tests/objstorage_testing.py         |  5 +-
 swh/objstorage/tests/test_multiplexer_filter.py    | 19 ------
 .../tests/test_objstorage_multiplexer.py           |  7 ---
 .../tests/test_objstorage_pathslicing.py           |  7 ---
 .../tests/test_objstorage_random_generator.py      |  8 ++-
 swh/objstorage/tests/test_objstorage_winery.py     | 13 ++---
 23 files changed, 136 insertions(+), 204 deletions(-)
Changes applied before test
commit 91c630852ab8204edcd068987cdaa0c23bfd9cff
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 22 17:18:02 2022 +0200

    Start introducing composite ObjId in the interface
    
    For now, any hash algo other than ID_HASH_ALGO ("sha1") is ignored.
    
    This changes the network protocol used by `list_content` from a
    concatenation of fixed-length hashes to a stream of msgpacked
    dictionaries.

commit e712f28684a1a17233a700dd22606b594a3b617b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 22 16:56:39 2022 +0200

    Make WineryWriter.add return None
    
    Like WineryObjStorage and other ObjStorage backends.

commit 2caa05e869c952a3e8c4f3418d41ea17955f5447
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 22 16:18:47 2022 +0200

    Remove get_random()
    
    It is not used anywhere (and swh-storage provides content_get_random() that
    feels the same need), and cloud backends do not implement it anyway.

See https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/167/ for more details.

swh/objstorage/api/server.py
98–109

this code is actually covered, but it runs in a separate process

ardumont added a subscriber: ardumont.

lgtm, couple of remarks inline.

swh/objstorage/api/client.py
54–55
swh/objstorage/interface.py
77

no need, it's already typed in the signature.

swh/objstorage/objstorage.py
19

Please add a docstring so one can immediately see why we use a specific implementation over the default swh.model.hashutil.bytes_to_hex.

This revision is now accepted and ready to land.Jun 24 2022, 11:36 AM
vlorentz marked 3 inline comments as done.

apply comments

swh/objstorage/interface.py
77

it's also incorrect; looks like I accidentally reverted a previous diff.

Build is green

Patch application report for D8029 (id=28944)

Rebasing onto e712f28684...

Current branch diff-target is up to date.
Changes applied before test
commit d5974f38ea1c5f56f83c98ae5fd92e0c869e9e3f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 22 17:18:02 2022 +0200

    Start introducing composite ObjId in the interface
    
    For now, any hash algo other than ID_HASH_ALGO ("sha1") is ignored.
    
    This changes the network protocol used by `list_content` from a
    concatenation of fixed-length hashes to a stream of msgpacked
    dictionaries.

See https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/168/ for more details.

douardda added inline comments.
swh/objstorage/backends/seaweedfs/objstorage.py
129

shouldn't this be changed to objid_to_default_hex?

seaweed: replace with objid_to_default_hex

Build is green

Patch application report for D8029 (id=29111)

Rebasing onto e712f28684...

Current branch diff-target is up to date.
Changes applied before test
commit 667cb87b9367864523092c3e3f78ae50d4c24d71
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Jun 22 17:18:02 2022 +0200

    Start introducing composite ObjId in the interface
    
    For now, any hash algo other than ID_HASH_ALGO ("sha1") is ignored.
    
    This changes the network protocol used by `list_content` from a
    concatenation of fixed-length hashes to a stream of msgpacked
    dictionaries.

See https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/169/ for more details.