Page MenuHomeSoftware Heritage

Add support for pathslicing in seaweedfs backend
AbandonedPublic

Authored by douardda on Oct 15 2021, 5:32 PM.

Details

Reviewers
None
Group Reviewers
Reviewers
Summary

this actually requires quite some refactoring in the seaweedfs backend,
mostly to support the (somewhat useless) list_content() method.

On the plus side, tests for this backend are now much more realistic,
mocking a seaweedfs Filer server instead of the WeedFiler directly.
On the minus side, the mock object is a bit too complex for a test
helper object...

Diff Detail

Event Timeline

Build has FAILED

Patch application report for D6492 (id=23578)

Rebasing onto 23b7f81c14...

Current branch diff-target is up to date.
Changes applied before test
commit 911c9bd227efe3a8d3bce7306565debf0528d429
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Oct 8 14:01:24 2021 +0200

    Add support for pathslicing in seaweedfs backend
    
    this actually requires quite some refactoring in the seaweedfs backend,
    mostly to support the (somewhat useless) `list_content()` method.
    
    On the plus side, tests for this backend are now much more realistic,
    mocking a seaweedfs Filer server instead of the `WeedFiler` directly.
    On the minus side, the mock object is a bit too complex for a test
    helper object...

Link to build: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/36/
See console output for more information: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/36/console

Harbormaster returned this revision to the author for changes because remote builds failed.Oct 15 2021, 5:33 PM
Harbormaster failed remote builds in B24478: Diff 23578!

Build has FAILED

Patch application report for D6492 (id=23586)

Rebasing onto 23b7f81c14...

Current branch diff-target is up to date.
Changes applied before test
commit 22927a595b5de2b75e3aa7fe49bc9a5e57768a7d
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Oct 8 14:01:24 2021 +0200

    Add support for pathslicing in seaweedfs backend
    
    this actually requires quite some refactoring in the seaweedfs backend,
    mostly to support the (somewhat useless) `list_content()` method.
    
    On the plus side, tests for this backend are now much more realistic,
    mocking a seaweedfs Filer server instead of the `WeedFiler` directly.
    On the minus side, the mock object is a bit too complex for a test
    helper object...

Link to build: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/37/
See console output for more information: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/37/console

Harbormaster returned this revision to the author for changes because remote builds failed.Oct 15 2021, 6:24 PM
Harbormaster failed remote builds in B24482: Diff 23586!

Build has FAILED

Patch application report for D6492 (id=23587)

Rebasing onto 23b7f81c14...

Current branch diff-target is up to date.
Changes applied before test
commit 084746eed24a9a6c22e1469d946cef8c193ce23a
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Oct 8 14:01:24 2021 +0200

    Add support for pathslicing in seaweedfs backend
    
    this actually requires quite some refactoring in the seaweedfs backend,
    mostly to support the (somewhat useless) `list_content()` method.
    
    On the plus side, tests for this backend are now much more realistic,
    mocking a seaweedfs Filer server instead of the `WeedFiler` directly.
    On the minus side, the mock object is a bit too complex for a test
    helper object...

Link to build: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/38/
See console output for more information: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/38/console

Harbormaster returned this revision to the author for changes because remote builds failed.Oct 15 2021, 6:30 PM
Harbormaster failed remote builds in B24483: Diff 23587!

Build is green

Patch application report for D6492 (id=23588)

Rebasing onto 23b7f81c14...

Current branch diff-target is up to date.
Changes applied before test
commit 6e5999f077f6dc72858552fed5fc120d6643577b
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Oct 8 14:01:24 2021 +0200

    Add support for pathslicing in seaweedfs backend
    
    this actually requires quite some refactoring in the seaweedfs backend,
    mostly to support the (somewhat useless) `list_content()` method.
    
    On the plus side, tests for this backend are now much more realistic,
    mocking a seaweedfs Filer server instead of the `WeedFiler` directly.
    On the minus side, the mock object is a bit too complex for a test
    helper object...

See https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/39/ for more details.

this actually requires quite some refactoring in the seaweedfs backend,
mostly to support the (somewhat useless) list_content() method.

If it's useless (why?), that would not shock me if you just raise a NotImplemented
exception (and bypass the related check for that implementation)

swh/objstorage/backends/seaweed.py
26

unrelated to this diff but still, how one can actually use debug messages with this hard-coded in?

36–37

That's how it's done in other modules (it does nothing if that won't end with / iirc).

On the other hand, this will strip spurious "/" [7] [8] so don't know if that's what we want...

$ ipython
Python 3.9.2 (default, Feb 28 2021, 17:03:44)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.27.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: url = "blah"

In [2]: url.rstrip("/")
Out[2]: 'blah'

In [3]: url = "blah/"

In [4]: url.rstrip("/")
Out[4]: 'blah'

In [5]: url.rstrip("//")
Out[5]: 'blah'

In [6]: url.rstrip("/")
Out[6]: 'blah'

In [7]: url = "blah//"

In [8]: url.rstrip("/")
Out[8]: 'blah'
73–76

you already typed it (and it was a bit out of sync with the type for the 2nd param)

107

i'll believe you on that one...

split the diff in 2

D6517 for improvements related to the tests (and a bit the backend itself), and this one for the pathslicing stuff (may not be kept in the end).

Build has FAILED

Patch application report for D6492 (id=23683)

Could not rebase; Attempt merge onto 23b7f81c14...

Updating 23b7f81..57ee6a4
Fast-forward
 mypy.ini                                          |   3 +
 requirements-test.txt                             |   2 +
 requirements.txt                                  |   1 +
 swh/objstorage/backends/seaweed.py                | 127 ++++++++---
 swh/objstorage/tests/test_objstorage_seaweedfs.py | 251 +++++++++++++++++++---
 5 files changed, 321 insertions(+), 63 deletions(-)
Changes applied before test
commit 57ee6a4faa90f4e4204b35c159faf4d2a9b098a7
Author: David Douard <david.douard@sdfa3.org>
Date:   Mon Oct 18 17:32:18 2021 +0200

    Add support for pathslicing in seaweedfs backend
    
    this actually requires quite some refactoring in the seaweedfs backend,
    mostly to support the (somewhat useless) `list_content()` method.

commit 78b1e82b8f87fbf7d6bb034204d5210ed71824cc
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed Oct 20 10:34:41 2021 +0200

    Improve a bit the seaweedfs backend

commit 55ff4b95d3062421a7be6465c6ca07f8cd4909ab
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Oct 8 14:01:24 2021 +0200

    Improve tests of the seaweedfs backend
    
    tests for this backend are now much more realistic,
    mocking a seaweedfs Filer server instead of the `WeedFiler` directly.
    
    And actually fix a bug in WeedFiler.get()...
    
    Also remove tests for each compression algo, keeping only one
    is enought.

Link to build: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/41/
See console output for more information: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/41/console

swh/objstorage/backends/seaweed.py
26

good catch, will explore this later on

36–37

ah yes sure, thx

73–76

yep, let me fix this

107

it's not very well documented in seaweedfs Filer API doc, but it actually comes from go FileMode stuff: https://pkg.go.dev/io/fs#FileMode
Maybe I should add this url in a comment...

Build has FAILED

Patch application report for D6492 (id=23685)

Could not rebase; Attempt merge onto 23b7f81c14...

Updating 23b7f81..c950bf4
Fast-forward
 mypy.ini                                          |   3 +
 requirements-test.txt                             |   2 +
 requirements.txt                                  |   1 +
 swh/objstorage/backends/seaweed.py                | 128 ++++++++---
 swh/objstorage/tests/test_objstorage_seaweedfs.py | 251 +++++++++++++++++++---
 5 files changed, 322 insertions(+), 63 deletions(-)
Changes applied before test
commit c950bf461d8636fce8f16c87c672608049cc4514
Author: David Douard <david.douard@sdfa3.org>
Date:   Mon Oct 18 17:32:18 2021 +0200

    Add support for pathslicing in seaweedfs backend
    
    this actually requires quite some refactoring in the seaweedfs backend,
    mostly to support the (somewhat useless) `list_content()` method.

commit be26482e93e15a2a699f9f35ce63286ccbf737b5
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed Oct 20 10:34:41 2021 +0200

    Improve a bit the seaweedfs backend
    
    - use a persistent session object in WeedFiler,
    - better implementation of listing methods

commit 55ff4b95d3062421a7be6465c6ca07f8cd4909ab
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Oct 8 14:01:24 2021 +0200

    Improve tests of the seaweedfs backend
    
    tests for this backend are now much more realistic,
    mocking a seaweedfs Filer server instead of the `WeedFiler` directly.
    
    And actually fix a bug in WeedFiler.get()...
    
    Also remove tests for each compression algo, keeping only one
    is enought.

Link to build: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/43/
See console output for more information: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/43/console

Build has FAILED

Patch application report for D6492 (id=23692)

Could not rebase; Attempt merge onto 23b7f81c14...

Updating 23b7f81..88702bc
Fast-forward
 mypy.ini                                          |   3 +
 requirements-test.txt                             |   2 +
 requirements.txt                                  |   1 +
 swh/objstorage/backends/seaweed.py                | 129 ++++++++---
 swh/objstorage/tests/test_objstorage_seaweedfs.py | 251 +++++++++++++++++++---
 5 files changed, 323 insertions(+), 63 deletions(-)
Changes applied before test
commit 88702bc4115e68d128656f52b855abf76c8a5f3b
Author: David Douard <david.douard@sdfa3.org>
Date:   Mon Oct 18 17:32:18 2021 +0200

    Add support for pathslicing in seaweedfs backend
    
    this actually requires quite some refactoring in the seaweedfs backend,
    mostly to support the (somewhat useless) `list_content()` method.

commit 96eadebd2d05f06bc87fb5461892e80fa98ebefe
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed Oct 20 10:34:41 2021 +0200

    Improve a bit the seaweedfs backend
    
    - use a persistent session object in WeedFiler,
    - better implementation of listing methods

commit 55ff4b95d3062421a7be6465c6ca07f8cd4909ab
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Oct 8 14:01:24 2021 +0200

    Improve tests of the seaweedfs backend
    
    tests for this backend are now much more realistic,
    mocking a seaweedfs Filer server instead of the `WeedFiler` directly.
    
    And actually fix a bug in WeedFiler.get()...
    
    Also remove tests for each compression algo, keeping only one
    is enought.

Link to build: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/49/
See console output for more information: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/49/console

Build has FAILED

Patch application report for D6492 (id=23694)

Could not rebase; Attempt merge onto 23b7f81c14...

Updating 23b7f81..c616aa6
Fast-forward
 mypy.ini                                          |   3 +
 requirements-test.txt                             |   2 +
 requirements.txt                                  |   1 +
 swh/objstorage/backends/seaweed.py                | 127 ++++++++---
 swh/objstorage/tests/test_objstorage_seaweedfs.py | 251 +++++++++++++++++++---
 5 files changed, 321 insertions(+), 63 deletions(-)
Changes applied before test
commit c616aa6321660c273c618af4056d4e34b1df5f25
Author: David Douard <david.douard@sdfa3.org>
Date:   Mon Oct 18 17:32:18 2021 +0200

    Add support for pathslicing in seaweedfs backend
    
    this actually requires quite some refactoring in the seaweedfs backend,
    mostly to support the (somewhat useless) `list_content()` method.

commit 6269067ca7b3599ecc7cc2436ee72a4f284fdd85
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed Oct 20 10:34:41 2021 +0200

    Improve a bit the seaweedfs backend
    
    - use a persistent session object in WeedFiler,
    - better implementation of listing methods

commit 55ff4b95d3062421a7be6465c6ca07f8cd4909ab
Author: David Douard <david.douard@sdfa3.org>
Date:   Fri Oct 8 14:01:24 2021 +0200

    Improve tests of the seaweedfs backend
    
    tests for this backend are now much more realistic,
    mocking a seaweedfs Filer server instead of the `WeedFiler` directly.
    
    And actually fix a bug in WeedFiler.get()...
    
    Also remove tests for each compression algo, keeping only one
    is enought.

Link to build: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/51/
See console output for more information: https://jenkins.softwareheritage.org/job/DOBJS/job/tests-on-diff/51/console

and this one for the pathslicing stuff (may not be kept in the end).

Do you keep the diff in the end?

probably won't be used