Page MenuHomeSoftware Heritage

cassandra: Add support for ScyllaDB
ClosedPublic

Authored by vlorentz on May 18 2021, 3:37 PM.

Details

Summary

All features work but snapshot_count_branches, because ScyllaDB does not
support user-defined aggregates yet.

Migration tests hang when run after the regular tests, but I can't
figure out why. This should not be an issue for now, as we won't run
Scylla tests on the CI.

Depends on D5748.

Test Plan

It (kind of) works on my machine©

Diff Detail

Repository
rDSTO Storage manager
Branch
scylla
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 21561
Build 33497: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 33496: arc lint + arc unit

Event Timeline

Build was aborted

Patch application report for D5750 (id=20550)

Could not rebase; Attempt merge onto 53c21d4c7b...

Updating 53c21d4c..8794ec5b
Fast-forward
 swh/storage/cassandra/cql.py        | 22 ++++++++--
 swh/storage/cassandra/schema.py     | 80 +++++++++++++++++++++---------------
 swh/storage/tests/storage_tests.py  |  4 +-
 swh/storage/tests/test_cassandra.py | 82 +++++++++++++++++++++++++++----------
 4 files changed, 129 insertions(+), 59 deletions(-)
Changes applied before test
commit 8794ec5bc364db79ce9b5e43ebda6688c5d2657e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue May 18 15:35:41 2021 +0200

    cassandra: Add support for ScyllaDB
    
    All features work but snapshot_count_branches, because ScyllaDB does not
    support user-defined aggregates yet.

commit eb31e6244236fc29aeb31234af25e7b5a17ac5fd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue May 18 15:34:17 2021 +0200

    tests: Make test parameters order deterministic, so they don't crash pytest-xdist
    
    pytest-xdist expects the parameters to be in the same order
    in all processes.

commit 916ab07cbf18a04867857eecbad05ea1052b46b2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue May 18 15:33:19 2021 +0200

    test_cassandra: Improve error when the process is started but not listening

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1330/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1330/console

Harbormaster returned this revision to the author for changes because remote builds failed.May 18 2021, 4:25 PM
Harbormaster failed remote builds in B21527: Diff 20550!

Build has FAILED

Patch application report for D5750 (id=20550)

Could not rebase; Attempt merge onto 0ed4a975d1...

Merge made by the 'recursive' strategy.
 swh/storage/cassandra/cql.py        | 22 ++++++++--
 swh/storage/cassandra/schema.py     | 80 +++++++++++++++++++++---------------
 swh/storage/tests/storage_tests.py  |  4 +-
 swh/storage/tests/test_cassandra.py | 82 +++++++++++++++++++++++++++----------
 4 files changed, 129 insertions(+), 59 deletions(-)
Changes applied before test
commit 1f199582d73597fb7e3900256bf389d142e5a1d0
Merge: 0ed4a975 8794ec5b
Author: Jenkins user <jenkins@localhost>
Date:   Wed May 19 06:51:36 2021 +0000

    Merge branch 'diff-target' into HEAD

commit 8794ec5bc364db79ce9b5e43ebda6688c5d2657e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue May 18 15:35:41 2021 +0200

    cassandra: Add support for ScyllaDB
    
    All features work but snapshot_count_branches, because ScyllaDB does not
    support user-defined aggregates yet.

commit eb31e6244236fc29aeb31234af25e7b5a17ac5fd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue May 18 15:34:17 2021 +0200

    tests: Make test parameters order deterministic, so they don't crash pytest-xdist
    
    pytest-xdist expects the parameters to be in the same order
    in all processes.

commit 916ab07cbf18a04867857eecbad05ea1052b46b2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue May 18 15:33:19 2021 +0200

    test_cassandra: Improve error when the process is started but not listening

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1331/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1331/console

Build is green

Patch application report for D5750 (id=20558)

Could not rebase; Attempt merge onto 0ed4a975d1...

Merge made by the 'recursive' strategy.
 swh/storage/cassandra/cql.py        | 22 ++++++++--
 swh/storage/cassandra/schema.py     | 79 +++++++++++++++++++--------------
 swh/storage/tests/storage_tests.py  |  4 +-
 swh/storage/tests/test_cassandra.py | 87 +++++++++++++++++++++++++++----------
 4 files changed, 132 insertions(+), 60 deletions(-)
Changes applied before test
commit a7f4f845df3c36c60b3c4e2803bd8db064a50c4b
Merge: 0ed4a975 fe8bbea5
Author: Jenkins user <jenkins@localhost>
Date:   Wed May 19 07:31:40 2021 +0000

    Merge branch 'diff-target' into HEAD

commit fe8bbea582886b986b7a43bd1f9f5367be840b5c
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue May 18 15:35:41 2021 +0200

    cassandra: Add partial support for ScyllaDB
    
    All features work but snapshot_count_branches, because ScyllaDB does not
    support user-defined aggregates yet.
    
    Migration tests hang when run after the regular tests, but I can't
    figure out why. This should not be an issue for now, as we won't run
    Scylla tests on the CI.

commit eb31e6244236fc29aeb31234af25e7b5a17ac5fd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue May 18 15:34:17 2021 +0200

    tests: Make test parameters order deterministic, so they don't crash pytest-xdist
    
    pytest-xdist expects the parameters to be in the same order
    in all processes.

commit 916ab07cbf18a04867857eecbad05ea1052b46b2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue May 18 15:33:19 2021 +0200

    test_cassandra: Improve error when the process is started but not listening

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1333/ for more details.

douardda added a subscriber: douardda.

please also add a piece of documentation somewhere on this alt. "cassandra" backend config (aka. "how to use scylladb" or something).

swh/storage/cassandra/schema.py
5

It's not clear to me: who is responsible for setting this variable? Is there a big "feature switch" somewhere else that handles this?

This revision now requires changes to proceed.May 19 2021, 10:51 AM
swh/storage/cassandra/schema.py
5

No, for now you need to manually change the code to use the schema.

Build is green

Patch application report for D5750 (id=20563)

Rebasing onto a92a96846b...

Current branch diff-target is up to date.
Changes applied before test
commit a9d6f7e004a0189ccbe4bd7b9df61ce31ac084c5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue May 18 15:35:41 2021 +0200

    cassandra: Add partial support for ScyllaDB
    
    All features work but snapshot_count_branches, because ScyllaDB does not
    support user-defined aggregates yet.
    
    Migration tests hang when run after the regular tests, but I can't
    figure out why. This should not be an issue for now, as we won't run
    Scylla tests on the CI.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1337/ for more details.

document + add an envvar switch.

Build is green

Patch application report for D5750 (id=20596)

Rebasing onto a92a96846b...

Current branch diff-target is up to date.
Changes applied before test
commit 73588c64de7137c1a04733010e0fe438adc3f2c8
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue May 18 15:35:41 2021 +0200

    cassandra: Add partial support for ScyllaDB
    
    All features work but snapshot_count_branches, because ScyllaDB does not
    support user-defined aggregates yet.
    
    Migration tests hang when run after the regular tests, but I can't
    figure out why. This should not be an issue for now, as we won't run
    Scylla tests on the CI.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1338/ for more details.

consider the minor fix in the README file, otherwise ok, thanks!

README.md
223

maybe prefer 'environment variable' here (I find 'configuration variable' confusing, looks like something one has to put in the swh config file)

This revision is now accepted and ready to land.May 21 2021, 10:27 AM
README.md
223

yes indeed

Build is green

Patch application report for D5750 (id=20614)

Rebasing onto 8e3731ace4...

First, rewinding head to replay your work on top of it...
Applying: cassandra: Add partial support for ScyllaDB
Changes applied before test
commit 041fe7b22cd115ca90544389372cc77f844889f2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue May 18 15:35:41 2021 +0200

    cassandra: Add partial support for ScyllaDB
    
    All features work but snapshot_count_branches, because ScyllaDB does not
    support user-defined aggregates yet.
    
    Migration tests hang when run after the regular tests, but I can't
    figure out why. This should not be an issue for now, as we won't run
    Scylla tests on the CI.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1340/ for more details.

This revision was landed with ongoing or failed builds.May 21 2021, 12:15 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D5750 (id=20615)

Rebasing onto 1d880a526c...

First, rewinding head to replay your work on top of it...
Fast-forwarded diff-target to base-revision-1341-D5750.
Changes applied before test

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1341/ for more details.