Page MenuHomeSoftware Heritage

cql: Explicitely set protocol version supported by cassandra-driver
AbandonedPublic

Authored by anlambert on May 10 2021, 3:48 PM.

Details

Reviewers
None
Group Reviewers
Reviewers
Summary

Cassandra 4.0rc1 moved protocol version 5 out of beta but it is only
supported in cassandra-driver >= 3.25.0.

This is the reason why debian package build of swh-storage is currently failing.

So ensure to explicitely set the maximum protocol version supported
by the driver to avoid errors when using cassandra-driver < 3.25.0
and cassandra >= 4.0rc1.

Diff Detail

Repository
rDSTO Storage manager
Branch
cassandra-protocol-version-fix
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 21419
Build 33271: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 33270: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D5722 (id=20442)

Rebasing onto 761709957a...

Current branch diff-target is up to date.
Changes applied before test
commit d3e5ae6aa32d5fa7e90132d56c3f95e3662caccc
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Mon May 10 15:22:41 2021 +0200

    cql: Explicitely set protocol version supported by cassandra-driver
    
    Cassandra 4.0rc1 moved protocol version 5 out of beta but it is only
    supported in cassandra-driver >= 3.25.0.
    
    So ensure to explicitely set the maximum protocol version supported
    by the driver to avoid errors when using cassandra-driver < 3.25.0
    and cassandra >= 4.0rc1.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1319/ for more details.

olasd added inline comments.
swh/storage/cassandra/cql.py
80

It's weird that the function is called "lower"_supported. Maybe the comment should be updated to reflect that. What does the magic "10" do?

swh/storage/cassandra/cql.py
80

Yes, the naming is wrong, I did not test that function prior today because of that.

Current beta protocol version is 6, I put 10 as previous_version to avoid updating that code at each protocol version bump.

It's very unclear to me what is going on... The driver is supposed to find the right version on its own; and it seems weird it would need this kind of boilerplate to work...

I also don't understand why you mention protocol versions 4 and 5, but the error is:

16:08:51 E cassandra.cluster.NoHostAvailable: ('Unable to connect to any servers', {'127.0.0.1:37715': ProtocolError('This version of the driver does not support protocol version 6')})

swh/storage/cassandra/cql.py
80

Then use ProtocolVersion.MAX_SUPPORTED+1 instead of 10.

It's very unclear to me what is going on... The driver is supposed to find the right version on its own; and it seems weird it would need this kind of boilerplate to work...

I also don't understand why you mention protocol versions 4 and 5, but the error is:

16:08:51 E cassandra.cluster.NoHostAvailable: ('Unable to connect to any servers', {'127.0.0.1:37715': ProtocolError('This version of the driver does not support protocol version 6')})

I checked the protocol_version attribute value of the Cluster class automatically retrieved by the driver when you do not set protocol_version parameter explicitely:

  • when using cassandra-driver < 3.25.0 and cassandra < 4.0rc1, it was set to 4
  • when using cassandra-driver >= 3.25.0 and cassandra==4.0rc1, it was set to 5

Starting cassandra 4.0rc1, cassandra-driver < 3.25.0 fails to automatically set the protocol to version 4.
My guess is there is some kind of regression in cassandra 4.0rc1 that messes up with protocol
version detection by old driver versions.

swh/storage/cassandra/cql.py
80

It does not work as in that case ProtocolVersion.DSE_V2 is picked and we got the same error.

We could use that version as previous_version but DSE_V2 was introduced in cassandra-driver 3.21.0 and we use cassandra-driver 3.20.2 on buster.

swh/storage/cassandra/cql.py
80

Maybe the right solution would be to backport cassandra-driver 3.25.0 to unstable and buster.

For the record, I managed to build the packages locally.

swh/storage/cassandra/cql.py
80

s/backport/update/ , as cassandra-driver 3.20.0 is already from our own repos.

And updating is the best solution IMO

swh/storage/cassandra/cql.py
80

Yeah, definitely. @anlambert, can you push your changes for the cassandra-driver 3.25.0 packages - at least the unstable version? I'm happy to review and push them to the repo.

swh/storage/cassandra/cql.py
80

@olasd, I did not manage to create a diff as the debian/unstable-swh branch was missing in our python-cassandra-driver repository so I pushed my debian/unstable-swh branch instead. Feel free to improve its current state before building the package.

Nevertheless, a call to gbp buildpackage --git-builder=sbuild -As --no-clean-source should work.

For the record, to build the package on buster, that simple diff is enough.

diff --git a/debian/changelog b/debian/changelog
index c31fbc9..058c8e1 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+python-cassandra-driver (3.25.0-1+swh1~bpo10+1) buster-swh; urgency=medium
+
+  * Rebuild for buster-swh
+
+ -- Antoine Lambert <antoine.lambert@inria.fr>  Thu, 06 May 2021 18:56:42 +0200
+
 python-cassandra-driver (3.25.0-1+swh1) unstable-swh; urgency=medium
 
   * New upstream version 3.25.0
diff --git a/debian/control b/debian/control
index b3aa4a9..dfa0b7b 100644
--- a/debian/control
+++ b/debian/control
@@ -6,7 +6,7 @@ Uploaders:
  Emmanuel Arias <eamanu@yaerobi.com>,
 Build-Depends:
  cython3,
- debhelper-compat (= 13),
+ debhelper-compat (= 12),
  dh-python,
  libev-dev,
  python3-all-dbg,
diff --git a/debian/gbp.conf b/debian/gbp.conf
index 7f63e3e..60babf3 100644
--- a/debian/gbp.conf
+++ b/debian/gbp.conf
@@ -1,5 +1,5 @@
 [DEFAULT]
 upstream-branch=debian/upstream
 upstream-tag=debian/upstream/%(version)s
-debian-branch=debian/unstable-swh
+debian-branch=debian/buster-swh
 pristine-tar=True

Nervertheless, buster misses the python3-pure-sasl dependency that needs to be backported.
I also managed to build it locally.

Thanks, @anlambert ! I've ended up pulling the cassandra-driver packaging from Debian, updating that (which was basically a gbp import-orig --pristine-tar + dch), and pushing to the repo on the forge.

I've also pulled the packaging of the new dependency from Debian, adapted it for git-buildpackage, and built that in our repo.

I've configured jenkins to get autobuilds of these packages. Finally, I've triggered a new build of swh.storage through a new tag, which worked.

So I guess this diff can be abandoned now.

cassandra-driver 3.25.0 is now in our debian repository, abandoning this.