Page MenuHomeSoftware Heritage

cassandra.storage: Use next token for pagination instead of computing it
ClosedPublic

Authored by ardumont on Aug 6 2020, 1:10 PM.

Diff Detail

Repository
rDSTO Storage manager
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 14387
Build 22117: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 22116: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D3723 (id=13118)

Rebasing onto be9e958d6c...

Current branch diff-target is up to date.
Changes applied before test
commit c3fb51611ca3336f528063a7e85b9ded12a2936e
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Aug 6 12:55:23 2020 +0200

    cassandra.storage: Use the next token for pagination
    
    Related to T2518

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/702/ for more details.

vlorentz added a subscriber: vlorentz.

Please reword the commit message to include the text of the task

This revision is now accepted and ready to land.Aug 6 2020, 2:30 PM
vlorentz requested changes to this revision.Aug 6 2020, 2:32 PM

oh actually, please add a test. There's already a collision test for cassandra using mocking somewhere, you can copy it

This revision now requires changes to proceed.Aug 6 2020, 2:32 PM
  • cassandra.storage: Use next token for pagination instead of computing it
  • wip: add tests for a collision
  • rework commit message

Build is green

Patch application report for D3723 (id=13127)

Rebasing onto be9e958d6c...

Current branch diff-target is up to date.
Changes applied before test
commit 6ca28641dae614253bfa874c243621c2aa3278b6
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Aug 6 16:42:02 2020 +0200

    wip: add tests for a collision

commit 6909d507f9e0966346ff9c4c807a1478d419f19f
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Aug 6 12:55:23 2020 +0200

    cassandra.storage: Use next token for pagination instead of computing it
    
    The existing implementation computed the next token using the tok+1. It's not
    good enough as it would skip some contents in case of collision on
    tok (collisions exist as the tok is a noncryptographic hash on 32 bits).
    
    Related to T2518

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/703/ for more details.

swh/storage/tests/test_cassandra.py
354

@vlorentz is that good enough?

366

jsyk, here the partition is not really relevant as we always return everything in the mock functions...
(I kept the 0, 1 voluntarily so be in accordance with the code nonetheless)

vlorentz added inline comments.
swh/storage/tests/test_cassandra.py
340

rename the variable "rows", as tokens are only the keys, not the values.

This revision is now accepted and ready to land.Aug 6 2020, 6:41 PM
  • Adapt according to remark
  • squash commits
  • improve a bit commit wording

Build has FAILED

Patch application report for D3723 (id=13128)

Rebasing onto be9e958d6c...

Current branch diff-target is up to date.
Changes applied before test
commit 653b1f9ada2fb9c7b2189f41fa277fc55c7eab0c
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Aug 6 12:55:23 2020 +0200

    cassandra.storage: Use next token for pagination instead of computing it
    
    The existing implementation computed the next token using the tok (adding 1).
    It's not good enough as it would skip some contents in case of collision on
    tok (collisions exist as the tok here is a noncryptographic hash on 32 bits).
    
    Related to T2518

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/704/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/704/console

ardumont retitled this revision from cassandra.storage: Use next token for pagination to cassandra.storage: Use next token for pagination instead of computing it.Aug 6 2020, 7:05 PM

Build has FAILED

Patch application report for D3723 (id=13128)

Rebasing onto be9e958d6c...

Current branch diff-target is up to date.
Changes applied before test
commit 653b1f9ada2fb9c7b2189f41fa277fc55c7eab0c
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Aug 6 12:55:23 2020 +0200

    cassandra.storage: Use next token for pagination instead of computing it
    
    The existing implementation computed the next token using the tok (adding 1).
    It's not good enough as it would skip some contents in case of collision on
    tok (collisions exist as the tok here is a noncryptographic hash on 32 bits).
    
    Related to T2518

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/705/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/705/console

Build has FAILED

Build is now broken due to latest model (even master storage build is broken)
D3727 is one probable fix (align model objects)

Build is green

Patch application report for D3723 (id=13128)

Rebasing onto be9e958d6c...

Current branch diff-target is up to date.
Changes applied before test
commit 653b1f9ada2fb9c7b2189f41fa277fc55c7eab0c
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Aug 6 12:55:23 2020 +0200

    cassandra.storage: Use next token for pagination instead of computing it
    
    The existing implementation computed the next token using the tok (adding 1).
    It's not good enough as it would skip some contents in case of collision on
    tok (collisions exist as the tok here is a noncryptographic hash on 32 bits).
    
    Related to T2518

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/706/ for more details.