Page MenuHomeSoftware Heritage

cassandra: add a TABLE class attribute to row classes, and use it to deduplicate prepared statements logic.
ClosedPublic

Authored by vlorentz on Aug 11 2020, 11:38 AM.

Details

Summary

It will also be used in a future commit to generate 'select' prepared statements.

Depends on D3759.

Diff Detail

Repository
rDSTO Storage manager
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D3760 (id=13233)

Could not rebase; Attempt merge onto 7d332f5967...

Updating 7d332f59..0b78b390
Fast-forward
 requirements.txt                    |   1 +
 swh/storage/cassandra/common.py     |   8 +-
 swh/storage/cassandra/converters.py |  37 ++-
 swh/storage/cassandra/cql.py        | 570 +++++++++++++++++-------------------
 swh/storage/cassandra/model.py      | 228 +++++++++++++++
 swh/storage/cassandra/schema.py     |   1 -
 swh/storage/cassandra/storage.py    | 229 ++++++++-------
 swh/storage/in_memory.py            |   4 +-
 swh/storage/interface.py            |   4 +-
 swh/storage/storage.py              |   2 +-
 swh/storage/tests/test_cassandra.py |  30 +-
 11 files changed, 681 insertions(+), 433 deletions(-)
 create mode 100644 swh/storage/cassandra/model.py
Changes applied before test
commit 0b78b3908248ade3ea7c1dabb14613d305448693
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Tue Aug 11 11:37:13 2020 +0200

    cassandra: add a TABLE class attribute to row classes, and use it to deduplicate prepared statements logic.
    
    It will also be used in a future commit to generate 'select' prepared statements.

commit 92e7a21e8b7360e4186c29e475f013c933bee8aa
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:46:27 2020 +0200

    cassandra: Add annotations to make mypy actually type-check calls to CqlRunner.
    
    All methods of CqlRunner were decorated, which prevented mypy from doing
    anything useful.
    
    As I finally found a way to type the decorator (using
    mypy_extensions.NamedArg), I can finally make mypy aware of the methods' types.
    
    This commit (as well as all three of the last commits) also fixes issues found
    by mypy thanks to this.

commit b11b890894e9112d403a3fe372cfd639d59b6953
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:42:59 2020 +0200

    cassandra.storage: remove dead code

commit f954714d95fa3e2124fbeddd3e81ad09e18ca313
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:36:41 2020 +0200

    Fix type of snapshot_count_branches.

commit 319de05d5fbebbebb47532209490a2f8380f5343
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:33:00 2020 +0200

    cassandra.cql: Use static dataclasses instead of generating namedtuples on the fly.
    
    Before this commit, python-cassandra used the default row factory,
    which creates anonymous named tuple on each query, which makes it
    impossible to type CqlRunner properly.
    
    This commit replaces the row factory with dict_factory, which creates
    only dicts, and converts them to well-defined dataclasses.
    Additionally, this stop leaking python-cassandra internals to
    cassandra.storage.
    
    This also has some great side-effects:
    
    * methods of CqlRunner are now consistent with each other (eg. _add_one
      methods used to be a mix of objects, dictionaries, and taking each value
      as argument)
    * it will allow me to deduplicate more codes in further commits (I
      already deduplicated insertions methods to use self._add_one, as
      it was meant on the initial write of this class)
    * CqlRunner no longer needs to define lists with column names, they are
      automatically detected from the dataclasses

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/732/ for more details.

This revision is now accepted and ready to land.Aug 11 2020, 2:22 PM