Page MenuHomeSoftware Heritage

cassandra: Add annotations to make mypy actually type-check calls to CqlRunner.
ClosedPublic

Authored by vlorentz on Aug 10 2020, 9:46 PM.

Details

Summary

All methods of CqlRunner were decorated, which prevented mypy from doing
anything useful.

As I finally found a way to type the decorator (using
mypy_extensions.NamedArg), I can finally make mypy aware of the methods' types.

This commit (as well as all three of the last commits) also fixes issues found
by mypy thanks to this.

Depends on D3758.

Diff Detail

Repository
rDSTO Storage manager
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build has FAILED

Patch application report for D3759 (id=13223)

Could not rebase; Attempt merge onto 7d332f5967...

Updating 7d332f59..9cc7875e
Fast-forward
 requirements.txt                    |   1 +
 swh/storage/cassandra/common.py     |   8 +-
 swh/storage/cassandra/converters.py |  37 ++-
 swh/storage/cassandra/cql.py        | 561 +++++++++++++++++-------------------
 swh/storage/cassandra/model.py      | 196 +++++++++++++
 swh/storage/cassandra/schema.py     |   1 -
 swh/storage/cassandra/storage.py    | 223 +++++++-------
 swh/storage/in_memory.py            |   4 +-
 swh/storage/interface.py            |   4 +-
 swh/storage/storage.py              |   2 +-
 swh/storage/tests/test_cassandra.py |  30 +-
 11 files changed, 640 insertions(+), 427 deletions(-)
 create mode 100644 swh/storage/cassandra/model.py
Changes applied before test
commit 9cc7875e0ab8be5f4c4ebdc3ea7d61bc93182938
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:46:27 2020 +0200

    cassandra: Add annotations to make mypy actually type-check calls to CqlRunner.
    
    All methods of CqlRunner were decorated, which prevented mypy from doing
    anything useful.
    
    As I finally found a way to type the decorator (using
    mypy_extensions.NamedArg), I can finally make mypy aware of the methods' types.
    
    This commit (as well as all three of the last commits) also fixes issues found
    by mypy thanks to this.

commit b967ca48ce6f97e9315bc25d5b8a48fb0bf59340
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:42:59 2020 +0200

    cassandra.storage: remove dead code

commit 564a4d2590ae7f468f2b0b7377d0b2d91eeaac56
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:36:41 2020 +0200

    Fix type of snapshot_count_branches.

commit 3d3ac1605129097740c24ccc48a8519a1b9b78a3
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:33:00 2020 +0200

    cassandra.cql: Use static dataclasses instead of generating namedtuples on the fly.
    
    Before this commit, python-cassandra used the default row factory,
    which creates anonymous named tuple on each query, which makes it
    impossible to type CqlRunner properly.
    
    This commit replaces the row factory with dict_factory, which creates
    only dicts, and converts them to well-defined dataclasses.
    Additionally, this stop leaking python-cassandra internals to
    cassandra.storage.
    
    This also has some great side-effects:
    
    * methods of CqlRunner are now consistent with each other (eg. _add_one
      methods used to be a mix of objects, dictionaries, and taking each value
      as argument)
    * it will allow me to deduplicate more codes in further commits (I
      already deduplicated insertions methods to use self._add_one, as
      it was meant on the initial write of this class)
    * CqlRunner no longer needs to define lists with column names, they are
      automatically detected from the dataclasses

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/723/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/723/console

disable false positive of flake8 3.8

also add type to _prepared_exists_statement.

Build has FAILED

Patch application report for D3759 (id=13230)

Could not rebase; Attempt merge onto 7d332f5967...

Updating 7d332f59..c45d27c8
Fast-forward
 requirements.txt                    |   1 +
 swh/storage/cassandra/common.py     |   8 +-
 swh/storage/cassandra/converters.py |  37 ++-
 swh/storage/cassandra/cql.py        | 561 +++++++++++++++++-------------------
 swh/storage/cassandra/model.py      | 196 +++++++++++++
 swh/storage/cassandra/schema.py     |   1 -
 swh/storage/cassandra/storage.py    | 223 +++++++-------
 swh/storage/in_memory.py            |   4 +-
 swh/storage/interface.py            |   4 +-
 swh/storage/storage.py              |   2 +-
 swh/storage/tests/test_cassandra.py |  30 +-
 11 files changed, 640 insertions(+), 427 deletions(-)
 create mode 100644 swh/storage/cassandra/model.py
Changes applied before test
commit c45d27c82b11567fb765ee202411363bd81a3638
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:46:27 2020 +0200

    cassandra: Add annotations to make mypy actually type-check calls to CqlRunner.
    
    All methods of CqlRunner were decorated, which prevented mypy from doing
    anything useful.
    
    As I finally found a way to type the decorator (using
    mypy_extensions.NamedArg), I can finally make mypy aware of the methods' types.
    
    This commit (as well as all three of the last commits) also fixes issues found
    by mypy thanks to this.

commit b11b890894e9112d403a3fe372cfd639d59b6953
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:42:59 2020 +0200

    cassandra.storage: remove dead code

commit f954714d95fa3e2124fbeddd3e81ad09e18ca313
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:36:41 2020 +0200

    Fix type of snapshot_count_branches.

commit 319de05d5fbebbebb47532209490a2f8380f5343
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:33:00 2020 +0200

    cassandra.cql: Use static dataclasses instead of generating namedtuples on the fly.
    
    Before this commit, python-cassandra used the default row factory,
    which creates anonymous named tuple on each query, which makes it
    impossible to type CqlRunner properly.
    
    This commit replaces the row factory with dict_factory, which creates
    only dicts, and converts them to well-defined dataclasses.
    Additionally, this stop leaking python-cassandra internals to
    cassandra.storage.
    
    This also has some great side-effects:
    
    * methods of CqlRunner are now consistent with each other (eg. _add_one
      methods used to be a mix of objects, dictionaries, and taking each value
      as argument)
    * it will allow me to deduplicate more codes in further commits (I
      already deduplicated insertions methods to use self._add_one, as
      it was meant on the initial write of this class)
    * CqlRunner no longer needs to define lists with column names, they are
      automatically detected from the dataclasses

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/729/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/729/console

Build is green

Patch application report for D3759 (id=13231)

Could not rebase; Attempt merge onto 7d332f5967...

Updating 7d332f59..124e2988
Fast-forward
 requirements.txt                    |   1 +
 swh/storage/cassandra/common.py     |   8 +-
 swh/storage/cassandra/converters.py |  37 ++-
 swh/storage/cassandra/cql.py        | 561 +++++++++++++++++-------------------
 swh/storage/cassandra/model.py      | 196 +++++++++++++
 swh/storage/cassandra/schema.py     |   1 -
 swh/storage/cassandra/storage.py    | 223 +++++++-------
 swh/storage/in_memory.py            |   4 +-
 swh/storage/interface.py            |   4 +-
 swh/storage/storage.py              |   2 +-
 swh/storage/tests/test_cassandra.py |  30 +-
 11 files changed, 640 insertions(+), 427 deletions(-)
 create mode 100644 swh/storage/cassandra/model.py
Changes applied before test
commit 124e298827cd11c066a4ff6aa87009a3a8438306
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:46:27 2020 +0200

    cassandra: Add annotations to make mypy actually type-check calls to CqlRunner.
    
    All methods of CqlRunner were decorated, which prevented mypy from doing
    anything useful.
    
    As I finally found a way to type the decorator (using
    mypy_extensions.NamedArg), I can finally make mypy aware of the methods' types.
    
    This commit (as well as all three of the last commits) also fixes issues found
    by mypy thanks to this.

commit b11b890894e9112d403a3fe372cfd639d59b6953
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:42:59 2020 +0200

    cassandra.storage: remove dead code

commit f954714d95fa3e2124fbeddd3e81ad09e18ca313
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:36:41 2020 +0200

    Fix type of snapshot_count_branches.

commit 319de05d5fbebbebb47532209490a2f8380f5343
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:33:00 2020 +0200

    cassandra.cql: Use static dataclasses instead of generating namedtuples on the fly.
    
    Before this commit, python-cassandra used the default row factory,
    which creates anonymous named tuple on each query, which makes it
    impossible to type CqlRunner properly.
    
    This commit replaces the row factory with dict_factory, which creates
    only dicts, and converts them to well-defined dataclasses.
    Additionally, this stop leaking python-cassandra internals to
    cassandra.storage.
    
    This also has some great side-effects:
    
    * methods of CqlRunner are now consistent with each other (eg. _add_one
      methods used to be a mix of objects, dictionaries, and taking each value
      as argument)
    * it will allow me to deduplicate more codes in further commits (I
      already deduplicated insertions methods to use self._add_one, as
      it was meant on the initial write of this class)
    * CqlRunner no longer needs to define lists with column names, they are
      automatically detected from the dataclasses

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/730/ for more details.

Build is green

Patch application report for D3759 (id=13232)

Could not rebase; Attempt merge onto 7d332f5967...

Updating 7d332f59..92e7a21e
Fast-forward
 requirements.txt                    |   1 +
 swh/storage/cassandra/common.py     |   8 +-
 swh/storage/cassandra/converters.py |  37 ++-
 swh/storage/cassandra/cql.py        | 568 +++++++++++++++++-------------------
 swh/storage/cassandra/model.py      | 196 +++++++++++++
 swh/storage/cassandra/schema.py     |   1 -
 swh/storage/cassandra/storage.py    | 229 ++++++++-------
 swh/storage/in_memory.py            |   4 +-
 swh/storage/interface.py            |   4 +-
 swh/storage/storage.py              |   2 +-
 swh/storage/tests/test_cassandra.py |  30 +-
 11 files changed, 649 insertions(+), 431 deletions(-)
 create mode 100644 swh/storage/cassandra/model.py
Changes applied before test
commit 92e7a21e8b7360e4186c29e475f013c933bee8aa
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:46:27 2020 +0200

    cassandra: Add annotations to make mypy actually type-check calls to CqlRunner.
    
    All methods of CqlRunner were decorated, which prevented mypy from doing
    anything useful.
    
    As I finally found a way to type the decorator (using
    mypy_extensions.NamedArg), I can finally make mypy aware of the methods' types.
    
    This commit (as well as all three of the last commits) also fixes issues found
    by mypy thanks to this.

commit b11b890894e9112d403a3fe372cfd639d59b6953
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:42:59 2020 +0200

    cassandra.storage: remove dead code

commit f954714d95fa3e2124fbeddd3e81ad09e18ca313
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:36:41 2020 +0200

    Fix type of snapshot_count_branches.

commit 319de05d5fbebbebb47532209490a2f8380f5343
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Aug 10 21:33:00 2020 +0200

    cassandra.cql: Use static dataclasses instead of generating namedtuples on the fly.
    
    Before this commit, python-cassandra used the default row factory,
    which creates anonymous named tuple on each query, which makes it
    impossible to type CqlRunner properly.
    
    This commit replaces the row factory with dict_factory, which creates
    only dicts, and converts them to well-defined dataclasses.
    Additionally, this stop leaking python-cassandra internals to
    cassandra.storage.
    
    This also has some great side-effects:
    
    * methods of CqlRunner are now consistent with each other (eg. _add_one
      methods used to be a mix of objects, dictionaries, and taking each value
      as argument)
    * it will allow me to deduplicate more codes in further commits (I
      already deduplicated insertions methods to use self._add_one, as
      it was meant on the initial write of this class)
    * CqlRunner no longer needs to define lists with column names, they are
      automatically detected from the dataclasses

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/731/ for more details.

anlambert added inline comments.
swh/storage/cassandra/cql.py
33

mypy documentation says that the Extended Callable types feature is deprecated and callbacks protocol should be used instead.

Nevertheless, I did not find an alternative solution based on protocol so far.

anlambert added inline comments.
swh/storage/cassandra/cql.py
33

Nevermind, using Protocol is for specific function signatures and its use is too complex here.

I am accepting the diff then.

Do not forget to add the new python3-mypy-extensions dependency in debian/control.

This revision is now accepted and ready to land.Aug 11 2020, 2:20 PM