Page MenuHomeSoftware Heritage

cassandra.cql: reorder origin_visit_* and origin_visit_status_* methods to be properly grouped.
ClosedPublic

Authored by vlorentz on Aug 12 2020, 6:02 PM.

Diff Detail

Repository
rDSTO Storage manager
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 14550
Build 22398: Phabricator diff pipeline on jenkins
Build 22397: arc lint + arc unit

Event Timeline

Build was aborted

Patch application report for D3779 (id=13264)

Could not rebase; Attempt merge onto 6675286c4f...

Updating 6675286c..35a316e8
Fast-forward
 swh/storage/cassandra/cql.py        |  131 ++---
 swh/storage/cassandra/model.py      |   35 +-
 swh/storage/cassandra/storage.py    |   16 +-
 swh/storage/db.py                   |   17 -
 swh/storage/in_memory.py            | 1049 +++++++++++++----------------------
 swh/storage/interface.py            |   31 --
 swh/storage/storage.py              |   12 -
 swh/storage/tests/test_in_memory.py |  106 +++-
 swh/storage/tests/test_storage.py   |  125 -----
 swh/storage/writer.py               |    4 +-
 10 files changed, 590 insertions(+), 936 deletions(-)
Changes applied before test
commit 35a316e8cdbf6a4d7ce5e751f44e72dfb1ecf3c5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 17:24:58 2020 +0200

    cassandra.cql: reorder origin_visit_* and origin_visit_status_* methods to be properly grouped.

commit d361a24ee14cfc8e3418362ee73bf6dccb360b01
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 17:23:55 2020 +0200

    Remove unused arguments of CqlRunner.origin_visit_status_get.

commit 861018cbdc5ffb105e96b0131e05ecb79892a0f8
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 16:45:17 2020 +0200

    in_memory: Remove InMemoryStorage.origin_* and implement InMemoryCqlRunner.origin_*

commit 56bbeebfcef9979fb9a680c20e7bfbf0031cb71b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 16:21:01 2020 +0200

    in_memory: Remove InMemoryStorage.snapshot_* and implement InMemoryCqlRunner.snapshot_*

commit 117d7edd2d23c408d1cfa8a6324e868e7e54b0c7
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 15:38:14 2020 +0200

    Remove endpoint snapshot_get_by_origin_visit.
    
    It's not used anywhere.

commit 9980e80a9f3ab1e41b2c3579f4fa1b0e915cc83c
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 15:27:15 2020 +0200

    in_memory: Remove InMemoryStorage.release_* and implement InMemoryCqlRunner.release_*

commit f3de9492b2c3cf87cb710817dc98421edb3ab077
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 15:23:40 2020 +0200

    in_memory: Remove InMemoryStorage.revision_* and implement InMemoryCqlRunner.revision_*

commit 0850cb85b5af4286c0fdc916ffaa5fb78075ec51
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 14:40:20 2020 +0200

    in_memory: Remove InMemoryStorage.directory_* and implement InMemoryCqlRunner.directory_*

commit 43e93f6bbb256f79510ab95d6929e539a2e5ff98
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 14:01:22 2020 +0200

    in_memory: Remove InMemoryStorage.skipped_content_* and implement InMemoryCqlRunner.skipped_content_*

commit 7f80ba884d1de90bc97bb3124d17707b3fc737a5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:54:32 2020 +0200

    in_memory: Remove InMemoryStorage.content_* and implement InMemoryCqlRunner.content_*

commit 5f37ca48fe645bcb5512a374baf2fb170621cd0e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:44:50 2020 +0200

    in_memory: make object_find_by_sha1_git merge results from the CassandraStorage.
    
    For now this has no effect. However, in the near future, the CassandraStorage
    will be in charge of some object types, so we need to merge objects
    found in the CassandraStorage and those found directly in the InMemoryStorage.

commit 385364bdb4fbc6af87c433e10fd9a68c247fb2ab
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:41:42 2020 +0200

    in_memory: Add InMemoryCqlRunner, a class that emulates cassandra.cql.CqlRunner without Cassandra.
    
    For now it's only used for object counters; but future commits will
    progressively move the in-memory's storage features to it.

commit 6a3ef9a907c76e8a99c06690af3dffb7d93318a8
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:34:30 2020 +0200

    Make InMemoryStorage inherit from CassandraStorage.
    
    This has no effect for now, other than deduplicating a method
    and causing a name clash.

commit d9bfed69178e07f882bade35b50fa99a2130211c
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:24:33 2020 +0200

    in_memory: Add class Table, which emulates a Cassandra table.
    
    It will be used to implement the in-memory storage as a backend for the
    cassandra storage.

commit fc80a7f06ae8e1fccbe3ce2ee7576d2e9c5ff93a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:16:44 2020 +0200

    cassandra.cql: Fix return type of stat_counters.

commit 267f48024bfd257e91aa95f6e1ae99320587add7
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:16:13 2020 +0200

    cassandra.model: Add PARTITION_KEY and CLUSTERING_KEY to the model classes.
    
    They will be used by the in-mem implementation of CqlRunner.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/749/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/749/console

Build is green

Patch application report for D3779 (id=13299)

Could not rebase; Attempt merge onto 6675286c4f...

Updating 6675286c..fee69b5b
Fast-forward
 swh/storage/cassandra/cql.py         |  131 ++---
 swh/storage/cassandra/model.py       |   35 +-
 swh/storage/cassandra/storage.py     |   61 +-
 swh/storage/db.py                    |   17 -
 swh/storage/in_memory.py             | 1055 +++++++++++++---------------------
 swh/storage/interface.py             |   31 -
 swh/storage/replay.py                |    2 +-
 swh/storage/storage.py               |   12 -
 swh/storage/tests/test_api_client.py |   45 +-
 swh/storage/tests/test_filter.py     |    2 +-
 swh/storage/tests/test_in_memory.py  |  112 +++-
 swh/storage/tests/test_replay.py     |   89 ++-
 swh/storage/tests/test_retry.py      |    3 +-
 swh/storage/tests/test_storage.py    |  181 ++----
 swh/storage/writer.py                |    4 +-
 15 files changed, 791 insertions(+), 989 deletions(-)
Changes applied before test
commit fee69b5b0259118a0b8a5f007129685ff12975d4
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 17:24:58 2020 +0200

    cassandra.cql: reorder origin_visit_* and origin_visit_status_* methods to be properly grouped.

commit 1cb26aeb587aa0b09f44bee3b52acb2919f50d0d
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 17:23:55 2020 +0200

    Remove unused arguments of CqlRunner.origin_visit_status_get.

commit 671d581fb528743e7f1ce80a38e86b0187d3f30c
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 16:45:17 2020 +0200

    in_memory: Remove InMemoryStorage.origin_* and implement InMemoryCqlRunner.origin_*

commit 083bfc5b04b0c0367ea4059b020c64769133230a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 16:21:01 2020 +0200

    in_memory: Remove InMemoryStorage.snapshot_* and implement InMemoryCqlRunner.snapshot_*

commit 327a28291308b314615d25fcd820de6b29a97ba8
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 15:38:14 2020 +0200

    Remove endpoint snapshot_get_by_origin_visit.
    
    It's not used anywhere.

commit 66b18eab890b09950ec631f3b5f208944ce49756
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 15:27:15 2020 +0200

    in_memory: Remove InMemoryStorage.release_* and implement InMemoryCqlRunner.release_*

commit 30d498a5914758a57fe9c76338ea4821c9e74c9d
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 15:23:40 2020 +0200

    in_memory: Remove InMemoryStorage.revision_* and implement InMemoryCqlRunner.revision_*

commit 941c110793f9302938f9bf94467ed98b39ab39e4
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 14:40:20 2020 +0200

    in_memory: Remove InMemoryStorage.directory_* and implement InMemoryCqlRunner.directory_*

commit e6a986db0f47a476a7579b4ae880774b2e64006f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 14:01:22 2020 +0200

    in_memory: Remove InMemoryStorage.skipped_content_* and implement InMemoryCqlRunner.skipped_content_*

commit 0263fbb377291214d9e61f14278038b41d7103de
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:54:32 2020 +0200

    in_memory: Remove InMemoryStorage.content_* and implement InMemoryCqlRunner.content_*

commit e6957ca4a912f8689fd8bfd919efeb955f21589e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:44:50 2020 +0200

    in_memory: make object_find_by_sha1_git merge results from the CassandraStorage.
    
    For now this has no effect. However, in the near future, the CassandraStorage
    will be in charge of some object types, so we need to merge objects
    found in the CassandraStorage and those found directly in the InMemoryStorage.

commit 892880c4d86ca433594d602ab5e5b25eb6b50229
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:41:42 2020 +0200

    in_memory: Add InMemoryCqlRunner, a class that emulates cassandra.cql.CqlRunner without Cassandra.
    
    For now it's only used for object counters; but future commits will
    progressively move the in-memory's storage features to it.

commit 69cb557387a8a5e490501c0f2bd9aec47762ee36
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:34:30 2020 +0200

    Make InMemoryStorage inherit from CassandraStorage.
    
    This has no effect for now, other than deduplicating a method
    and causing a name clash.

commit 30d65e6bd9e290e2eb0b1e029a93be2428fb33fa
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:24:33 2020 +0200

    in_memory: Add class Table, which emulates a Cassandra table.
    
    It will be used to implement the in-memory storage as a backend for the
    cassandra storage.

commit a0037c5d12a5694bfab749896db8546b7176b633
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:16:44 2020 +0200

    cassandra.cql: Fix return type of stat_counters.

commit f53cc6fc8606e9f97af0c3eea0ce0d7ec46d486a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:16:13 2020 +0200

    cassandra.model: Add PARTITION_KEY and CLUSTERING_KEY to the model classes.
    
    They will be used by the in-mem implementation of CqlRunner.

commit dd320a63b32b0e79c6f90a83236d607093219359
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 20:25:23 2020 +0200

    cassandra: Make origin_visit_get_latest filter using any status of a visit, instead of just the last.
    
    This fixes a mismatch in behavior with the pg and the in-mem storages

commit 1d45c5a0debc2e45ceee5effe565aab36905eb1c
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 19:08:17 2020 +0200

    cassandra: Fix wrong algo reported in HashCollision, because of variable shadowing.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/779/ for more details.

This revision is now accepted and ready to land.Aug 14 2020, 2:14 PM

Build is green

Patch application report for D3779 (id=13347)

Could not rebase; Attempt merge onto 6675286c4f...

Updating 6675286c..e5f450c3
Fast-forward
 swh/storage/cassandra/cql.py         |  131 ++---
 swh/storage/cassandra/model.py       |   35 +-
 swh/storage/cassandra/storage.py     |   63 +-
 swh/storage/db.py                    |   17 -
 swh/storage/in_memory.py             | 1055 +++++++++++++---------------------
 swh/storage/interface.py             |   31 -
 swh/storage/replay.py                |    2 +-
 swh/storage/storage.py               |   12 -
 swh/storage/tests/test_api_client.py |   45 +-
 swh/storage/tests/test_filter.py     |    2 +-
 swh/storage/tests/test_in_memory.py  |  112 +++-
 swh/storage/tests/test_replay.py     |   89 ++-
 swh/storage/tests/test_retry.py      |    3 +-
 swh/storage/tests/test_storage.py    |  191 ++----
 swh/storage/writer.py                |    4 +-
 15 files changed, 799 insertions(+), 993 deletions(-)
Changes applied before test
commit e5f450c336147ab29fd38c0ccf0150f3b4cb1105
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 17:24:58 2020 +0200

    cassandra.cql: reorder origin_visit_* and origin_visit_status_* methods to be properly grouped.

commit 249e4afdc5935870bdf4b2166583a82bb56dd355
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 17:23:55 2020 +0200

    Remove unused arguments of CqlRunner.origin_visit_status_get.

commit e1eb6cd18180776149f76ec9b0250b8a0773e406
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 16:45:17 2020 +0200

    in_memory: Remove InMemoryStorage.origin_* and implement InMemoryCqlRunner.origin_*

commit f78c76f5b796c1b0f3ed4a1aeed0fb28200e0ff6
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 16:21:01 2020 +0200

    in_memory: Remove InMemoryStorage.snapshot_* and implement InMemoryCqlRunner.snapshot_*

commit 66511303a8b16d328c91b77689c241fec6f45330
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 15:38:14 2020 +0200

    Remove endpoint snapshot_get_by_origin_visit.
    
    It's not used anywhere.

commit 1104c53acaa6b5ac8a1e92ba59266d690a7ec60c
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 15:27:15 2020 +0200

    in_memory: Remove InMemoryStorage.release_* and implement InMemoryCqlRunner.release_*

commit 237c400653c75d00e143a9dd97b0f50fb7a51614
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 15:23:40 2020 +0200

    in_memory: Remove InMemoryStorage.revision_* and implement InMemoryCqlRunner.revision_*

commit 8e7eed4461d013ef5e8487b8a824f860899ea2c7
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 14:40:20 2020 +0200

    in_memory: Remove InMemoryStorage.directory_* and implement InMemoryCqlRunner.directory_*

commit d5f41f8b96527ad83166b89bd1e8d22670ce446a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 14:01:22 2020 +0200

    in_memory: Remove InMemoryStorage.skipped_content_* and implement InMemoryCqlRunner.skipped_content_*

commit b3af39a97bd59e3aa8aa84b19e9512421b2c075e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:54:32 2020 +0200

    in_memory: Remove InMemoryStorage.content_* and implement InMemoryCqlRunner.content_*

commit 397a645ebf1ed8d7537dcde61ef7f01ffc4127ac
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:44:50 2020 +0200

    in_memory: make object_find_by_sha1_git merge results from the CassandraStorage.
    
    For now this has no effect. However, in the near future, the CassandraStorage
    will be in charge of some object types, so we need to merge objects
    found in the CassandraStorage and those found directly in the InMemoryStorage.

commit a96c253ab428e043de3a9c4ccc3ff04c179c3fa2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:41:42 2020 +0200

    in_memory: Add InMemoryCqlRunner, a class that emulates cassandra.cql.CqlRunner without Cassandra.
    
    For now it's only used for object counters; but future commits will
    progressively move the in-memory's storage features to it.

commit bc47283ddc8ca5397d8e25f25c670a9c4f21c435
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:34:30 2020 +0200

    Make InMemoryStorage inherit from CassandraStorage.
    
    This has no effect for now, other than deduplicating a method
    and causing a name clash.

commit 20971864bb1638b22bef00c79c7f0cb1be8afeed
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:24:33 2020 +0200

    in_memory: Add class Table, which emulates a Cassandra table.
    
    It will be used to implement the in-memory storage as a backend for the
    cassandra storage.

commit ef0600539bb7716380791e9c49619387c30407f4
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:16:44 2020 +0200

    cassandra.cql: Fix return type of stat_counters.

commit 1266b6a7fe5006746e579e1ec28b4fb600b1188e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 13:16:13 2020 +0200

    cassandra.model: Add PARTITION_KEY and CLUSTERING_KEY to the model classes.
    
    They will be used by the in-mem implementation of CqlRunner.

commit 3dc69aaa42e76e373c4d23c1e8d98af5a804970f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 20:25:23 2020 +0200

    cassandra: Make origin_visit_get_latest filter using any status of a visit, instead of just the last.
    
    This fixes a mismatch in behavior with the pg and the in-mem storages

commit 006eeecaba7523f407fa0e253d6c832ea246d959
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Aug 12 19:08:17 2020 +0200

    cassandra: Fix wrong algo reported in HashCollision, because of variable shadowing.

commit da287313765da62d34dc0dca7c22dbd61f40504f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Aug 14 15:15:54 2020 +0200

    cassandra: Fix content_missing_per_sha1 when its parameter has length != 1.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/808/ for more details.