Page MenuHomeSoftware Heritage

cassandra: Rewrite content_missing to run queries concurrently.
Changes PlannedPublicDraft

Authored by vlorentz on Oct 18 2021, 1:26 PM.

Details

Summary

This should solve the latest bottleneck in @vsellier's loader benchmarks

Depends on D6423.

Diff Detail

Unit TestsFailed

TimeTest
282 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_api_client.TestStorageApi::test_content_missing
self = <swh.storage.tests.test_api_client.TestStorageApi object at 0x7f1bf8e406a0> swh_storage = <RemoteStorage url=mock://example.com/> sample_data = <swh.storage.tests.storage_data.StorageData object at 0x7f1bf994e978>
43 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_api_client.TestStorageApi::test_content_missing_per_sha1
self = <swh.storage.tests.test_api_client.TestStorageApi object at 0x7f1bf841b588> swh_storage = <RemoteStorage url=mock://example.com/> sample_data = <swh.storage.tests.storage_data.StorageData object at 0x7f1bf9752f98>
41 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_api_client.TestStorageApi::test_content_missing_per_sha1_git
self = <swh.storage.tests.test_api_client.TestStorageApi object at 0x7f1bf841b550> swh_storage = <RemoteStorage url=mock://example.com/> sample_data = <swh.storage.tests.storage_data.StorageData object at 0x7f1bf841b978>
138 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_api_client.TestStorageApi::test_content_missing_unknown_algo
self = <swh.storage.tests.test_api_client.TestStorageApi object at 0x7f1bf959d240> swh_storage = <RemoteStorage url=mock://example.com/> sample_data = <swh.storage.tests.storage_data.StorageData object at 0x7f1bf94b4908>
48 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_api_client.TestStorageApi::test_object_find_by_sha1_git
self = <swh.storage.tests.test_api_client.TestStorageApi object at 0x7f1bf8497eb8> swh_storage = <RemoteStorage url=mock://example.com/> sample_data = <swh.storage.tests.storage_data.StorageData object at 0x7f1bf85d2be0>
View Full Test Results (27 Failed · 1,668 Passed · 61 Skipped)

Event Timeline

Build has FAILED

Patch application report for D6495 (id=23599)

Could not rebase; Attempt merge onto e9fd74d72d...

Updating e9fd74d7..c71d9c3a
Fast-forward
 swh/storage/cassandra/cql.py        | 77 +++++++++++++++++++++++++++----------
 swh/storage/cassandra/storage.py    | 53 +++++++++++++++++++------
 swh/storage/tests/test_cassandra.py |  7 ++--
 3 files changed, 102 insertions(+), 35 deletions(-)
Changes applied before test
commit c71d9c3a7608ed9c6506f9dca5561ef234aa14b2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Oct 18 13:25:20 2021 +0200

    cassandra: Rewrite content_missing to run queries concurrently.

commit 56474d6667071785078f977ff6cf16064f326143
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Oct 6 15:22:24 2021 +0200

    cassandra: Add alternative algorithms to list missing objects
    
    The existing implementation is now referred to as 'grouped-naive'.
    This is pretty bad, because it groups together requests that need
    to be dispatched to multiple server.
    
    'concurrent' is a new naive strategy, that is easy to implement
    and should perform nicely.
    
    'grouped-pk-serial' and 'grouped-pk-concurrent' still group
    the ids they request, but in a smarter way.
    I expect 'grouped-pk-concurrent' to be faster than 'grouped-pk-serial',
    and it *may* be faster than 'concurrent' but we need benchmarks to know.

commit a9867104771069c0f5f964a1995a2f2ddf47da93
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Oct 18 13:00:00 2021 +0200

    cassandra: Fix incomplete check of content existence in object_find_by_sha1_git
    
    content_missing_by_sha1_git only checks the index and not the main table.
    
    This is incorrect, because contents should not be considered written
    before an entry is written to the main table, even if an entry
    exists in one of the indexes.

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1457/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1457/console

Harbormaster returned this revision to the author for changes because remote builds failed.Oct 18 2021, 1:45 PM
Harbormaster failed remote builds in B24494: Diff 23599!