Details

Reviewers

None

Group Reviewers

Commits

rDSTO50fb54f212bc: directory_entry_add_batch: Remove the temporary prepared statement entirely
rDSTOfc950deb20b4: cassandra: Add option to select (hopefully) more efficient batch insertion algos
rDSTOda7e63ea683c: directory_entry_add_batch: Reduce churn of prepared statements

Summary

This adds a new config option for the cassandra backend,
'directory_entries_insert_algo', with three possible values:

'one-per-one' is the default, and preserves the current naive behavior
'concurrent' and 'batch' are attempts at being more efficient

Addresses the current performance bottleneck, according to T3493#68979

Diff Detail

Repository

rDSTO Storage manager

Branch

cassandra-dir-concurrent

Lint

No Linters Available

Unit

No Unit Test Coverage

Build Status

Buildable 23181
Build 36169: Phabricator diff pipeline on jenkins	Jenkins console · Jenkins
Build 36168: arc lint + arc unit

Unit TestsFailed
View All

	Time	Test
	377 ms	Jenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.tests.test_cassandra.TestCassandraStorage::test_directory_add_atomic self = <swh.storage.tests.test_cassandra.TestCassandraStorage object at 0x7f2a623dcd68> swh_storage = <swh.storage.cassandra.storage.CassandraStorage object at 0x7f2a5e9f17b8> sample_data = <swh.storage.tests.storage_data.StorageData object at 0x7f2a5e9eef98>
	4 ms	Jenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.fixer::swh.storage.fixer._fix_content
	2 ms	Jenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.fixer::swh.storage.fixer._fix_origin
	3 ms	Jenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.fixer::swh.storage.fixer._fix_origin_visit
	1 ms	Jenkins > .tox.py3.lib.python3.7.site-packages.swh.storage.fixer::swh.storage.fixer._fix_raw_extrinsic_metadata
		View Full Test Results (1 Failed · 1,076 Passed · 39 Skipped)

Event Timeline

vlorentz created this revision.Aug 26 2021, 11:10 AM

Herald added a reviewer: Reviewers. · View Herald TranscriptAug 26 2021, 11:10 AM

vlorentz edited the test plan for this revision. (Show Details)Aug 26 2021, 11:11 AM

Build has FAILED

Patch application report for D6139 (id=22222)

Rebasing onto b110d1b69c...

First, rewinding head to replay your work on top of it...
Applying: cassandra: Add option to select (hopefully) more efficient batch insertion algos

Changes applied before test

commit cc8c2d10d4d1d36c0fdc0a3fa815162a6c74ed28
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Aug 26 11:08:15 2021 +0200

    cassandra: Add option to select (hopefully) more efficient batch insertion algos
    
    This adds a new config option for the cassandra backend,
    'directory_entries_insert_algo', with three possible values:
    
    * 'one-per-one' is the default, and preserves the current naive behavior
    * 'concurrent' and 'batch' are attempts at being more efficient

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1373/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1373/console

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 26 2021, 11:17 AM

Harbormaster failed remote builds in B23180: Diff 22222!

vlorentz mentioned this in T3493: [cassandra] Git loader performance are very bad.Aug 26 2021, 11:20 AM

Thanks I will test this on g5k with the directory replayer and the git loader.
I will keep you in touch

Fix in-mem

Build has FAILED

Patch application report for D6139 (id=22223)

Rebasing onto b110d1b69c...

First, rewinding head to replay your work on top of it...
Applying: cassandra: Add option to select (hopefully) more efficient batch insertion algos
Applying: Fix in-mem

Changes applied before test

commit febe31d66e35063fe7330427bec648c2f9f8d070
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Aug 26 11:45:22 2021 +0200

    Fix in-mem

commit 58070f87089745fbca3526e0b35dcc533c05caa9
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Aug 26 11:08:15 2021 +0200

    cassandra: Add option to select (hopefully) more efficient batch insertion algos
    
    This adds a new config option for the cassandra backend,
    'directory_entries_insert_algo', with three possible values:
    
    * 'one-per-one' is the default, and preserves the current naive behavior
    * 'concurrent' and 'batch' are attempts at being more efficient

Link to build: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1374/
See console output for more information: https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1374/console

Harbormaster returned this revision to the author for changes because remote builds failed.Aug 26 2021, 11:53 AM

Harbormaster failed remote builds in B23181: Diff 22223!

fix test

Build is green

Patch application report for D6139 (id=22232)

Rebasing onto b110d1b69c...

First, rewinding head to replay your work on top of it...
Applying: cassandra: Add option to select (hopefully) more efficient batch insertion algos
Applying: Fix in-mem
Applying: Update test

Changes applied before test

commit a9f95df65ddfe9085ca89eef2fd0841b806fc685
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Aug 26 12:20:26 2021 +0200

    Update test

commit 749b46ae5a80b870b5ef090a8652203383ea9533
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Aug 26 11:45:22 2021 +0200

    Fix in-mem

commit cd0898b85b7e3a537c13a2b9495a688e834b3ff7
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Aug 26 11:08:15 2021 +0200

    cassandra: Add option to select (hopefully) more efficient batch insertion algos
    
    This adds a new config option for the cassandra backend,
    'directory_entries_insert_algo', with three possible values:
    
    * 'one-per-one' is the default, and preserves the current naive behavior
    * 'concurrent' and 'batch' are attempts at being more efficient

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1375/ for more details.

Harbormaster completed remote builds in B23190: Diff 22232.Aug 26 2021, 12:28 PM

vlorentz requested review of this revision.Aug 26 2021, 12:28 PM

The patch was test in a loader and in the replayers.
The difference was not really significant on the loader but I'm not really confident in the tests as the cluster had a pretty high load (running replayers + second datacenter synchronization).
I will retry with a more quieter environment to be able to isolate the loader behavior.

It was different with the directory replayers. With the concurrent and batch option, the performance seems to be really better:

The first plateau is with the unpatched version.
The peak is with the concurrent parameter
The problem is it seems there is a lock contention on the counter and all the replayers stopped with this error:

Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]: Traceback (most recent call last):
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/bin/swh", line 11, in <module>
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     load_entry_point('swh.core==0.14.4', 'console_scripts', 'swh')()
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/swh/core/cli/__init__.py", line 185, in main
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     return swh(auto_envvar_prefix="SWH")
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     return self.main(*args, **kwargs)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     rv = self.invoke(ctx)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     return _process_result(sub_ctx.command.invoke(sub_ctx))
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     return _process_result(sub_ctx.command.invoke(sub_ctx))
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     return ctx.invoke(self.callback, **ctx.params)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     return callback(*args, **kwargs)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/click/decorators.py", line 17, in new_func
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     return f(get_current_context(), *args, **kwargs)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/swh/storage/cli.py", line 194, in replay
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     client.process(worker_fn)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/swh/journal/client.py", line 265, in process
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     batch_processed, at_eof = self.handle_messages(messages, worker_fn)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/swh/journal/client.py", line 292, in handle_messages
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     worker_fn(dict(objects))
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/swh/storage/replay.py", line 62, in process_replay_objects
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     _insert_objects(object_type, objects, storage)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/swh/storage/replay.py", line 144, in _insert_objects
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     storage.origin_add(origins)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/swh/storage/cassandra/storage.py", line 1021, in origin_add
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     OriginRow(sha1=hash_url(origin.url), url=origin.url, next_visit_id=1)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/swh/storage/cassandra/cql.py", line 162, in newf
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     self, *args, **kwargs, statement=self._prepared_statements[f.__name__]
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/swh/storage/cassandra/cql.py", line 913, in origin_add_one
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     self._add_one(statement, origin)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/swh/storage/cassandra/cql.py", line 315, in _add_one
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     self._increment_counter(obj.TABLE, 1)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/swh/storage/cassandra/cql.py", line 162, in newf
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     self, *args, **kwargs, statement=self._prepared_statements[f.__name__]
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/swh/storage/cassandra/cql.py", line 312, in _increment_counter                                                                                                                                                             
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     self._execute_with_retries(statement, [nb, object_type])
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/tenacity/__init__.py", line 329, in wrapped_f
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     return self.call(f, *args, **kw)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/tenacity/__init__.py", line 409, in call
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     do = self.iter(retry_state=retry_state)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/tenacity/__init__.py", line 356, in iter
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     return fut.result()
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 425, in result
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     return self.__get_result()
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     raise self._exception
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/tenacity/__init__.py", line 412, in call
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     result = fn(*args, **kwargs)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "/usr/lib/python3/dist-packages/swh/storage/cassandra/cql.py", line 293, in _execute_with_retries                                                                                                                                                          
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:     return self._session.execute(statement, args, timeout=1000.0)
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "cassandra/cluster.py", line 2345, in cassandra.cluster.Session.execute
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]:   File "cassandra/cluster.py", line 4304, in cassandra.cluster.ResponseFuture.result
Aug 26 18:36:59 paravance-70.rennes.grid5000.fr swh[9885]: cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed out waiting for replica nodes' responses] message="Operation timed out - received only 0 responses." info={'consistency': 'ONE', 'required_responses': 1, 'received_responses': 0, 'write_type': 'COUNTER'}

Tested with 40 and 10 directory replayers

This should be addressed by D6149 + D6150

rebase + squash + add tests

vlorentz edited the test plan for this revision. (Show Details)Aug 27 2021, 2:47 PM

Build is green

Patch application report for D6139 (id=22276)

Could not rebase; Attempt merge onto 47a6919fee...

Updating 47a6919f..d272e70b
Fast-forward
 swh/storage/cassandra/cql.py        | 105 ++++++++++++++++++++++++++++--------
 swh/storage/cassandra/schema.py     |   8 ---
 swh/storage/cassandra/storage.py    |  32 +++++++++--
 swh/storage/in_memory.py            |   1 +
 swh/storage/tests/storage_tests.py  |  67 ++++++++++++++++-------
 swh/storage/tests/test_cassandra.py |  31 ++++++++---
 6 files changed, 187 insertions(+), 57 deletions(-)

Changes applied before test

commit d272e70b9c2a2ceec484ffb7511eb6e742255319
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Aug 26 11:08:15 2021 +0200

    cassandra: Add option to select (hopefully) more efficient batch insertion algos
    
    This adds a new config option for the cassandra backend,
    'directory_entries_insert_algo', with three possible values:
    
    * 'one-per-one' is the default, and preserves the current naive behavior
    * 'concurrent' and 'batch' are attempts at being more efficient

commit c8a1ed7ac4adb7ca4f0bda9b12ace14b2fc521ce
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Aug 27 11:45:18 2021 +0200

    cassandra: Remove stat_counters.
    
    They were inaccurate and a performance bottleneck.
    
    We can/should use swh-counters instead, now.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1387/ for more details.

Harbormaster completed remote builds in B23233: Diff 22276.Aug 27 2021, 2:54 PM

thanks. I will test that once the monitoring is updated to use the statsd statistics instead of the object_count table content.

rebase

Build is green

Patch application report for D6139 (id=22328)

Could not rebase; Attempt merge onto 3ad1bec113...

Updating 3ad1bec1..df045d45
Fast-forward
 swh/storage/cassandra/cql.py        | 105 ++++++++++++++++++++++++++++--------
 swh/storage/cassandra/schema.py     |   8 ---
 swh/storage/cassandra/storage.py    |  32 +++++++++--
 swh/storage/in_memory.py            |   1 +
 swh/storage/tests/storage_tests.py  |  67 ++++++++++++++++-------
 swh/storage/tests/test_cassandra.py |  31 ++++++++---
 6 files changed, 187 insertions(+), 57 deletions(-)

Changes applied before test

commit df045d45cb90d9ba7f1fb87cdf20945ac42fa800
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Aug 26 11:08:15 2021 +0200

    cassandra: Add option to select (hopefully) more efficient batch insertion algos
    
    This adds a new config option for the cassandra backend,
    'directory_entries_insert_algo', with three possible values:
    
    * 'one-per-one' is the default, and preserves the current naive behavior
    * 'concurrent' and 'batch' are attempts at being more efficient

commit e8aad0fffc5341a78f98cb3b4c4845b6f4e95527
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Aug 27 11:45:18 2021 +0200

    cassandra: Remove stat_counters.
    
    They were inaccurate and a performance bottleneck.
    
    We can/should use swh-counters instead, now.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1394/ for more details.

Harbormaster completed remote builds in B23282: Diff 22328.Aug 31 2021, 10:56 AM

Test with 10 replayers with the 3 kind of algorithm:

first interval: one-by-one
second interval: concurremt
third interval: batch:

The stats are based on the directory_add statsd metrics

concurrent and batch seem to have almost the same performances. Both are faster than the one-by-one algo but batch looks more stable.

The concurrent algo have a higher impact on the cpu consumption.

I will test with more replayers in parallel to confirm the tendency

Some feedback, I had to delay the benchmarks because the servers was almost full and the cluster needed to be expanded to 7 nodes. The cluster is in stabilization phase (rebuild/repair of the new node and cleanup of the old one)
When it will be done, I will be able to finalize the tests Hopefully at the beginning of the next week

These are more results with different number of replayers.
Each line represents a server with 20 directory replayers, the renages are for one-by-one, concurrent, batch

1 node

2 nodes

3 nodes

4 nodes

5 nodes

For 4 nodes, and only for 4 nodes, the batch version collapse and is doing nothing visible and nothing is logged.
There is some communication with cassandra but nothing is ingested.

After analyze, the batch mode uses huge prepared statements (PS). It seems with the concurrency of the replayers, they are evicted from the PS cache before they can be executed. The client silently reprepare the statement and retry fall in a prepared/evicted loop.
Some interesting read:
https://docs.datastax.com/en/developer/java-driver/3.0/manual/statements/prepared/#preparing-on-multiple-nodes

finally, if the driver tries to execute a statement and finds out that the coordinator doesn’t know about it, it will re-prepare the statement on the fly (this is transparent for the client, but will cost two extra roundtrips):

From a tcpdump:

...
Prepared query with ID a6a4d5fc60c93bd33e1df45d36a045b5 not found (either the query was not prepared on this host (maybe the host has been restarted?) or you have prepared too many queries and it has been evicted from the internal cache)
BEGIN UNLOGGED BATCH
Prepared query with ID a6a4d5fc60c93bd33e1df45d36a045b5 not found (either the query was not prepared on this host (maybe the host has been restarted?) or you have prepared too many queries and it has been evicted from the internal cache)
BEGIN UNLOGGED BATCH
Prepared query with ID a6a4d5fc60c93bd33e1df45d36a045b5 not found (either the query was not prepared on this host (maybe the host has been restarted?) or you have prepared too many queries and it has been evicted from the internal cache)
BEGIN UNLOGGED BATCH
Prepared query with ID a6a4d5fc60c93bd33e1df45d36a045b5 not found (either the query was not prepared on this host (maybe the host has been restarted?) or you have prepared too many queries and it has been evicted from the internal cache)
BEGIN UNLOGGED BATCH
Prepared query with ID a6a4d5fc60c93bd33e1df45d36a045b5 not found (either the query was not prepared on this host (maybe the host has been restarted?) or you have prepared too many queries and it has been evicted from the internal cache)
....

This example if for a big PS with 904

INSERT INTO directory_entry (directory_id, name, target, perms, type) VALUES (?, ?, ?, ?, ?);

In cassandra, there are a lot of logs like this one:

WARN  [ScheduledTasks:1] 2021-09-07 07:45:41,884 QueryProcessor.java:101 - 576 prepared statements discarded in the last minute because cache limit reached (63 MB)

According to the documentation of the cassandra concurrent api[1], it seems the concurrency can by specified as an argument of the execute_concurrent_with_args method. The default is 100, but it could be interesting to check with higher or lower values

The concurrency parameter controls how many statements will be executed concurrently. When Cluster.protocol_version is set to 1 or 2, it is recommended that this be kept below 100 times the number of core connections per host times the number of connected hosts (see Cluster.set_core_connections_per_host()). If that amount is exceeded, the event loop thread may attempt to block on new connection creation, substantially impacting throughput. If protocol_version is 3 or higher, you can safely experiment with higher levels of concurrency.

[1] https://docs.datastax.com/en/developer/python-driver/3.25/api/cassandra/concurrent/

I will probably dedicate a benchmark session for the concurrent algo with different concurrencies

In D6139#160624, @vsellier wrote:

After analyze, the batch mode uses huge prepared statements (PS). It seems with the concurrency of the replayers, they are evicted from the PS cache before they can be executed. The client silently reprepare the statement and retry fall in a prepared/evicted loop.

Ooooh, so that's what it is! Good to know.

Can you try with lower values for BATCH_INSERT_MAX_SIZE? (eg. 100)

rebase
directory_entry_add_batch: Reduce churn of prepared statements

Build is green

Patch application report for D6139 (id=22479)

Could not rebase; Attempt merge onto 834a49d067...

Updating 834a49d0..da7e63ea
Fast-forward
 swh/storage/cassandra/cql.py              | 86 ++++++++++++++++++++++++++++---
 swh/storage/cassandra/storage.py          | 32 ++++++++++--
 swh/storage/in_memory.py                  |  1 +
 swh/storage/migrate_extrinsic_metadata.py | 23 +++++++--
 swh/storage/tests/storage_tests.py        |  2 +-
 swh/storage/tests/test_cassandra.py       | 55 +++++++++++++++++---
 6 files changed, 178 insertions(+), 21 deletions(-)

Changes applied before test

commit da7e63ea683c5dad0da64e83615a7df24156c6a2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 8 11:56:49 2021 +0200

    directory_entry_add_batch: Reduce churn of prepared statements
    
    By reusing the 'steady state' main statement (which is quite large)
    across calls.

commit fc950deb20b4a2ea1f40a2617f51e3eb73f78612
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Aug 26 11:08:15 2021 +0200

    cassandra: Add option to select (hopefully) more efficient batch insertion algos
    
    This adds a new config option for the cassandra backend,
    'directory_entries_insert_algo', with three possible values:
    
    * 'one-per-one' is the default, and preserves the current naive behavior
    * 'concurrent' and 'batch' are attempts at being more efficient

commit 7dc2863ef1fa0701e0cf632c0ce68db8e2965817
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Sep 6 14:45:40 2021 +0200

    migrate_extrinsic_metadata: Add an option to limit the number of revisions
    
    This will be used as a second pass on objects that failed with
    older versions of the script.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1397/ for more details.

Harbormaster completed remote builds in B23430: Diff 22479.Sep 8 2021, 12:05 PM

Thanks for the last fix, it looks better with a smaller batch size:
5 nodes:

The ingestion is ~7500 ops/s in batch compared to ~6500 before

6 nodes:

It seems the maximum rate we can afford is ~ 7500 ops

4 nodes:
4 nodes run still looks problematics, 3 runs were launched, 2 succeed and 1 failed.

I can't find anything indicating a particular load on the servers that can explain this behavior.
There is no particular cpu memory or I/O load, the compaction jobs looks stable, no client timeout.
The only difference is the PS eviction behavior

directory_entry_add_batch: Remove the temporary prepared statement entirely

Build is green

Patch application report for D6139 (id=22516)

Could not rebase; Attempt merge onto 834a49d067...

Updating 834a49d0..50fb54f2
Fast-forward
 swh/storage/cassandra/cql.py              | 80 ++++++++++++++++++++++++++++---
 swh/storage/cassandra/storage.py          | 32 +++++++++++--
 swh/storage/in_memory.py                  |  1 +
 swh/storage/migrate_extrinsic_metadata.py | 23 +++++++--
 swh/storage/tests/storage_tests.py        |  2 +-
 swh/storage/tests/test_cassandra.py       | 55 ++++++++++++++++++---
 6 files changed, 172 insertions(+), 21 deletions(-)

Changes applied before test

commit 50fb54f212bc36c1d28d80749cb435bbddaf2db8
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Sep 9 11:35:22 2021 +0200

    directory_entry_add_batch: Remove the temporary prepared statement entirely
    
    And fall back to concurrent insertion.

commit da7e63ea683c5dad0da64e83615a7df24156c6a2
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Sep 8 11:56:49 2021 +0200

    directory_entry_add_batch: Reduce churn of prepared statements
    
    By reusing the 'steady state' main statement (which is quite large)
    across calls.

commit fc950deb20b4a2ea1f40a2617f51e3eb73f78612
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Aug 26 11:08:15 2021 +0200

    cassandra: Add option to select (hopefully) more efficient batch insertion algos
    
    This adds a new config option for the cassandra backend,
    'directory_entries_insert_algo', with three possible values:
    
    * 'one-per-one' is the default, and preserves the current naive behavior
    * 'concurrent' and 'batch' are attempts at being more efficient

commit 7dc2863ef1fa0701e0cf632c0ce68db8e2965817
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Mon Sep 6 14:45:40 2021 +0200

    migrate_extrinsic_metadata: Add an option to limit the number of revisions
    
    This will be used as a second pass on objects that failed with
    older versions of the script.

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1398/ for more details.

Harbormaster completed remote builds in B23463: Diff 22516.Sep 9 2021, 11:44 AM

This revision was not accepted when it landed; it landed in state Needs Review.Sep 9 2021, 3:27 PM

This revision was not accepted when it landed; it landed in state Needs Review.

Closed by commit rDSTOfc950deb20b4: cassandra: Add option to select (hopefully) more efficient batch insertion algos (authored by vlorentz). · Explain Why

This revision was automatically updated to reflect the committed changes.

vlorentz added a commit: rDSTOfc950deb20b4: cassandra: Add option to select (hopefully) more efficient batch insertion algos.

vlorentz added a commit: rDSTOda7e63ea683c: directory_entry_add_batch: Reduce churn of prepared statements.

vlorentz added a commit: rDSTO50fb54f212bc: directory_entry_add_batch: Remove the temporary prepared statement entirely.

final tests with the last version, everything looks good with ~~almost the same performances~~ a better ingestion rate in batch:
4 nodes before (batch only):

4 nodes after:

5 nodes before:
(don't have the measure yet due to the reservation expiration)

5 nodes after:

~~The performance is a little better for all the tests because there is one more server on the cluster.~~

cassandra: Add option to select (hopefully) more efficient batch insertion algos
ClosedPublic
Actions

Details

Diff Detail

Unit TestsFailed
View All

Event Timeline

Patch application report for D6139 (id=22222)

Changes applied before test

Patch application report for D6139 (id=22223)

Changes applied before test

Patch application report for D6139 (id=22232)

Changes applied before test

Patch application report for D6139 (id=22276)

Changes applied before test

Patch application report for D6139 (id=22328)

Changes applied before test

Patch application report for D6139 (id=22479)

Changes applied before test

Patch application report for D6139 (id=22516)

Changes applied before test

Revision Contents
Changeset List

Diff 22223

swh/storage/cassandra/cql.py

swh/storage/cassandra/storage.py

swh/storage/in_memory.py

cassandra: Add option to select (hopefully) more efficient batch insertion algosClosedPublicActions

Details

Diff Detail

Unit TestsFailedView All

Event Timeline

Patch application report for D6139 (id=22222)

Changes applied before test

Patch application report for D6139 (id=22223)

Changes applied before test

Patch application report for D6139 (id=22232)

Changes applied before test

Patch application report for D6139 (id=22276)

Changes applied before test

Patch application report for D6139 (id=22328)

Changes applied before test

Patch application report for D6139 (id=22479)

Changes applied before test

Patch application report for D6139 (id=22516)

Changes applied before test

Revision ContentsChangeset List

Diff 22223

swh/storage/cassandra/cql.py

swh/storage/cassandra/storage.py

swh/storage/in_memory.py

cassandra: Add option to select (hopefully) more efficient batch insertion algos
ClosedPublic
Actions

Unit TestsFailed
View All

Revision Contents
Changeset List