Now reads perform a lot better because the miscalculation is fixed but also because the RBD is mounted read-only. It must be throttled otherwise it puts too much pressure on the cluster which underperforms on writes.

May 15 2021, 6:31 AM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

estimate the number of objects with sequential read based on the median size
implement read-only to experiment with various settings on an existing Read Storage

May 15 2021, 6:25 AM · Object storage

May 10 2021

dachary added a comment to T3149: Benchmark software for the object storage.

remap RBD images readonly when they are full so that there is no need to acquire read-write (not sure it matters, just an idea at this point and it's a simple thing to do)
clobber postgres when starting the benchmarks, in case there are leftovers
the postgres standby does not need to be hot (see above)
add recommended tuning for PostgreSQL (assuming a machine that has 128GB RAM)
zap the grid5000 nvme for PostgreSQL because they are not reset when the machine is deployed

May 10 2021, 5:54 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

With hot_standby = off the WAL is quickly flushed to the standby server when the write finish.
As soon as the write finish, the benchmark starts to read all databases as fast as it can which
significantly slows down the replication because it needs to ensure strong consistency between the
master and the standby.

May 10 2021, 3:28 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

Tune PostgreSQL and verify it improves the situation as follows:

May 10 2021, 12:13 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

$ ansible-playbook -i inventory tests-run.yml && ssh -t $runner direnv exec bench python bench/bench.py --file-count-ro 500 --rw-workers 40 --ro-workers 40 --file-size 50000 --no-warmup
...
WARNING:root:Objects write 6.8K/s
WARNING:root:Bytes write 137.7MB/s
WARNING:root:Objects read 1.5K/s
WARNING:root:Bytes read 109.9MB/s

May 10 2021, 8:37 AM · Object storage

May 8 2021

dachary added a comment to T3149: Benchmark software for the object storage.

After writing 1TB in 40 DB (40 * 25GB), the WAL is ~200GB i.e. ~20%:

May 8 2021, 10:35 AM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

$ ansible-playbook -i inventory tests-run.yml && ssh -t $runner direnv exec bench python bench/bench.py --file-count-ro 500 --rw-workers 40 --ro-workers 40 --file-size 50000 --no-warmup

May 8 2021, 8:46 AM · Object storage

May 3 2021

dachary closed T3065: Using git to store objects, a subtask of T3054: Scale out object storage design, as Wontfix.

May 3 2021, 5:49 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary closed T3065: Using git to store objects as Wontfix.

May 3 2021, 5:49 PM · Object storage

dachary added a comment to T3065: Using git to store objects.

While this is very creative, there is no benefit in storing small objects in git for the Software Heritage workload.

May 3 2021, 5:48 PM · Object storage

dachary closed T3050: Using libcephsqlite to store objects, a subtask of T3054: Scale out object storage design, as Wontfix.

May 3 2021, 5:47 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary closed T3050: Using libcephsqlite to store objects as Wontfix.

May 3 2021, 5:47 PM · Object storage

dachary added a comment to T3050: Using libcephsqlite to store objects.

There is no need to use Ceph for the Write Storage: PostgreSQL performs well and there is no scaling problem. The size of the Write Storage is limited, by design.

May 3 2021, 5:47 PM · Object storage

dachary closed T3055: Ceph and immutable & append only storage, a subtask of T3056: Ceph as an object storage, as Wontfix.

May 3 2021, 5:45 PM · Object storage (RedHat collaboration)

dachary closed T3055: Ceph and immutable & append only storage as Wontfix.

May 3 2021, 5:45 PM · Object storage

dachary added a comment to T3055: Ceph and immutable & append only storage.

It was discussed, during the Ceph Developer Summit 2021 and the conclusion was that RADOS is not the place to implement immutable optimizations. RGW is a better fit.

May 3 2021, 5:45 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

Group the two postgresql nvme drives in a single logical volume to get more storage. We need 30 write workers using 100GB Shards require 3TB of postgresql storage
Setup a second postgresql server set as a standby replication of the master: it may negatively impact the performances of the master cluster and should be included in the benchmark
Explain the benchmark methodology & assumptions

May 3 2021, 2:59 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

$ bench.py --file-count-ro 200 --rw-workers 20 --ro-workers 80 --file-size 50000 --no-warmup
...
WARNING:root:Objects write 5.8K/s
WARNING:root:Bytes write 117.9MB/s
WARNING:root:Objects read 1.3K/s
WARNING:root:Bytes read 100.4MB/s

May 3 2021, 7:26 AM · Object storage

May 2 2021

dachary added a comment to T3149: Benchmark software for the object storage.

$ bench.py --file-count-ro 200 --rw-workers 20 --ro-workers 80 --file-size 50000 --rand-ratio 10
...
WARNING:root:Objects write 5.8K/s
WARNING:root:Bytes write 118.4MB/s
WARNING:root:Objects read 12.3K/s
WARNING:root:Bytes read 850.3MB/s

May 2 2021, 9:41 AM · Object storage

May 1 2021

dachary added a comment to T3149: Benchmark software for the object storage.

Fix a race condition that failed postgresql database drops.

May 1 2021, 5:19 PM · Object storage

Apr 27 2021

dachary added a comment to T3149: Benchmark software for the object storage.

The rewrite to use processes was trivial and preliminary tests yield the expected results. Most of the time was spent on two problems:

Apr 27 2021, 2:02 PM · Object storage

Apr 20 2021

dachary added a comment to T3149: Benchmark software for the object storage.

Struggled most of today because there is a bottleneck when using threads and postgres, from a single client. However, when running 4 process, it performs as expected. The benchmark should be rewritten to use the process pool instead of the thread pool which should not be too complicated. I tried to add a warmup phase so that all concurrent threads/process do not start at the same time, but it does not really make any visible difference.

Apr 20 2021, 9:05 PM · Object storage

Apr 19 2021

dachary added a comment to T3149: Benchmark software for the object storage.

Completed the tests for the rewrite, it is working.

Apr 19 2021, 2:49 PM · Object storage

Apr 18 2021

dachary added a comment to T3149: Benchmark software for the object storage.

rbd bench on the images created

Apr 18 2021, 3:15 PM · Object storage

Apr 17 2021

dachary added a comment to T3149: Benchmark software for the object storage.

There is a 3% space overhead on the RBD data pool. 6TB data, 3TB parity = 9TB. Actual 9.3TB, i.e. ~+3%.

Apr 17 2021, 10:19 PM · Object storage

dachary added a comment to T3108: Grid5000 for benchmarking.

https://www.grid5000.fr/w/Grenoble:Network shows the network topology

Apr 17 2021, 5:26 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

Complete rewrite to:

Apr 17 2021, 5:24 PM · Object storage

Apr 14 2021

dachary renamed T3249: Deleting and erasing an object from Object deletion to Deleting and erasing an object.

Apr 14 2021, 5:37 PM · Object storage (RedHat collaboration)

dachary added a subtask for T3054: Scale out object storage design: T3249: Deleting and erasing an object.

Apr 14 2021, 5:37 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a parent task for T3249: Deleting and erasing an object: T3054: Scale out object storage design.

Apr 14 2021, 5:37 PM · Object storage (RedHat collaboration)

dachary changed the status of T3249: Deleting and erasing an object from Open to Work in Progress.

Apr 14 2021, 5:37 PM · Object storage (RedHat collaboration)

dachary added a comment to T3149: Benchmark software for the object storage.

Add reader to continuously read from images to simulate a read workload
Randomize the payload instead of using easily compressible data (postgres does a good job compressing them and this does not reflect the reality)

Apr 14 2021, 5:33 PM · Object storage

Apr 12 2021

dachary added a comment to T3149: Benchmark software for the object storage.

bench.py --file-count-ro 20 --rw-workers 20 --packer-workers 20 --file-size 1024 --fake-ro yields WARNING:root:Objects write 17.7K/s
bench.py --file-count-ro 40 --rw-workers 40 --packer-workers 20 --file-size 1024 --fake-ro yields WARNING:root:Objects write 13.8K/s

Apr 12 2021, 9:21 AM · Object storage

Apr 7 2021

dachary added a comment to T3149: Benchmark software for the object storage.

The benchmark was moved to a temporary repository for convenience (easier than uploading here every time). https://git.easter-eggs.org/biceps/biceps

Apr 7 2021, 6:25 PM · Object storage

Apr 6 2021

dachary closed T3210: Ceph Quincy CDS & immutable objects as Resolved.

Apr 6 2021, 11:33 PM · Object storage

dachary closed T3210: Ceph Quincy CDS & immutable objects, a subtask of T3054: Scale out object storage design, as Resolved.

Apr 6 2021, 11:33 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary updated the task description for T3054: Scale out object storage design.

Apr 6 2021, 11:33 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a comment to T3210: Ceph Quincy CDS & immutable objects.

Takeaways from the session:

Apr 6 2021, 6:35 PM · Object storage

dachary updated the task description for T3210: Ceph Quincy CDS & immutable objects.

Apr 6 2021, 1:46 PM · Object storage

dachary updated the task description for T3210: Ceph Quincy CDS & immutable objects.

Apr 6 2021, 1:46 PM · Object storage

dachary added a subtask for T3054: Scale out object storage design: T3210: Ceph Quincy CDS & immutable objects.

Apr 6 2021, 1:39 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a parent task for T3210: Ceph Quincy CDS & immutable objects: T3054: Scale out object storage design.

Apr 6 2021, 1:39 PM · Object storage

dachary changed the status of T3210: Ceph Quincy CDS & immutable objects from Open to Work in Progress.

Apr 6 2021, 1:33 PM · Object storage

Mar 30 2021

dachary renamed T3186: Ceph Sepia lab for performance testing from Ceph Sepia lab for testing to Ceph Sepia lab for performance testing.

Mar 30 2021, 10:14 AM · Object storage

dachary added a subtask for T3054: Scale out object storage design: T3186: Ceph Sepia lab for performance testing.

Mar 30 2021, 10:13 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a parent task for T3186: Ceph Sepia lab for performance testing: T3054: Scale out object storage design.

Mar 30 2021, 10:13 AM · Object storage

dachary changed the status of T3186: Ceph Sepia lab for performance testing from Open to Work in Progress.

Mar 30 2021, 10:13 AM · Object storage

Mar 26 2021

dachary updated the task description for T3054: Scale out object storage design.

Mar 26 2021, 11:52 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

Mar 25 2021

dachary updated the task description for T3054: Scale out object storage design.

Mar 25 2021, 10:18 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

Mar 24 2021

dachary added a comment to T3149: Benchmark software for the object storage.

Refactored the custer provsioning to use all available disks instead of the existing file system (using cephadm instead of a hand made ceph cluster).

Mar 24 2021, 11:50 AM · Object storage

Mar 23 2021

dachary added a comment to T3149: Benchmark software for the object storage.

The benchmark runs and it's not too complicated which is a relief. I'll cleanup the mess I made and move forward to finish writing the software.

Mar 23 2021, 3:27 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

The benchmarks are not fully functional but they produce a write load that matches the object storage design. They run (README.txt) via libvirt and are being tested on Grid5000 to ensure all the pieces are in place (i.e. does it actually work to reserve machines + provision them + run) before moving forward.

Mar 23 2021, 3:03 PM · Object storage

Mar 17 2021

dachary added a comment to T3057: Using seaweedfs to store objects.

Mail thread with Chris Lu on SeaweedFS use cases with 100+ billions objects.

Mar 17 2021, 4:22 PM · Object storage

dachary updated the task description for T3054: Scale out object storage design.

Mar 17 2021, 4:18 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a subtask for T3054: Scale out object storage design: T3149: Benchmark software for the object storage.

Mar 17 2021, 4:16 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a parent task for T3149: Benchmark software for the object storage: T3054: Scale out object storage design.

Mar 17 2021, 4:16 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

First draft for layer 0.

Mar 17 2021, 4:16 PM · Object storage

dachary changed the status of T3149: Benchmark software for the object storage from Open to Work in Progress.

Mar 17 2021, 4:15 PM · Object storage

Mar 15 2021

dachary added a comment to T3054: Scale out object storage design.

Bookmarking https://leo-project.net/leofs/

Mar 15 2021, 5:21 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

Mar 10 2021

dachary closed T3108: Grid5000 for benchmarking as Resolved.

Mar 10 2021, 9:10 PM · Object storage

dachary closed T3108: Grid5000 for benchmarking, a subtask of T3054: Scale out object storage design, as Resolved.

Mar 10 2021, 9:10 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a comment to T3108: Grid5000 for benchmarking.

With a little help from the mattermost channel and after approval of the account, it was possible to boot a physical machine with a Debian GNU/Linux installed from scratch and get root access to it.

Mar 10 2021, 9:09 PM · Object storage

dachary updated the task description for T3108: Grid5000 for benchmarking.

Mar 10 2021, 5:41 PM · Object storage

dachary added a comment to T3054: Scale out object storage design.

Thanks for helping with the labelling @rdicosmo 👍

Mar 10 2021, 4:30 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary updated the task description for T3054: Scale out object storage design.

Mar 10 2021, 4:11 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary updated the task description for T3054: Scale out object storage design.

Mar 10 2021, 4:10 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary closed T3106: TCO of a production ready Ambry vs implementing the design as Resolved.

Added a section about TCO in the design document.

Mar 10 2021, 9:17 AM · Object storage

dachary closed T3106: TCO of a production ready Ambry vs implementing the design, a subtask of T3054: Scale out object storage design, as Resolved.

Mar 10 2021, 9:17 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

Mar 9 2021

dachary added a comment to T3108: Grid5000 for benchmarking.

There is a mattermost channel dedicated to Grid5000 but one has to be invited to join, it is not open to the public.

Mar 9 2021, 10:48 PM · Object storage

dachary added a comment to T3108: Grid5000 for benchmarking.

Additional nvme drives for yeti should be something similar to https://www.samsung.com/semiconductor/ssd/enterprise-ssd/ but confirmation is needed to verify the machines actually have the required SFF-8639 to plug them in.

Mar 9 2021, 7:01 PM · Object storage

dachary added a comment to T3108: Grid5000 for benchmarking.

The account request was approved, I'll proceed with a minimal reservation to figure out how it is done.

Mar 9 2021, 6:34 PM · Object storage

dachary updated the task description for T3108: Grid5000 for benchmarking.

Mar 9 2021, 6:18 PM · Object storage

dachary added a comment to T3108: Grid5000 for benchmarking.

Thanks for the feedback. https://www.grid5000.fr/w/Grenoble:Hardware#yeti has 1.6TB nvme which seems better. It would be better to have a total of 4TB nvme available to get closer to the target global index size (i.e. 40 bytes 100 billions entries = 4TB). I'm told it is possible to donate hardware to Grid5000: if testing with the current configuration is not convincing enough, 4 more nvme pcie drives could be donated and they would be installed in the machines. No idea how much delay to expect but its good to know it is possible.

Mar 9 2021, 6:11 PM · Object storage

dachary added a comment to T3108: Grid5000 for benchmarking.

Looking at the available hardware, here is what could be used:

Mar 9 2021, 4:17 PM · Object storage

dachary added a comment to T3108: Grid5000 for benchmarking.

Followed the instructions at https://www.grid5000.fr/w/Grid5000:Get_an_account to get an account. Waiting for approval.

Mar 9 2021, 4:02 PM · Object storage

dachary added a subtask for T3054: Scale out object storage design: T3108: Grid5000 for benchmarking.

Mar 9 2021, 3:12 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a parent task for T3108: Grid5000 for benchmarking: T3054: Scale out object storage design.

Mar 9 2021, 3:12 PM · Object storage

dachary changed the status of T3108: Grid5000 for benchmarking from Open to Work in Progress.