Page MenuHomeSoftware Heritage

dachary (Loïc Dachary)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 8 2021, 11:21 PM (22 w, 2 d)

Recent Activity

Yesterday

dachary added a comment to T3149: Benchmark software for the object storage.

Creating a 20 billions global index fails because there is not enough disk space (2.9TB is full even with tunefs -m 0).

Sun, Jun 13, 11:36 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

The equilibrium between reads and write is with 5 readers and 10 writers which leads to 1.2% random reads above the threshold, the worst one being 2sec. What it means is that care must be taken, application side, to throttle reads and writes otherwise the penalty is a significant degradation is latency.

Sun, Jun 13, 11:31 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

When the benchmark write, the pressure of 40 workers slows down the reads significantly.

Sun, Jun 13, 7:03 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

Running the benchmark with a read workload only (the Ceph cluster is doing nothing else), with 20 workers shows 8% of requests with a latency above the threshold:

Sun, Jun 13, 3:40 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

In interrupted the benchmarks because it shows reads are not as expected, i.e. a large number of reads take very long and the number of reads per seconds is way more than what is needed. There is no throttling on reads only the number of workers is the limit. I was expecting they would be slowed down by other factors and not apply too much pressure on the cluster. But I was apparently wrong and throttling must be implemented to slow them down.

Sun, Jun 13, 7:52 AM · Object storage

Sat, Jun 12

dachary added a comment to T3149: Benchmark software for the object storage.

For the record, creating 10 billions entries in the global index took:

Sat, Jun 12, 5:43 PM · Object storage

Mon, Jun 7

dachary added a comment to T3149: Benchmark software for the object storage.
In T3149#65906, @zack wrote:

how about just collecting all raw timings in an output CSV file (or several files if needed) and compute the stats downstream (e.g., with pandas)?
that would allow changing the percentiles later on as well as compute different stats, without having to rerun the benchmarks

Mon, Jun 7, 3:35 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

I still think that returning a histogram of response times, in buckets of 5 or 10 ms wide ranges, may be valuable? We can then derive percentiles from that if we're so inclined.

Mon, Jun 7, 12:45 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.
In T3149#65880, @olasd wrote:

While you're at it, could you report quantiles for the time to first byte, instead of just a raw maximum?

Something like:

  • best 1%
  • best 10%
  • best 25%
  • median
  • worst 25% / best 75%
  • worst 10%
  • worst 1%
  • maximum

(this all might be overkill, but...)

Mon, Jun 7, 12:09 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.
  • Collect and display the worst time to first byte, not the average
Mon, Jun 7, 11:59 AM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

and this needs fixing.

do you mean the bench code needs fixing (to report the proper stats)?

Mon, Jun 7, 11:34 AM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

This week-end run was not very fruitful: since the global index could not be populated as expected and it was discovered Sunday morning, there was no time to fallback to a small one, for instance 10 billion entries. A run was launched and lasted ~24h to show:

Mon, Jun 7, 9:10 AM · Object storage

Sat, Jun 5

dachary updated the task description for T3327: Hardware architecture for the object storage.
Sat, Jun 5, 8:21 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

20 billions entries were inserted in the global index. After building the index it occupies 2.5TB, therefore each entry users ~125 bytes of raw space. That's 25% more than with a 1 billion entries global index (i.e. 100 bytes)

Sat, Jun 5, 7:59 PM · Object storage
dachary updated the task description for T3149: Benchmark software for the object storage.
Sat, Jun 5, 7:54 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.
  • Add insertion in the global index to the benchmark
Sat, Jun 5, 1:07 PM · Object storage

Wed, Jun 2

dachary added a comment to T3327: Hardware architecture for the object storage.

My notes on the meeting:

Wed, Jun 2, 4:28 PM · Object storage

Mon, May 31

dachary added a comment to T3149: Benchmark software for the object storage.
  • Add the generate script to ingest entries in the global index.
Mon, May 31, 5:55 PM · Object storage
dachary updated the task description for T3327: Hardware architecture for the object storage.
Mon, May 31, 5:45 PM · Object storage
dachary added a comment to T3327: Hardware architecture for the object storage.

The call is set to Wednesday June 2nd, 2021 4pm UTC+2 at https://meet.jit.si/ApparentStreetsJokeOk

Mon, May 31, 5:10 PM · Object storage
dachary added a comment to T3327: Hardware architecture for the object storage.
Mon, May 31, 9:57 AM · Object storage

Wed, May 19

dachary added a comment to T3327: Hardware architecture for the object storage.

@olasd E. Lacour completed a study for a Ceph cluster today, with hardware specifications and pricing. He is available to discuss if you'd like.

Wed, May 19, 5:03 PM · Object storage

Mon, May 17

dachary updated the task description for T3054: Scale out object storage design.
Mon, May 17, 4:17 PM · Roadmap 2021, meta-task, Object storage
dachary updated the task description for T3054: Scale out object storage design.
Mon, May 17, 4:16 PM · Roadmap 2021, meta-task, Object storage
dachary added a comment to T3149: Benchmark software for the object storage.
  • display the time to first byte for random reads
Mon, May 17, 3:21 PM · Object storage
dachary updated the task description for T3327: Hardware architecture for the object storage.
Mon, May 17, 2:10 PM · Object storage
dachary updated the task description for T3327: Hardware architecture for the object storage.
Mon, May 17, 2:07 PM · Object storage
dachary triaged T3327: Hardware architecture for the object storage as Normal priority.
Mon, May 17, 1:59 PM · Object storage

May 15 2021

dachary added a comment to T3149: Benchmark software for the object storage.

Reducing the number of read workers to 20 allows writes to perform as expected. The test results are collected in the README file for archive.

May 15 2021, 6:33 PM · Object storage
dachary updated the task description for T3327: Hardware architecture for the object storage.
May 15 2021, 1:12 PM · Object storage
dachary updated the task description for T3327: Hardware architecture for the object storage.
May 15 2021, 1:11 PM · Object storage
dachary removed a parent task for T3327: Hardware architecture for the object storage: T3054: Scale out object storage design.
May 15 2021, 1:09 PM · Object storage
dachary removed a subtask for T3054: Scale out object storage design: T3327: Hardware architecture for the object storage.
May 15 2021, 1:09 PM · Roadmap 2021, meta-task, Object storage
dachary added a subtask for T3054: Scale out object storage design: T3327: Hardware architecture for the object storage.
May 15 2021, 1:08 PM · Roadmap 2021, meta-task, Object storage
dachary added a parent task for T3327: Hardware architecture for the object storage: T3054: Scale out object storage design.
May 15 2021, 1:08 PM · Object storage
dachary changed the status of T3327: Hardware architecture for the object storage from Open to Work in Progress.
May 15 2021, 1:07 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

Now reads perform a lot better because the miscalculation is fixed but also because the RBD is mounted read-only. It must be throttled otherwise it puts too much pressure on the cluster which underperforms on writes.

May 15 2021, 6:31 AM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.
  • estimate the number of objects with sequential read based on the median size
  • implement read-only to experiment with various settings on an existing Read Storage
May 15 2021, 6:25 AM · Object storage

May 10 2021

dachary added a comment to T3149: Benchmark software for the object storage.
  • remap RBD images readonly when they are full so that there is no need to acquire read-write (not sure it matters, just an idea at this point and it's a simple thing to do)
  • clobber postgres when starting the benchmarks, in case there are leftovers
  • the postgres standby does not need to be hot (see above)
  • add recommended tuning for PostgreSQL (assuming a machine that has 128GB RAM)
  • zap the grid5000 nvme for PostgreSQL because they are not reset when the machine is deployed
May 10 2021, 5:54 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

With hot_standby = off the WAL is quickly flushed to the standby server when the write finish.
As soon as the write finish, the benchmark starts to read all databases as fast as it can which
significantly slows down the replication because it needs to ensure strong consistency between the
master and the standby.

May 10 2021, 3:28 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

Tune PostgreSQL and verify it improves the situation as follows:

May 10 2021, 12:13 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.
$ ansible-playbook -i inventory tests-run.yml && ssh -t $runner direnv exec bench python bench/bench.py --file-count-ro 500 --rw-workers 40 --ro-workers 40 --file-size 50000 --no-warmup
...
WARNING:root:Objects write 6.8K/s
WARNING:root:Bytes write 137.7MB/s
WARNING:root:Objects read 1.5K/s
WARNING:root:Bytes read 109.9MB/s
May 10 2021, 8:37 AM · Object storage

May 8 2021

dachary added a comment to T3149: Benchmark software for the object storage.

After writing 1TB in 40 DB (40 * 25GB), the WAL is ~200GB i.e. ~20%:

May 8 2021, 10:35 AM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.
$ ansible-playbook -i inventory tests-run.yml && ssh -t $runner direnv exec bench python bench/bench.py --file-count-ro 500 --rw-workers 40 --ro-workers 40 --file-size 50000 --no-warmup
May 8 2021, 8:46 AM · Object storage

May 3 2021

dachary closed T3065: Using git to store objects, a subtask of T3054: Scale out object storage design, as Wontfix.
May 3 2021, 5:49 PM · Roadmap 2021, meta-task, Object storage
dachary closed T3065: Using git to store objects as Wontfix.
May 3 2021, 5:49 PM · Object storage
dachary added a comment to T3065: Using git to store objects.

While this is very creative, there is no benefit in storing small objects in git for the Software Heritage workload.

May 3 2021, 5:48 PM · Object storage
dachary closed T3050: Using libcephsqlite to store objects, a subtask of T3054: Scale out object storage design, as Wontfix.
May 3 2021, 5:47 PM · Roadmap 2021, meta-task, Object storage
dachary closed T3050: Using libcephsqlite to store objects as Wontfix.
May 3 2021, 5:47 PM · Object storage
dachary added a comment to T3050: Using libcephsqlite to store objects.

There is no need to use Ceph for the Write Storage: PostgreSQL performs well and there is no scaling problem. The size of the Write Storage is limited, by design.

May 3 2021, 5:47 PM · Object storage
dachary closed T3055: Ceph and immutable & append only storage, a subtask of T3056: Ceph as an object storage, as Wontfix.
May 3 2021, 5:45 PM · Object storage
dachary closed T3055: Ceph and immutable & append only storage as Wontfix.
May 3 2021, 5:45 PM · Object storage
dachary added a comment to T3055: Ceph and immutable & append only storage.

It was discussed, during the Ceph Developer Summit 2021 and the conclusion was that RADOS is not the place to implement immutable optimizations. RGW is a better fit.

May 3 2021, 5:45 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.
  • Group the two postgresql nvme drives in a single logical volume to get more storage. We need 30 write workers using 100GB Shards require 3TB of postgresql storage
  • Setup a second postgresql server set as a standby replication of the master: it may negatively impact the performances of the master cluster and should be included in the benchmark
  • Explain the benchmark methodology & assumptions
May 3 2021, 2:59 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.
$ bench.py --file-count-ro 200 --rw-workers 20 --ro-workers 80 --file-size 50000 --no-warmup
...
WARNING:root:Objects write 5.8K/s
WARNING:root:Bytes write 117.9MB/s
WARNING:root:Objects read 1.3K/s
WARNING:root:Bytes read 100.4MB/s
May 3 2021, 7:26 AM · Object storage

May 2 2021

dachary added a comment to T3149: Benchmark software for the object storage.
$ bench.py --file-count-ro 200 --rw-workers 20 --ro-workers 80 --file-size 50000 --rand-ratio 10
...
WARNING:root:Objects write 5.8K/s
WARNING:root:Bytes write 118.4MB/s
WARNING:root:Objects read 12.3K/s
WARNING:root:Bytes read 850.3MB/s
May 2 2021, 9:41 AM · Object storage

May 1 2021

dachary added a comment to T3149: Benchmark software for the object storage.

Fix a race condition that failed postgresql database drops.

May 1 2021, 5:19 PM · Object storage

Apr 27 2021

dachary added a comment to T3149: Benchmark software for the object storage.

The rewrite to use processes was trivial and preliminary tests yield the expected results. Most of the time was spent on two problems:

Apr 27 2021, 2:02 PM · Object storage

Apr 20 2021

dachary added a comment to T3149: Benchmark software for the object storage.

Struggled most of today because there is a bottleneck when using threads and postgres, from a single client. However, when running 4 process, it performs as expected. The benchmark should be rewritten to use the process pool instead of the thread pool which should not be too complicated. I tried to add a warmup phase so that all concurrent threads/process do not start at the same time, but it does not really make any visible difference.

Apr 20 2021, 9:05 PM · Object storage

Apr 19 2021

dachary added a comment to T3149: Benchmark software for the object storage.

Completed the tests for the rewrite, it is working.

Apr 19 2021, 2:49 PM · Object storage

Apr 18 2021

dachary added a comment to T3149: Benchmark software for the object storage.

rbd bench on the images created

Apr 18 2021, 3:15 PM · Object storage

Apr 17 2021

dachary added a comment to T3149: Benchmark software for the object storage.

There is a 3% space overhead on the RBD data pool. 6TB data, 3TB parity = 9TB. Actual 9.3TB, i.e. ~+3%.

Apr 17 2021, 10:19 PM · Object storage
dachary added a comment to T3108: Grid5000 for benchmarking.

https://www.grid5000.fr/w/Grenoble:Network shows the network topology

Apr 17 2021, 5:26 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

Complete rewrite to:

Apr 17 2021, 5:24 PM · Object storage

Apr 14 2021

dachary renamed T3249: Deleting and erasing an object from Object deletion to Deleting and erasing an object.
Apr 14 2021, 5:37 PM · Object storage
dachary added a subtask for T3054: Scale out object storage design: T3249: Deleting and erasing an object.
Apr 14 2021, 5:37 PM · Roadmap 2021, meta-task, Object storage
dachary added a parent task for T3249: Deleting and erasing an object: T3054: Scale out object storage design.
Apr 14 2021, 5:37 PM · Object storage
dachary changed the status of T3249: Deleting and erasing an object from Open to Work in Progress.
Apr 14 2021, 5:37 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.
  • Add reader to continuously read from images to simulate a read workload
  • Randomize the payload instead of using easily compressible data (postgres does a good job compressing them and this does not reflect the reality)
Apr 14 2021, 5:33 PM · Object storage

Apr 12 2021

dachary added a comment to T3149: Benchmark software for the object storage.
  • bench.py --file-count-ro 20 --rw-workers 20 --packer-workers 20 --file-size 1024 --fake-ro yields WARNING:root:Objects write 17.7K/s
  • bench.py --file-count-ro 40 --rw-workers 40 --packer-workers 20 --file-size 1024 --fake-ro yields WARNING:root:Objects write 13.8K/s
Apr 12 2021, 9:21 AM · Object storage

Apr 7 2021

dachary added a comment to T3149: Benchmark software for the object storage.

The benchmark was moved to a temporary repository for convenience (easier than uploading here every time). https://git.easter-eggs.org/biceps/biceps

Apr 7 2021, 6:25 PM · Object storage

Apr 6 2021

dachary closed T3210: Ceph Quincy CDS & immutable objects as Resolved.
Apr 6 2021, 11:33 PM · Object storage
dachary closed T3210: Ceph Quincy CDS & immutable objects, a subtask of T3054: Scale out object storage design, as Resolved.
Apr 6 2021, 11:33 PM · Roadmap 2021, meta-task, Object storage
dachary updated the task description for T3054: Scale out object storage design.
Apr 6 2021, 11:33 PM · Roadmap 2021, meta-task, Object storage
dachary added a comment to T3210: Ceph Quincy CDS & immutable objects.

Takeaways from the session:

Apr 6 2021, 6:35 PM · Object storage
dachary updated the task description for T3210: Ceph Quincy CDS & immutable objects.
Apr 6 2021, 1:46 PM · Object storage
dachary updated the task description for T3210: Ceph Quincy CDS & immutable objects.
Apr 6 2021, 1:46 PM · Object storage
dachary added a subtask for T3054: Scale out object storage design: T3210: Ceph Quincy CDS & immutable objects.
Apr 6 2021, 1:39 PM · Roadmap 2021, meta-task, Object storage
dachary added a parent task for T3210: Ceph Quincy CDS & immutable objects: T3054: Scale out object storage design.
Apr 6 2021, 1:39 PM · Object storage
dachary changed the status of T3210: Ceph Quincy CDS & immutable objects from Open to Work in Progress.
Apr 6 2021, 1:33 PM · Object storage

Mar 30 2021

dachary renamed T3186: Ceph Sepia lab for performance testing from Ceph Sepia lab for testing to Ceph Sepia lab for performance testing.
Mar 30 2021, 10:14 AM · Object storage
dachary added a subtask for T3054: Scale out object storage design: T3186: Ceph Sepia lab for performance testing.
Mar 30 2021, 10:13 AM · Roadmap 2021, meta-task, Object storage
dachary added a parent task for T3186: Ceph Sepia lab for performance testing: T3054: Scale out object storage design.
Mar 30 2021, 10:13 AM · Object storage
dachary changed the status of T3186: Ceph Sepia lab for performance testing from Open to Work in Progress.
Mar 30 2021, 10:13 AM · Object storage

Mar 26 2021

dachary updated the task description for T3054: Scale out object storage design.
Mar 26 2021, 11:52 PM · Roadmap 2021, meta-task, Object storage

Mar 25 2021

dachary updated the task description for T3054: Scale out object storage design.
Mar 25 2021, 10:18 AM · Roadmap 2021, meta-task, Object storage

Mar 24 2021

dachary added a comment to T3149: Benchmark software for the object storage.

Refactored the custer provsioning to use all available disks instead of the existing file system (using cephadm instead of a hand made ceph cluster).

Mar 24 2021, 11:50 AM · Object storage

Mar 23 2021

dachary added a comment to T3149: Benchmark software for the object storage.

The benchmark runs and it's not too complicated which is a relief. I'll cleanup the mess I made and move forward to finish writing the software.

Mar 23 2021, 3:27 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

The benchmarks are not fully functional but they produce a write load that matches the object storage design. They run (README.txt) via libvirt and are being tested on Grid5000 to ensure all the pieces are in place (i.e. does it actually work to reserve machines + provision them + run) before moving forward.

Mar 23 2021, 3:03 PM · Object storage

Mar 17 2021

dachary added a comment to T3057: Using seaweedfs to store objects.

Mail thread with Chris Lu on SeaweedFS use cases with 100+ billions objects.

Mar 17 2021, 4:22 PM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Mar 17 2021, 4:18 PM · Roadmap 2021, meta-task, Object storage
dachary added a subtask for T3054: Scale out object storage design: T3149: Benchmark software for the object storage.
Mar 17 2021, 4:16 PM · Roadmap 2021, meta-task, Object storage
dachary added a parent task for T3149: Benchmark software for the object storage: T3054: Scale out object storage design.
Mar 17 2021, 4:16 PM · Object storage
dachary added a comment to T3149: Benchmark software for the object storage.

First draft for layer 0.

Mar 17 2021, 4:16 PM · Object storage
dachary changed the status of T3149: Benchmark software for the object storage from Open to Work in Progress.
Mar 17 2021, 4:15 PM · Object storage

Mar 15 2021

dachary added a comment to T3054: Scale out object storage design.

Bookmarking https://leo-project.net/leofs/

Mar 15 2021, 5:21 PM · Roadmap 2021, meta-task, Object storage

Mar 10 2021

dachary closed T3108: Grid5000 for benchmarking as Resolved.
Mar 10 2021, 9:10 PM · Object storage
dachary closed T3108: Grid5000 for benchmarking, a subtask of T3054: Scale out object storage design, as Resolved.
Mar 10 2021, 9:10 PM · Roadmap 2021, meta-task, Object storage
dachary added a comment to T3108: Grid5000 for benchmarking.

With a little help from the mattermost channel and after approval of the account, it was possible to boot a physical machine with a Debian GNU/Linux installed from scratch and get root access to it.

Mar 10 2021, 9:09 PM · Object storage
dachary updated the task description for T3108: Grid5000 for benchmarking.
Mar 10 2021, 5:41 PM · Object storage