Feed Advanced Search

Advanced Search
Use Results
Edit Query
Hide Query

	Include stories about projects I am a member of.

Aug 12 2021

dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

The number of slow random reads reaches ~3.5% presumably because there is too much write pressure (the throttling of writes was removed).

Aug 12 2021, 1:25 PM · Object storage

dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

The benchmarks were modified to (i) use a fixed number of random / sequential readers instead of a random choice for better predictability, (ii) introduce throttling to cap the sequential reads speed to approximately 200MB/s. A run of read only was run:

Aug 12 2021, 12:56 PM · Object storage

dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

The run terminated August 11th @ 15:21 because of what appears to be a rare race condition. It was however mostly finished. The results show an unexpected degradation in the read performances. It deserves further investigation because it keeps degrading over time. The write performance are however stable and suggest the benchmark code itself may be responsible for this degradation. If the Ceph cluster was globally slowing down, both reads and writes would show a degradation in performance because previous benchmark results showed that there is a correlation between the two.

Aug 12 2021, 7:35 AM · Object storage

dachary updated the task description for T3422: Running the benchmarks: August 6th, 2021, 9 days.

Aug 12 2021, 7:28 AM · Object storage

Aug 2 2021

dachary added a comment to T3054: Scale out object storage design.

Improve the readability of the graphs

Aug 2 2021, 11:46 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary updated the task description for T3422: Running the benchmarks: August 6th, 2021, 9 days.

Aug 2 2021, 10:34 AM · Object storage

dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

Rehearse the run and make minor updates to make sure it runs right away this friday.

Aug 2 2021, 10:31 AM · Object storage

Jul 20 2021

dachary added a comment to T3104: Persistent readonly perfect hash table.

In the global read index, I would consider storing, for each object, alongside the shard id, the length and offset of the object (which are comparatively cheap to store)

Jul 20 2021, 11:29 AM · Object storage (RedHat collaboration)

Jul 19 2021

dachary added a comment to T3104: Persistent readonly perfect hash table.

The Compress, Hash and Displace: CHD Algorithm described in http://cmph.sourceforge.net/papers/esa09.pdf generates a hash function under 4MB for ~30M keys, 32 bytes each.

Jul 19 2021, 8:59 PM · Object storage (RedHat collaboration)

dachary added a comment to T3104: Persistent readonly perfect hash table.

A 100GB file can have 25M objects (4KB median size). If a perfect hash function requires 4bits per entry, that's reading ~12MB for every lookup.

Jul 19 2021, 7:26 PM · Object storage (RedHat collaboration)

dachary added a comment to T3104: Persistent readonly perfect hash table.

the colliding entries may be stored adjacent to each other...

Jul 19 2021, 6:35 PM · Object storage (RedHat collaboration)

dachary updated the task description for T3104: Persistent readonly perfect hash table.

Jul 19 2021, 6:17 PM · Object storage (RedHat collaboration)

dachary added a comment to T3104: Persistent readonly perfect hash table.

I just realized that since a perfect hash function need parameters that may require additional sequential reads at the beginning of the file, it would actually make more sense to have a regular hash function with a format that allows for collisions. Even if the collisions are relatively frequent, the colliding entries may be stored adjacent to each other and will not require an additional read. They are likely to be in the same block most of the time. That would save the trouble of implementing a perfect hash function.

Jul 19 2021, 6:17 PM · Object storage (RedHat collaboration)

dachary updated the task description for T3104: Persistent readonly perfect hash table.

Jul 19 2021, 6:04 PM · Object storage (RedHat collaboration)

dachary updated the task description for T3104: Persistent readonly perfect hash table.

Jul 19 2021, 6:02 PM · Object storage (RedHat collaboration)

dachary added a comment to T3104: Persistent readonly perfect hash table.

what are "the parameters to the perfect hash functions"? what are the possible formats?

Jul 19 2021, 5:51 PM · Object storage (RedHat collaboration)

dachary added a comment to T3104: Persistent readonly perfect hash table.

The content of a file:

Jul 19 2021, 3:34 PM · Object storage (RedHat collaboration)

dachary renamed T3104: Persistent readonly perfect hash table from Using a custom Hash Table format to Persistent readonly perfect hash table.

Jul 19 2021, 2:29 PM · Object storage (RedHat collaboration)

dachary added a subtask for T3432: Add winery backend: T3104: Persistent readonly perfect hash table.

Jul 19 2021, 2:24 PM · Object storage

dachary added a parent task for T3104: Persistent readonly perfect hash table: T3432: Add winery backend.

Jul 19 2021, 2:24 PM · Object storage (RedHat collaboration)

dachary updated the task description for T3432: Add winery backend.

Jul 19 2021, 2:21 PM · Object storage

dachary added a comment to T3432: Add winery backend.

On the topic of throttling, the following discussion happened on IRC:

Jul 19 2021, 2:09 PM · Object storage

dachary updated the task description for T3432: Add winery backend.

Jul 19 2021, 2:07 PM · Object storage

dachary updated subscribers of T3432: Add winery backend.

I misrepresented @olasd suggestions, here is the chat log on the matter.

Jul 19 2021, 2:02 PM · Object storage

dachary updated the task description for T3432: Add winery backend.

Jul 19 2021, 12:47 PM · Object storage

dachary updated the task description for T3432: Add winery backend.

Jul 19 2021, 12:46 PM · Object storage

dachary added inline comments to D6006: add winery backend.

Jul 19 2021, 12:43 PM

dachary added a comment to D6006: add winery backend.

In D6006#154829, @vlorentz wrote:

why *args, **kwargs on all methods?

Jul 19 2021, 12:40 PM

dachary requested review of D6006: add winery backend.

Jul 19 2021, 12:31 PM

dachary changed the status of T3432: Add winery backend from Open to Work in Progress.

Jul 19 2021, 12:03 PM · Object storage

dachary updated the task description for T3422: Running the benchmarks: August 6th, 2021, 9 days.

Jul 19 2021, 7:22 AM · Object storage

dachary closed T3421: Running the benchmarks: July 16th, 2 days, a subtask of T3054: Scale out object storage design, as Resolved.

Jul 19 2021, 7:19 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary closed T3421: Running the benchmarks: July 16th, 2 days as Resolved.

Jul 19 2021, 7:19 AM · Object storage

Jul 12 2021

dachary updated the task description for T3422: Running the benchmarks: August 6th, 2021, 9 days.

Jul 12 2021, 3:43 PM · Object storage

dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

On a une procédure pour ce genre de cas, je t'ai ajouté au groupe
"oar-unrestricted-adv-reservations" qui devrait lever toutes les
restrictions sur les réservations à l'avance de ressources. Tu devrais du
coup pouvoir refaire ta réservation avec le bon walltime.

J'ai mis une date d'expiration au 12 septembre sur ce groupe pour être sûr
que ça suffise, mais pense bien à refaire une demande d'utilisation
spéciale si tu as un nouveau besoin hors charte après celle d'août.

Jul 12 2021, 3:42 PM · Object storage

dachary updated the task description for T3054: Scale out object storage design.

Jul 12 2021, 3:41 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary updated the task description for T3054: Scale out object storage design.

Jul 12 2021, 3:41 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

Mail sent today:

Jul 12 2021, 12:50 PM · Object storage

Jul 10 2021

dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

$ oarsub -t exotic -l "{cluster='dahu'}/host=30+{cluster='yeti'}/host=3,walltime=216" --reservation '2021-08-06 19:00:00' -t deploy                                   
[ADMISSION RULE] Include exotic resources in the set of reservable resources (this does NOT exclude non-exotic resources).                                                                       
[ADMISSION RULE] Error: Walltime too big for this job, it is limited to 168 hours

Jul 10 2021, 8:00 AM · Object storage

dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

Received yesterday:

Jul 10 2021, 7:59 AM · Object storage

dachary renamed T3422: Running the benchmarks: August 6th, 2021, 9 days from Running the benchmarks: August, 10 day to Running the benchmarks: August 6th, 2021, 9 days.

Jul 10 2021, 7:58 AM · Object storage

Jul 6 2021

dachary added a comment to T3327: Hardware architecture for the object storage.

Quote for the write storage nodes.

Jul 6 2021, 2:52 PM · Object storage (RedHat collaboration)

dachary added a comment to T3327: Hardware architecture for the object storage.

Storage node 8TB

Jul 6 2021, 2:38 PM · Object storage (RedHat collaboration)

dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

Special permission request sent:

Jul 6 2021, 2:01 PM · Object storage

dachary updated the task description for T3422: Running the benchmarks: August 6th, 2021, 9 days.

Jul 6 2021, 1:49 PM · Object storage

dachary closed T3186: Ceph Sepia lab for performance testing, a subtask of T3054: Scale out object storage design, as Wontfix.

Jul 6 2021, 8:26 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary closed T3186: Ceph Sepia lab for performance testing as Wontfix.

The benchmark results using grid5000 turn out to be good enough and there will not be a need to use the resources of the Sepia lab.

Jul 6 2021, 8:26 AM · Object storage

dachary closed T3068: Using Sorted String Tables as a file format, a subtask of T3054: Scale out object storage design, as Wontfix.

Jul 6 2021, 8:24 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary closed T3068: Using Sorted String Tables as a file format as Wontfix.

Jul 6 2021, 8:24 AM · Object storage

dachary added a comment to T3068: Using Sorted String Tables as a file format.

Using a hash table is a better option because it is O(1) instead of O(log(n))

Jul 6 2021, 8:24 AM · Object storage

dachary closed T3066: Using RocksDB SST as a file format as Wontfix.

Jul 6 2021, 8:23 AM · Object storage

dachary closed T3066: Using RocksDB SST as a file format, a subtask of T3054: Scale out object storage design, as Wontfix.

Jul 6 2021, 8:23 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a comment to T3066: Using RocksDB SST as a file format.

It is not worth the effort and using a hash table is a better option.

Jul 6 2021, 8:23 AM · Object storage

dachary updated the task description for T3054: Scale out object storage design.

Jul 6 2021, 8:22 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary updated the task description for T3054: Scale out object storage design.

Jul 6 2021, 8:21 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a subtask for T3054: Scale out object storage design: T3422: Running the benchmarks: August 6th, 2021, 9 days.

Jul 6 2021, 8:18 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a parent task for T3422: Running the benchmarks: August 6th, 2021, 9 days: T3054: Scale out object storage design.

Jul 6 2021, 8:18 AM · Object storage

dachary changed the status of T3422: Running the benchmarks: August 6th, 2021, 9 days from Open to Work in Progress.

Jul 6 2021, 8:18 AM · Object storage

dachary renamed T3421: Running the benchmarks: July 16th, 2 days from Running the benchmarks: July 16th, 60h to Running the benchmarks: July 16th, 2 days.

Jul 6 2021, 8:14 AM · Object storage

dachary updated the task description for T3054: Scale out object storage design.

Jul 6 2021, 8:13 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary updated the task description for T3054: Scale out object storage design.

Jul 6 2021, 8:12 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a subtask for T3054: Scale out object storage design: T3421: Running the benchmarks: July 16th, 2 days.

Jul 6 2021, 7:59 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary added a parent task for T3421: Running the benchmarks: July 16th, 2 days: T3054: Scale out object storage design.

Jul 6 2021, 7:59 AM · Object storage

dachary changed the status of T3421: Running the benchmarks: July 16th, 2 days from Open to Work in Progress.

Jul 6 2021, 7:58 AM · Object storage

dachary updated the task description for T3054: Scale out object storage design.

Jul 6 2021, 7:44 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary closed T3149: Benchmark software for the object storage, a subtask of T3054: Scale out object storage design, as Resolved.

Jul 6 2021, 7:42 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary closed T3149: Benchmark software for the object storage as Resolved.

Jul 6 2021, 7:42 AM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

After some cleanup, the final version is https://git.easter-eggs.org/biceps/biceps/-/tree/7d137fcd54f265253a27346b3652e26c6c5dd5e8. It concludes this (long) task and it can be closed.

Jul 6 2021, 7:42 AM · Object storage

dachary updated the task description for T3149: Benchmark software for the object storage.

Jul 6 2021, 7:40 AM · Object storage

dachary closed T3048: Using a custom Sorted String Table format, a subtask of T3054: Scale out object storage design, as Wontfix.

Jul 6 2021, 7:37 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

dachary closed T3048: Using a custom Sorted String Table format as Wontfix.

Jul 6 2021, 7:37 AM · Object storage

Jun 28 2021

dachary added a comment to T3149: Benchmark software for the object storage.

After discussing with @olasd and @douardda today, it was decided that although the benchmarks are not 100% as expected, they are good enough. The next steps will be to:

Jun 28 2021, 12:05 PM · Object storage

Jun 26 2021

dachary added a comment to T3149: Benchmark software for the object storage.

With a warmup phase and 100GB Shards. The number of PGs was incorrectly set to the ro pool instead of the ro-data pool: background recovery happened during the last third of the run approximately.

Jun 26 2021, 8:11 AM · Object storage

Jun 22 2021

dachary added a comment to T3149: Benchmark software for the object storage.

Add RBD QoS dynamically to avoid bursts

Jun 22 2021, 6:28 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

Implement throttling for writes

Jun 22 2021, 2:14 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

For the record this blog post published April 2021 has pointers on how to benchmark and tune Ceph.

Jun 22 2021, 10:18 AM · Object storage

Jun 21 2021

dachary added a comment to T3149: Benchmark software for the object storage.

New stats look like this, with a Ceph cluster of 15 OSDs:

Jun 21 2021, 10:43 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

The statistics are no longer displayed as the benchmark runs, they are stored in CSV files: one line is added every 5 seconds
IO stats are collected from the Ceph cluster every five seconds and included in the CSV files
The stats.py file is implemented to analyze the content of the CSV files and display statistics on the benchmark run

Jun 21 2021, 7:23 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

$ bench.py --file-count-ro 350 --rw-workers 10 --ro-workers 5 --file-size $((100 * 1024)) --no-warmup
...
WARNING:root:Objects write 6.4K/s
WARNING:root:Bytes write 131.3MB/s
WARNING:root:Objects read 24.3K/s
WARNING:root:Bytes read 99.4MB/s
WARNING:root:2.0859388857985817% of random reads took longer than 100.0ms
WARNING:root:Worst times to first byte on random reads (ms) [10751, 8217, 7655, 7446, 7366, 6919, 6722, 6515, 6481, 6079, 5918, 5839, 5823, 5759, 5634, 5573, 5492, 5335, 5114, 5105, 5009, 4976, 4963, 4914, 4913, 4854, 4822, 4668, 4658, 4605, 4593, 4551, 4537, 4489, 4470, 4431, 4418, 4411, 4385, 4327, 4298, 4224, 4090, 4082, 4070, 4010, 3868, 3865, 3819, 3818, 3815, 3805, 3798, 3755, 3719, 3716, 3711, 3704, 3688, 3612, 3608, 3606, 3579, 3543, 3537, 3527, 3493, 3450, 3441, 3356, 3346, 3338, 3319, 3313, 3294, 3272, 3264, 3258, 3244, 3183, 3179, 3160, 3145, 3136, 3127, 3123, 3119, 3107, 3098, 3093, 3090, 3083, 3082, 3068, 3057, 3052, 3029, 3028, 3022, 3022]

Jun 21 2021, 8:52 AM · Object storage

Jun 20 2021

dachary added a comment to T3149: Benchmark software for the object storage.

When the Read Storage went over 20TB, the PGs of the Ceph pool were automatically increased (double). As a consequence backfilling started but it is throttled to not have a negative impact on performances.

Jun 20 2021, 11:22 AM · Object storage

Jun 13 2021

dachary added a comment to T3149: Benchmark software for the object storage.

Creating a 20 billions global index fails because there is not enough disk space (2.9TB is full even with tunefs -m 0).

Jun 13 2021, 11:36 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

The equilibrium between reads and write is with 5 readers and 10 writers which leads to 1.2% random reads above the threshold, the worst one being 2sec. What it means is that care must be taken, application side, to throttle reads and writes otherwise the penalty is a significant degradation is latency.

Jun 13 2021, 11:31 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

When the benchmark write, the pressure of 40 workers slows down the reads significantly.

Jun 13 2021, 7:03 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

Running the benchmark with a read workload only (the Ceph cluster is doing nothing else), with 20 workers shows 8% of requests with a latency above the threshold:

Jun 13 2021, 3:40 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

I interrupted the benchmarks because it shows reads are not as expected, i.e. a large number of reads take very long and the number of reads per seconds is way more than what is needed. There is no throttling on reads only the number of workers is the limit. I was expecting they would be slowed down by other factors and not apply too much pressure on the cluster. But I was apparently wrong and throttling must be implemented to slow them down.

Jun 13 2021, 7:52 AM · Object storage

Jun 12 2021

dachary added a comment to T3149: Benchmark software for the object storage.

For the record, creating 10 billions entries in the global index took:

Jun 12 2021, 5:43 PM · Object storage

Jun 7 2021

dachary added a comment to T3149: Benchmark software for the object storage.

In T3149#65906, @zack wrote:

how about just collecting all raw timings in an output CSV file (or several files if needed) and compute the stats downstream (e.g., with pandas)?
that would allow changing the percentiles later on as well as compute different stats, without having to rerun the benchmarks

Jun 7 2021, 3:35 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

I still think that returning a histogram of response times, in buckets of 5 or 10 ms wide ranges, may be valuable? We can then derive percentiles from that if we're so inclined.

Jun 7 2021, 12:45 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

In T3149#65880, @olasd wrote:

While you're at it, could you report quantiles for the time to first byte, instead of just a raw maximum?

Something like:

best 1%

best 10%

best 25%

median

worst 25% / best 75%

worst 10%

worst 1%

maximum

(this all might be overkill, but...)

Jun 7 2021, 12:09 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

Collect and display the worst time to first byte, not the average

Jun 7 2021, 11:59 AM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

In T3149#65877, @douardda wrote:

and this needs fixing.

do you mean the bench code needs fixing (to report the proper stats)?

Jun 7 2021, 11:34 AM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.

This week-end run was not very fruitful: since the global index could not be populated as expected and it was discovered Sunday morning, there was no time to fallback to a small one, for instance 10 billion entries. A run was launched and lasted ~24h to show:

Jun 7 2021, 9:10 AM · Object storage

Jun 5 2021

dachary updated the task description for T3327: Hardware architecture for the object storage.

Jun 5 2021, 8:21 PM · Object storage (RedHat collaboration)

dachary added a comment to T3149: Benchmark software for the object storage.

20 billions entries were inserted in the global index. After building the index it occupies 2.5TB, therefore each entry uses ~125 bytes of raw space. That's 25% more than with a 1 billion entries global index (i.e. 100 bytes)

Jun 5 2021, 7:59 PM · Object storage

dachary updated the task description for T3149: Benchmark software for the object storage.

Jun 5 2021, 7:54 PM · Object storage

dachary added a comment to T3149: Benchmark software for the object storage.