Page MenuHomeSoftware Heritage

dachary (Loïc Dachary)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 8 2021, 11:21 PM (36 w, 2 d)

Recent Activity

Mon, Aug 30

dachary updated the task description for T3054: Scale out object storage design.
Mon, Aug 30, 12:53 PM · Roadmap 2021, meta-task, Object storage

Sun, Aug 29

dachary added a subtask for T3432: Add winery backend: T3533: Winery backend: implementation.
Sun, Aug 29, 2:43 PM · Object storage
dachary added a parent task for T3533: Winery backend: implementation: T3432: Add winery backend.
Sun, Aug 29, 2:43 PM · Object storage
dachary triaged T3533: Winery backend: implementation as Normal priority.
Sun, Aug 29, 2:41 PM · Object storage
dachary removed a subtask for T3432: Add winery backend: T3530: IO throttling: implementation.
Sun, Aug 29, 2:37 PM · Object storage
dachary edited parent tasks for T3530: IO throttling: implementation, added: T3532: IO throttling; removed: T3432: Add winery backend.
Sun, Aug 29, 2:37 PM · Object storage
dachary added a subtask for T3532: IO throttling: T3530: IO throttling: implementation.
Sun, Aug 29, 2:37 PM · Object storage
dachary added a parent task for T3531: IO throttling: benchmark: T3532: IO throttling.
Sun, Aug 29, 2:37 PM · Object storage
dachary added a subtask for T3532: IO throttling: T3531: IO throttling: benchmark.
Sun, Aug 29, 2:37 PM · Object storage
dachary added a parent task for T3532: IO throttling: T3432: Add winery backend.
Sun, Aug 29, 2:36 PM · Object storage
dachary added a subtask for T3432: Add winery backend: T3532: IO throttling.
Sun, Aug 29, 2:36 PM · Object storage
dachary triaged T3532: IO throttling as Normal priority.
Sun, Aug 29, 2:36 PM · Object storage
dachary updated the task description for T3530: IO throttling: implementation.
Sun, Aug 29, 2:36 PM · Object storage
dachary triaged T3531: IO throttling: benchmark as Normal priority.
Sun, Aug 29, 2:34 PM · Object storage
dachary added a subtask for T3432: Add winery backend: T3530: IO throttling: implementation.
Sun, Aug 29, 2:31 PM · Object storage
dachary added a parent task for T3530: IO throttling: implementation: T3432: Add winery backend.
Sun, Aug 29, 2:31 PM · Object storage
dachary triaged T3530: IO throttling: implementation as Normal priority.
Sun, Aug 29, 2:30 PM · Object storage
dachary added a subtask for T3432: Add winery backend: T3529: Publish object storage benchmark results.
Sun, Aug 29, 2:25 PM · Object storage
dachary added a parent task for T3529: Publish object storage benchmark results: T3432: Add winery backend.
Sun, Aug 29, 2:25 PM · Object storage
dachary triaged T3529: Publish object storage benchmark results as Normal priority.
Sun, Aug 29, 2:24 PM · Object storage
dachary added a subtask for T3432: Add winery backend: T3528: Add winery backend: grid5000 benchmark.
Sun, Aug 29, 2:21 PM · Object storage
dachary added a parent task for T3528: Add winery backend: grid5000 benchmark: T3432: Add winery backend.
Sun, Aug 29, 2:21 PM · Object storage
dachary triaged T3528: Add winery backend: grid5000 benchmark as Normal priority.
Sun, Aug 29, 2:20 PM · Object storage
dachary added a parent task for T3527: Self-host Software Heritage on grid5000: T3432: Add winery backend.
Sun, Aug 29, 2:15 PM · Object storage
dachary added a subtask for T3432: Add winery backend: T3527: Self-host Software Heritage on grid5000.
Sun, Aug 29, 2:15 PM · Object storage
dachary triaged T3527: Self-host Software Heritage on grid5000 as Normal priority.
Sun, Aug 29, 2:15 PM · Object storage
dachary added a subtask for T3432: Add winery backend: T3526: Add winery backend: learning the CI.
Sun, Aug 29, 2:12 PM · Object storage
dachary added a parent task for T3526: Add winery backend: learning the CI: T3432: Add winery backend.
Sun, Aug 29, 2:12 PM · Object storage
dachary triaged T3526: Add winery backend: learning the CI as Normal priority.
Sun, Aug 29, 2:11 PM · Object storage
dachary renamed T3525: grid5000 tools and documentation from grid5000 tools to grid5000 tools and documentation.
Sun, Aug 29, 2:05 PM · Object storage
dachary added a parent task for T3525: grid5000 tools and documentation: T3432: Add winery backend.
Sun, Aug 29, 2:05 PM · Object storage
dachary added a subtask for T3432: Add winery backend: T3525: grid5000 tools and documentation.
Sun, Aug 29, 2:05 PM · Object storage
dachary triaged T3525: grid5000 tools and documentation as Normal priority.
Sun, Aug 29, 2:04 PM · Object storage
dachary updated the task description for T3524: Add winery backend: create the Ceph cluster.
Sun, Aug 29, 1:59 PM · Object storage
dachary added a subtask for T3432: Add winery backend: T3524: Add winery backend: create the Ceph cluster.
Sun, Aug 29, 1:58 PM · Object storage
dachary added a parent task for T3524: Add winery backend: create the Ceph cluster: T3432: Add winery backend.
Sun, Aug 29, 1:58 PM · Object storage
dachary added a project to T3523: Add winery backend: create the PostgreSQL cluster: Object storage.
Sun, Aug 29, 1:57 PM · Object storage
dachary triaged T3524: Add winery backend: create the Ceph cluster as Normal priority.
Sun, Aug 29, 1:57 PM · Object storage
dachary added a subtask for T3432: Add winery backend: T3523: Add winery backend: create the PostgreSQL cluster.
Sun, Aug 29, 1:56 PM · Object storage
dachary added a parent task for T3523: Add winery backend: create the PostgreSQL cluster: T3432: Add winery backend.
Sun, Aug 29, 1:56 PM · Object storage
dachary triaged T3523: Add winery backend: create the PostgreSQL cluster as Normal priority.
Sun, Aug 29, 1:56 PM · Object storage
dachary updated the task description for T3521: Persistent readonly perfect hash table: benchmarks.
Sun, Aug 29, 1:44 PM · Object storage
dachary updated the task description for T3520: Persistent readonly perfect hash table: implementation.
Sun, Aug 29, 1:42 PM · Object storage
dachary updated the task description for T3519: Persistent readonly perfect hash table: CI and package.
Sun, Aug 29, 1:41 PM · Object storage
dachary updated the task description for T3522: Add winery backend: learning the codebase.
Sun, Aug 29, 1:39 PM · Object storage
dachary added a subtask for T3432: Add winery backend: T3522: Add winery backend: learning the codebase.
Sun, Aug 29, 1:36 PM · Object storage
dachary added a parent task for T3522: Add winery backend: learning the codebase: T3432: Add winery backend.
Sun, Aug 29, 1:36 PM · Object storage
dachary triaged T3522: Add winery backend: learning the codebase as Normal priority.
Sun, Aug 29, 1:35 PM · Object storage
dachary added a parent task for T3521: Persistent readonly perfect hash table: benchmarks: T3104: Persistent readonly perfect hash table.
Sun, Aug 29, 1:26 PM · Object storage
dachary added a subtask for T3104: Persistent readonly perfect hash table: T3521: Persistent readonly perfect hash table: benchmarks.
Sun, Aug 29, 1:26 PM · Object storage
dachary updated the task description for T3521: Persistent readonly perfect hash table: benchmarks.
Sun, Aug 29, 1:26 PM · Object storage
dachary triaged T3521: Persistent readonly perfect hash table: benchmarks as Normal priority.
Sun, Aug 29, 1:26 PM · Object storage
dachary added a parent task for T3520: Persistent readonly perfect hash table: implementation: T3104: Persistent readonly perfect hash table.
Sun, Aug 29, 1:22 PM · Object storage
dachary added a subtask for T3104: Persistent readonly perfect hash table: T3520: Persistent readonly perfect hash table: implementation.
Sun, Aug 29, 1:22 PM · Object storage
dachary updated the task description for T3520: Persistent readonly perfect hash table: implementation.
Sun, Aug 29, 1:21 PM · Object storage
dachary updated the task description for T3520: Persistent readonly perfect hash table: implementation.
Sun, Aug 29, 1:20 PM · Object storage
dachary triaged T3520: Persistent readonly perfect hash table: implementation as Normal priority.
Sun, Aug 29, 1:20 PM · Object storage
dachary added a subtask for T3104: Persistent readonly perfect hash table: T3519: Persistent readonly perfect hash table: CI and package.
Sun, Aug 29, 1:14 PM · Object storage
dachary added a parent task for T3519: Persistent readonly perfect hash table: CI and package: T3104: Persistent readonly perfect hash table.
Sun, Aug 29, 1:14 PM · Object storage
dachary triaged T3519: Persistent readonly perfect hash table: CI and package as Normal priority.
Sun, Aug 29, 1:14 PM · Object storage
dachary changed the status of T3104: Persistent readonly perfect hash table, a subtask of T3432: Add winery backend, from Work in Progress to Open.
Sun, Aug 29, 1:08 PM · Object storage
dachary changed the status of T3104: Persistent readonly perfect hash table from Work in Progress to Open.
Sun, Aug 29, 1:08 PM · Object storage
dachary changed the status of T3104: Persistent readonly perfect hash table, a subtask of T3054: Scale out object storage design, from Work in Progress to Open.
Sun, Aug 29, 1:08 PM · Roadmap 2021, meta-task, Object storage
dachary changed the status of T3432: Add winery backend from Work in Progress to Open.
Sun, Aug 29, 1:08 PM · Object storage
dachary changed the status of T3249: Deleting and erasing an object, a subtask of T3054: Scale out object storage design, from Work in Progress to Open.
Sun, Aug 29, 1:05 PM · Roadmap 2021, meta-task, Object storage
dachary changed the status of T3249: Deleting and erasing an object from Work in Progress to Open.
Sun, Aug 29, 1:05 PM · Object storage

Mon, Aug 23

dachary closed T3422: Running the benchmarks: August 6th, 2021, 9 days, a subtask of T3054: Scale out object storage design, as Resolved.
Mon, Aug 23, 12:26 PM · Roadmap 2021, meta-task, Object storage
dachary closed T3422: Running the benchmarks: August 6th, 2021, 9 days as Resolved.
Mon, Aug 23, 12:26 PM · Object storage
dachary updated the task description for T3422: Running the benchmarks: August 6th, 2021, 9 days.
Mon, Aug 23, 12:25 PM · Object storage

Aug 12 2021

dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

Throttling writes to 120MBs to reduce the pressure:

Aug 12 2021, 1:26 PM · Object storage
dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

The number of slow random reads reaches ~3.5% presumably because there is too much write pressure (the throttling of writes was removed).

Aug 12 2021, 1:25 PM · Object storage
dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

The benchmarks were modified to (i) use a fixed number of random / sequential readers instead of a random choice for better predictability, (ii) introduce throttling to cap the sequential reads speed to approximately 200MB/s. A run of read only was run:

Aug 12 2021, 12:56 PM · Object storage
dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

The run terminated August 11th @ 15:21 because of what appears to be a rare race condition. It was however mostly finished. The results show an unexpected degradation in the read performances. It deserves further investigation because it keeps degrading over time. The write performance are however stable and suggest the benchmark code itself may be responsible for this degradation. If the Ceph cluster was globally slowing down, both reads and writes would show a degradation in performance because previous benchmark results showed that there is a correlation between the two.

Aug 12 2021, 7:35 AM · Object storage
dachary updated the task description for T3422: Running the benchmarks: August 6th, 2021, 9 days.
Aug 12 2021, 7:28 AM · Object storage

Aug 2 2021

dachary added a comment to T3054: Scale out object storage design.

Improve the readability of the graphs

Aug 2 2021, 11:46 AM · Roadmap 2021, meta-task, Object storage
dachary updated the task description for T3422: Running the benchmarks: August 6th, 2021, 9 days.
Aug 2 2021, 10:34 AM · Object storage
dachary added a comment to T3422: Running the benchmarks: August 6th, 2021, 9 days.

Rehearse the run and make minor updates to make sure it runs right away this friday.

Aug 2 2021, 10:31 AM · Object storage

Jul 20 2021

dachary added a comment to T3104: Persistent readonly perfect hash table.

In the global read index, I would consider storing, for each object, alongside the shard id, the length and offset of the object (which are comparatively cheap to store)

Jul 20 2021, 11:29 AM · Object storage

Jul 19 2021

dachary added a comment to T3104: Persistent readonly perfect hash table.

The Compress, Hash and Displace: CHD Algorithm described in http://cmph.sourceforge.net/papers/wads07.pdf generates a hash function under 4MB for ~30M keys, 32 bytes each.

Jul 19 2021, 8:59 PM · Object storage
dachary added a comment to T3104: Persistent readonly perfect hash table.

A 100GB file can have 25M objects (4KB median size). If a perfect hash function requires 4bits per entry, that's reading ~12MB for every lookup.

Jul 19 2021, 7:26 PM · Object storage
dachary added a comment to T3104: Persistent readonly perfect hash table.

the colliding entries may be stored adjacent to each other...

Jul 19 2021, 6:35 PM · Object storage
dachary updated the task description for T3104: Persistent readonly perfect hash table.
Jul 19 2021, 6:17 PM · Object storage
dachary added a comment to T3104: Persistent readonly perfect hash table.

I just realized that since a perfect hash function need parameters that may require additional sequential reads at the beginning of the file, it would actually make more sense to have a regular hash function with a format that allows for collisions. Even if the collisions are relatively frequent, the colliding entries may be stored adjacent to each other and will not require an additional read. They are likely to be in the same block most of the time. That would save the trouble of implementing a perfect hash function.

Jul 19 2021, 6:17 PM · Object storage
dachary updated the task description for T3104: Persistent readonly perfect hash table.
Jul 19 2021, 6:04 PM · Object storage
dachary updated the task description for T3104: Persistent readonly perfect hash table.
Jul 19 2021, 6:02 PM · Object storage
dachary added a comment to T3104: Persistent readonly perfect hash table.

what are "the parameters to the perfect hash functions"? what are the possible formats?

Jul 19 2021, 5:51 PM · Object storage
dachary added a comment to T3104: Persistent readonly perfect hash table.

The content of a file:

Jul 19 2021, 3:34 PM · Object storage
dachary renamed T3104: Persistent readonly perfect hash table from Using a custom Hash Table format to Persistent readonly perfect hash table.
Jul 19 2021, 2:29 PM · Object storage
dachary added a subtask for T3432: Add winery backend: T3104: Persistent readonly perfect hash table.
Jul 19 2021, 2:24 PM · Object storage
dachary added a parent task for T3104: Persistent readonly perfect hash table: T3432: Add winery backend.
Jul 19 2021, 2:24 PM · Object storage
dachary updated the task description for T3432: Add winery backend.
Jul 19 2021, 2:21 PM · Object storage
dachary added a comment to T3432: Add winery backend.

On the topic of throttling, the following discussion happened on IRC:

Jul 19 2021, 2:09 PM · Object storage
dachary updated the task description for T3432: Add winery backend.
Jul 19 2021, 2:07 PM · Object storage
dachary updated subscribers of T3432: Add winery backend.

I misrepresented @olasd suggestions, here is the chat log on the matter.

Jul 19 2021, 2:02 PM · Object storage
dachary updated the task description for T3432: Add winery backend.
Jul 19 2021, 12:47 PM · Object storage
dachary updated the task description for T3432: Add winery backend.
Jul 19 2021, 12:46 PM · Object storage
dachary added inline comments to D6006: add winery backend.
Jul 19 2021, 12:43 PM
dachary added a comment to D6006: add winery backend.

why *args, **kwargs on all methods?

Jul 19 2021, 12:40 PM
dachary requested review of D6006: add winery backend.
Jul 19 2021, 12:31 PM
dachary changed the status of T3432: Add winery backend from Open to Work in Progress.
Jul 19 2021, 12:03 PM · Object storage