Page MenuHomeSoftware Heritage
Feed Advanced Search

Feb 22 2021

dachary changed the status of T3050: Using libcephsqlite to store objects from Work in Progress to Open.
Feb 22 2021, 12:25 AM · Object storage
dachary changed the status of T3050: Using libcephsqlite to store objects, a subtask of T3054: Scale out object storage design, from Work in Progress to Open.
Feb 22 2021, 12:25 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary closed T3064: Using ambry to store objects, a subtask of T3054: Scale out object storage design, as Invalid.
Feb 22 2021, 12:25 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary closed T3064: Using ambry to store objects as Invalid.
Feb 22 2021, 12:25 AM · Object storage
dachary added a comment to T3064: Using ambry to store objects.

Ambry has been a great source of inspiration and the best fit for the software heritage use case. Including the partition UUID in the object takes advantage of the immutability of the objects allows all readers to have a scale out object storage.

Feb 22 2021, 12:24 AM · Object storage
dachary changed the status of T3065: Using git to store objects, a subtask of T3054: Scale out object storage design, from Work in Progress to Open.
Feb 22 2021, 12:17 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary changed the status of T3065: Using git to store objects from Work in Progress to Open.
Feb 22 2021, 12:17 AM · Object storage
dachary closed T3048: Using a custom Sorted String Table format, a subtask of T3054: Scale out object storage design, as Invalid.
Feb 22 2021, 12:16 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary closed T3048: Using a custom Sorted String Table format as Invalid.
Feb 22 2021, 12:16 AM · Object storage
dachary added a comment to T3048: Using a custom Sorted String Table format.

It turns out there are a number of suitable formats (SST from RocksDB for one), no need to re-invent this wheel.

Feb 22 2021, 12:16 AM · Object storage
dachary closed T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K, a subtask of T3056: Ceph as an object storage, as Invalid.
Feb 22 2021, 12:13 AM · Object storage (RedHat collaboration)
dachary closed T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K as Invalid.
Feb 22 2021, 12:13 AM · Object storage
dachary added a comment to T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.

In the T3054 proposed design, objects are packed into larger files and there is no reason to continue in this direction. There seems to be a consensus that tenths of billions of individual objects is problematic. It takes very long to enumerate, for one thing. And noone is doing that which is not a great sign.

Feb 22 2021, 12:13 AM · Object storage
dachary added a comment to T3049: Benchmarking an RBD based object container.

The T3054 design evolved and this benchmark won't be needed

Feb 22 2021, 12:09 AM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 22 2021, 12:08 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary closed T3049: Benchmarking an RBD based object container as Invalid.
Feb 22 2021, 12:06 AM · Object storage
dachary closed T3049: Benchmarking an RBD based object container, a subtask of T3056: Ceph as an object storage, as Invalid.
Feb 22 2021, 12:06 AM · Object storage (RedHat collaboration)
dachary updated the task description for T3054: Scale out object storage design.
Feb 22 2021, 12:04 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

Feb 21 2021

dachary updated the task description for T3054: Scale out object storage design.
Feb 21 2021, 8:54 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 21 2021, 8:49 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 21 2021, 5:45 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a comment to T3064: Using ambry to store objects.

Readonly partitions are stored in Sorted String Table format.

Feb 21 2021, 5:41 PM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 21 2021, 12:22 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3064: Using ambry to store objects.
Feb 21 2021, 12:13 PM · Object storage
dachary added a comment to T3064: Using ambry to store objects.

Open sourcing DataHub: LinkedIn’s metadata search and discovery platform explains how developers work on DataHub and the relationship between code internal to Linkedin and what is published as Free Software. It is not about ambry and maybe the ambry team has a completely different behavior. A similar article about ambry is dated 2016:

Feb 21 2021, 12:11 PM · Object storage
dachary updated the task description for T3064: Using ambry to store objects.
Feb 21 2021, 11:59 AM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 21 2021, 11:44 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

Feb 20 2021

dachary added a comment to T3051: Using EOS to store objects.

QuarkDB is now used for namespace. It stores 2.5 billions objects.

Feb 20 2021, 4:45 PM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 20 2021, 4:42 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 20 2021, 4:31 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a comment to T3065: Using git to store objects.

git partial clone

Feb 20 2021, 2:01 PM · Object storage
dachary added a subtask for T3054: Scale out object storage design: T3065: Using git to store objects.
Feb 20 2021, 2:00 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a parent task for T3065: Using git to store objects: T3054: Scale out object storage design.
Feb 20 2021, 2:00 PM · Object storage
dachary changed the status of T3065: Using git to store objects from Open to Work in Progress.
Feb 20 2021, 1:59 PM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 20 2021, 1:40 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a parent task for T3064: Using ambry to store objects: T3054: Scale out object storage design.
Feb 20 2021, 1:39 PM · Object storage
dachary added a subtask for T3054: Scale out object storage design: T3064: Using ambry to store objects.
Feb 20 2021, 1:39 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary changed the status of T3064: Using ambry to store objects from Open to Work in Progress.
Feb 20 2021, 1:38 PM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 20 2021, 1:33 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

Feb 17 2021

dachary added a comment to T3048: Using a custom Sorted String Table format.

Let's leave it open: although T3050 is a better fit, it is not ready yet and an interim solution may be required.

Feb 17 2021, 11:24 PM · Object storage
dachary reopened T3048: Using a custom Sorted String Table format, a subtask of T3054: Scale out object storage design, as Work in Progress.
Feb 17 2021, 11:23 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary reopened T3048: Using a custom Sorted String Table format as "Work in Progress".
Feb 17 2021, 11:23 PM · Object storage
dachary closed T3048: Using a custom Sorted String Table format as Resolved.
Feb 17 2021, 11:22 PM · Object storage
dachary closed T3048: Using a custom Sorted String Table format, a subtask of T3054: Scale out object storage design, as Resolved.
Feb 17 2021, 11:22 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a comment to T3048: Using a custom Sorted String Table format.

T3050 is a better fit as it does not require any specification or development.

Feb 17 2021, 11:22 PM · Object storage
dachary added a comment to T3050: Using libcephsqlite to store objects.

Although it is not a good fit to store all objects, it is a better fit than RBD + a custom format to store 1TB worth of objects. Provided support for multiple concurrent readers is added.

Feb 17 2021, 11:19 PM · Object storage
dachary reopened T3050: Using libcephsqlite to store objects as "Work in Progress".
Feb 17 2021, 11:18 PM · Object storage
dachary reopened T3050: Using libcephsqlite to store objects, a subtask of T3054: Scale out object storage design, as Work in Progress.
Feb 17 2021, 11:18 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 5:40 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 5:39 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 3:14 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a comment to T3054: Scale out object storage design.

In the following small objects are < 4KB and object storage software refers to the list of software from the description for which there are no blockers.

Feb 17 2021, 3:12 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:53 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:43 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:42 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:42 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:37 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:17 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:11 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:11 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary closed T3057: Using seaweedfs to store objects as Resolved.
Feb 17 2021, 1:57 PM · Object storage
dachary closed T3057: Using seaweedfs to store objects, a subtask of T3054: Scale out object storage design, as Resolved.
Feb 17 2021, 1:57 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3057: Using seaweedfs to store objects.
Feb 17 2021, 1:56 PM · Object storage
dachary added a comment to T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.

We'd want a reader to try reading on the mirrored pool, and then to fall back to the erasure coded pool if the object is larger than the cutoff. The increased latency in getting large objects may be worth the space savings ? I don't know.

Feb 17 2021, 1:34 PM · Object storage
olasd added a comment to T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.

If the size of the object was known to the reader of the object store it would be a great way to develop storage strategies depending on the object size. So far I assumed the reader does not have that information and is therefore unable to figure out which object storage to use based on that information but maybe I missed something?

Feb 17 2021, 1:18 PM · Object storage
dachary added subtasks for T3054: Scale out object storage design: T3051: Using EOS to store objects, T3050: Using libcephsqlite to store objects.
Feb 17 2021, 11:26 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a parent task for T3051: Using EOS to store objects: T3054: Scale out object storage design.
Feb 17 2021, 11:26 AM · Object storage
dachary added a parent task for T3050: Using libcephsqlite to store objects: T3054: Scale out object storage design.
Feb 17 2021, 11:26 AM · Object storage
dachary added a subtask for T3054: Scale out object storage design: T3057: Using seaweedfs to store objects.
Feb 17 2021, 11:25 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a parent task for T3057: Using seaweedfs to store objects: T3054: Scale out object storage design.
Feb 17 2021, 11:25 AM · Object storage
dachary changed the status of T3057: Using seaweedfs to store objects from Open to Work in Progress.
Feb 17 2021, 11:24 AM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 11:21 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 11:21 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 11:13 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 11:06 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 10:38 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 10:32 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 10:31 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a comment to T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.

The bench script and full results are in the tarbal.

Feb 17 2021, 10:27 AM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 8:29 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 8:26 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a comment to T3054: Scale out object storage design.
In T3054#58874, @olasd wrote:

@zack, very good point about having a target for the "time to first byte when reading an object".

I don't know what would be a "good" target for that metric; my gut says that staying within 100ms for any given object would be acceptable, as long as the number of parallel readers doesn't impact the amount too much (of course, within the IOPS of the underlying media, etc.).

Feb 17 2021, 8:26 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 8:20 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary removed a parent task for T3049: Benchmarking an RBD based object container: T3054: Scale out object storage design.
Feb 17 2021, 8:14 AM · Object storage
dachary removed a parent task for T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K: T3054: Scale out object storage design.
Feb 17 2021, 8:14 AM · Object storage
dachary added a parent task for T3056: Ceph as an object storage: T3054: Scale out object storage design.
Feb 17 2021, 8:14 AM · Object storage (RedHat collaboration)
dachary edited subtasks for T3054: Scale out object storage design, added: T3056: Ceph as an object storage; removed: T3049: Benchmarking an RBD based object container, T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.
Feb 17 2021, 8:14 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a parent task for T3049: Benchmarking an RBD based object container: T3056: Ceph as an object storage.
Feb 17 2021, 8:13 AM · Object storage
dachary added a parent task for T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K: T3056: Ceph as an object storage.
Feb 17 2021, 8:13 AM · Object storage
dachary added subtasks for T3056: Ceph as an object storage: T3055: Ceph and immutable & append only storage, T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K, T3049: Benchmarking an RBD based object container.
Feb 17 2021, 8:13 AM · Object storage (RedHat collaboration)
dachary added a parent task for T3055: Ceph and immutable & append only storage: T3056: Ceph as an object storage.
Feb 17 2021, 8:13 AM · Object storage
dachary changed the status of T3056: Ceph as an object storage from Open to Work in Progress.
Feb 17 2021, 8:12 AM · Object storage (RedHat collaboration)
dachary changed the status of T3055: Ceph and immutable & append only storage from Open to Work in Progress.
Feb 17 2021, 8:10 AM · Object storage
dachary added a comment to T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.

If the size of the object was known to the reader of the object store it would be a great way to develop storage strategies depending on the object size. So far I assumed the reader does not have that information and is therefore unable to figure out which object storage to use based on that information but maybe I missed something?

Feb 17 2021, 7:14 AM · Object storage

Feb 16 2021

dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 11:37 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
olasd added a comment to T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.

Maybe it would make sense to consider putting the very small objects (e.g. those <= the min alloc size) into a 3 or 4-way mirrored pool instead of an erasure coded pool;

Feb 16 2021, 10:40 PM · Object storage
olasd added a comment to T3054: Scale out object storage design.

@zack, very good point about having a target for the "time to first byte when reading an object".

Feb 16 2021, 10:24 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 10:19 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 10:17 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 10:16 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task