Page MenuHomeSoftware Heritage
Feed Advanced Search

Feb 21 2021

dachary updated the task description for T3064: Using ambry to store objects.
Feb 21 2021, 11:59 AM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 21 2021, 11:44 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

Feb 20 2021

dachary added a comment to T3051: Using EOS to store objects.

QuarkDB is now used for namespace. It stores 2.5 billions objects.

Feb 20 2021, 4:45 PM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 20 2021, 4:42 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 20 2021, 4:31 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a comment to T3065: Using git to store objects.

git partial clone

Feb 20 2021, 2:01 PM · Object storage
dachary added a subtask for T3054: Scale out object storage design: T3065: Using git to store objects.
Feb 20 2021, 2:00 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a parent task for T3065: Using git to store objects: T3054: Scale out object storage design.
Feb 20 2021, 2:00 PM · Object storage
dachary changed the status of T3065: Using git to store objects from Open to Work in Progress.
Feb 20 2021, 1:59 PM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 20 2021, 1:40 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a parent task for T3064: Using ambry to store objects: T3054: Scale out object storage design.
Feb 20 2021, 1:39 PM · Object storage
dachary added a subtask for T3054: Scale out object storage design: T3064: Using ambry to store objects.
Feb 20 2021, 1:39 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary changed the status of T3064: Using ambry to store objects from Open to Work in Progress.
Feb 20 2021, 1:38 PM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 20 2021, 1:33 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task

Feb 17 2021

dachary added a comment to T3048: Using a custom Sorted String Table format.

Let's leave it open: although T3050 is a better fit, it is not ready yet and an interim solution may be required.

Feb 17 2021, 11:24 PM · Object storage
dachary reopened T3048: Using a custom Sorted String Table format, a subtask of T3054: Scale out object storage design, as Work in Progress.
Feb 17 2021, 11:23 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary reopened T3048: Using a custom Sorted String Table format as "Work in Progress".
Feb 17 2021, 11:23 PM · Object storage
dachary closed T3048: Using a custom Sorted String Table format as Resolved.
Feb 17 2021, 11:22 PM · Object storage
dachary closed T3048: Using a custom Sorted String Table format, a subtask of T3054: Scale out object storage design, as Resolved.
Feb 17 2021, 11:22 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a comment to T3048: Using a custom Sorted String Table format.

T3050 is a better fit as it does not require any specification or development.

Feb 17 2021, 11:22 PM · Object storage
dachary added a comment to T3050: Using libcephsqlite to store objects.

Although it is not a good fit to store all objects, it is a better fit than RBD + a custom format to store 1TB worth of objects. Provided support for multiple concurrent readers is added.

Feb 17 2021, 11:19 PM · Object storage
dachary reopened T3050: Using libcephsqlite to store objects as "Work in Progress".
Feb 17 2021, 11:18 PM · Object storage
dachary reopened T3050: Using libcephsqlite to store objects, a subtask of T3054: Scale out object storage design, as Work in Progress.
Feb 17 2021, 11:18 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 5:40 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 5:39 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 3:14 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a comment to T3054: Scale out object storage design.

In the following small objects are < 4KB and object storage software refers to the list of software from the description for which there are no blockers.

Feb 17 2021, 3:12 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:53 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:43 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:42 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:42 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:37 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:17 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:11 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 2:11 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary closed T3057: Using seaweedfs to store objects as Resolved.
Feb 17 2021, 1:57 PM · Object storage
dachary closed T3057: Using seaweedfs to store objects, a subtask of T3054: Scale out object storage design, as Resolved.
Feb 17 2021, 1:57 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3057: Using seaweedfs to store objects.
Feb 17 2021, 1:56 PM · Object storage
dachary added a comment to T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.

We'd want a reader to try reading on the mirrored pool, and then to fall back to the erasure coded pool if the object is larger than the cutoff. The increased latency in getting large objects may be worth the space savings ? I don't know.

Feb 17 2021, 1:34 PM · Object storage
dachary added subtasks for T3054: Scale out object storage design: T3051: Using EOS to store objects, T3050: Using libcephsqlite to store objects.
Feb 17 2021, 11:26 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a parent task for T3051: Using EOS to store objects: T3054: Scale out object storage design.
Feb 17 2021, 11:26 AM · Object storage
dachary added a parent task for T3050: Using libcephsqlite to store objects: T3054: Scale out object storage design.
Feb 17 2021, 11:26 AM · Object storage
dachary added a subtask for T3054: Scale out object storage design: T3057: Using seaweedfs to store objects.
Feb 17 2021, 11:25 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a parent task for T3057: Using seaweedfs to store objects: T3054: Scale out object storage design.
Feb 17 2021, 11:25 AM · Object storage
dachary changed the status of T3057: Using seaweedfs to store objects from Open to Work in Progress.
Feb 17 2021, 11:24 AM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 11:21 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 11:21 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 11:13 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 11:06 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 10:38 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 10:32 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 10:31 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a comment to T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.

The bench script and full results are in the tarbal.

Feb 17 2021, 10:27 AM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 8:29 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 8:26 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a comment to T3054: Scale out object storage design.
In T3054#58874, @olasd wrote:

@zack, very good point about having a target for the "time to first byte when reading an object".

I don't know what would be a "good" target for that metric; my gut says that staying within 100ms for any given object would be acceptable, as long as the number of parallel readers doesn't impact the amount too much (of course, within the IOPS of the underlying media, etc.).

Feb 17 2021, 8:26 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 17 2021, 8:20 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary removed a parent task for T3049: Benchmarking an RBD based object container: T3054: Scale out object storage design.
Feb 17 2021, 8:14 AM · Object storage
dachary removed a parent task for T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K: T3054: Scale out object storage design.
Feb 17 2021, 8:14 AM · Object storage
dachary added a parent task for T3056: Ceph as an object storage: T3054: Scale out object storage design.
Feb 17 2021, 8:14 AM · Object storage (RedHat collaboration)
dachary edited subtasks for T3054: Scale out object storage design, added: T3056: Ceph as an object storage; removed: T3049: Benchmarking an RBD based object container, T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.
Feb 17 2021, 8:14 AM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a parent task for T3049: Benchmarking an RBD based object container: T3056: Ceph as an object storage.
Feb 17 2021, 8:13 AM · Object storage
dachary added a parent task for T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K: T3056: Ceph as an object storage.
Feb 17 2021, 8:13 AM · Object storage
dachary added subtasks for T3056: Ceph as an object storage: T3055: Ceph and immutable & append only storage, T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K, T3049: Benchmarking an RBD based object container.
Feb 17 2021, 8:13 AM · Object storage (RedHat collaboration)
dachary added a parent task for T3055: Ceph and immutable & append only storage: T3056: Ceph as an object storage.
Feb 17 2021, 8:13 AM · Object storage
dachary changed the status of T3056: Ceph as an object storage from Open to Work in Progress.
Feb 17 2021, 8:12 AM · Object storage (RedHat collaboration)
dachary changed the status of T3055: Ceph and immutable & append only storage from Open to Work in Progress.
Feb 17 2021, 8:10 AM · Object storage
dachary added a comment to T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.

If the size of the object was known to the reader of the object store it would be a great way to develop storage strategies depending on the object size. So far I assumed the reader does not have that information and is therefore unable to figure out which object storage to use based on that information but maybe I missed something?

Feb 17 2021, 7:14 AM · Object storage

Feb 16 2021

dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 11:37 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 10:19 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 10:17 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 10:16 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a comment to T3054: Scale out object storage design.

For the record stats from january 2021

Feb 16 2021, 10:15 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a comment to T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.

Description Default value of bluestore compression min blob size for rotational media.
Type Unsigned Integer
Required No
Default 128K

Feb 16 2021, 10:07 PM · Object storage
dachary added a comment to T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.

With a 4KB min alloc and a 4+2 erasure coded pool, objects that have a size < 16KB will require 16KB anyway + 8KB for parity. T3054 suggests that 75% of objects have a size < 16KB. Since the space amplification makes even the smallest object 16KB big, that's a total of 16KB * 7.5B = 120TB. That's 120TB / 750TB = 16% of the total. Without the space amplification these objects only use ~5% of the total space. The space amplification costs 10% of the total uncompressed storage.

Feb 16 2021, 7:52 PM · Object storage
dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 7:27 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 7:15 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 7:03 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 7:01 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 6:55 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary updated the task description for T3054: Scale out object storage design.
Feb 16 2021, 6:53 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a task to D398: [WIP] "packing" object storage design documentation: T3054: Scale out object storage design.
Feb 16 2021, 6:46 PM · Object storage
dachary added a revision to T3054: Scale out object storage design: D398: [WIP] "packing" object storage design documentation.
Feb 16 2021, 6:46 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a parent task for T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K: T3054: Scale out object storage design.
Feb 16 2021, 6:42 PM · Object storage
dachary added a parent task for T3048: Using a custom Sorted String Table format: T3054: Scale out object storage design.
Feb 16 2021, 6:42 PM · Object storage
dachary added subtasks for T3054: Scale out object storage design: T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K, T3049: Benchmarking an RBD based object container, T3048: Using a custom Sorted String Table format.
Feb 16 2021, 6:42 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a parent task for T3049: Benchmarking an RBD based object container: T3054: Scale out object storage design.
Feb 16 2021, 6:42 PM · Object storage
dachary changed the status of T3054: Scale out object storage design from Open to Work in Progress.
Feb 16 2021, 6:41 PM · Roadmap 2022, Object storage (RedHat collaboration), Roadmap 2021, meta-task
dachary added a project to D398: [WIP] "packing" object storage design documentation: Object storage.
Feb 16 2021, 3:22 PM · Object storage
dachary added a comment to T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.

Josh Durgin gave some more pointers to relevant pull requests:

Feb 16 2021, 9:41 AM · Object storage
dachary added a comment to T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.

Root cause analysis for space overhead with erasure coded pools.

Feb 16 2021, 12:13 AM · Object storage

Feb 15 2021

dachary updated the task description for T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.
Feb 15 2021, 11:44 PM · Object storage
dachary updated the task description for T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K.
Feb 15 2021, 11:43 PM · Object storage
dachary changed the status of T3052: Reducing Ceph bluestore_min_alloc_size from 64K to 4K from Open to Work in Progress.
Feb 15 2021, 11:42 PM · Object storage
dachary added a comment to T3014: Using an RBD image to store artifacts.

There is one concern that was not addressed: the metadata do not scale out, it is a single rocksdb database.

Feb 15 2021, 10:49 PM · Object storage
dachary closed T3051: Using EOS to store objects as Resolved.
Feb 15 2021, 10:14 PM · Object storage
dachary added a comment to T3051: Using EOS to store objects.

At first glance EOS is an entire system that adresses all the needs of the researchers at CERN. It includes an object storage with data and metadata separated, which is what the Software Heritage is likely to look like as well. However, this part is not standalone. Although it is a great source of inspiration:

Feb 15 2021, 10:14 PM · Object storage
dachary added a comment to T3051: Using EOS to store objects.

The Scalla software suite provides two fundamental building blocks: an xrootd server for low latency high bandwidth data access and an olbd server for building scalable xrootd clusters. This paper describes the architecture, how low latency is achieved, and the scaling opportunities the software allows. Actual performance measurements are presented and discussed. Scalla offers a readily deployable framework in which to construct large fault-tolerant high performance data access configurations using commodity hardware with a minimum amount of administrative overhead.

Feb 15 2021, 10:10 PM · Object storage
dachary closed T3050: Using libcephsqlite to store objects as Resolved.
Feb 15 2021, 9:41 PM · Object storage
dachary added a comment to T3050: Using libcephsqlite to store objects.

There is a hard limit on the sqlite database (~280TB) so it would not work, even if perfectly optimized.

Feb 15 2021, 9:41 PM · Object storage