Page MenuHomeSoftware Heritage

Scale out object storage design
Started, Work in Progress, NormalPublic

Related Objects

StatusAssignedTask
Opendouardda
Work in Progressdachary
Wontfixdachary
Resolveddachary
Resolveddachary
Work in Progressdachary
Wontfixdachary
Invaliddachary
Invaliddachary
Resolveddachary
Resolveddachary
Resolveddachary
Wontfixdachary
Invaliddachary
Wontfixdachary
Wontfixdachary
Wontfixdachary
Invaliddachary
Opendachary
Opendachary
Opendachary
Opendachary
Opendachary
Resolveddachary
Resolveddachary
Resolveddachary
Resolveddachary
Wontfixdachary
Resolveddachary
Opendachary
ResolvedNone
Resolveddachary

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
dachary updated the task description. (Show Details)
dachary updated the task description. (Show Details)

For the record the half baked benchmark script for the proposed designed I worked on today. To be continued!


dachary updated the task description. (Show Details)

I have found some interesting pointers relative to the management of small files in hdfs (found them when looking for unrelated other stuff). Is it something you have identified and excluded from the scope due to some blockers ?

Very interesting to see how this problem was presented & solved in the Hadoop ecosystem, thanks for the links.

  • HAR (Hadoop Archives) were designed to reduce the HDFS space amplification for small objects (no specifics on how much amplification that is exactly). This 2009 article gives a hint: every HDFS file requires 150 bytes in RAM and it states that "Certainly a billion files is not feasible." and that explains why HAR was introduced.
dachary updated the task description. (Show Details)
rdicosmo moved this task from Backlog to Done on the Roadmap 2021 board.
rdicosmo moved this task from Done to Work in progress on the Roadmap 2021 board.
rdicosmo added a subscriber: rdicosmo.

Thanks for helping with the labelling @rdicosmo 👍

dachary updated the task description. (Show Details)
dachary updated the task description. (Show Details)
dachary updated the task description. (Show Details)
dachary updated the task description. (Show Details)
dachary changed the status of subtask T3249: Deleting and erasing an object from Work in Progress to Open.Sun, Aug 29, 1:05 PM