A mail was sent to Patrick Donnelly to ask for his opinion on the matter.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Feb 15 2021
This preliminary exploration is complete and moved to benchmarking to discover blockers.
Updated the description, even simpler.
Thanks for the comment. Let's keep just the SWHID then.
Although simple and close to what is needed, Xz is not an exact match: the index would need to be maintained.
Xz format inadequate for long-term archiving
The zstd format is tightly associated with the compression algorithm and is therefore more complex. It can however be a sequence of independently compressed content and could be used for the same purpose as xz.
The 7z format is more complex because it knows about files, directories etc. It is not not just a compressed data format.
There are two blockers:
When extracting a single file (-x file) the in memory index is walked sequentially looking for the file.
The index is located at the end of the file.
The content of the archive is compressed as successive blocs of a given size.
The index is compressed as a single block of unlimited size.
Feb 14 2021
In D398#24602, @douardda wrote:No idea whether if it's of some interest for our subject, but we may also have a look at openio
About Ceph RGW and the lack of packing https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/AEMW6O7WVJFMUIX7QGI2KM7HKDSTNIYT/
https://github.com/vasi/pixz is a candidate for the 1TB archive content
For the record yesterday's IRC log
Feb 13 2021
For the record, today's IRC log:
Feb 6 2021
Benchmarking S3 in Ceph with COSBench could be interesting (the video is not yet available). In the past COSBench was difficult to use but maybe it improved. This is off-topic though, but I don't know where to write that down at the moment.
Feb 4 2021
Feb 2 2021
Feb 1 2021
A trivial test case (attached) shows that an RBD image backed by a k=4,m=2 erasure coded pool (RAID6 equivalent) can store 4GB of data using 6GB of disk. The metadata overhead is small. It would be great if someone could repeat the test to make sure I did not accidentally obtained these results.
Jan 11 2021
Thanks for the merge :-) It feels really good to see a commit, however simple, being merged on a Monday morning!
Jan 9 2021
bin/update contains
reword commit title
Jan 8 2021
forgot the leading (swh)