Page MenuHomeSoftware Heritage

dataset: document the AWS S3 bucket for content objects
Closed, MigratedEdits Locked

Description

The public Amazon S3 bucket located at s3://softwareheritage/content/ contains copies of all the content objects of the archive.
The format is 1 file for each blob, named as its SHA1 (not git salted), containing the actual byte sequence as a gzipped object.
We should document this as a dataset, side-by-side with the graph dataset, at https://docs.softwareheritage.org/devel/swh-dataset/