Page MenuHomeSoftware Heritage

Object storageFolder
ActivePublic

Milestones

Members

  • This project does not have any members.
  • View All

Recent Activity

Aug 23 2022

vlorentz added a revision to T4402: Pass dict of hashes instead of single sha1 to objstorage.get(): D8286: Pass 'obj_id' argument to objstorage.add().
Aug 23 2022, 10:34 AM · Object storage

Jul 19 2022

vlorentz added revisions to T4403: Update objstorage interface to return dicts of hashes instead of single sha1: D8009: Make obj_id argument of ObjStorage.add() required, D8013: Drop the now unused add_stream and get_stream methods, D8017: Make add() and restore() return None instead of ObjId, D8026: Remove get_random(), D8029: Start introducing composite ObjId in the interface, D8074: Remove ID-based filters, D8076: Make __iter__ actually return composite objids.
Jul 19 2022, 3:19 PM · Object storage
vlorentz triaged T4403: Update objstorage interface to return dicts of hashes instead of single sha1 as Normal priority.
Jul 19 2022, 3:18 PM · Object storage
vlorentz added a revision to T4402: Pass dict of hashes instead of single sha1 to objstorage.get(): D8029: Start introducing composite ObjId in the interface.
Jul 19 2022, 3:16 PM · Object storage
vlorentz added revisions to T4402: Pass dict of hashes instead of single sha1 to objstorage.get(): D8138: Update for swh-objstorage >= 2.0.0, D8137: Call objstorage.get() with a HashDict instead of single hash, D8135: rehash: Call objstorage.content_get() with a HashDict instead of single hash, D8127: Call objstorage.content_get() with a HashDict instead of single hash, D8126: Replace Dict[str, bytes] with a TypedDict to represent dicts of hashes, D8122: Fix crash when calling __contains__/get/check/delete with composite obj ids.
Jul 19 2022, 3:16 PM · Object storage
vlorentz triaged T4402: Pass dict of hashes instead of single sha1 to objstorage.get() as Normal priority.
Jul 19 2022, 3:16 PM · Object storage

Jul 1 2022

douardda added a comment to T2309: Add support for other hash algo than sha1 in current objstorage implementation.

do you have in mind to make the actual hash used as primary key in an objstorage a configuration of said storage instance? e.g. create a pathslicer or s3 objstorage using sha256 is just a matter of configuration of the objstorage?

Jul 1 2022, 10:38 AM · Object storage
douardda added a comment to T2309: Add support for other hash algo than sha1 in current objstorage implementation.

do you have in mind to make the actual hash used as primary key in an objstorage a configuration of said storage instance? e.g. create a pathslicer or s3 objstorage using sha256 is just a matter of configuration of the objstorage?

Jul 1 2022, 10:34 AM · Object storage

Jun 27 2022

bchauvet added a revision to T2309: Add support for other hash algo than sha1 in current objstorage implementation: D8029: Start introducing composite ObjId in the interface.
Jun 27 2022, 2:35 PM · Object storage

Jun 21 2022

vlorentz added a parent task for T2309: Add support for other hash algo than sha1 in current objstorage implementation: T3775: Dealing with repositories with contents that produces hash conflicts (example included from GitLab).
Jun 21 2022, 2:41 PM · Object storage
olasd added a revision to T2309: Add support for other hash algo than sha1 in current objstorage implementation: D8008: Set object id when calling objstorage.add.
Jun 21 2022, 2:35 PM · Object storage

May 1 2022

seirl closed T1848: refresh graph dataset export, a subtask of T3085: Complete and updated copy of the archive on S3 (objects+graph), as Resolved.
May 1 2022, 12:08 PM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage

Apr 29 2022

seirl changed the status of T1848: refresh graph dataset export, a subtask of T3085: Complete and updated copy of the archive on S3 (objects+graph), from Open to Work in Progress.
Apr 29 2022, 6:23 PM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage
seirl closed T1743: create a nice landing web page for exported dataset, a subtask of T3085: Complete and updated copy of the archive on S3 (objects+graph), as Resolved.
Apr 29 2022, 6:14 PM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage

Apr 8 2022

anlambert closed T4119: TestRemoteObjStorage::test_content_iterator is failing since werkzeug 2.1.0 release as Resolved by committing rDOBJSdd99e5d64e20: api/server: Fix streaming responses implementation.
Apr 8 2022, 3:10 PM · Object storage
anlambert added a revision to T4119: TestRemoteObjStorage::test_content_iterator is failing since werkzeug 2.1.0 release: D7534: api/server: Fix streaming responses implementation.
Apr 8 2022, 12:15 PM · Object storage

Apr 5 2022

zack changed the status of T1743: create a nice landing web page for exported dataset, a subtask of T3085: Complete and updated copy of the archive on S3 (objects+graph), from Open to Work in Progress.
Apr 5 2022, 1:39 PM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage

Mar 30 2022

anlambert triaged T4119: TestRemoteObjStorage::test_content_iterator is failing since werkzeug 2.1.0 release as Normal priority.
Mar 30 2022, 3:07 PM · Object storage
anlambert created T4119: TestRemoteObjStorage::test_content_iterator is failing since werkzeug 2.1.0 release.
Mar 30 2022, 3:07 PM · Object storage

Mar 25 2022

bchauvet lowered the priority of T3085: Complete and updated copy of the archive on S3 (objects+graph) from High to Low.
Mar 25 2022, 5:28 PM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage

Mar 23 2022

bchauvet added a project to T3085: Complete and updated copy of the archive on S3 (objects+graph): Roadmap 2022.
Mar 23 2022, 4:39 PM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage

Jan 25 2022

vlorentz updated subscribers of T3527: Self-host Software Heritage on grid5000.

@vsellier already did this to benchmark cassandra. it's indeed necessary to see how the backends behave with real loader and vault workloads. (less so for the objstorage, since the workloads should be much more uniform)

Jan 25 2022, 11:30 PM · Object storage
dachary added a comment to T3527: Self-host Software Heritage on grid5000.

I'mt not exactly sure why I thought that would be necessary for benchmarking. In any case... it's not ;-)

Jan 25 2022, 9:53 PM · Object storage
dachary closed T3527: Self-host Software Heritage on grid5000, a subtask of T3432: Add winery backend, as Wontfix.
Jan 25 2022, 9:53 PM · Object storage
dachary closed T3527: Self-host Software Heritage on grid5000 as Wontfix.
Jan 25 2022, 9:53 PM · Object storage
dachary closed T3525: grid5000 tools and documentation, a subtask of T3432: Add winery backend, as Resolved.
Jan 25 2022, 9:52 PM · Object storage
dachary closed T3525: grid5000 tools and documentation as Resolved.
Jan 25 2022, 9:52 PM · Object storage
dachary added a comment to T3525: grid5000 tools and documentation.

The documentation is at:

Jan 25 2022, 9:52 PM · Object storage
dachary closed T3634: Create swh-perfecthash module as Resolved.
Jan 25 2022, 9:51 PM · Object storage
dachary closed T3528: Add winery backend: grid5000 benchmark, a subtask of T3432: Add winery backend, as Resolved.
Jan 25 2022, 9:50 PM · Object storage
dachary closed T3528: Add winery backend: grid5000 benchmark as Resolved.
Jan 25 2022, 9:50 PM · Object storage
dachary added a comment to T3528: Add winery backend: grid5000 benchmark.

It's documented in the winery test environment and was actually able to use the instructions successfully (after a few fixes...). It does work an this can be closed as resolved.

Jan 25 2022, 9:50 PM · Object storage
dachary added a comment to T3432: Add winery backend.

Added a wiki page to be a more accessible version of the benchmark process than the README in the sources.

Jan 25 2022, 9:48 PM · Object storage

Jan 22 2022

dachary changed the status of T3532: IO throttling, a subtask of T3432: Add winery backend, from Open to Work in Progress.
Jan 22 2022, 4:14 PM · Object storage

Dec 16 2021

olasd closed T1954: Up-to-date objstorage mirror on S3, a subtask of T3085: Complete and updated copy of the archive on S3 (objects+graph), as Resolved.
Dec 16 2021, 3:12 PM · Roadmap 2022, meta-task, Roadmap 2021, System administration, Object storage
olasd closed T1954: Up-to-date objstorage mirror on S3 as Resolved.
Dec 16 2021, 3:12 PM · System administration, Object storage

Dec 14 2021

dachary added a comment to D6834: docker: add the swh-winery-db and swh-winery services.

Ah wait I got it, docker/services/swh-winery/entrypoint.sh launches winery itself, not an actual objstorage backend; you're only reusing the scafholding.

Dec 14 2021, 5:35 PM · Object storage
vlorentz accepted D6834: docker: add the swh-winery-db and swh-winery services.

Ah wait I got it, docker/services/swh-winery/entrypoint.sh launches winery itself, not an actual objstorage backend; you're only reusing the scafholding.

Dec 14 2021, 5:03 PM · Object storage
vlorentz updated the summary of D6834: docker: add the swh-winery-db and swh-winery services.
Dec 14 2021, 4:57 PM · Object storage
vlorentz added a comment to D6834: docker: add the swh-winery-db and swh-winery services.

I don't understand why. Currently, we have: client --> objstorage pathslicing backend (port 5003) --> disk.
What you want to do is: client --> objstorage proxy (port 5003) --> objstorage winery backend (port 5012) --> winery --> ceph, right?
(objstorage winery backend is what is launched by docker/services/swh-winery/entrypoint.sh in this diff)

Dec 14 2021, 4:54 PM · Object storage
dachary added a comment to D6834: docker: add the swh-winery-db and swh-winery services.

Instead of defining a new service, could you provide an alternative docker-compose config file? This way, it can be used as to switch all services to use it, just by adding a CLI parameter. eg. we do this to replace the postgres storage backend with cassandra: https://docs.softwareheritage.org/devel/getting-started/using-docker.html#cassandra

Dec 14 2021, 4:17 PM · Object storage
vlorentz added a comment to D6834: docker: add the swh-winery-db and swh-winery services.

Instead of defining a new service, could you provide an alternative docker-compose config file? This way, it can be used as to switch all services to use it, just by adding a CLI parameter. eg. we do this to replace the postgres storage backend with cassandra: https://docs.softwareheritage.org/devel/getting-started/using-docker.html#cassandra

Dec 14 2021, 3:57 PM · Object storage
dachary updated the test plan for D6834: docker: add the swh-winery-db and swh-winery services.
Dec 14 2021, 3:54 PM · Object storage
dachary updated the summary of D6834: docker: add the swh-winery-db and swh-winery services.
Dec 14 2021, 3:53 PM · Object storage
dachary added a comment to D6834: docker: add the swh-winery-db and swh-winery services.

It depends on https://forge.softwareheritage.org/D6796 and will fail until it is merged. It can be tested from sources with an override like this:

Dec 14 2021, 3:52 PM · Object storage
dachary added a project to D6834: docker: add the swh-winery-db and swh-winery services: Object storage.
Dec 14 2021, 3:49 PM · Object storage

Dec 13 2021

dachary updated the task description for T3804: Winery backend server.
Dec 13 2021, 6:23 PM · Object storage
dachary renamed T3804: Winery backend server from Winery backend proxy to Winery backend server.
Dec 13 2021, 6:22 PM · Object storage
olasd added a comment to T3804: Winery backend server.

In practical terms, the two winery objstorage database servers and Ceph itself will be hosted at CEA, while the main ingestion storage / graph storage / ... will remain in Rocquencourt (separated sites, with fairly high bandwidth networking between them).

Dec 13 2021, 6:07 PM · Object storage
vsellier added a watcher for Object storage: vsellier.
Dec 13 2021, 5:58 PM