Add support for other hash algo than sha1 in current objstorage implementation
Closed, MigratedEdits Locked
Actions

Assigned To

Authored By

	douardda
	Mar 12 2020, 1:43 PM

Description

The idea is that using a secondary (e.g. in a multiplexer configuration) objstorage would allow us to be able to keep a copy of the few content that are colliding on sha1.

It is not a proper long term solution.

Revisions and Commits

rDDATASET Datasets
	D8008	rDDATASETe31bdb26a827 Set object id when calling objstorage.add
rDOBJS Object storage
	D8756	rDOBJSdf4be2d87c30 azure: Add tests based on Azurite in addition to mocks
	D8029	rDOBJS667cb87b9367 Start introducing composite ObjId in the interface

Related Objects
Search...

Status	Assigned	Task
Migrated	gitlab-migration	T3775 Dealing with repositories with contents that produces hash conflicts (example included from GitLab)
Migrated	gitlab-migration	T2309 Add support for other hash algo than sha1 in current objstorage implementation
Migrated	gitlab-migration	T4402 Pass dict of hashes instead of single sha1 to objstorage.get()
Migrated	gitlab-migration	T4403 Update objstorage interface to return dicts of hashes instead of single sha1

Event Timeline

douardda triaged this task as Normal priority.Mar 12 2020, 1:43 PM

douardda created this task.

olasd added a revision: D8008: Set object id when calling objstorage.add.Jun 21 2022, 2:35 PM

vlorentz added a parent task: T3775: Dealing with repositories with contents that produces hash conflicts (example included from GitLab).Jun 21 2022, 2:41 PM

olasd added a commit: rDDATASETe31bdb26a827: Set object id when calling objstorage.add.Jun 21 2022, 4:03 PM

bchauvet added a revision: D8029: Start introducing composite ObjId in the interface.Jun 27 2022, 2:35 PM

do you have in mind to make the actual hash used as primary key in an objstorage a configuration of said storage instance? e.g. create a pathslicer or s3 objstorage using sha256 is just a matter of configuration of the objstorage?

In T2309#87779, @douardda wrote:

do you have in mind to make the actual hash used as primary key in an objstorage a configuration of said storage instance? e.g. create a pathslicer or s3 objstorage using sha256 is just a matter of configuration of the objstorage?

Also, is the idea is to make any swh objstorage able to be queried for a content using any supported hash? or will the only query API require a multihash object?

vlorentz added a commit: rDOBJS667cb87b9367: Start introducing composite ObjId in the interface.Jul 4 2022, 2:07 PM

vlorentz mentioned this in D8756: azure: Add tests based on Azurite in addition to mocks.Oct 24 2022, 2:45 PM

vlorentz added a revision: D8756: azure: Add tests based on Azurite in addition to mocks.Oct 24 2022, 2:45 PM

vlorentz added a commit: rDOBJSdf4be2d87c30: azure: Add tests based on Azurite in addition to mocks.Oct 25 2022, 3:12 PM

Possibly relevant for the Azure storage: https://learn.microsoft.com/en-us/rest/api/storageservices/find-blobs-by-tags

This task has been migrated to GitLab.

gitlab-migration closed subtask T4402: Pass dict of hashes instead of single sha1 to objstorage.get() as Migrated.Jan 8 2023, 5:04 PM

gitlab-migration closed subtask T4403: Update objstorage interface to return dicts of hashes instead of single sha1 as Migrated.

Add support for other hash algo than sha1 in current objstorage implementationClosed, MigratedEdits LockedActions

Description

Revisions and Commits

Related ObjectsSearch...

Event Timeline

Add support for other hash algo than sha1 in current objstorage implementation
Closed, MigratedEdits Locked
Actions

Related Objects
Search...