Add support for slices when getting objects from the objstorage.
Closed, MigratedEdits Locked
Actions

Assigned To

Authored By

	vlorentz
	Dec 18 2018, 4:40 PM

Description

Currently get methods only supports returning the full blob. We should add new parameters (eg. start and end) to specify the range of bytes the caller wants.

These get methods are defined in swh-objstorage/swh/objstorage/objstorage.py (the abstract base class) and in three different backends:

swh/objstorage/objstorage_pathslicing.py manipulates a file object, so it's a matter of using seek() and read().
swh/objstorage/objstorage_in_memory.py manipulates Python bytes objects, so it only needs slicing.
swh/objstorage/objstorage_rados.py uses RADOS, so it's a bit more tedious. Fortunately, it already uses the slicing logic of RADOS (self.ioctx.read(_obj_id, offset, READ_SIZE)), so it's a matter of changing values of the arguments to self.ioctx.read.

Related Objects
Search...

Status	Assigned	Task
Migrated	gitlab-migration	T803 Indexer - Retrieval error when contents is too big
Migrated	gitlab-migration	T1446 Add support for slices in Storage.content_get
Migrated	gitlab-migration	T1447 Add support for slices when getting objects from the objstorage.

Event Timeline

vlorentz triaged this task as Low priority.Dec 18 2018, 4:40 PM

vlorentz created this task.

objstorage_pathslicing manipulates a *gzipped* file object, which means that TTBOMK seek is not supported, and we will have to decompress the complete beginning of the file to get to the range that we really want to read.

Same issue for the azure blob storage (the objects there are compressed), and likely for the S3 storage as well. Whether that's a good design decision or not (it probably isn't) is beside the point, but that's what we have to work with now.

I'm not convinced that feature is really something that we want to implement (it's not really needed for the indexer, for instance), and I'm not convinced about the Easy hack classification either :)

Same here. Not that much an easy hack. And what is the real life use case that drive this feature request? YAGNI?

I didn't know objects are compressed. That indeed makes the issue harder.

vlorentz mentioned this in D1363: Add support for slices in Storage.content_get (T1446).Apr 6 2019, 9:30 AM

This task has been migrated to GitLab.

Add support for slices when getting objects from the objstorage.Closed, MigratedEdits LockedActions

Description

Related ObjectsSearch...

Event Timeline

Add support for slices when getting objects from the objstorage.
Closed, MigratedEdits Locked
Actions

Related Objects
Search...