Page MenuHomeSoftware Heritage

test_storage: Force LazyContent __type__ attribute to be serializable
AbandonedPublic

Authored by ardumont on Mar 2 2020, 1:52 PM.

Details

Reviewers
olasd
Group Reviewers
Reviewers
Summary

This actually fixes the debian build package failure [1]

I do not like it much as i'm fooling the extra serializer into my bidding. I do
not see a better way though, ideas welcome.

Also, i do not get why the tox build do not need it...

[1]
https://jenkins.softwareheritage.org/view/Debian%20packages/job/debian/job/packages/job/DSTO/job/gbp-buildpackage/137/console

Diff Detail

Repository
rDSTO Storage manager
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 10826
Build 16259: tox-on-jenkinsJenkins
Build 16258: arc lint + arc unit

Event Timeline

ardumont retitled this revision from test_storage: Force the __type__ attribute to something serializable to test_storage: Force LazyContent __type__ attribute to be serializable.Mar 2 2020, 1:53 PM

LazyContents should not be going through the RPC layer. If they go through the RPC layer, then the data is lost.

One of the endpoints must be missing a call to .with_data().

olasd requested changes to this revision.Mar 2 2020, 2:15 PM
This revision now requires changes to proceed.Mar 2 2020, 2:15 PM

LazyContents should not be going through the RPC layer. If they go through the RPC layer, then the data is lost.
One of the endpoints must be missing a call to .with_data().

ok, i don't get it yet but i sense the logic in the reasoning. Looking for the missing call.
Thanks.

LazyContents should not be going through the RPC layer. If they go through the RPC layer, then the data is lost.
One of the endpoints must be missing a call to .with_data().

ok, i don't get it yet but i sense the logic in the reasoning. Looking for the missing call.
Thanks.

The "lazy Content objects" were introduced when refactoring the swh.model.from_disk module.

The idea is to allow Content-like objects to be carried around in memory while having their full data loaded as late as possible; In the end, this should allow us to do the full pipeline of object existence checks with only hashes in memory, keeping the loading of data from disk to contents which we're going to need to transfer to storage anyway.

Of course there's lots of details that need to be sorted out (e.g. the lifetime of temporary files) before we can make full utilization of this feature in all loaders.

Currently, the storage RPC client should materialize lazy content objects by calling with_data before they're passed down the wire.

This needs a version of swh.core containing the following commit: rDCORE041728f8e4c9

In D2737#65379, @olasd wrote:

Currently, the storage RPC client should materialize lazy content objects by calling with_data before they're passed down the wire.

This needs a version of swh.core containing the following commit: rDCORE041728f8e4c9

That's introduced in swh.core 0.0.94, this build was done with 0.0.93. We need to tighten our dependencies here.

The "lazy Content objects" were introduced when refactoring the swh.model.from_disk module.
...

Thanks for the thorough explanation.
I kinda followed that from afar.
And did not grasp the relation ;)

That's introduced in swh.core 0.0.94, this build was done with 0.0.93. We need to tighten our dependencies here.

Yes indeed. I did fix the core, i must have triggered the storage one too soon.
(without updating the requirements indeed ;)

Closing this now and fixing the build appropriately ;)