Page MenuHomeSoftware Heritage

loader: Stop materializing full lists of objects to be stored.
ClosedPublic

Authored by vlorentz on Sep 17 2020, 2:31 PM.

Details

Summary

Since rDLDBASE43728c596498979cd5083b61e93360b4c2071c31, store_data consumes the entire iterator
of contents, and since 3b97703d7f14e145d6124f1c61f5f283ee8eecf2, it does the same for
other object types.

This causes all the (new) objects of the loaded repository to be loaded
in memory at the same time before being sent to the storage, which can
cause OOM errors.

Instead, with this commit, objects are added one by one to the storage,
which restores the lazy behavior we had before these two commits using
the buffered storage proxy.

Resolves T2373.

Event Timeline

Build is green

Patch application report for D3976 (id=14010)

Rebasing onto fbe906c0c9...

First, rewinding head to replay your work on top of it...
Applying: loader: Stop materializing full lists of objects to be stored.
Changes applied before test
commit dbf4271aad183f96a22843dc7c5ae84ce5f72ce9
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Thu Sep 17 14:31:22 2020 +0200

    loader: Stop materializing full lists of objects to be stored.
    
    Since 43728c596498979cd5083b61e93360b4c2071c31, store_data consumes the entire iterator
    of contents, and since 3b97703d7f14e145d6124f1c61f5f283ee8eecf2, it does the same for
    other object types.
    
    This causes all the (new) objects of the loaded repository to be loaded
    in memory at the same time before being sent to the storage, which can
    cause OOM errors.
    
    Instead, with this commit, objects are added one by one to the storage,
    which restores the lazy behavior we had before these two commits using
    the buffered storage proxy.

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/294/ for more details.

I really hate this, but save from making *_add able to take generators as arguments again, this is probably the most decent way out.

Begrudgingly giving this an Accept.

This definitely needs a comment in the code, because I'm sure the next person reading this code will be very confused.

This revision is now accepted and ready to land.Sep 17 2020, 4:00 PM
This revision was landed with ongoing or failed builds.Sep 17 2020, 6:16 PM
This revision was automatically updated to reflect the committed changes.

Build is green

Patch application report for D3976 (id=14028)

Rebasing onto 452fa224f9...

First, rewinding head to replay your work on top of it...
Fast-forwarded diff-target to base-revision-297-D3976.
Changes applied before test

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/297/ for more details.

Build is green

Patch application report for D3976 (id=14029)

Rebasing onto 452fa224f9...

First, rewinding head to replay your work on top of it...
Fast-forwarded diff-target to base-revision-298-D3976.
Changes applied before test

See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/298/ for more details.