Page MenuHomeSoftware Heritage

Return an accurate summary from buffer's flush() method
ClosedPublic

Authored by olasd on Feb 4 2021, 3:26 PM.

Details

Summary

The earlier implementation would only return summary data from keys that
existed in the last _add backend method run, rather than collating all
the results.

Diff Detail

Repository
rDSTO Storage manager
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D5018 (id=17890)

Could not rebase; Attempt merge onto 9a9f234e0a...

Updating 9a9f234e..8075b513
Fast-forward
 swh/storage/buffer.py            |  10 +-
 swh/storage/tests/test_buffer.py | 199 ++++++++++++++++++++++++++++++++++-----
 2 files changed, 181 insertions(+), 28 deletions(-)
Changes applied before test
commit 8075b513631d2e40949c2c73d8813bda2842afd7
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Feb 4 14:24:50 2021 +0100

    Return an accurate summary from buffer's flush() method
    
    The earlier implementation would only return summary data from keys that
    existed in the last `_add` backend method run, rather than collating all
    the results.

commit c63ea0d5bcf1138dbbab244ea1a6376591126260
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Feb 4 09:57:50 2021 +0100

    buffer: ensure objects are added in topological order
    
    This new integration test checks that, when flushing the buffer storage,
    the addition functions of the underlying storage backend are called in
    topological order (content, directory, revision, release then snapshot).
    
    This reduces the probability of "data consistency" regressions caused by
    the use of the buffering storage proxy alone.

commit 5b3e6c9f70f94f4ae7440de453fd75cd1d3bbeb6
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Feb 4 09:56:03 2021 +0100

    buffer: add support for snapshots
    
    This is mostly a consistency addition, considering that most (if not
    all) loaders will only add a single snapshot.
    
    The common pattern of loading objects in topological order (content >
    directory > revision > release > snapshot), then flushing the storage,
    is now fully consistent; Without this addition, the snapshot addition
    would reach the backend storage before all other objects are added,
    leading to potential inconsistencies if the flush of other object types
    fails.

commit 18967ed4a5f2eb67c57ae34ed913073b45bedc90
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Feb 4 10:14:59 2021 +0100

    buffer: add type annotations for tests

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1125/ for more details.

olasd requested review of this revision.Feb 4 2021, 3:31 PM
This revision is now accepted and ready to land.Feb 4 2021, 4:44 PM

Rebase and factor out the summary update function

Build is green

Patch application report for D5018 (id=17901)

Could not rebase; Attempt merge onto 9a9f234e0a...

Updating 9a9f234e..1526107b
Fast-forward
 swh/storage/buffer.py            |  14 +++-
 swh/storage/tests/test_buffer.py | 157 ++++++++++++++++++++++++++++++++-------
 2 files changed, 143 insertions(+), 28 deletions(-)
Changes applied before test
commit 1526107b12eb2b3f607edcb3558ee12648d3f154
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Feb 4 14:24:50 2021 +0100

    Return an accurate summary from buffer's flush() method
    
    The earlier implementation would only return summary data from keys that
    existed in the last `_add` backend method run, rather than collating all
    the results.

commit 5b3e6c9f70f94f4ae7440de453fd75cd1d3bbeb6
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Feb 4 09:56:03 2021 +0100

    buffer: add support for snapshots
    
    This is mostly a consistency addition, considering that most (if not
    all) loaders will only add a single snapshot.
    
    The common pattern of loading objects in topological order (content >
    directory > revision > release > snapshot), then flushing the storage,
    is now fully consistent; Without this addition, the snapshot addition
    would reach the backend storage before all other objects are added,
    leading to potential inconsistencies if the flush of other object types
    fails.

commit 18967ed4a5f2eb67c57ae34ed913073b45bedc90
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Feb 4 10:14:59 2021 +0100

    buffer: add type annotations for tests

See https://jenkins.softwareheritage.org/job/DSTO/job/tests-on-diff/1129/ for more details.