Differential D6693

Add support for a redis-based reporter for failed replayed objects
ClosedPublic
Actions

Authored by douardda on Nov 26 2021, 1:33 PM.

Details

Reviewers

vlorentz

Group Reviewers

Reviewers

Maniphest Tasks

T3693: Provide a mecanism to report (with persistence) objects that fails to get replayed (mirror)

Commits

rDOBJSRPLf5051ce1cbf4: Add support for a redis-based reporter for failed replayed objects

Summary

this is not an ideal solution since it's using a global variable
(swh.objstorage.replayer.replay.REPORTER) to activate the replaying
error reporting feature, but according the nature of tenacity (used to
implement retry capacity of interactions with the source and destination
objstorages), there is no easy solution (that I can see).

Depends on D6692
Related to T3693

Diff Detail

Repository

rDOBJSRPL Content replayer

Branch

master

Lint

Lint Skipped

Unit

Unit Tests Skipped

Build Status

Buildable 25312
Build 39567: Phabricator diff pipeline on jenkins	Jenkins console · Jenkins
Build 39566: arc lint + arc unit

Event Timeline

douardda created this revision.Nov 26 2021, 1:33 PM

Herald added a reviewer: Reviewers. · View Herald TranscriptNov 26 2021, 1:33 PM

douardda edited the summary of this revision. (Show Details)Nov 26 2021, 1:33 PM

douardda added a task: T3693: Provide a mecanism to report (with persistence) objects that fails to get replayed (mirror).

Build is green

Patch application report for D6693 (id=24313)

Could not rebase; Attempt merge onto 720b7af2bf...

Updating 720b7af..10f620a
Fast-forward
 mypy.ini                                  |   3 +
 requirements-test.txt                     |   2 +
 requirements.txt                          |   1 +
 swh/objstorage/replayer/cli.py            |   5 +
 swh/objstorage/replayer/replay.py         | 111 ++++++++++++--------
 swh/objstorage/replayer/tests/test_cli.py | 167 ++++++++++++++++++++++++------
 6 files changed, 216 insertions(+), 73 deletions(-)

Changes applied before test

commit 10f620acc13d9490f44e282329d84d8568483f90
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Nov 25 18:05:01 2021 +0100

    Add support for a redis-based reporter for failed replayed objects
    
    this is not an ideal solution since it's using a global variable
    (swh.objstorage.replayer.replay.REPORTER) to activate the replaying
    error reporting feature, but according the nature of tenacity (used to
    implement retry capacity of interactions with the source and destination
    objstorages), there is no easy solution (that I can see).

commit cbb915109c82a63d98c4006e19dbc3d2bfa572a5
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Nov 25 18:02:54 2021 +0100

    Add doctrings and comments in test_cli.py

commit 604e00e30439ba5bd9dce24fd505637da81dd75f
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Nov 25 17:49:00 2021 +0100

    Rework the retry and reporting system in replay.py
    
    split the 'copy_object()' in 2 parts, 'get_object()' and 'put_object()'
    and make each of these decorated by '@content_replay_retry' (instead of
    the whole 'copy_object()'), so that a failing 'put_object' does not
    trigger getting the object from the src objstorage again.
    
    Also, only log in statsd the end result (error or success) instead of
    logging each attempt. We don't need these intermediate results in
    statsd, and it makes them much harder to use in a dashboard.
    
    Last, for the sake of implementation ease, use function name as
    "operation" tag in stastds reports (so "get_object", "put_object" and
    "obj_in_storage") instead of "copy" and "in_dst".
    
    Doing so, the 'operation' argument of the ReplayError exception has been
    dropped.

commit 002443567791de9ffef1a2b443eb7471d120bebe
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Nov 25 17:36:27 2021 +0100

    Small code refactoring in test_cli
    
    make '_fill_objstorage_and_kafka()' take the actual objstorage as
    argument, instead of the dict of objstorages ("src", "dst"). The "dst"
    objstorage is not used in the function, and it make the intent of this
    later clearer.

See https://jenkins.softwareheritage.org/job/DOBJSRPL/job/tests-on-diff/17/ for more details.

Harbormaster completed remote builds in B25187: Diff 24313.Nov 26 2021, 1:35 PM

douardda requested review of this revision.Nov 26 2021, 1:35 PM

there is no easy solution (that I can see).

There is one: moving all functions in replay.py to their own class, and set tenacity when instantiating the class. But I don't think it's worth it.

This revision is now accepted and ready to land.Nov 29 2021, 10:54 AM

rebase

Build is green

Patch application report for D6693 (id=24435)

Could not rebase; Attempt merge onto 720b7af2bf...

Updating 720b7af..f5051ce
Fast-forward
 mypy.ini                                     |   3 +
 requirements-test.txt                        |   2 +
 requirements.txt                             |   1 +
 swh/objstorage/replayer/cli.py               |   5 +
 swh/objstorage/replayer/replay.py            | 112 +++++++++++-------
 swh/objstorage/replayer/tests/test_cli.py    | 167 ++++++++++++++++++++++-----
 swh/objstorage/replayer/tests/test_statsd.py | 119 +++++++++++++++++++
 7 files changed, 336 insertions(+), 73 deletions(-)
 create mode 100644 swh/objstorage/replayer/tests/test_statsd.py

Changes applied before test

commit f5051ce1cbf4f8e5a73c57a38c689ea69d8c282e
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Nov 25 18:05:01 2021 +0100

    Add support for a redis-based reporter for failed replayed objects
    
    this is not an ideal solution since it's using a global variable
    (swh.objstorage.replayer.replay.REPORTER) to activate the replaying
    error reporting feature, but according the nature of tenacity (used to
    implement retry capacity of interactions with the source and destination
    objstorages), there is no easy solution (that I can see).

commit a7bd6bc4427bf35741c8953751ec1330f91445df
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Nov 25 18:02:54 2021 +0100

    Add doctrings and comments in test_cli.py

commit 8098798820bb2025f9bfc7c6b2b46e08a818b797
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Nov 25 17:49:00 2021 +0100

    Rework the retry and reporting system in replay.py
    
    split the 'copy_object()' in 2 parts, 'get_object()' and 'put_object()'
    and make each of these decorated by '@content_replay_retry' (instead of
    the whole 'copy_object()'), so that a failing 'put_object' does not
    trigger getting the object from the src objstorage again.
    
    Also, only log in statsd the end result (error or success) instead of
    logging each attempt. We don't need these intermediate results in
    statsd, and it makes them much harder to use in a dashboard.
    
    Last, for the sake of implementation ease, use function name as
    "operation" tag in stastds reports (so "get_object", "put_object" and
    "obj_in_storage") instead of "copy" and "in_dst".
    
    Doing so, the 'operation' argument of the ReplayError exception has been
    dropped.

commit 1d8ea80c7d01847dba16343508b74a037a1da864
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed Dec 1 17:28:56 2021 +0100

    Add tests for expected statsd reports during a content replay session

commit 002443567791de9ffef1a2b443eb7471d120bebe
Author: David Douard <david.douard@sdfa3.org>
Date:   Thu Nov 25 17:36:27 2021 +0100

    Small code refactoring in test_cli
    
    make '_fill_objstorage_and_kafka()' take the actual objstorage as
    argument, instead of the dict of objstorages ("src", "dst"). The "dst"
    objstorage is not used in the function, and it make the intent of this
    later clearer.

See https://jenkins.softwareheritage.org/job/DOBJSRPL/job/tests-on-diff/23/ for more details.

Harbormaster completed remote builds in B25312: Diff 24435.Dec 2 2021, 11:36 AM

Closed by commit rDOBJSRPLf5051ce1cbf4: Add support for a redis-based reporter for failed replayed objects (authored by douardda). · Explain WhyDec 2 2021, 1:12 PM

This revision was automatically updated to reflect the committed changes.

douardda added a commit: rDOBJSRPLf5051ce1cbf4: Add support for a redis-based reporter for failed replayed objects.