Page MenuHomeSoftware Heritage

Rework the replaying exception handling
ClosedPublic

Authored by douardda on Dec 7 2022, 4:30 PM.

Details

Summary

to make it a bit more robust and give more details on what's going on in
case of errors.

Add tests for those error situations.

Also improve logging in replay.py:

Add a few debug logging statements and improve the info-level summary
statement at the end of process_replay_objects_content(), and fix tests accordingly.

Diff Detail

Event Timeline

Build is green

Patch application report for D8939 (id=32209)

Rebasing onto 0c0fa5c733...

Current branch diff-target is up to date.
Changes applied before test
commit 82e3a9258a56a064acf05dd6cf2a6ffab90d907d
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed Dec 7 16:21:15 2022 +0100

    Add tests for replay scenarios with add errors

commit 9156f6e1d5e68e195f7c98e6666102dfa61e2244
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed Dec 7 16:19:11 2022 +0100

    Improve logging in replay.py
    
    Add a few debug logging statements and improve the info-level summary
    statement at the end of `process_replay_objects_content()`.
    
    Fix tests accordingly.

commit 062d603ebb5128354137beaaec35d92614d28eb4
Author: David Douard <david.douard@sdfa3.org>
Date:   Mon Nov 21 16:05:24 2022 +0100

    Rework the replaying exception handling
    
    to make it a bit more robust and give more details on what's going on in
    case of errors.
    
    Add tests for those error situations.

See https://jenkins.softwareheritage.org/job/DOBJSRPL/job/tests-on-diff/46/ for more details.

douardda edited the summary of this revision. (Show Details)

Could you use a logger instance, and add if logger.isEnabledFor(logging.DEBUG): before logger.debug statements that use hash_to_hex?

swh/objstorage/replayer/tests/test_replay_errors.py
1

2022

Could you use a logger instance,

what do you mean by "use a logger instance"?

and add if logger.isEnabledFor(logging.DEBUG): before logger.debug statements that use hash_to_hex?

sure

Could you use a logger instance,

what do you mean by "use a logger instance"?

and add if logger.isEnabledFor(logging.DEBUG): before logger.debug statements that use hash_to_hex?

sure

Actually, why would this be needed? I mean I don't expect has_to_hex() to be a fat cpu burner, do I?

Could you use a logger instance,

what do you mean by "use a logger instance"?

and add if logger.isEnabledFor(logging.DEBUG): before logger.debug statements that use hash_to_hex?

sure

Actually, why would this be needed? I mean I don't expect has_to_hex() to be a fat cpu burner, do I?

And hash_to_hex() is already lru_cached, so I don't see the true benefit of adding these if statements everywhere...

Build is green

Patch application report for D8939 (id=32252)

Rebasing onto 0c0fa5c733...

Current branch diff-target is up to date.
Changes applied before test
commit 11b1b233e2b113e622f256928ec7e6273f8e60d1
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed Dec 7 16:21:15 2022 +0100

    Add tests for replay scenarios with add errors

commit b0989f7ed9d7771e81d8a8fd989d50b8fa4ce3c9
Author: David Douard <david.douard@sdfa3.org>
Date:   Wed Dec 7 16:19:11 2022 +0100

    Improve logging in replay.py
    
    Add a few debug logging statements and improve the info-level summary
    statement at the end of `process_replay_objects_content()`.
    
    Fix tests accordingly.

commit a3e366823d810220dbfac2b24cf02fc98866ff2e
Author: David Douard <david.douard@sdfa3.org>
Date:   Mon Nov 21 16:05:24 2022 +0100

    Rework the replaying exception handling
    
    to make it a bit more robust and give more details on what's going on in
    case of errors.
    
    Add tests for those error situations.

See https://jenkins.softwareheritage.org/job/DOBJSRPL/job/tests-on-diff/47/ for more details.

Could you use a logger instance,

what do you mean by "use a logger instance"?

I meant using the result of logging.getLogger() instead of logging directly, but you are already doing that. I don't see why I wrote that comment...

And hash_to_hex() is already lru_cached, so I don't see the true benefit of adding these if statements everywhere...

would it be called if not for logging, though?

Actually, why would this be needed? I mean I don't expect has_to_hex() to be a fat cpu burner, do I?

Heh you're right, it's only ~0.3µs on cache miss

This revision is now accepted and ready to land.Dec 15 2022, 10:04 AM