Page MenuHomeSoftware Heritage

Ensure origins are not visited faster than twice a day
ClosedPublic

Authored by olasd on Oct 25 2022, 3:52 PM.

Details

Summary

The scheduled_cooldown only applies to tasks that have not been executed
yet. absolute_cooldown avoids archiving objects faster than that.

Test Plan

tests need to be written, I expect this to fail testing

Diff Detail

Repository
rDSCH Scheduling utilities
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D8770 (id=31619)

Could not rebase; Attempt merge onto aeb870a700...

Updating aeb870a..719e53b
Fast-forward
 swh/scheduler/backend.py                         | 10 ++++++++++
 swh/scheduler/celery_backend/config.py           |  4 ++--
 swh/scheduler/celery_backend/recurrent_visits.py |  6 ++++++
 swh/scheduler/interface.py                       |  4 +++-
 4 files changed, 21 insertions(+), 3 deletions(-)
Changes applied before test
commit 719e53b9e49bb9e12a218ef5185b407dc98dfc48
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Tue Oct 25 15:48:55 2022 +0200

    Ensure origins are not visited faster than twice a day
    
    The scheduled_cooldown only applies to tasks that have not been executed
    yet. absolute_cooldown avoids archiving objects faster than that.

commit 5e3ecb339ed74fcbb59724dff3c7bb63217f16d3
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Tue Oct 25 15:47:37 2022 +0200

    Refresh task type data from the database every time recurrent tasks are run
    
    Avoids inconsistencies between the database state and an ongoing
    recurrent task scheduler.

commit bde27a9e4262373486ed8c3e48bfbfe5171a9432
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Tue Oct 25 15:46:26 2022 +0200

    Use json instead of msgpack for serializers
    
    Recent celery versions generate serialized messages with mime types incompatible
    with older versions when using msgpack

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/566/ for more details.

olasd requested review of this revision.Oct 25 2022, 3:56 PM
This revision is now accepted and ready to land.Oct 25 2022, 4:05 PM

Add tests for the new absolute_cooldown

Build is green

Patch application report for D8770 (id=31623)

Could not rebase; Attempt merge onto aeb870a700...

Updating aeb870a..539374a
Fast-forward
 swh/scheduler/backend.py                         | 10 ++++++++++
 swh/scheduler/celery_backend/config.py           |  4 ++--
 swh/scheduler/celery_backend/recurrent_visits.py |  6 ++++++
 swh/scheduler/interface.py                       |  4 +++-
 swh/scheduler/tests/test_scheduler.py            | 17 +++++++++++++----
 5 files changed, 34 insertions(+), 7 deletions(-)
Changes applied before test
commit 539374a1e1edbf18453dfd1dfec9ee9e7259ca50
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Tue Oct 25 15:48:55 2022 +0200

    Ensure origins are not visited faster than twice a day
    
    The scheduled_cooldown only applies to tasks that have not been executed
    yet. absolute_cooldown avoids archiving objects faster than that.

commit 5e3ecb339ed74fcbb59724dff3c7bb63217f16d3
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Tue Oct 25 15:47:37 2022 +0200

    Refresh task type data from the database every time recurrent tasks are run
    
    Avoids inconsistencies between the database state and an ongoing
    recurrent task scheduler.

commit bde27a9e4262373486ed8c3e48bfbfe5171a9432
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Tue Oct 25 15:46:26 2022 +0200

    Use json instead of msgpack for serializers
    
    Recent celery versions generate serialized messages with mime types incompatible
    with older versions when using msgpack

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/567/ for more details.

Build is green

Patch application report for D8770 (id=31626)

Could not rebase; Attempt merge onto aeb870a700...

Updating aeb870a..ff75e74
Fast-forward
 swh/scheduler/backend.py                         | 10 ++++++++++
 swh/scheduler/celery_backend/config.py           |  4 ++--
 swh/scheduler/celery_backend/recurrent_visits.py | 11 +++++++++--
 swh/scheduler/interface.py                       |  4 +++-
 swh/scheduler/tests/test_scheduler.py            | 17 +++++++++++++----
 5 files changed, 37 insertions(+), 9 deletions(-)
Changes applied before test
commit ff75e742ee45e150d326f0007380dd1d1b45859d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Tue Oct 25 15:48:55 2022 +0200

    Ensure origins are not visited faster than twice a day
    
    The scheduled_cooldown only applies to tasks that have not been executed
    yet. absolute_cooldown avoids archiving objects faster than that.

commit 1f9109fa4d66f42fcc67bd6ba06c91d1eeffedec
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Tue Oct 25 15:47:37 2022 +0200

    Refresh task type data from the database every time recurrent tasks are run
    
    Avoids inconsistencies between the database state and an ongoing
    recurrent task scheduler.

commit bde27a9e4262373486ed8c3e48bfbfe5171a9432
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Tue Oct 25 15:46:26 2022 +0200

    Use json instead of msgpack for serializers
    
    Recent celery versions generate serialized messages with mime types incompatible
    with older versions when using msgpack

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/569/ for more details.