Page MenuHomeSoftware Heritage

recurrent_visits: Allow to set no origins scheduled backoff in config
ClosedPublic

Authored by anlambert on Sep 14 2022, 4:27 PM.

Details

Summary

The send_visits_for_visit_type function uses a default schedule backoff
of 20 minutes where there is no origins to schedule for a given visit
type.

It exists use cases when we would like that schedule backoff to be
shorter in order to schedule listed origins for loading into the
archive more rapidly, typically in the docker environment.

So allow to set that backoff value through configuration.

The purpose of that diff is to schedule loading tasks for listed
origins in the docker environment without having to restart the
swh-scheduler-schedule-recurrent service.

Test Plan

I did what I could to test this but things are hard to mock here
as it involves threading.

Diff Detail

Repository
rDSCH Scheduling utilities
Branch
config-no-origins-scheduled-backoff
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 31532
Build 49326: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 49325: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D8475 (id=30531)

Rebasing onto 7cfaa986c2...

Current branch diff-target is up to date.
Changes applied before test
commit e086db1855f00f2212b7f605086ccffcc200a9a2
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Wed Sep 14 16:18:51 2022 +0200

    recurrent_visits: Allow to set no origins scheduled backoff in config
    
    The send_visits_for_visit_type function uses a default schedule backoff
    of 20 minutes where there is no origins to schedule for a given visit
    type.
    
    It exists use cases when we would like that schedule backoff to be
    shorter in order to schedule listed origins for loading into the
    archive more rapidly, typically in the docker environment.
    
    So allow to set that backoff value through configuration.

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/559/ for more details.

vlorentz added inline comments.
swh/scheduler/celery_backend/recurrent_visits.py
71

should be renamed to DEFAULT_NO_ORIGINS_SCHEDULED_BACKOFF

This revision now requires changes to proceed.Sep 15 2022, 9:02 AM

Build is green

Patch application report for D8475 (id=30543)

Rebasing onto 7cfaa986c2...

Current branch diff-target is up to date.
Changes applied before test
commit b1afdab9200327cdff89a2377525317ae736bcb1
Author: Antoine Lambert <anlambert@softwareheritage.org>
Date:   Wed Sep 14 16:18:51 2022 +0200

    recurrent_visits: Allow to set no origins scheduled backoff in config
    
    The send_visits_for_visit_type function uses a default schedule backoff
    of 20 minutes where there is no origins to schedule for a given visit
    type.
    
    It exists use cases when we would like that schedule backoff to be
    shorter in order to schedule listed origins for loading into the
    archive more rapidly, typically in the docker environment.
    
    So allow to set that backoff value through configuration.

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/560/ for more details.

This revision was not accepted when it landed; it landed in state Needs Review.Sep 15 2022, 1:55 PM
This revision was automatically updated to reflect the committed changes.