Page MenuHomeSoftware Heritage

Add scheduling policy for already visited origins with known last update
ClosedPublic

Authored by vlorentz on Jan 20 2021, 5:46 PM.

Details

Summary

This policy schedules origins by decreasing order of "visit lag" (that
is, origins with the most lag are scheduled first).

Related to T2444

Diff Detail

Repository
rDSCH Scheduling utilities
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D4899 (id=17411)

Could not rebase; Attempt merge onto 898820fac5...

Updating 898820f..f1a890c
Fast-forward
 swh/scheduler/backend.py              |  82 +++++++++++++----
 swh/scheduler/tests/test_scheduler.py | 169 +++++++++++++++++++++++++++++-----
 2 files changed, 209 insertions(+), 42 deletions(-)
Changes applied before test
commit f1a890c13a90ea963ceff81d41c1456638bd7c90
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:29:16 2021 +0100

    Add scheduling policy for already visited origins with known last update
    
    This policy schedules origins by decreasing order of "visit lag" (that
    is, origins with the most lag are scheduled first).

commit a5ece5f6aa7a378062eeb6ab8e7c9b0faea35c11
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:25:46 2021 +0100

    Add scheduling policy for never visited origins
    
    This policy orders never visited origins by increasing date of last
    update (scheduling the "oldest" never visited origins first).

commit e03158823653265bd5ebcb60d7bcc67c0e8beb4e
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:23:03 2021 +0100

    Reorganize grab_next_visits tests to better check sorting behavior
    
     - factor out test setup and results checking
     - properly exercize corner cases of the oldest_scheduled_first policy

commit 8bab1ba37aebbb9921e73ffbb17a9cb25a94c264
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:17:17 2021 +0100

    Make the grab_next_visits sql query modular
    
    This will allow us to easily plug new scheduling policies in that
    function.

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/198/ for more details.

Build is green

Patch application report for D4899 (id=17460)

Could not rebase; Attempt merge onto b641ac83eb...

Updating b641ac8..ea8bb15
Fast-forward
 swh/scheduler/backend.py              |  34 +++++++
 swh/scheduler/tests/test_scheduler.py | 178 ++++++++++++++++++++++++++++------
 2 files changed, 184 insertions(+), 28 deletions(-)
Changes applied before test
commit ea8bb155b917af2d641817eaa9aa100ee49a9cbf
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:29:16 2021 +0100

    Add scheduling policy for already visited origins with known last update
    
    This policy schedules origins by decreasing order of "visit lag" (that
    is, origins with the most lag are scheduled first).

commit ae71389ef603c6454d18e180c6c9019f99dcda9d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:25:46 2021 +0100

    Add scheduling policy for never visited origins
    
    This policy orders never visited origins by increasing date of last
    update (scheduling the "oldest" never visited origins first).

commit 6ea71f094c81df4d60a5d5872e4b60c2b4dc0f7c
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:23:03 2021 +0100

    Reorganize grab_next_visits tests to better check sorting behavior
    
     - factor out test setup and results checking
     - properly exercize corner cases of the oldest_scheduled_first policy

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/216/ for more details.

Build has FAILED

Patch application report for D4899 (id=17472)

Could not rebase; Attempt merge onto af3789891f...

Updating af37898..2f47936
Fast-forward
 swh/scheduler/backend.py              |  34 +++++++
 swh/scheduler/tests/test_scheduler.py | 178 ++++++++++++++++++++++++++++------
 2 files changed, 184 insertions(+), 28 deletions(-)
Changes applied before test
commit 2f47936731cf438a5195978a2af3250597b693b5
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:29:16 2021 +0100

    Add scheduling policy for already visited origins with known last update
    
    This policy schedules origins by decreasing order of "visit lag" (that
    is, origins with the most lag are scheduled first).

commit acad712ad3f71f88f99e45e9b4f571ad751945dc
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:25:46 2021 +0100

    Add scheduling policy for never visited origins
    
    This policy orders never visited origins by increasing date of last
    update (scheduling the "oldest" never visited origins first).

commit 03460207a17d82635ef5a6f12358392143eb9eef
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:23:03 2021 +0100

    Reorganize grab_next_visits tests to better check sorting behavior
    
     - factor out test setup and results checking
     - properly exercize corner cases of the oldest_scheduled_first policy

Link to build: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/220/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/220/console

Build has FAILED

Patch application report for D4899 (id=17472)

Could not rebase; Attempt merge onto af3789891f...

Updating af37898..2f47936
Fast-forward
 swh/scheduler/backend.py              |  34 +++++++
 swh/scheduler/tests/test_scheduler.py | 178 ++++++++++++++++++++++++++++------
 2 files changed, 184 insertions(+), 28 deletions(-)
Changes applied before test
commit 2f47936731cf438a5195978a2af3250597b693b5
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:29:16 2021 +0100

    Add scheduling policy for already visited origins with known last update
    
    This policy schedules origins by decreasing order of "visit lag" (that
    is, origins with the most lag are scheduled first).

commit acad712ad3f71f88f99e45e9b4f571ad751945dc
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:25:46 2021 +0100

    Add scheduling policy for never visited origins
    
    This policy orders never visited origins by increasing date of last
    update (scheduling the "oldest" never visited origins first).

commit 03460207a17d82635ef5a6f12358392143eb9eef
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:23:03 2021 +0100

    Reorganize grab_next_visits tests to better check sorting behavior
    
     - factor out test setup and results checking
     - properly exercize corner cases of the oldest_scheduled_first policy

Link to build: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/223/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/223/console

Build is green

Patch application report for D4899 (id=17472)

Could not rebase; Attempt merge onto af3789891f...

Updating af37898..2f47936
Fast-forward
 swh/scheduler/backend.py              |  34 +++++++
 swh/scheduler/tests/test_scheduler.py | 178 ++++++++++++++++++++++++++++------
 2 files changed, 184 insertions(+), 28 deletions(-)
Changes applied before test
commit 2f47936731cf438a5195978a2af3250597b693b5
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:29:16 2021 +0100

    Add scheduling policy for already visited origins with known last update
    
    This policy schedules origins by decreasing order of "visit lag" (that
    is, origins with the most lag are scheduled first).

commit acad712ad3f71f88f99e45e9b4f571ad751945dc
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:25:46 2021 +0100

    Add scheduling policy for never visited origins
    
    This policy orders never visited origins by increasing date of last
    update (scheduling the "oldest" never visited origins first).

commit 03460207a17d82635ef5a6f12358392143eb9eef
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Wed Jan 20 17:23:03 2021 +0100

    Reorganize grab_next_visits tests to better check sorting behavior
    
     - factor out test setup and results checking
     - properly exercize corner cases of the oldest_scheduled_first policy

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/227/ for more details.

This revision is now accepted and ready to land.Jan 22 2021, 10:58 AM