Page MenuHomeSoftware Heritage

Introduce new scheduling policy to grab origins without last update
ClosedPublic

Authored by olasd on Jul 1 2021, 12:34 PM.

Diff Detail

Repository
rDSCH Scheduling utilities
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

swh/scheduler/backend.py
435

we probably want those joins to be computed as well now.

Build has FAILED

Patch application report for D5956 (id=21386)

Could not rebase; Attempt merge onto 1006f0aee4...

Updating 1006f0a..97d7828
Fast-forward
 sql/updates/29.sql                         |  27 +++++++
 swh/scheduler/backend.py                   |  57 ++++++++++---
 swh/scheduler/interface.py                 |  21 +++++
 swh/scheduler/journal_client.py            |  98 ++++++++++++++++++++---
 swh/scheduler/model.py                     |   8 ++
 swh/scheduler/sql/30-schema.sql            |  21 ++++-
 swh/scheduler/tests/test_api_client.py     |   2 +
 swh/scheduler/tests/test_journal_client.py | 115 ++++++++++++++++++---------
 swh/scheduler/tests/test_scheduler.py      | 123 ++++++++++++++++++++++++++---
 9 files changed, 403 insertions(+), 69 deletions(-)
 create mode 100644 sql/updates/29.sql
Changes applied before test
commit 97d7828110f5df2160084090c831ade723524c0f
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 1 12:18:49 2021 +0200

    Introduce new scheduling policy to grab origins without last update
    
    Related to T2345

commit f3d182b0d38ca6c617805bf2011d2001a0c8bb6c
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 16:00:01 2021 +0200

    journal_client: Compute next position for origin visit
    
    For origin without any last_update information [1], the journal client is now also in
    charge of moving their next position in the queue for rescheduling. Depending on their
    status, the next position offset and next_visit_queue_position are updated after each
    visit completes:
    
    - if the visit has failed, increase the next visit target by the minimal visit
      interval (to take into account transient loading issues)
    - if the visit is successful, and records some changes, decrease the visit interval
      index by 2 (visit the origin *way* more often).
    - if the visit is successful, and records no changes, increase the visit interval index
      by 1 (visit the origin less often).
    
    We then set the next visit target to its current value + the new visit interval
    multiplied by a random fudge factor (picked in the -/+ 10% range).
    
    The fudge factor allows the visits to spread out, avoiding "bursts" of loaded origins
    e.g. when a number of origins from a single hoster are processed at once.
    
    Note that the computations happen for all origins for simplicity and code maintenance
    but it will only be used by a new soon-to-be scheduling policy.
    
    [1] Lister cannot provide it for some reason.

commit cb1edf1ab24d1c8db5821578a7fb2633fab50ff4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 18:07:59 2021 +0200

    Introduce storage for the recurrent visit scheduler queue position

commit ec6e69f6415a007611c46f25e7c48e909a793d53
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:42:26 2021 +0200

    Start handling of recurrent loading tasks in scheduler
    
    This deals first and foremost with the next_position_offset update done by the scheduler
    journal client.

commit c486b28ece7c0b127fea10bbb4d7f5d1ad5c50ba
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 14:41:07 2021 +0200

    journal_client: Explicit docstring

commit 98f99b9fd457820dc2d4b5dab7e89cb8261a34a4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:39:40 2021 +0200

    journal_client: Only check last_* fields for some permutation tests
    
    In a future commit, we will add new fields whose values will be permutation dependent.

Link to build: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/400/
See console output for more information: https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/400/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 1 2021, 12:37 PM
Harbormaster failed remote builds in B22365: Diff 21386!

Build is green

Patch application report for D5956 (id=21403)

Could not rebase; Attempt merge onto 1006f0aee4...

Updating 1006f0a..cebd184
Fast-forward
 sql/updates/29.sql                         |  27 +++
 swh/scheduler/backend.py                   |  57 ++++-
 swh/scheduler/interface.py                 |  21 ++
 swh/scheduler/journal_client.py            |  98 ++++++++-
 swh/scheduler/model.py                     |   8 +
 swh/scheduler/sql/30-schema.sql            |  21 +-
 swh/scheduler/tests/test_api_client.py     |   2 +
 swh/scheduler/tests/test_journal_client.py | 336 ++++++++++++++++++-----------
 swh/scheduler/tests/test_scheduler.py      | 123 +++++++++--
 9 files changed, 533 insertions(+), 160 deletions(-)
 create mode 100644 sql/updates/29.sql
Changes applied before test
commit cebd1842e1be4738c280125ea0bbeb30bf40180a
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 1 12:18:49 2021 +0200

    Introduce new scheduling policy to grab origins without last update
    
    Related to T2345

commit faac6f895e0bd3565441c48c5ad3207ce141e8cb
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 16:00:01 2021 +0200

    journal_client: Compute next position for origin visit
    
    For origin without any last_update information [1], the journal client is now also in
    charge of moving their next position in the queue for rescheduling. Depending on their
    status, the next position offset and next_visit_queue_position are updated after each
    visit completes:
    
    - if the visit has failed, increase the next visit target by the minimal visit
      interval (to take into account transient loading issues)
    - if the visit is successful, and records some changes, decrease the visit interval
      index by 2 (visit the origin *way* more often).
    - if the visit is successful, and records no changes, increase the visit interval index
      by 1 (visit the origin less often).
    
    We then set the next visit target to its current value + the new visit interval
    multiplied by a random fudge factor (picked in the -/+ 10% range).
    
    The fudge factor allows the visits to spread out, avoiding "bursts" of loaded origins
    e.g. when a number of origins from a single hoster are processed at once.
    
    Note that the computations happen for all origins for simplicity and code maintenance
    but it will only be used by a new soon-to-be scheduling policy.
    
    [1] Lister cannot provide it for some reason.

commit cb1edf1ab24d1c8db5821578a7fb2633fab50ff4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 18:07:59 2021 +0200

    Introduce storage for the recurrent visit scheduler queue position

commit ec6e69f6415a007611c46f25e7c48e909a793d53
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:42:26 2021 +0200

    Start handling of recurrent loading tasks in scheduler
    
    This deals first and foremost with the next_position_offset update done by the scheduler
    journal client.

commit c486b28ece7c0b127fea10bbb4d7f5d1ad5c50ba
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 14:41:07 2021 +0200

    journal_client: Explicit docstring

commit 98f99b9fd457820dc2d4b5dab7e89cb8261a34a4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:39:40 2021 +0200

    journal_client: Only check last_* fields for some permutation tests
    
    In a future commit, we will add new fields whose values will be permutation dependent.

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/402/ for more details.

ardumont edited the test plan for this revision. (Show Details)
ardumont edited the summary of this revision. (Show Details)
swh/scheduler/tests/test_scheduler.py
1060 ↗(On Diff #21403)

¯\_(ツ)_/¯

Build is green

Patch application report for D5956 (id=21407)

Could not rebase; Attempt merge onto 1006f0aee4...

Updating 1006f0a..cc41f0c
Fast-forward
 sql/updates/29.sql                         |  27 +++
 swh/scheduler/backend.py                   |  59 ++++-
 swh/scheduler/interface.py                 |  21 ++
 swh/scheduler/journal_client.py            |  98 ++++++++-
 swh/scheduler/model.py                     |   8 +
 swh/scheduler/sql/30-schema.sql            |  21 +-
 swh/scheduler/tests/test_api_client.py     |   2 +
 swh/scheduler/tests/test_journal_client.py | 336 ++++++++++++++++++-----------
 swh/scheduler/tests/test_scheduler.py      | 140 ++++++++++--
 9 files changed, 552 insertions(+), 160 deletions(-)
 create mode 100644 sql/updates/29.sql
Changes applied before test
commit cc41f0cd579011034e52c245738929fb77fe4a01
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 1 12:18:49 2021 +0200

    Introduce new scheduling policy to grab origins without last update
    
    Related to T2345

commit faac6f895e0bd3565441c48c5ad3207ce141e8cb
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 16:00:01 2021 +0200

    journal_client: Compute next position for origin visit
    
    For origin without any last_update information [1], the journal client is now also in
    charge of moving their next position in the queue for rescheduling. Depending on their
    status, the next position offset and next_visit_queue_position are updated after each
    visit completes:
    
    - if the visit has failed, increase the next visit target by the minimal visit
      interval (to take into account transient loading issues)
    - if the visit is successful, and records some changes, decrease the visit interval
      index by 2 (visit the origin *way* more often).
    - if the visit is successful, and records no changes, increase the visit interval index
      by 1 (visit the origin less often).
    
    We then set the next visit target to its current value + the new visit interval
    multiplied by a random fudge factor (picked in the -/+ 10% range).
    
    The fudge factor allows the visits to spread out, avoiding "bursts" of loaded origins
    e.g. when a number of origins from a single hoster are processed at once.
    
    Note that the computations happen for all origins for simplicity and code maintenance
    but it will only be used by a new soon-to-be scheduling policy.
    
    [1] Lister cannot provide it for some reason.

commit cb1edf1ab24d1c8db5821578a7fb2633fab50ff4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 18:07:59 2021 +0200

    Introduce storage for the recurrent visit scheduler queue position

commit ec6e69f6415a007611c46f25e7c48e909a793d53
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:42:26 2021 +0200

    Start handling of recurrent loading tasks in scheduler
    
    This deals first and foremost with the next_position_offset update done by the scheduler
    journal client.

commit c486b28ece7c0b127fea10bbb4d7f5d1ad5c50ba
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 14:41:07 2021 +0200

    journal_client: Explicit docstring

commit 98f99b9fd457820dc2d4b5dab7e89cb8261a34a4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:39:40 2021 +0200

    journal_client: Only check last_* fields for some permutation tests
    
    In a future commit, we will add new fields whose values will be permutation dependent.

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/403/ for more details.

swh/scheduler/backend.py
431–443

Use the limit parameter in the call to the method.

433–434
435

Drop this join which is no longer needed with the previous comment.

475–477

This need to be a template and conditionned upon the right policy (the new one from this diff)

478

Build is green

Patch application report for D5956 (id=21518)

Could not rebase; Attempt merge onto 1006f0aee4...

Updating 1006f0a..e3fc744
Fast-forward
 sql/updates/29.sql                         |  27 ++
 swh/scheduler/backend.py                   |  57 +++-
 swh/scheduler/interface.py                 |  19 ++
 swh/scheduler/journal_client.py            | 136 +++++++++-
 swh/scheduler/model.py                     |   8 +
 swh/scheduler/sql/30-schema.sql            |  21 +-
 swh/scheduler/tests/test_api_client.py     |   2 +
 swh/scheduler/tests/test_journal_client.py | 411 ++++++++++++++++++++---------
 swh/scheduler/tests/test_scheduler.py      | 173 +++++++++++-
 9 files changed, 695 insertions(+), 159 deletions(-)
 create mode 100644 sql/updates/29.sql
Changes applied before test
commit e3fc744f5224c29863b621bcc44a26877b97cc99
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 1 12:18:49 2021 +0200

    Introduce new scheduling policy to grab origins without last update
    
    This is in charge of scheduling origins without last update. This also updates the
    global queue position so the journal client can initialize correctly the next position
    per origin and visit type.
    
    Related to T2345

commit 8c4ae9f14d6abdca41a4f01b438310501ecb6259
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 16:00:01 2021 +0200

    journal_client: Compute next position for origin visit
    
    For origin without any last_update information [1], the journal client is now also in
    charge of moving their next position in the queue for rescheduling. Depending on their
    status, the next position offset and next_visit_queue_position are updated after each
    visit completes:
    
    - if the visit has failed, increase the next visit target by the minimal visit
      interval (to take into account transient loading issues)
    - if the visit is successful, and records some changes, decrease the visit interval
      index by 2 (visit the origin *way* more often).
    - if the visit is successful, and records no changes, increase the visit interval index
      by 1 (visit the origin less often).
    
    We then set the next visit target to its current value + the new visit interval
    multiplied by a random fudge factor (picked in the -/+ 10% range).
    
    The fudge factor allows the visits to spread out, avoiding "bursts" of loaded origins
    e.g. when a number of origins from a single hoster are processed at once.
    
    Note that the computations happen for all origins for simplicity and code maintenance
    but it will only be used by a new soon-to-be scheduling policy.
    
    [1] Lister cannot provide it for some reason.

commit cb1edf1ab24d1c8db5821578a7fb2633fab50ff4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 18:07:59 2021 +0200

    Introduce storage for the recurrent visit scheduler queue position

commit ec6e69f6415a007611c46f25e7c48e909a793d53
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:42:26 2021 +0200

    Start handling of recurrent loading tasks in scheduler
    
    This deals first and foremost with the next_position_offset update done by the scheduler
    journal client.

commit c486b28ece7c0b127fea10bbb4d7f5d1ad5c50ba
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 14:41:07 2021 +0200

    journal_client: Explicit docstring

commit 98f99b9fd457820dc2d4b5dab7e89cb8261a34a4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:39:40 2021 +0200

    journal_client: Only check last_* fields for some permutation tests
    
    In a future commit, we will add new fields whose values will be permutation dependent.

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/407/ for more details.

Build is green

Patch application report for D5956 (id=21519)

Could not rebase; Attempt merge onto 1006f0aee4...

Updating 1006f0a..474a337
Fast-forward
 sql/updates/29.sql                         |  27 ++
 swh/scheduler/backend.py                   |  43 ++-
 swh/scheduler/interface.py                 |  19 ++
 swh/scheduler/journal_client.py            | 136 +++++++++-
 swh/scheduler/model.py                     |   8 +
 swh/scheduler/sql/30-schema.sql            |  21 +-
 swh/scheduler/tests/test_api_client.py     |   2 +
 swh/scheduler/tests/test_journal_client.py | 411 ++++++++++++++++++++---------
 swh/scheduler/tests/test_scheduler.py      | 173 +++++++++++-
 9 files changed, 687 insertions(+), 153 deletions(-)
 create mode 100644 sql/updates/29.sql
Changes applied before test
commit 474a3379d53f876241c265fb6619c7dd3910199d
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 1 12:18:49 2021 +0200

    Introduce new scheduling policy to grab origins without last update
    
    This is in charge of scheduling origins without last update. This also updates the
    global queue position so the journal client can initialize correctly the next position
    per origin and visit type.
    
    Related to T2345

commit 8c4ae9f14d6abdca41a4f01b438310501ecb6259
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 16:00:01 2021 +0200

    journal_client: Compute next position for origin visit
    
    For origin without any last_update information [1], the journal client is now also in
    charge of moving their next position in the queue for rescheduling. Depending on their
    status, the next position offset and next_visit_queue_position are updated after each
    visit completes:
    
    - if the visit has failed, increase the next visit target by the minimal visit
      interval (to take into account transient loading issues)
    - if the visit is successful, and records some changes, decrease the visit interval
      index by 2 (visit the origin *way* more often).
    - if the visit is successful, and records no changes, increase the visit interval index
      by 1 (visit the origin less often).
    
    We then set the next visit target to its current value + the new visit interval
    multiplied by a random fudge factor (picked in the -/+ 10% range).
    
    The fudge factor allows the visits to spread out, avoiding "bursts" of loaded origins
    e.g. when a number of origins from a single hoster are processed at once.
    
    Note that the computations happen for all origins for simplicity and code maintenance
    but it will only be used by a new soon-to-be scheduling policy.
    
    [1] Lister cannot provide it for some reason.

commit cb1edf1ab24d1c8db5821578a7fb2633fab50ff4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 18:07:59 2021 +0200

    Introduce storage for the recurrent visit scheduler queue position

commit ec6e69f6415a007611c46f25e7c48e909a793d53
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:42:26 2021 +0200

    Start handling of recurrent loading tasks in scheduler
    
    This deals first and foremost with the next_position_offset update done by the scheduler
    journal client.

commit c486b28ece7c0b127fea10bbb4d7f5d1ad5c50ba
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 14:41:07 2021 +0200

    journal_client: Explicit docstring

commit 98f99b9fd457820dc2d4b5dab7e89cb8261a34a4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:39:40 2021 +0200

    journal_client: Only check last_* fields for some permutation tests
    
    In a future commit, we will add new fields whose values will be permutation dependent.

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/408/ for more details.

Revert one last unneeded change

Build is green

Patch application report for D5956 (id=21521)

Could not rebase; Attempt merge onto 1006f0aee4...

Updating 1006f0a..b02db7c
Fast-forward
 sql/updates/29.sql                         |  27 ++
 swh/scheduler/backend.py                   |  41 ++-
 swh/scheduler/interface.py                 |  19 ++
 swh/scheduler/journal_client.py            | 136 +++++++++-
 swh/scheduler/model.py                     |   8 +
 swh/scheduler/sql/30-schema.sql            |  21 +-
 swh/scheduler/tests/test_api_client.py     |   2 +
 swh/scheduler/tests/test_journal_client.py | 411 ++++++++++++++++++++---------
 swh/scheduler/tests/test_scheduler.py      | 173 +++++++++++-
 9 files changed, 686 insertions(+), 152 deletions(-)
 create mode 100644 sql/updates/29.sql
Changes applied before test
commit b02db7ce6222feeb5db7a7aff83a11c3a3697bd3
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 1 12:18:49 2021 +0200

    Introduce new scheduling policy to grab origins without last update
    
    This is in charge of scheduling origins without last update. This also updates the
    global queue position so the journal client can initialize correctly the next position
    per origin and visit type.
    
    Related to T2345

commit 8c4ae9f14d6abdca41a4f01b438310501ecb6259
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 16:00:01 2021 +0200

    journal_client: Compute next position for origin visit
    
    For origin without any last_update information [1], the journal client is now also in
    charge of moving their next position in the queue for rescheduling. Depending on their
    status, the next position offset and next_visit_queue_position are updated after each
    visit completes:
    
    - if the visit has failed, increase the next visit target by the minimal visit
      interval (to take into account transient loading issues)
    - if the visit is successful, and records some changes, decrease the visit interval
      index by 2 (visit the origin *way* more often).
    - if the visit is successful, and records no changes, increase the visit interval index
      by 1 (visit the origin less often).
    
    We then set the next visit target to its current value + the new visit interval
    multiplied by a random fudge factor (picked in the -/+ 10% range).
    
    The fudge factor allows the visits to spread out, avoiding "bursts" of loaded origins
    e.g. when a number of origins from a single hoster are processed at once.
    
    Note that the computations happen for all origins for simplicity and code maintenance
    but it will only be used by a new soon-to-be scheduling policy.
    
    [1] Lister cannot provide it for some reason.

commit cb1edf1ab24d1c8db5821578a7fb2633fab50ff4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 18:07:59 2021 +0200

    Introduce storage for the recurrent visit scheduler queue position

commit ec6e69f6415a007611c46f25e7c48e909a793d53
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:42:26 2021 +0200

    Start handling of recurrent loading tasks in scheduler
    
    This deals first and foremost with the next_position_offset update done by the scheduler
    journal client.

commit c486b28ece7c0b127fea10bbb4d7f5d1ad5c50ba
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 14:41:07 2021 +0200

    journal_client: Explicit docstring

commit 98f99b9fd457820dc2d4b5dab7e89cb8261a34a4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:39:40 2021 +0200

    journal_client: Only check last_* fields for some permutation tests
    
    In a future commit, we will add new fields whose values will be permutation dependent.

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/409/ for more details.

swh/scheduler/backend.py
430

We discussed the possibility to make this an enum with olasd (but that's a refactoring for another time).

olasd added a reviewer: ardumont.

make the handling of CTEs more modular

Build is green

Patch application report for D5956 (id=21740)

Could not rebase; Attempt merge onto 1006f0aee4...

Updating 1006f0a..d58776a
Fast-forward
 sql/updates/29.sql                         |  27 ++
 swh/scheduler/backend.py                   | 116 +++++---
 swh/scheduler/interface.py                 |  19 ++
 swh/scheduler/journal_client.py            | 136 +++++++++-
 swh/scheduler/model.py                     |   8 +
 swh/scheduler/sql/30-schema.sql            |  21 +-
 swh/scheduler/tests/test_api_client.py     |   2 +
 swh/scheduler/tests/test_journal_client.py | 411 ++++++++++++++++++++---------
 swh/scheduler/tests/test_scheduler.py      | 173 +++++++++++-
 9 files changed, 730 insertions(+), 183 deletions(-)
 create mode 100644 sql/updates/29.sql
Changes applied before test
commit d58776ab0b41ccaf93cc64c86688712db5b44c07
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Thu Jul 22 12:22:24 2021 +0200

    Introduce new scheduling policy to grab origins without last update
    
    This is in charge of scheduling origins without last update. This also updates the
    global queue position so the journal client can initialize correctly the next position
    per origin and visit type.
    
    Related to T2345

commit 825e8cfe7d245d025c70384439d0f739b878eadd
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date:   Thu Jul 22 12:19:42 2021 +0200

    grab_next_visits: make the handling of CTEs more modular
    
    This allows us to insert extra CTEs if a scheduling policy needs it.

commit 8c4ae9f14d6abdca41a4f01b438310501ecb6259
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 16:00:01 2021 +0200

    journal_client: Compute next position for origin visit
    
    For origin without any last_update information [1], the journal client is now also in
    charge of moving their next position in the queue for rescheduling. Depending on their
    status, the next position offset and next_visit_queue_position are updated after each
    visit completes:
    
    - if the visit has failed, increase the next visit target by the minimal visit
      interval (to take into account transient loading issues)
    - if the visit is successful, and records some changes, decrease the visit interval
      index by 2 (visit the origin *way* more often).
    - if the visit is successful, and records no changes, increase the visit interval index
      by 1 (visit the origin less often).
    
    We then set the next visit target to its current value + the new visit interval
    multiplied by a random fudge factor (picked in the -/+ 10% range).
    
    The fudge factor allows the visits to spread out, avoiding "bursts" of loaded origins
    e.g. when a number of origins from a single hoster are processed at once.
    
    Note that the computations happen for all origins for simplicity and code maintenance
    but it will only be used by a new soon-to-be scheduling policy.
    
    [1] Lister cannot provide it for some reason.

commit cb1edf1ab24d1c8db5821578a7fb2633fab50ff4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 18:07:59 2021 +0200

    Introduce storage for the recurrent visit scheduler queue position

commit ec6e69f6415a007611c46f25e7c48e909a793d53
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:42:26 2021 +0200

    Start handling of recurrent loading tasks in scheduler
    
    This deals first and foremost with the next_position_offset update done by the scheduler
    journal client.

commit c486b28ece7c0b127fea10bbb4d7f5d1ad5c50ba
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Tue Jun 29 14:41:07 2021 +0200

    journal_client: Explicit docstring

commit 98f99b9fd457820dc2d4b5dab7e89cb8261a34a4
Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org>
Date:   Wed Jun 23 16:39:40 2021 +0200

    journal_client: Only check last_* fields for some permutation tests
    
    In a future commit, we will add new fields whose values will be permutation dependent.

See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/417/ for more details.

This revision is now accepted and ready to land.Jul 22 2021, 12:53 PM