Details
- Reviewers
vlorentz - Group Reviewers
Reviewers - Maniphest Tasks
- T2345: Improve handling of recurrent loading tasks in scheduler
- Commits
- rDSCH8281e351d6a1: journal_client: Disable origins when too many visited attempts failed
tox
Diff Detail
- Repository
- rDSCH Scheduling utilities
- Branch
- master
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 22863 Build 35659: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 35658: arc lint + arc unit
Event Timeline
Build is green
Patch application report for D5980 (id=21553)
Could not rebase; Attempt merge onto 1006f0aee4...
Updating 1006f0a..b16fd82 Fast-forward sql/updates/29.sql | 31 ++ swh/scheduler/backend.py | 41 ++- swh/scheduler/interface.py | 19 + swh/scheduler/journal_client.py | 198 ++++++++++- swh/scheduler/model.py | 10 + swh/scheduler/sql/30-schema.sql | 23 +- swh/scheduler/tests/test_api_client.py | 2 + swh/scheduler/tests/test_journal_client.py | 548 ++++++++++++++++++++++------- swh/scheduler/tests/test_scheduler.py | 173 ++++++++- 9 files changed, 894 insertions(+), 151 deletions(-) create mode 100644 sql/updates/29.sql
Changes applied before test
commit b16fd8252e00ac81214f4ac218d55e7eaf9beb69 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Jul 8 11:24:42 2021 +0200 journal_client: Deactivate origins when too many visited attempts failed Related to T2345 commit 64aa4458ddd2bb5bd9a913c17950732d962129e6 Author: David Douard <david.douard@sdfa3.org> Date: Wed Jul 7 16:55:57 2021 +0200 Add a successive_visits counter to origin visit stats This maintains the number of successive visits resulting in the same status. This will help when implementing the disabling of too many failed visit attempts for a given origin. Related to T2345 commit b02db7ce6222feeb5db7a7aff83a11c3a3697bd3 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Jul 1 12:18:49 2021 +0200 Introduce new scheduling policy to grab origins without last update This is in charge of scheduling origins without last update. This also updates the global queue position so the journal client can initialize correctly the next position per origin and visit type. Related to T2345 commit 8c4ae9f14d6abdca41a4f01b438310501ecb6259 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jun 29 16:00:01 2021 +0200 journal_client: Compute next position for origin visit For origin without any last_update information [1], the journal client is now also in charge of moving their next position in the queue for rescheduling. Depending on their status, the next position offset and next_visit_queue_position are updated after each visit completes: - if the visit has failed, increase the next visit target by the minimal visit interval (to take into account transient loading issues) - if the visit is successful, and records some changes, decrease the visit interval index by 2 (visit the origin *way* more often). - if the visit is successful, and records no changes, increase the visit interval index by 1 (visit the origin less often). We then set the next visit target to its current value + the new visit interval multiplied by a random fudge factor (picked in the -/+ 10% range). The fudge factor allows the visits to spread out, avoiding "bursts" of loaded origins e.g. when a number of origins from a single hoster are processed at once. Note that the computations happen for all origins for simplicity and code maintenance but it will only be used by a new soon-to-be scheduling policy. [1] Lister cannot provide it for some reason. commit cb1edf1ab24d1c8db5821578a7fb2633fab50ff4 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Wed Jun 23 18:07:59 2021 +0200 Introduce storage for the recurrent visit scheduler queue position commit ec6e69f6415a007611c46f25e7c48e909a793d53 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Wed Jun 23 16:42:26 2021 +0200 Start handling of recurrent loading tasks in scheduler This deals first and foremost with the next_position_offset update done by the scheduler journal client. commit c486b28ece7c0b127fea10bbb4d7f5d1ad5c50ba Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Tue Jun 29 14:41:07 2021 +0200 journal_client: Explicit docstring commit 98f99b9fd457820dc2d4b5dab7e89cb8261a34a4 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Wed Jun 23 16:39:40 2021 +0200 journal_client: Only check last_* fields for some permutation tests In a future commit, we will add new fields whose values will be permutation dependent.
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/415/ for more details.
Build is green
Patch application report for D5980 (id=21882)
Could not rebase; Attempt merge onto 4fa29fe128...
Updating 4fa29fe..88d3036 Fast-forward sql/updates/29.sql | 4 ++ swh/scheduler/journal_client.py | 32 +++++++++ swh/scheduler/model.py | 2 + swh/scheduler/sql/30-schema.sql | 2 + swh/scheduler/tests/test_journal_client.py | 105 ++++++++++++++++++++++++++++- 5 files changed, 142 insertions(+), 3 deletions(-)
Changes applied before test
commit 88d3036f407698c2615f18e9b470cf08ecb1716c Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Jul 8 11:24:42 2021 +0200 journal_client: Deactivate origins when too many visited attempts failed Either for failed or not found attempts. Related to T2345 commit cdc2af4733752ddccea01bdbd70b9805fbdaf6f1 Author: David Douard <david.douard@sdfa3.org> Date: Wed Jul 7 16:55:57 2021 +0200 Add a successive_visits counter to origin visit stats This maintains the number of successive visits resulting in the same status. This will help implementing disabling of too many successive failed or not_found visits for a given origin. Related to T2345
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/421/ for more details.
Build is green
Patch application report for D5980 (id=21886)
Rebasing onto 4fa29fe128...
Current branch diff-target is up to date.
Changes applied before test
commit d616934db615e6f53ea89c629a6b660bd24176e4 Author: David Douard <david.douard@sdfa3.org> Date: Wed Jul 7 16:55:57 2021 +0200 Add a successive_visits counter to origin visit stats This maintains the number of successive visits resulting in the same status. This will help implementing disabling of too many successive failed or not_found visits for a given origin. Related to T2345
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/423/ for more details.
Build is green
Patch application report for D5980 (id=21887)
Could not rebase; Attempt merge onto 4fa29fe128...
Updating 4fa29fe..dbb1e40 Fast-forward sql/updates/29.sql | 4 ++ swh/scheduler/journal_client.py | 32 +++++++++ swh/scheduler/model.py | 2 + swh/scheduler/sql/30-schema.sql | 2 + swh/scheduler/tests/test_journal_client.py | 105 ++++++++++++++++++++++++++++- 5 files changed, 142 insertions(+), 3 deletions(-)
Changes applied before test
commit dbb1e40ac43783cf8ded478f1925206e44b53fef Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Jul 8 11:24:42 2021 +0200 journal_client: Deactivate origins when too many visited attempts failed Either for failed or not found attempts. Related to T2345 commit d616934db615e6f53ea89c629a6b660bd24176e4 Author: David Douard <david.douard@sdfa3.org> Date: Wed Jul 7 16:55:57 2021 +0200 Add a successive_visits counter to origin visit stats This maintains the number of successive visits resulting in the same status. This will help implementing disabling of too many successive failed or not_found visits for a given origin. Related to T2345
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/424/ for more details.
Build is green
Patch application report for D5980 (id=21890)
Could not rebase; Attempt merge onto 4fa29fe128...
Updating 4fa29fe..196ba39 Fast-forward sql/updates/29.sql | 4 ++ swh/scheduler/journal_client.py | 32 +++++++++ swh/scheduler/model.py | 2 + swh/scheduler/sql/30-schema.sql | 2 + swh/scheduler/tests/test_journal_client.py | 107 ++++++++++++++++++++++++++++- 5 files changed, 144 insertions(+), 3 deletions(-)
Changes applied before test
commit 196ba394712751b52c604b2f2444fe5a5d214e44 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Jul 8 11:24:42 2021 +0200 journal_client: Deactivate origins when too many visited attempts failed Either for failed or not found attempts. Related to T2345 commit 015d16158df9a87cdea29d76a55381d6798ee4e3 Author: David Douard <david.douard@sdfa3.org> Date: Wed Jul 7 16:55:57 2021 +0200 Add a successive_visits counter to origin visit stats This maintains the number of successive visits resulting in the same status. This will help implementing disabling of too many successive failed or not_found visits for a given origin. Related to T2345
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/427/ for more details.
I'm not a huge fan of disabling origins forever; are you planning to relax this somehow? (eg. visit again a couple of years later)
It's not forever.
What happens if a lister lists a disabled origin again?
The lister will activate the origin again.
- Rebase
- Adapt according to review (avoid constant)
- Update docstring to explicit the disabling of origins
Build is green
Patch application report for D5980 (id=21895)
Could not rebase; Attempt merge onto 4fa29fe128...
Updating 4fa29fe..8d1b51f Fast-forward sql/updates/29.sql | 4 ++ swh/scheduler/journal_client.py | 35 ++++++++++ swh/scheduler/model.py | 2 + swh/scheduler/sql/30-schema.sql | 2 + swh/scheduler/tests/test_journal_client.py | 107 ++++++++++++++++++++++++++++- 5 files changed, 147 insertions(+), 3 deletions(-)
Changes applied before test
commit 8d1b51f0a60cf1f8b94942a490c00f7b0b4097c7 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Jul 8 11:24:42 2021 +0200 journal_client: Deactivate origins when too many visited attempts failed Either for failed or not found attempts. It's up to the lister to activate back the origins if they are getting alive at some point. Related to T2345 commit 1bcf84d5e66d02c006698a89d2571911d3fd0764 Author: David Douard <david.douard@sdfa3.org> Date: Wed Jul 7 16:55:57 2021 +0200 Add a successive_visits counter to origin visit stats This maintains the number of successive visits resulting in the same status. This will help implementing disabling of too many successive failed or not_found visits for a given origin. Related to T2345
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/429/ for more details.
Build is green
Patch application report for D5980 (id=21896)
Could not rebase; Attempt merge onto 4fa29fe128...
Updating 4fa29fe..8d1b51f Fast-forward sql/updates/29.sql | 4 ++ swh/scheduler/journal_client.py | 35 ++++++++++ swh/scheduler/model.py | 2 + swh/scheduler/sql/30-schema.sql | 2 + swh/scheduler/tests/test_journal_client.py | 107 ++++++++++++++++++++++++++++- 5 files changed, 147 insertions(+), 3 deletions(-)
Changes applied before test
commit 8d1b51f0a60cf1f8b94942a490c00f7b0b4097c7 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Jul 8 11:24:42 2021 +0200 journal_client: Deactivate origins when too many visited attempts failed Either for failed or not found attempts. It's up to the lister to activate back the origins if they are getting alive at some point. Related to T2345 commit 1bcf84d5e66d02c006698a89d2571911d3fd0764 Author: David Douard <david.douard@sdfa3.org> Date: Wed Jul 7 16:55:57 2021 +0200 Add a successive_visits counter to origin visit stats This maintains the number of successive visits resulting in the same status. This will help implementing disabling of too many successive failed or not_found visits for a given origin. Related to T2345
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/430/ for more details.
Build is green
Patch application report for D5980 (id=21897)
Could not rebase; Attempt merge onto 4fa29fe128...
Updating 4fa29fe..d92e052 Fast-forward sql/updates/29.sql | 4 ++ swh/scheduler/journal_client.py | 35 ++++++++++ swh/scheduler/model.py | 2 + swh/scheduler/sql/30-schema.sql | 2 + swh/scheduler/tests/test_journal_client.py | 107 ++++++++++++++++++++++++++++- 5 files changed, 147 insertions(+), 3 deletions(-)
Changes applied before test
commit d92e05218f9458f11b99fcbc82ae518185c125c1 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Jul 8 11:24:42 2021 +0200 journal_client: Disable origins when too many visited attempts failed This disable origins for either failed or not found attempts 3 times in a row. It's not definitive though as it's the lister's responsibility to activate back origins if they get listed again. Related to T2345 commit 1bcf84d5e66d02c006698a89d2571911d3fd0764 Author: David Douard <david.douard@sdfa3.org> Date: Wed Jul 7 16:55:57 2021 +0200 Add a successive_visits counter to origin visit stats This maintains the number of successive visits resulting in the same status. This will help implementing disabling of too many successive failed or not_found visits for a given origin. Related to T2345
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/431/ for more details.
Build is green
Patch application report for D5980 (id=21898)
Could not rebase; Attempt merge onto 4fa29fe128...
Updating 4fa29fe..8281e35 Fast-forward sql/updates/29.sql | 4 ++ swh/scheduler/journal_client.py | 38 +++++++++- swh/scheduler/model.py | 2 + swh/scheduler/sql/30-schema.sql | 2 + swh/scheduler/tests/test_journal_client.py | 107 ++++++++++++++++++++++++++++- 5 files changed, 149 insertions(+), 4 deletions(-)
Changes applied before test
commit 8281e351d6a13a55711fca5b89c7f24c71174dab Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Jul 8 11:24:42 2021 +0200 journal_client: Disable origins when too many visited attempts failed This disable origins for either failed or not found attempts 3 times in a row. It's not definitive though as it's the lister's responsibility to activate back origins if they get listed again. Related to T2345 commit 1bcf84d5e66d02c006698a89d2571911d3fd0764 Author: David Douard <david.douard@sdfa3.org> Date: Wed Jul 7 16:55:57 2021 +0200 Add a successive_visits counter to origin visit stats This maintains the number of successive visits resulting in the same status. This will help implementing disabling of too many successive failed or not_found visits for a given origin. Related to T2345
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/432/ for more details.