Details
Details
- Reviewers
- None
- Group Reviewers
Reviewers
Diff Detail
Diff Detail
- Repository
- rDSCH Scheduling utilities
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 18753 Build 29033: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 29032: arc lint + arc unit
Event Timeline
Comment Actions
Build is green
Patch application report for D4928 (id=17538)
Could not rebase; Attempt merge onto 86b255544c...
Updating 86b2555..8988481 Fast-forward swh/scheduler/backend.py | 133 +++++++++++++++++------ swh/scheduler/cli/simulator.py | 1 - swh/scheduler/interface.py | 15 ++- swh/scheduler/simulator/__init__.py | 18 ++-- swh/scheduler/simulator/common.py | 41 +++++-- swh/scheduler/simulator/origin_scheduler.py | 2 +- swh/scheduler/simulator/origins.py | 162 +++++++++++++++++++++------- swh/scheduler/sql/30-schema.sql | 28 +++++ swh/scheduler/sql/60-indexes.sql | 4 + swh/scheduler/tests/test_scheduler.py | 21 +++- swh/scheduler/tests/test_simulator.py | 9 +- 11 files changed, 339 insertions(+), 95 deletions(-)
Changes applied before test
commit 8988481d2a95f9697afd91ab45e6a755476c6989 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Jan 22 16:10:46 2021 +0100 [wip] add indexes to origins_to_schedule. commit bbd42e7b430badd008371908460c73d12dc42efa Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Jan 22 15:54:15 2021 +0100 [wip] add materialized view origins_to_schedule and use it in grab_next_visits. commit b71fd526a9012545b8a92412bc500c08b0dc8372 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:55:43 2021 +0100 simulator: stop validating the scheduling policy in the CLI We already do that in the scheduler backend function commit 174b8ebba99dd696c4643a6f23cf208303bb0ff7 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:55:16 2021 +0100 Run simulator tests on all known scheduling policies commit 04cbecd89e941610b433c1378f24eb86cb1f04a7 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:48:38 2021 +0100 simulator: record visit metrics alongside scheduler metrics This allows us to check the behavior of the archive over time in terms of number of visits. commit 417e2874f930a898d79659d876ed978ab6fdd57f Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:45:23 2021 +0100 simulator: stop using the database as a cache for origin data This was a significant bottleneck of the simulator. To work around this, we: - Generate snapshot ids consistently in the OriginModel - Cache the origin data locally in the simulator, to compute the eventfulness of visits - Cache the last visit time for all origins to compute the estimated run time of visit tasks. commit 79b37ac6bec2e1c276b8e48b6f78821b515113c4 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:31:43 2021 +0100 grab_next_visits: don't re-schedule visits too fast The earlier implementation would just schedule new visits for origins forever, regardless of whether they were already scheduled or not. commit c4d02d51c1be808a50b10e9e77d3e28d82b7bb48 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:29:45 2021 +0100 Allow overriding the timestamp of grab_next_visits This makes the simulator behavior more consistent with reality. commit b1247caaeadabb13bc1502ab2e50247a62b2404e Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:27:40 2021 +0100 Construct grab_next_visits query arguments incrementally commit 0cb88aff9d42d297e2c272cfaecfc4a7c8460b75 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Jan 21 14:57:42 2021 +0100 simulator: add simple lister simulation commit bf0daa6a45764e6634b6b7b10a3eec2d937640cc Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Jan 21 14:54:53 2021 +0100 Factor out ListedOrigin generation to use the OriginModel This generates consistent last_update values according to the model and simulated time.
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/269/ for more details.
Comment Actions
Build is green
Patch application report for D4928 (id=17575)
Could not rebase; Attempt merge onto 2906b4e8a0...
Updating 2906b4e..e9a00dc Fast-forward swh/scheduler/backend.py | 133 ++++++++++++++++------ swh/scheduler/cli/simulator.py | 1 - swh/scheduler/interface.py | 15 ++- swh/scheduler/simulator/__init__.py | 18 ++- swh/scheduler/simulator/common.py | 41 +++++-- swh/scheduler/simulator/origin_scheduler.py | 2 +- swh/scheduler/simulator/origins.py | 171 +++++++++++++++++++++------- swh/scheduler/sql/30-schema.sql | 28 +++++ swh/scheduler/sql/60-indexes.sql | 4 + swh/scheduler/tests/test_scheduler.py | 21 +++- swh/scheduler/tests/test_simulator.py | 9 +- 11 files changed, 347 insertions(+), 96 deletions(-)
Changes applied before test
commit e9a00dc6dc661a0e991a0b3c7f08bfd915190d99 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Jan 22 16:10:46 2021 +0100 [wip] add indexes to origins_to_schedule. commit 203cd2d23bc78b0af4f18310d989993c9a973966 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Jan 22 15:54:15 2021 +0100 [wip] add materialized view origins_to_schedule and use it in grab_next_visits. commit 32c8ec91bc6dedd528ae7c8e828a419fddd9e6e0 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:55:43 2021 +0100 simulator: stop validating the scheduling policy in the CLI We already do that in the scheduler backend function commit 6d588a2df1b70c46dbd7828f9d8f478fed122915 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:55:16 2021 +0100 Run simulator tests on all known scheduling policies commit 1e7f9d7f79b2b135f182291b94bbe64ccb6e0595 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:48:38 2021 +0100 simulator: record visit metrics alongside scheduler metrics This allows us to check the behavior of the archive over time in terms of number of visits. commit 31e37e80927995902c6a3550166f7b2e3336b71c Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:45:23 2021 +0100 simulator: stop using the database as a cache for origin data This was a significant bottleneck of the simulator. To work around this, we: - Generate snapshot ids consistently in the OriginModel - Cache the origin data locally in the simulator, to compute the eventfulness of visits - Cache the last visit time for all origins to compute the estimated run time of visit tasks. commit 6784a19cdc38b8f97aa4f9c1da9859ece24865f1 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:31:43 2021 +0100 grab_next_visits: don't re-schedule visits too fast The earlier implementation would just schedule new visits for origins forever, regardless of whether they were already scheduled or not. commit 09a8768c30dc335afccde4df046b371a274cb2f9 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:29:45 2021 +0100 Allow overriding the timestamp of grab_next_visits This makes the simulator behavior more consistent with reality. commit a2dc72474056c2f20e255acf13ec3e662e1aad7a Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:27:40 2021 +0100 Construct grab_next_visits query arguments incrementally commit e5709214b4917a5fe3634d040da7a061f5978f66 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Jan 21 14:57:42 2021 +0100 simulator: add simple lister simulation commit 7af98e2bc048c6946679e7d95cf8620e4a0ee4bf Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Jan 21 14:54:53 2021 +0100 Factor out ListedOrigin generation to use the OriginModel This generates consistent last_update values according to the model and simulated time.
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/279/ for more details.
Comment Actions
Build is green
Patch application report for D4928 (id=17615)
Could not rebase; Attempt merge onto 2906b4e8a0...
Updating 2906b4e..73f6ccb Fast-forward docs/simulator.rst | 15 +++ swh/scheduler/backend.py | 133 +++++++++++++++------ swh/scheduler/cli/simulator.py | 1 - swh/scheduler/interface.py | 15 ++- swh/scheduler/simulator/__init__.py | 18 ++- swh/scheduler/simulator/common.py | 41 +++++-- swh/scheduler/simulator/origin_scheduler.py | 2 +- swh/scheduler/simulator/origins.py | 173 ++++++++++++++++++++++------ swh/scheduler/sql/30-schema.sql | 28 +++++ swh/scheduler/sql/60-indexes.sql | 4 + swh/scheduler/tests/test_scheduler.py | 21 +++- swh/scheduler/tests/test_simulator.py | 9 +- 12 files changed, 364 insertions(+), 96 deletions(-)
Changes applied before test
commit 73f6ccb5a6bdb061df6c7f832ce27b954eb61828 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Jan 22 16:10:46 2021 +0100 [wip] add indexes to origins_to_schedule. commit ce0e5d58538c854b747d89bf64d119706713ec5d Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Fri Jan 22 15:54:15 2021 +0100 [wip] add materialized view origins_to_schedule and use it in grab_next_visits. commit cf0583b079594c85e5e4fb512aceaf9fd4151473 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:55:43 2021 +0100 simulator: stop validating the scheduling policy in the CLI We already do that in the scheduler backend function commit ebb5847ea2eec79fa9b89cd684f1b6a92059324d Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:55:16 2021 +0100 Run simulator tests on all known scheduling policies commit 1f77521d486cfa110983b85fe0a724a347291840 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:48:38 2021 +0100 simulator: record visit metrics alongside scheduler metrics This allows us to check the behavior of the archive over time in terms of number of visits. commit 889839446eb8645a5520237513a54c892d3a3104 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:45:23 2021 +0100 simulator: stop using the database as a cache for origin data This was a significant bottleneck of the simulator. To work around this, we: - Generate snapshot ids consistently in the OriginModel - Cache the origin data locally in the simulator, to compute the eventfulness of visits - Cache the last visit time for all origins to compute the estimated run time of visit tasks. commit c92ead5875ecfd96a164eec1803398adec6eb8a8 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:31:43 2021 +0100 grab_next_visits: don't re-schedule visits too fast The earlier implementation would just schedule new visits for origins forever, regardless of whether they were already scheduled or not. commit 2b39cbcabf9960c1f660442e15f6c17654aec9e2 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:29:45 2021 +0100 Allow overriding the timestamp of grab_next_visits This makes the simulator behavior more consistent with reality. commit 7ffbdd1b3eb579f43e8913ea11cfd916b2f3c457 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:27:40 2021 +0100 Construct grab_next_visits query arguments incrementally commit ea068b46a89e07c60ad1233afd36afc6bb29031e Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Jan 21 14:57:42 2021 +0100 simulator: add simple lister simulation commit 7af98e2bc048c6946679e7d95cf8620e4a0ee4bf Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Jan 21 14:54:53 2021 +0100 Factor out ListedOrigin generation to use the OriginModel This generates consistent last_update values according to the model and simulated time.
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/288/ for more details.