Depends on D4921
Details
- Reviewers
vlorentz - Group Reviewers
Reviewers - Commits
- rDSCHaaffff2631a7: Simulator: allow to export results in a csv file
Diff Detail
- Repository
- rDSCH Scheduling utilities
- Branch
- randomize
- Lint
Lint Skipped - Unit
Unit Tests Skipped - Build Status
Buildable 18668 Build 28889: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 28888: arc lint + arc unit
Event Timeline
Build is green
Patch application report for D4923 (id=17508)
Could not rebase; Attempt merge onto 03460207a1...
Updating 0346020..708b1f7 Fast-forward swh/scheduler/cli/simulator.py | 25 ++++++++++--- swh/scheduler/simulator/__init__.py | 61 ++++++++++++++++++------------- swh/scheduler/simulator/common.py | 71 +++++++++++++++++++++++++++---------- 3 files changed, 110 insertions(+), 47 deletions(-)
Changes applied before test
commit 708b1f7a0d098fe0b78a6479998c025930264e01 Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 10:49:13 2021 +0100 Simulation: allow to export results in a csv file commit 6ef3d2177ca00e5f42885eaf6a37ecd0c94df7ac Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 10:47:18 2021 +0100 Simulation: log at infol level recorded metrics this allows to follows what the simulation is doing. commit 6698e51903bfdfcc00cfc55be3056c25c8cb270f Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 09:07:22 2021 +0100 Make plotting histograms optional in simulator cli command commit 1c069ca34add6b26d060588abb7958a089cb0735 Author: David Douard <david.douard@sdfa3.org> Date: Thu Jan 21 11:33:19 2021 +0100 Randomize last_upadte in generated ListedOrigins in fill_test_data also insert objects by batches of 10k to make it nicer with ram usage. commit 8a9aaf3942d5585e6af038ebced3cde6faf27c7e Author: David Douard <david.douard@sdfa3.org> Date: Thu Jan 21 11:30:21 2021 +0100 Add a --num-origins option to the fill-test-data cli command
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/248/ for more details.
swh/scheduler/simulator/common.py | ||
---|---|---|
103 | known |
Build is green
Patch application report for D4923 (id=17519)
Could not rebase; Attempt merge onto b93aa5be2c...
Merge made by the 'recursive' strategy. swh/scheduler/cli/simulator.py | 12 +++++-- swh/scheduler/simulator/__init__.py | 2 +- swh/scheduler/simulator/common.py | 71 +++++++++++++++++++++++++++---------- 3 files changed, 64 insertions(+), 21 deletions(-)
Changes applied before test
commit 38727d02937d62007c1c811817ecbb10ec88d582 Merge: b93aa5b d5d5c1d Author: Jenkins user <jenkins@localhost> Date: Fri Jan 22 10:40:56 2021 +0000 Merge branch 'diff-target' into HEAD commit d5d5c1dca97032eda637c6946627e5ee8b80c6f6 Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 10:49:13 2021 +0100 Simulation: allow to export results in a csv file commit 011cc3ddff7e5683f1f00a8c578d21776a923ae7 Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 09:07:22 2021 +0100 Make plotting histograms optional in simulator cli command
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/257/ for more details.
swh/scheduler/simulator/common.py | ||
---|---|---|
123 | Instead of zipping twice, we can probably just do the sums in a loop for each timestamp. |
Build is green
Patch application report for D4923 (id=17522)
Could not rebase; Attempt merge onto b93aa5be2c...
Updating b93aa5b..bd6def0 Fast-forward swh/scheduler/backend.py | 61 +++++++++-- swh/scheduler/cli/simulator.py | 17 ++- swh/scheduler/interface.py | 15 ++- swh/scheduler/simulator/__init__.py | 27 ++--- swh/scheduler/simulator/common.py | 109 ++++++++++++++----- swh/scheduler/simulator/origin_scheduler.py | 2 +- swh/scheduler/simulator/origins.py | 162 +++++++++++++++++++++------- swh/scheduler/tests/test_simulator.py | 9 +- 8 files changed, 311 insertions(+), 91 deletions(-)
Changes applied before test
commit bd6def07947375e8cc01e87e038f948a7b3ba425 Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 12:17:00 2021 +0100 Simulator: allow to export results in a csv file commit bd0941c722dae7ac14385b45048eee7f9565f735 Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 12:15:47 2021 +0100 Make plottings optional in simulator cli output commit f878c6036ba7400dc08fc33dc8d3858cc234b4c9 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:55:16 2021 +0100 Run simulator tests on all known scheduling policies commit bdbc3a86f84772ec166764ca5169ec597cf89e14 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:48:38 2021 +0100 simulator: record visit metrics alongside scheduler metrics This allows us to check the behavior of the archive over time in terms of number of visits. commit 7afb0a498432d1e2641abf3a9de859354699c5c4 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:45:23 2021 +0100 simulator: stop using the database as a cache for origin data This was a significant bottleneck of the simulator. To work around this, we: - Generate snapshot ids consistently in the OriginModel - Cache the origin data locally in the simulator, to compute the eventfulness of visits - Cache the last visit time for all origins to compute the estimated run time of visit tasks. commit 8e7377d8af45ef8e8234b57dc6a16be75dd74ac5 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Jan 21 17:38:41 2021 +0100 simulator: add a trivial heartbeat process to show progress For now, this process only writes a log every simulated day. commit ba303f946ecd3e15e58de0072ce71b50aa423d59 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:31:43 2021 +0100 grab_next_visits: don't re-schedule visits too fast The earlier implementation would just schedule new visits for origins forever, regardless of whether they were already scheduled or not. commit 808ae6851faee9b633e773f9150d360cdb927146 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:29:45 2021 +0100 Allow overriding the timestamp of grab_next_visits This makes the simulator behavior more consistent with reality. commit 9943195d31c51a44325cba09d07fb6e904d45a00 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:27:40 2021 +0100 Construct grab_next_visits query arguments incrementally commit 72070b7bf628788b6872e90a3f8ac8f0c01b70d9 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Jan 21 14:57:42 2021 +0100 simulator: add simple lister simulation commit 1f1aad459c4b0740ecbe96e9809e4b31f66bf999 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Jan 21 14:54:53 2021 +0100 Factor out ListedOrigin generation to use the OriginModel This generates consistent last_update values according to the model and simulated time.
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/259/ for more details.
Build is green
Patch application report for D4923 (id=17536)
Could not rebase; Attempt merge onto 86b255544c...
Auto-merging swh/scheduler/simulator/__init__.py Auto-merging swh/scheduler/cli/simulator.py Merge made by the 'recursive' strategy. swh/scheduler/backend.py | 61 +++++++++-- swh/scheduler/cli/simulator.py | 17 ++- swh/scheduler/interface.py | 15 ++- swh/scheduler/simulator/__init__.py | 27 ++--- swh/scheduler/simulator/common.py | 109 ++++++++++++++----- swh/scheduler/simulator/origin_scheduler.py | 2 +- swh/scheduler/simulator/origins.py | 162 +++++++++++++++++++++------- swh/scheduler/tests/test_simulator.py | 9 +- 8 files changed, 311 insertions(+), 91 deletions(-)
Changes applied before test
commit 522a327ced00337a239057d1d464323c1531dc92 Merge: 86b2555 df8f308 Author: Jenkins user <jenkins@localhost> Date: Fri Jan 22 15:22:40 2021 +0000 Merge branch 'diff-target' into HEAD commit df8f3086db8ae4289b6a3a2c675308b25fa82165 Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 12:17:00 2021 +0100 Simulator: allow to export results in a csv file commit 6a1b2e037498f5d1ad28effc5fb0a79f520ef46a Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 12:15:47 2021 +0100 Make plottings optional in simulator cli output commit f878c6036ba7400dc08fc33dc8d3858cc234b4c9 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:55:16 2021 +0100 Run simulator tests on all known scheduling policies commit bdbc3a86f84772ec166764ca5169ec597cf89e14 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:48:38 2021 +0100 simulator: record visit metrics alongside scheduler metrics This allows us to check the behavior of the archive over time in terms of number of visits. commit 7afb0a498432d1e2641abf3a9de859354699c5c4 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:45:23 2021 +0100 simulator: stop using the database as a cache for origin data This was a significant bottleneck of the simulator. To work around this, we: - Generate snapshot ids consistently in the OriginModel - Cache the origin data locally in the simulator, to compute the eventfulness of visits - Cache the last visit time for all origins to compute the estimated run time of visit tasks. commit 8e7377d8af45ef8e8234b57dc6a16be75dd74ac5 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Jan 21 17:38:41 2021 +0100 simulator: add a trivial heartbeat process to show progress For now, this process only writes a log every simulated day. commit ba303f946ecd3e15e58de0072ce71b50aa423d59 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:31:43 2021 +0100 grab_next_visits: don't re-schedule visits too fast The earlier implementation would just schedule new visits for origins forever, regardless of whether they were already scheduled or not. commit 808ae6851faee9b633e773f9150d360cdb927146 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:29:45 2021 +0100 Allow overriding the timestamp of grab_next_visits This makes the simulator behavior more consistent with reality. commit 9943195d31c51a44325cba09d07fb6e904d45a00 Author: Nicolas Dandrimont <nicolas@dandrimont.eu> Date: Thu Jan 21 17:27:40 2021 +0100 Construct grab_next_visits query arguments incrementally commit 72070b7bf628788b6872e90a3f8ac8f0c01b70d9 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Jan 21 14:57:42 2021 +0100 simulator: add simple lister simulation commit 1f1aad459c4b0740ecbe96e9809e4b31f66bf999 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Thu Jan 21 14:54:53 2021 +0100 Factor out ListedOrigin generation to use the OriginModel This generates consistent last_update values according to the model and simulated time.
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/268/ for more details.
Build is green
Patch application report for D4923 (id=17759)
Could not rebase; Attempt merge onto cf0583b079...
Updating cf0583b..0af7420 Fast-forward swh/scheduler/cli/simulator.py | 17 +++++++-- swh/scheduler/simulator/__init__.py | 2 +- swh/scheduler/simulator/common.py | 70 +++++++++++++++++++++++++++---------- 3 files changed, 68 insertions(+), 21 deletions(-)
Changes applied before test
commit 0af7420dbc89dc6b5b903e5fcc6565aa8f497a44 Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 12:17:00 2021 +0100 Simulator: allow to export results in a csv file commit aaf7dd6f1d820012b588e780178aaefdc64e2685 Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 12:15:47 2021 +0100 Make plottings optional in simulator cli output
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/290/ for more details.
That's not a valid reason! A valid reason is "I agree with olasd's comments, fix them (plz)"...
swh/scheduler/simulator/common.py | ||
---|---|---|
123 | well everything in this pipeline is a generator, so I see no harm in "double zipping" there. |
Build is green
Patch application report for D4923 (id=17760)
Could not rebase; Attempt merge onto cf0583b079...
Updating cf0583b..baf5dce Fast-forward swh/scheduler/cli/simulator.py | 17 +++++++-- swh/scheduler/simulator/__init__.py | 2 +- swh/scheduler/simulator/common.py | 70 +++++++++++++++++++++++++++---------- 3 files changed, 68 insertions(+), 21 deletions(-)
Changes applied before test
commit baf5dce08ef24360ff89e92ff0cc6e5712cc20cd Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 12:17:00 2021 +0100 Simulator: allow to export results in a csv file commit aaf7dd6f1d820012b588e780178aaefdc64e2685 Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 12:15:47 2021 +0100 Make plottings optional in simulator cli output
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/291/ for more details.
Build is green
Patch application report for D4923 (id=17780)
Could not rebase; Attempt merge onto aaf7dd6f1d...
Updating aaf7dd6..aaffff2 Fast-forward swh/scheduler/cli/simulator.py | 7 ++++++- swh/scheduler/simulator/common.py | 36 +++++++++++++++++++++++++++++++++-- swh/scheduler/tests/test_simulator.py | 11 ++++++++++- 3 files changed, 50 insertions(+), 4 deletions(-)
Changes applied before test
commit aaffff2631a771b30c22b7a1fa69414bf3ed9dcd Author: David Douard <david.douard@sdfa3.org> Date: Fri Jan 22 12:17:00 2021 +0100 Simulator: allow to export results in a csv file commit 9fce3f6f2c73fe64663e9b3e41043161c5620f45 Author: David Douard <david.douard@sdfa3.org> Date: Mon Feb 1 15:36:16 2021 +0100 Add minimal tests for the SimulationReport.format() method
See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/294/ for more details.