This simulator will allow us to compare the behavior of the old and new
schedulers, as well as to test the impact of scheduler policies and their
parameters on the performance of the Software Heritage archival
infrastructure as a whole.
Details
- Reviewers
olasd douardda - Group Reviewers
Reviewers - Maniphest Tasks
- T2973: Implement a scheduler simulator
- Commits
- rDSCH9468bb9384f1: simulator: add basic tests for fill_test_data and run
rDSCH88e0b4280501: simulator: Add documentation.
rDSCH898820fac52c: simulator: collect and plot scheduler metrics over time
rDSCH9ce68f8d0e0e: simulator: stop using get_scheduler directly
rDSCH62c6d90867bc: simulator: Make min_batch_size a parameter defined in the setup.
rDSCHead7b347db9d: simulator: implement a simulator for the "old" task-based scheduler
rDSCHaecd27eee06a: Move the simulator cli to the main cli module
rDSCH05067e3ecc88: simulator: Replace attrs with dataclasses for consistency
rDSCH24922fe2d995: simulator: wrap tasks and task events in typechecked objects
rDSCH22ebb7a9a4bc: simulator: Split into smaller files in the same package
rDSCHd5318aea0a93: simulator: also fill data for the task-based scheduler
rDSCH29204199774b: simulator: add typing for Environment.scheduler
rDSCHad7bfbe731da: simulator: Make the run time a CLI argument
rDSCHdf34db0bfc61: simulator: tweak simulation environment constants
rDSCH21ce2c88dddc: simulator: generate more origins in fill_data
rDSCH6433266106dd: simulator: add support for a basic SimulationReport
rDSCHc474a825336a: simulator: refine origin model to follow an exponential distribution
rDSCHcb12449e8f57: simulator: simulate the scheduler journal client
rDSCH20b7f9c68f83: simulator: generate OriginVisitStatus objects in modeled visits
rDSCH2459badf0c05: simulator: Remove some debug statements and lower log level
rDSCH39ad47de2e75: simulator: Move scheduler into the simulation environment object
rDSCH31967fa850c3: simulator: Use datetimes instead of a floating point simulated time
rDSCHfc3f06bd1d77: Introduce scaffolding for a scheduler simulator
use the docs, Luke
Diff Detail
- Repository
- rDSCH Scheduling utilities
- Branch
- scheduling-policy
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 18516 Build 28642: Phabricator diff pipeline on jenkins Jenkins console · Jenkins Build 28641: arc lint + arc unit
Event Timeline
Build is green
Patch application report for D4856 (id=17208)
Could not rebase; Attempt merge onto a62003397d...
Updating a620033..dfa0aee Fast-forward .pre-commit-config.yaml | 1 + docs/index.rst | 1 + docs/simulator.rst | 55 +++++++++++++ mypy.ini | 3 + requirements-simulator.txt | 1 + setup.py | 5 +- sql/updates/20.sql | 6 ++ swh/scheduler/backend.py | 9 ++- swh/scheduler/cli/__init__.py | 7 +- swh/scheduler/cli/origin.py | 141 ++++++++++++++++++++++++++++++++ swh/scheduler/interface.py | 8 +- swh/scheduler/model.py | 9 +++ swh/scheduler/simulator/__init__.py | 144 +++++++++++++++++++++++++++++++++ swh/scheduler/simulator/__main__.py | 31 +++++++ swh/scheduler/sql/30-schema.sql | 2 +- swh/scheduler/sql/60-indexes.sql | 2 +- swh/scheduler/tests/common.py | 10 +-- swh/scheduler/tests/conftest.py | 45 ++++++++--- swh/scheduler/tests/test_cli_origin.py | 112 +++++++++++++++++++++++++ swh/scheduler/tests/test_model.py | 31 ++++++- swh/scheduler/tests/test_scheduler.py | 27 ++++--- 21 files changed, 611 insertions(+), 39 deletions(-) create mode 100644 docs/simulator.rst create mode 100644 requirements-simulator.txt create mode 100644 sql/updates/20.sql create mode 100644 swh/scheduler/cli/origin.py create mode 100644 swh/scheduler/simulator/__init__.py create mode 100644 swh/scheduler/simulator/__main__.py create mode 100644 swh/scheduler/tests/test_cli_origin.py
Changes applied before test
commit dfa0aee33500715f47b2e228c5462153d101a5b5
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 16:13:01 2021 +0100
Introduce scaffolding for a scheduler simulator
This simulator will allow us to compare the behavior of the old and new
schedulers, as well as to test the impact of scheduler policies and their
parameters on the performance of the Software Heritage archival
infrastructure as a whole.
commit 9f843eef37313b551a158dfa11aea97e5ef2fc81
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 15:31:55 2021 +0100
Filter origins by visit type when scheduling the next visits
We have separate task queues and workers for each visit type, so it
makes sense to split this endpoint along these lines too, at least for
now.
commit 23d1b3c1883c3c955b5dd5ba1cc2270c93e156d6
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 15:25:56 2021 +0100
Reorganize ListedOrigin fixtures to generate multiple visit_types
commit da347f7f4c401a43ec34de76365ad323d0ff7b77
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Tue Jan 12 17:10:39 2021 +0100
Introduce a `swh scheduler origin schedule-next` cli
This creates one-shot tasks in the classic scheduler for the next visits
to run according to the visit scheduling policy.
commit 42957c9e96e6c7d8070e0b6c786c273e8c1602a0
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Tue Jan 12 17:28:33 2021 +0100
Rename test task types to names that match real tasks
The success of tests using these task types would depend on the test run
order, because these task types are (currently) being created by
swh/scheduler/sql/50-data.sql, but the table is truncated after the
first test completes.
commit d1393c54da99c45175dd0b6a69734d17fc887960
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Tue Jan 12 16:16:31 2021 +0100
Introduce a `swh scheduler origin grab-next` cli
This returns, as CSV, the next origins to be visited according to the
passed scheduling policy.See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/115/ for more details.
Build is green
Patch application report for D4856 (id=17217)
Could not rebase; Attempt merge onto a62003397d...
Updating a620033..0cde030 Fast-forward .pre-commit-config.yaml | 1 + docs/index.rst | 1 + docs/simulator.rst | 55 +++++++++++++ mypy.ini | 3 + requirements-simulator.txt | 1 + setup.py | 5 +- sql/updates/20.sql | 6 ++ swh/scheduler/backend.py | 9 ++- swh/scheduler/cli/__init__.py | 7 +- swh/scheduler/cli/origin.py | 142 ++++++++++++++++++++++++++++++++ swh/scheduler/interface.py | 8 +- swh/scheduler/model.py | 9 +++ swh/scheduler/simulator/__init__.py | 144 +++++++++++++++++++++++++++++++++ swh/scheduler/simulator/__main__.py | 31 +++++++ swh/scheduler/sql/30-schema.sql | 2 +- swh/scheduler/sql/60-indexes.sql | 2 +- swh/scheduler/tests/common.py | 10 +-- swh/scheduler/tests/conftest.py | 45 ++++++++--- swh/scheduler/tests/test_cli_origin.py | 112 +++++++++++++++++++++++++ swh/scheduler/tests/test_model.py | 31 ++++++- swh/scheduler/tests/test_scheduler.py | 27 ++++--- 21 files changed, 612 insertions(+), 39 deletions(-) create mode 100644 docs/simulator.rst create mode 100644 requirements-simulator.txt create mode 100644 sql/updates/20.sql create mode 100644 swh/scheduler/cli/origin.py create mode 100644 swh/scheduler/simulator/__init__.py create mode 100644 swh/scheduler/simulator/__main__.py create mode 100644 swh/scheduler/tests/test_cli_origin.py
Changes applied before test
commit 0cde0300fbbd0832a8dcca52ea1e04597e75f423
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 16:13:01 2021 +0100
Introduce scaffolding for a scheduler simulator
This simulator will allow us to compare the behavior of the old and new
schedulers, as well as to test the impact of scheduler policies and their
parameters on the performance of the Software Heritage archival
infrastructure as a whole.
commit ca45d40f2a62d4a0f200cabe760ad3a0cda00f89
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 15:31:55 2021 +0100
Filter origins by visit type when scheduling the next visits
We have separate task queues and workers for each visit type, so it
makes sense to split this endpoint along these lines too, at least for
now.
commit 59b4cb3f1c7a081e0d28b11d15888d38a9de151e
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 15:25:56 2021 +0100
Reorganize ListedOrigin fixtures to generate multiple visit_types
commit 4f5338f2aba360fed2e524cbcdd23b11bacfb79d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Tue Jan 12 17:10:39 2021 +0100
Introduce a `swh scheduler origin schedule-next` cli
This creates one-shot tasks in the classic scheduler for the next visits
to run according to the visit scheduling policy.
commit 3dd1d5f28d329620a65ee00749d24401b6d8cf00
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Tue Jan 12 17:28:33 2021 +0100
Rename test task types to names that match real tasks
The success of tests using these task types would depend on the test run
order, because these task types are (currently) being created by
swh/scheduler/sql/50-data.sql, but the table is truncated after the
first test completes.
commit 5d7b002ac403565e348ac8fe4dd56d015cf29cae
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Tue Jan 12 16:16:31 2021 +0100
Introduce a `swh scheduler origin grab-next` cli
This returns, as CSV, the next origins to be visited according to the
passed scheduling policy.See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/121/ for more details.
Lots of iterative improvements:
- Introduce scaffolding for a scheduler simulator
- simulator: Use datetimes instead of a floating point simulated time
- simulator: Move scheduler into the simulation environment object
- simulator: generate OriginVisitStatus objects in modeled visits
- simulator: simulate the scheduler journal client
- simulator: Remove some debug statements and lower log level
- simulator: refine origin model to follow an exponential distribution
- simulator: add support for a basic SimulationReport
- simulator: add typing for Environment.scheduler
- simulator: generate more origins in fill_data
- simulator: tweak simulation environment constants
- simulator: Make the run time a CLI argument
- simulator: Split into smaller files in the same package
- simulator: also fill data for the task-based scheduler
- simulator: wrap tasks and task events in typechecked objects
- simulator: Replace attrs with dataclasses for consistency
- Move the simulator cli to the main cli module
- simulator: implement a simulator for the "old" task-based scheduler
Build is green
Patch application report for D4856 (id=17281)
Could not rebase; Attempt merge onto a5fb291703...
Updating a5fb291..a4bbd6b Fast-forward .pre-commit-config.yaml | 1 + docs/index.rst | 1 + docs/simulator.rst | 55 +++++++++++++ mypy.ini | 6 ++ requirements-simulator.txt | 2 + setup.py | 5 +- sql/updates/23.sql | 71 ++++++++++++++++ swh/scheduler/cli/__init__.py | 2 +- swh/scheduler/cli/simulator.py | 57 +++++++++++++ swh/scheduler/simulator/__init__.py | 123 ++++++++++++++++++++++++++++ swh/scheduler/simulator/common.py | 102 +++++++++++++++++++++++ swh/scheduler/simulator/origin_scheduler.py | 69 ++++++++++++++++ swh/scheduler/simulator/origins.py | 119 +++++++++++++++++++++++++++ swh/scheduler/simulator/task_scheduler.py | 77 +++++++++++++++++ swh/scheduler/sql/30-schema.sql | 2 +- swh/scheduler/sql/40-func.sql | 6 +- 16 files changed, 692 insertions(+), 6 deletions(-) create mode 100644 docs/simulator.rst create mode 100644 requirements-simulator.txt create mode 100644 sql/updates/23.sql create mode 100644 swh/scheduler/cli/simulator.py create mode 100644 swh/scheduler/simulator/__init__.py create mode 100644 swh/scheduler/simulator/common.py create mode 100644 swh/scheduler/simulator/origin_scheduler.py create mode 100644 swh/scheduler/simulator/origins.py create mode 100644 swh/scheduler/simulator/task_scheduler.py
Changes applied before test
commit a4bbd6bd914d5854be0830a034f855d05970b009
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:33:43 2021 +0100
simulator: implement a simulator for the "old" task-based scheduler
We extend the Task object with an autogenerated uuid allowing us to
track the task lifetime between its creation and the generation of visit
statuses, as the task-based scheduler does.
commit c9cf37ac6290783ec1f043833a887c7e76a0eb9d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:31:42 2021 +0100
Move the simulator cli to the main cli module
commit b7e09ab024ac33ca4730d83f2a289b669dc784d2
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:37:59 2021 +0100
simulator: Replace attrs with dataclasses for consistency
commit cc734124c942af9498c0cb11799613ac04d17047
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:31:41 2021 +0100
simulator: wrap tasks and task events in typechecked objects
This allows us to extend these objects without redefining a bunch of
type annotations.
commit c9e915ca69a7e614979172a694149afa361ec88c
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 14:47:33 2021 +0100
simulator: also fill data for the task-based scheduler
commit a5d0d0aa521819abf46f34ba265975ca1c806222
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Fri Jan 15 14:41:05 2021 +0100
simulator: Split into smaller files in the same package
commit 8c0c94afe407df82be55867a4550350772934aae
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:50:00 2021 +0100
simulator: Make the run time a CLI argument
commit f49f8c488ef420890e0c94940fd08a8ccf7b5fe4
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:40:16 2021 +0100
simulator: tweak simulation environment constants
commit 3e2f46120c7c220d347232d788f27cc7cfaaafd7
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:37:00 2021 +0100
simulator: generate more origins in fill_data
commit be5375b59621189138847ada4d5c6ee71d82e554
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:35:01 2021 +0100
simulator: add typing for Environment.scheduler
commit 24cd33ea564fd215c0eda55bfe479e3f1374feca
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:00:21 2021 +0100
simulator: add support for a basic SimulationReport
For now, this collects the runtime of tasks that have run, and gets
printed at the end of the simulation.
commit 34ebf6af90537ec7864fd1b0d2bb5133a9db4f15
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:45:23 2021 +0100
simulator: refine origin model to follow an exponential distribution
This models origins using a consistent characteristic "time between
commits" that follows an exponential distribution between 1 second and
10 years.
From this characteristic time, and feedback from the OriginVisitStats,
we can generate the expected run time and output status of the next
visit of that origin.
commit efdd500aca9b886fa5031533cc159a9c469edf75
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:43:20 2021 +0100
simulator: Remove some debug statements and lower log level
commit bff6576d4d1effd0f81380ffce74d9973f5e054f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:17:11 2021 +0100
simulator: simulate the scheduler journal client
commit 0e894915ad70fcc294b91c15e3f678f2f54c3f8a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:12:38 2021 +0100
simulator: generate OriginVisitStatus objects in modeled visits
To be able to generate uneventful visits, we would need to store
the last snapshot seen for a given origin. Instead of storing this
within the simulator, which would be a concern for large scale
simulations, we use the scheduler visit cache directly.
commit 315fb880e35361652a2277fcb7d5544e5ae81067
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:09:58 2021 +0100
simulator: Move scheduler into the simulation environment object
The scheduler is used by a lot of the simulated actors, it makes sense
to share it all the time.
commit 6efb445060018326e5164b8f3bc6d137c6800fe5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:07:56 2021 +0100
simulator: Use datetimes instead of a floating point simulated time
commit 54d42dd92f2c40cd7fdeda136ab33e2c1423682f
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 16:13:01 2021 +0100
Introduce scaffolding for a scheduler simulator
This simulator will allow us to compare the behavior of the old and new
schedulers, as well as to test the impact of scheduler policies and their
parameters on the performance of the Software Heritage archival
infrastructure as a whole.
commit d3afd144af1d3fa511cd2ae4cc76a25cc0856cc6
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:10:44 2021 +0100
Use the recorded task end time for the task scheduler feedback loop
This allows us to run "time-warping" simulations without interference
from the real wall clock time.See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/141/ for more details.
Build is green
Patch application report for D4856 (id=17306)
Rebasing onto d3afd144af...
Current branch diff-target is up to date.
Changes applied before test
commit 5a3c8d9bbea4f5ba62c61e98faa8d8d769f8a835
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Mon Jan 18 13:51:35 2021 +0100
simulator: add basic tests for fill_test_data and run
commit a4bbd6bd914d5854be0830a034f855d05970b009
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:33:43 2021 +0100
simulator: implement a simulator for the "old" task-based scheduler
We extend the Task object with an autogenerated uuid allowing us to
track the task lifetime between its creation and the generation of visit
statuses, as the task-based scheduler does.
commit c9cf37ac6290783ec1f043833a887c7e76a0eb9d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:31:42 2021 +0100
Move the simulator cli to the main cli module
commit b7e09ab024ac33ca4730d83f2a289b669dc784d2
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:37:59 2021 +0100
simulator: Replace attrs with dataclasses for consistency
commit cc734124c942af9498c0cb11799613ac04d17047
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:31:41 2021 +0100
simulator: wrap tasks and task events in typechecked objects
This allows us to extend these objects without redefining a bunch of
type annotations.
commit c9e915ca69a7e614979172a694149afa361ec88c
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 14:47:33 2021 +0100
simulator: also fill data for the task-based scheduler
commit a5d0d0aa521819abf46f34ba265975ca1c806222
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Fri Jan 15 14:41:05 2021 +0100
simulator: Split into smaller files in the same package
commit 8c0c94afe407df82be55867a4550350772934aae
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:50:00 2021 +0100
simulator: Make the run time a CLI argument
commit f49f8c488ef420890e0c94940fd08a8ccf7b5fe4
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:40:16 2021 +0100
simulator: tweak simulation environment constants
commit 3e2f46120c7c220d347232d788f27cc7cfaaafd7
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:37:00 2021 +0100
simulator: generate more origins in fill_data
commit be5375b59621189138847ada4d5c6ee71d82e554
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:35:01 2021 +0100
simulator: add typing for Environment.scheduler
commit 24cd33ea564fd215c0eda55bfe479e3f1374feca
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:00:21 2021 +0100
simulator: add support for a basic SimulationReport
For now, this collects the runtime of tasks that have run, and gets
printed at the end of the simulation.
commit 34ebf6af90537ec7864fd1b0d2bb5133a9db4f15
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:45:23 2021 +0100
simulator: refine origin model to follow an exponential distribution
This models origins using a consistent characteristic "time between
commits" that follows an exponential distribution between 1 second and
10 years.
From this characteristic time, and feedback from the OriginVisitStats,
we can generate the expected run time and output status of the next
visit of that origin.
commit efdd500aca9b886fa5031533cc159a9c469edf75
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:43:20 2021 +0100
simulator: Remove some debug statements and lower log level
commit bff6576d4d1effd0f81380ffce74d9973f5e054f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:17:11 2021 +0100
simulator: simulate the scheduler journal client
commit 0e894915ad70fcc294b91c15e3f678f2f54c3f8a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:12:38 2021 +0100
simulator: generate OriginVisitStatus objects in modeled visits
To be able to generate uneventful visits, we would need to store
the last snapshot seen for a given origin. Instead of storing this
within the simulator, which would be a concern for large scale
simulations, we use the scheduler visit cache directly.
commit 315fb880e35361652a2277fcb7d5544e5ae81067
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:09:58 2021 +0100
simulator: Move scheduler into the simulation environment object
The scheduler is used by a lot of the simulated actors, it makes sense
to share it all the time.
commit 6efb445060018326e5164b8f3bc6d137c6800fe5
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:07:56 2021 +0100
simulator: Use datetimes instead of a floating point simulated time
commit 54d42dd92f2c40cd7fdeda136ab33e2c1423682f
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 16:13:01 2021 +0100
Introduce scaffolding for a scheduler simulator
This simulator will allow us to compare the behavior of the old and new
schedulers, as well as to test the impact of scheduler policies and their
parameters on the performance of the Software Heritage archival
infrastructure as a whole.See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/142/ for more details.
Build is green
Patch application report for D4856 (id=17336)
Rebasing onto 5e609d5205...
Current branch diff-target is up to date.
Changes applied before test
commit 77362633b7485bfe3944d8c278d509eb60f0d664
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Mon Jan 18 13:51:35 2021 +0100
simulator: add basic tests for fill_test_data and run
commit eb7676ea2e8dcc5fa92067ad7858e5069ccc8db1
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:33:43 2021 +0100
simulator: implement a simulator for the "old" task-based scheduler
We extend the Task object with an autogenerated uuid allowing us to
track the task lifetime between its creation and the generation of visit
statuses, as the task-based scheduler does.
commit 687e6f007cb4943ef19ff87b87953607c6f206b7
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:31:42 2021 +0100
Move the simulator cli to the main cli module
commit c3f520abc55c7355dbef0d2fed1102cc30040176
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:37:59 2021 +0100
simulator: Replace attrs with dataclasses for consistency
commit 58042267faeaaab656c1e459b14fcfa24f300795
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:31:41 2021 +0100
simulator: wrap tasks and task events in typechecked objects
This allows us to extend these objects without redefining a bunch of
type annotations.
commit b58eb740e9d407b95a1df632eaef91bbe6c3ff8b
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 14:47:33 2021 +0100
simulator: also fill data for the task-based scheduler
commit 85df218106a0c29dd79900321572e87a7c90a5bd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Fri Jan 15 14:41:05 2021 +0100
simulator: Split into smaller files in the same package
commit 2bc5187c76657b00d54b61f993aeeb2de25acf18
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:50:00 2021 +0100
simulator: Make the run time a CLI argument
commit edf406dc961c8d0a77a34e73b0e19fcd511bd27d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:40:16 2021 +0100
simulator: tweak simulation environment constants
commit e7d60a996249b6827332e17e2977bec1b69eab83
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:37:00 2021 +0100
simulator: generate more origins in fill_data
commit b663e5414a09bc0b5a22c111894433d71c77f42c
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:35:01 2021 +0100
simulator: add typing for Environment.scheduler
commit c3e8380e1aa140c8823ef76ba6d384474f160c9b
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:00:21 2021 +0100
simulator: add support for a basic SimulationReport
For now, this collects the runtime of tasks that have run, and gets
printed at the end of the simulation.
commit b4b20ad406d8925cb4aa96828dbc5af14e0bda8d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:45:23 2021 +0100
simulator: refine origin model to follow an exponential distribution
This models origins using a consistent characteristic "time between
commits" that follows an exponential distribution between 1 second and
10 years.
From this characteristic time, and feedback from the OriginVisitStats,
we can generate the expected run time and output status of the next
visit of that origin.
commit cce6ce250ee0e73cc2b486c32cae8c05265a9974
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:43:20 2021 +0100
simulator: Remove some debug statements and lower log level
commit 7e5f99837487c3785dfa96ed28ce9fecdf25bad8
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:17:11 2021 +0100
simulator: simulate the scheduler journal client
commit 63c3beea168e2f41ff0cbd71fe53af95e062748a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:12:38 2021 +0100
simulator: generate OriginVisitStatus objects in modeled visits
To be able to generate uneventful visits, we would need to store
the last snapshot seen for a given origin. Instead of storing this
within the simulator, which would be a concern for large scale
simulations, we use the scheduler visit cache directly.
commit dd06b1bd428c15cf8ebb89873f24ee372ff363eb
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:09:58 2021 +0100
simulator: Move scheduler into the simulation environment object
The scheduler is used by a lot of the simulated actors, it makes sense
to share it all the time.
commit 524ec4a50a60eb45815faf49d8d675a86756955b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:07:56 2021 +0100
simulator: Use datetimes instead of a floating point simulated time
commit 11263f58a02c9f1aa485df5ea4ac5131998f3d69
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 16:13:01 2021 +0100
Introduce scaffolding for a scheduler simulator
This simulator will allow us to compare the behavior of the old and new
schedulers, as well as to test the impact of scheduler policies and their
parameters on the performance of the Software Heritage archival
infrastructure as a whole.See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/155/ for more details.
| docs/simulator.rst | ||
|---|---|---|
| 17 | this list of items is not very clear. May be rephrased a bit for better clarity. Especially the last one on the feedback loop | |
| swh/scheduler/cli/simulator.py | ||
| 38 | unclear what this "scheduler" option actually refers to. Is 'task_scheduler' the current ("legacy") one? And "origin_scheduler" the first simple implementation recently added? | |
| 45 | how does the "policy" option interact with the "scheduler" above? | |
| swh/scheduler/simulator/__init__.py | ||
|---|---|---|
| 63 | ok now I see... | |
| swh/scheduler/simulator/__init__.py | ||
|---|---|---|
| 21 | it would probably be nice to add a docstring/comment that gives an overall description of how this simulator works | |
| swh/scheduler/simulator/origins.py | ||
| 37 | I'm not sure I get how this method is supposed to be called. Is it once and only once? or it it called each time an "next commit date for this origin" event is triggered (if that make sense)? I mean the method name suggest it gives a definitive mean time between commits. Is this it? | |
| 42 | this is not so easy to read and get (for someone like me at least)... I'd really appreciate a more comprehensive/explanatory comment here... | |
overall looks good to me, but it could benefit from more comments and explanations. Not easy to get in as is.
| swh/scheduler/simulator/task_scheduler.py | ||
|---|---|---|
| 26 | why the 10 factor? | |
| swh/scheduler/simulator/task_scheduler.py | ||
|---|---|---|
| 26 | it's completely arbitrary | |
| swh/scheduler/simulator/task_scheduler.py | ||
|---|---|---|
| 26 | then add a comment about it | |
Address @douardda's comments:
- simulator: Make min_batch_size a parameter defined in the setup.
- simulator: Add documentation.
| swh/scheduler/simulator/task_scheduler.py | ||
|---|---|---|
| 26 | better yet: I'll make it a constant somewhere else | |
Build is green
Patch application report for D4856 (id=17348)
Rebasing onto 5e609d5205...
Current branch diff-target is up to date.
Changes applied before test
commit 7af4bebc7a4503964f9bd61ac101c54fc42ca474
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 16:32:27 2021 +0100
simulator: Add documentation.
commit 1967379c3251f407e7e5128efbbceafe293e3704
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 16:17:24 2021 +0100
simulator: Make min_batch_size a parameter defined in the setup.
commit 77362633b7485bfe3944d8c278d509eb60f0d664
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Mon Jan 18 13:51:35 2021 +0100
simulator: add basic tests for fill_test_data and run
commit eb7676ea2e8dcc5fa92067ad7858e5069ccc8db1
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:33:43 2021 +0100
simulator: implement a simulator for the "old" task-based scheduler
We extend the Task object with an autogenerated uuid allowing us to
track the task lifetime between its creation and the generation of visit
statuses, as the task-based scheduler does.
commit 687e6f007cb4943ef19ff87b87953607c6f206b7
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:31:42 2021 +0100
Move the simulator cli to the main cli module
commit c3f520abc55c7355dbef0d2fed1102cc30040176
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:37:59 2021 +0100
simulator: Replace attrs with dataclasses for consistency
commit 58042267faeaaab656c1e459b14fcfa24f300795
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:31:41 2021 +0100
simulator: wrap tasks and task events in typechecked objects
This allows us to extend these objects without redefining a bunch of
type annotations.
commit b58eb740e9d407b95a1df632eaef91bbe6c3ff8b
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 14:47:33 2021 +0100
simulator: also fill data for the task-based scheduler
commit 85df218106a0c29dd79900321572e87a7c90a5bd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Fri Jan 15 14:41:05 2021 +0100
simulator: Split into smaller files in the same package
commit 2bc5187c76657b00d54b61f993aeeb2de25acf18
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:50:00 2021 +0100
simulator: Make the run time a CLI argument
commit edf406dc961c8d0a77a34e73b0e19fcd511bd27d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:40:16 2021 +0100
simulator: tweak simulation environment constants
commit e7d60a996249b6827332e17e2977bec1b69eab83
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:37:00 2021 +0100
simulator: generate more origins in fill_data
commit b663e5414a09bc0b5a22c111894433d71c77f42c
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:35:01 2021 +0100
simulator: add typing for Environment.scheduler
commit c3e8380e1aa140c8823ef76ba6d384474f160c9b
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:00:21 2021 +0100
simulator: add support for a basic SimulationReport
For now, this collects the runtime of tasks that have run, and gets
printed at the end of the simulation.
commit b4b20ad406d8925cb4aa96828dbc5af14e0bda8d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:45:23 2021 +0100
simulator: refine origin model to follow an exponential distribution
This models origins using a consistent characteristic "time between
commits" that follows an exponential distribution between 1 second and
10 years.
From this characteristic time, and feedback from the OriginVisitStats,
we can generate the expected run time and output status of the next
visit of that origin.
commit cce6ce250ee0e73cc2b486c32cae8c05265a9974
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:43:20 2021 +0100
simulator: Remove some debug statements and lower log level
commit 7e5f99837487c3785dfa96ed28ce9fecdf25bad8
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:17:11 2021 +0100
simulator: simulate the scheduler journal client
commit 63c3beea168e2f41ff0cbd71fe53af95e062748a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:12:38 2021 +0100
simulator: generate OriginVisitStatus objects in modeled visits
To be able to generate uneventful visits, we would need to store
the last snapshot seen for a given origin. Instead of storing this
within the simulator, which would be a concern for large scale
simulations, we use the scheduler visit cache directly.
commit dd06b1bd428c15cf8ebb89873f24ee372ff363eb
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:09:58 2021 +0100
simulator: Move scheduler into the simulation environment object
The scheduler is used by a lot of the simulated actors, it makes sense
to share it all the time.
commit 524ec4a50a60eb45815faf49d8d675a86756955b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:07:56 2021 +0100
simulator: Use datetimes instead of a floating point simulated time
commit 11263f58a02c9f1aa485df5ea4ac5131998f3d69
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 16:13:01 2021 +0100
Introduce scaffolding for a scheduler simulator
This simulator will allow us to compare the behavior of the old and new
schedulers, as well as to test the impact of scheduler policies and their
parameters on the performance of the Software Heritage archival
infrastructure as a whole.See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/161/ for more details.
Build is green
Patch application report for D4856 (id=17350)
Rebasing onto 5e609d5205...
Current branch diff-target is up to date.
Changes applied before test
commit b594c847826699608f49afe4153c2f2b3ef99657
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 16:32:27 2021 +0100
simulator: Add documentation.
commit 1967379c3251f407e7e5128efbbceafe293e3704
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 16:17:24 2021 +0100
simulator: Make min_batch_size a parameter defined in the setup.
commit 77362633b7485bfe3944d8c278d509eb60f0d664
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Mon Jan 18 13:51:35 2021 +0100
simulator: add basic tests for fill_test_data and run
commit eb7676ea2e8dcc5fa92067ad7858e5069ccc8db1
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:33:43 2021 +0100
simulator: implement a simulator for the "old" task-based scheduler
We extend the Task object with an autogenerated uuid allowing us to
track the task lifetime between its creation and the generation of visit
statuses, as the task-based scheduler does.
commit 687e6f007cb4943ef19ff87b87953607c6f206b7
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:31:42 2021 +0100
Move the simulator cli to the main cli module
commit c3f520abc55c7355dbef0d2fed1102cc30040176
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:37:59 2021 +0100
simulator: Replace attrs with dataclasses for consistency
commit 58042267faeaaab656c1e459b14fcfa24f300795
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:31:41 2021 +0100
simulator: wrap tasks and task events in typechecked objects
This allows us to extend these objects without redefining a bunch of
type annotations.
commit b58eb740e9d407b95a1df632eaef91bbe6c3ff8b
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 14:47:33 2021 +0100
simulator: also fill data for the task-based scheduler
commit 85df218106a0c29dd79900321572e87a7c90a5bd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Fri Jan 15 14:41:05 2021 +0100
simulator: Split into smaller files in the same package
commit 2bc5187c76657b00d54b61f993aeeb2de25acf18
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:50:00 2021 +0100
simulator: Make the run time a CLI argument
commit edf406dc961c8d0a77a34e73b0e19fcd511bd27d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:40:16 2021 +0100
simulator: tweak simulation environment constants
commit e7d60a996249b6827332e17e2977bec1b69eab83
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:37:00 2021 +0100
simulator: generate more origins in fill_data
commit b663e5414a09bc0b5a22c111894433d71c77f42c
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:35:01 2021 +0100
simulator: add typing for Environment.scheduler
commit c3e8380e1aa140c8823ef76ba6d384474f160c9b
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:00:21 2021 +0100
simulator: add support for a basic SimulationReport
For now, this collects the runtime of tasks that have run, and gets
printed at the end of the simulation.
commit b4b20ad406d8925cb4aa96828dbc5af14e0bda8d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:45:23 2021 +0100
simulator: refine origin model to follow an exponential distribution
This models origins using a consistent characteristic "time between
commits" that follows an exponential distribution between 1 second and
10 years.
From this characteristic time, and feedback from the OriginVisitStats,
we can generate the expected run time and output status of the next
visit of that origin.
commit cce6ce250ee0e73cc2b486c32cae8c05265a9974
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:43:20 2021 +0100
simulator: Remove some debug statements and lower log level
commit 7e5f99837487c3785dfa96ed28ce9fecdf25bad8
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:17:11 2021 +0100
simulator: simulate the scheduler journal client
commit 63c3beea168e2f41ff0cbd71fe53af95e062748a
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:12:38 2021 +0100
simulator: generate OriginVisitStatus objects in modeled visits
To be able to generate uneventful visits, we would need to store
the last snapshot seen for a given origin. Instead of storing this
within the simulator, which would be a concern for large scale
simulations, we use the scheduler visit cache directly.
commit dd06b1bd428c15cf8ebb89873f24ee372ff363eb
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:09:58 2021 +0100
simulator: Move scheduler into the simulation environment object
The scheduler is used by a lot of the simulated actors, it makes sense
to share it all the time.
commit 524ec4a50a60eb45815faf49d8d675a86756955b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:07:56 2021 +0100
simulator: Use datetimes instead of a floating point simulated time
commit 11263f58a02c9f1aa485df5ea4ac5131998f3d69
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 16:13:01 2021 +0100
Introduce scaffolding for a scheduler simulator
This simulator will allow us to compare the behavior of the old and new
schedulers, as well as to test the impact of scheduler policies and their
parameters on the performance of the Software Heritage archival
infrastructure as a whole.See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/162/ for more details.
Build is green
Patch application report for D4856 (id=17356)
Could not rebase; Attempt merge onto 0a32a31195...
Auto-merging setup.py Merge made by the 'recursive' strategy. .pre-commit-config.yaml | 1 + docs/index.rst | 1 + docs/simulator.rst | 65 +++++++++++++ mypy.ini | 6 ++ requirements-simulator.txt | 2 + setup.py | 34 +++---- swh/scheduler/cli/__init__.py | 2 +- swh/scheduler/cli/simulator.py | 61 ++++++++++++ swh/scheduler/simulator/__init__.py | 144 ++++++++++++++++++++++++++++ swh/scheduler/simulator/common.py | 102 ++++++++++++++++++++ swh/scheduler/simulator/origin_scheduler.py | 68 +++++++++++++ swh/scheduler/simulator/origins.py | 128 +++++++++++++++++++++++++ swh/scheduler/simulator/task_scheduler.py | 76 +++++++++++++++ swh/scheduler/tests/test_simulator.py | 45 +++++++++ 14 files changed, 718 insertions(+), 17 deletions(-) create mode 100644 docs/simulator.rst create mode 100644 requirements-simulator.txt create mode 100644 swh/scheduler/cli/simulator.py create mode 100644 swh/scheduler/simulator/__init__.py create mode 100644 swh/scheduler/simulator/common.py create mode 100644 swh/scheduler/simulator/origin_scheduler.py create mode 100644 swh/scheduler/simulator/origins.py create mode 100644 swh/scheduler/simulator/task_scheduler.py create mode 100644 swh/scheduler/tests/test_simulator.py
Changes applied before test
commit 69877d3f987eaba00dcc97359b48e1fc8a677298
Merge: 0a32a31 ed04441
Author: Jenkins user <jenkins@localhost>
Date: Tue Jan 19 17:06:09 2021 +0000
Merge branch 'diff-target' into HEAD
commit ed044415e625080cb4bc67b2656743d92ed4c884
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 16:32:27 2021 +0100
simulator: Add documentation.
commit 186aebeb12905dc98cc370b360a8b3f5c4db3186
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 16:17:24 2021 +0100
simulator: Make min_batch_size a parameter defined in the setup.
commit 6150c764616d3c25ee13eb08bea6b4c9d1c2bc0d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Mon Jan 18 13:51:35 2021 +0100
simulator: add basic tests for fill_test_data and run
commit 5bee207dca74bb2c70611b3308c93bc522d48247
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:33:43 2021 +0100
simulator: implement a simulator for the "old" task-based scheduler
We extend the Task object with an autogenerated uuid allowing us to
track the task lifetime between its creation and the generation of visit
statuses, as the task-based scheduler does.
commit 6ec79c18b7b0e8be7b086aa79e87de81a8dbd06a
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:31:42 2021 +0100
Move the simulator cli to the main cli module
commit b25874a7066c95460f7d24c132f32f4dabf055a7
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:37:59 2021 +0100
simulator: Replace attrs with dataclasses for consistency
commit cb0bc27be55cf384c68b834ae3c89dd93434fbba
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:31:41 2021 +0100
simulator: wrap tasks and task events in typechecked objects
This allows us to extend these objects without redefining a bunch of
type annotations.
commit 947aecb14cdb4c6dd2da178f53599b4a41c8245b
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 14:47:33 2021 +0100
simulator: also fill data for the task-based scheduler
commit ff6cd0669e0d75afbd2c63424db66bf8d1e91bee
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Fri Jan 15 14:41:05 2021 +0100
simulator: Split into smaller files in the same package
commit b232135cb982f4fc8e5fb6242a88012d732e252d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:50:00 2021 +0100
simulator: Make the run time a CLI argument
commit 24e93d8aa72107bf953f884df4c9b15ea9cbeb2c
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:40:16 2021 +0100
simulator: tweak simulation environment constants
commit 9885d12cd708a26878cd9aa70ab590223589e8d7
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:37:00 2021 +0100
simulator: generate more origins in fill_data
commit a1d80fec0f5760d136857fb893232b1baec35b64
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:35:01 2021 +0100
simulator: add typing for Environment.scheduler
commit 6a9ec5f38133fe232da1ca98ff30ef44b12a4c12
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:00:21 2021 +0100
simulator: add support for a basic SimulationReport
For now, this collects the runtime of tasks that have run, and gets
printed at the end of the simulation.
commit 5d0e2aee4182df9476934349ad20da5dafc8b61f
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:45:23 2021 +0100
simulator: refine origin model to follow an exponential distribution
This models origins using a consistent characteristic "time between
commits" that follows an exponential distribution between 1 second and
10 years.
From this characteristic time, and feedback from the OriginVisitStats,
we can generate the expected run time and output status of the next
visit of that origin.
commit 7934c2f90191615db69b50dc27744ec73704f896
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:43:20 2021 +0100
simulator: Remove some debug statements and lower log level
commit d0ed751eca9f2ff0464b795edb9e9bb2a0305649
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:17:11 2021 +0100
simulator: simulate the scheduler journal client
commit c9b0728955e683748f8b03a22f91d501b64aad67
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:12:38 2021 +0100
simulator: generate OriginVisitStatus objects in modeled visits
To be able to generate uneventful visits, we would need to store
the last snapshot seen for a given origin. Instead of storing this
within the simulator, which would be a concern for large scale
simulations, we use the scheduler visit cache directly.
commit 56c8d1dd66d8a993c8bc7c7bcc4e3fb3704f6864
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:09:58 2021 +0100
simulator: Move scheduler into the simulation environment object
The scheduler is used by a lot of the simulated actors, it makes sense
to share it all the time.
commit 1659aa17fe0510030fb24d3b7867d2c4a366b5dd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:07:56 2021 +0100
simulator: Use datetimes instead of a floating point simulated time
commit ef241dd84c400f9be0d92396867587d47216e385
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 16:13:01 2021 +0100
Introduce scaffolding for a scheduler simulator
This simulator will allow us to compare the behavior of the old and new
schedulers, as well as to test the impact of scheduler policies and their
parameters on the performance of the Software Heritage archival
infrastructure as a whole.
commit 49a14792b0329049b51cbc6ed9c48006e9ff1a73
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Tue Jan 19 17:56:44 2021 +0100
Import the journal subcommand in the main swh.scheduler cli
This issue was masked by tox.ini using pytest with --doctest-modules,
which imports all modules during test collection, and therefore executing
the side-effects of swh.scheduler.cli.journal.See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/165/ for more details.
Build is green
Patch application report for D4856 (id=17362)
Could not rebase; Attempt merge onto 0a32a31195...
Auto-merging setup.py Merge made by the 'recursive' strategy. .pre-commit-config.yaml | 1 + docs/index.rst | 1 + docs/simulator.rst | 65 ++++++++++++ mypy.ini | 6 ++ requirements-simulator.txt | 2 + setup.py | 34 ++++--- swh/scheduler/cli/__init__.py | 2 +- swh/scheduler/cli/simulator.py | 68 +++++++++++++ swh/scheduler/simulator/__init__.py | 147 ++++++++++++++++++++++++++++ swh/scheduler/simulator/common.py | 102 +++++++++++++++++++ swh/scheduler/simulator/origin_scheduler.py | 68 +++++++++++++ swh/scheduler/simulator/origins.py | 128 ++++++++++++++++++++++++ swh/scheduler/simulator/task_scheduler.py | 76 ++++++++++++++ swh/scheduler/tests/test_simulator.py | 53 ++++++++++ 14 files changed, 736 insertions(+), 17 deletions(-) create mode 100644 docs/simulator.rst create mode 100644 requirements-simulator.txt create mode 100644 swh/scheduler/cli/simulator.py create mode 100644 swh/scheduler/simulator/__init__.py create mode 100644 swh/scheduler/simulator/common.py create mode 100644 swh/scheduler/simulator/origin_scheduler.py create mode 100644 swh/scheduler/simulator/origins.py create mode 100644 swh/scheduler/simulator/task_scheduler.py create mode 100644 swh/scheduler/tests/test_simulator.py
Changes applied before test
commit c4641ac0f67e92b7d6ffe885f9fc7a410a547e63
Merge: 0a32a31 e12a4f1
Author: Jenkins user <jenkins@localhost>
Date: Tue Jan 19 17:42:37 2021 +0000
Merge branch 'diff-target' into HEAD
commit e12a4f13386cdb25d366f5e2ee81044cb8e30169
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 18:36:53 2021 +0100
simulator: stop using get_scheduler directly
This reuses the scheduler instantiated by the cli instead of hardcoding
our own using the PG* variables.
commit ed044415e625080cb4bc67b2656743d92ed4c884
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 16:32:27 2021 +0100
simulator: Add documentation.
commit 186aebeb12905dc98cc370b360a8b3f5c4db3186
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 16:17:24 2021 +0100
simulator: Make min_batch_size a parameter defined in the setup.
commit 6150c764616d3c25ee13eb08bea6b4c9d1c2bc0d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Mon Jan 18 13:51:35 2021 +0100
simulator: add basic tests for fill_test_data and run
commit 5bee207dca74bb2c70611b3308c93bc522d48247
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:33:43 2021 +0100
simulator: implement a simulator for the "old" task-based scheduler
We extend the Task object with an autogenerated uuid allowing us to
track the task lifetime between its creation and the generation of visit
statuses, as the task-based scheduler does.
commit 6ec79c18b7b0e8be7b086aa79e87de81a8dbd06a
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:31:42 2021 +0100
Move the simulator cli to the main cli module
commit b25874a7066c95460f7d24c132f32f4dabf055a7
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:37:59 2021 +0100
simulator: Replace attrs with dataclasses for consistency
commit cb0bc27be55cf384c68b834ae3c89dd93434fbba
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:31:41 2021 +0100
simulator: wrap tasks and task events in typechecked objects
This allows us to extend these objects without redefining a bunch of
type annotations.
commit 947aecb14cdb4c6dd2da178f53599b4a41c8245b
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 14:47:33 2021 +0100
simulator: also fill data for the task-based scheduler
commit ff6cd0669e0d75afbd2c63424db66bf8d1e91bee
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Fri Jan 15 14:41:05 2021 +0100
simulator: Split into smaller files in the same package
commit b232135cb982f4fc8e5fb6242a88012d732e252d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:50:00 2021 +0100
simulator: Make the run time a CLI argument
commit 24e93d8aa72107bf953f884df4c9b15ea9cbeb2c
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:40:16 2021 +0100
simulator: tweak simulation environment constants
commit 9885d12cd708a26878cd9aa70ab590223589e8d7
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:37:00 2021 +0100
simulator: generate more origins in fill_data
commit a1d80fec0f5760d136857fb893232b1baec35b64
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:35:01 2021 +0100
simulator: add typing for Environment.scheduler
commit 6a9ec5f38133fe232da1ca98ff30ef44b12a4c12
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:00:21 2021 +0100
simulator: add support for a basic SimulationReport
For now, this collects the runtime of tasks that have run, and gets
printed at the end of the simulation.
commit 5d0e2aee4182df9476934349ad20da5dafc8b61f
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:45:23 2021 +0100
simulator: refine origin model to follow an exponential distribution
This models origins using a consistent characteristic "time between
commits" that follows an exponential distribution between 1 second and
10 years.
From this characteristic time, and feedback from the OriginVisitStats,
we can generate the expected run time and output status of the next
visit of that origin.
commit 7934c2f90191615db69b50dc27744ec73704f896
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:43:20 2021 +0100
simulator: Remove some debug statements and lower log level
commit d0ed751eca9f2ff0464b795edb9e9bb2a0305649
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:17:11 2021 +0100
simulator: simulate the scheduler journal client
commit c9b0728955e683748f8b03a22f91d501b64aad67
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:12:38 2021 +0100
simulator: generate OriginVisitStatus objects in modeled visits
To be able to generate uneventful visits, we would need to store
the last snapshot seen for a given origin. Instead of storing this
within the simulator, which would be a concern for large scale
simulations, we use the scheduler visit cache directly.
commit 56c8d1dd66d8a993c8bc7c7bcc4e3fb3704f6864
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:09:58 2021 +0100
simulator: Move scheduler into the simulation environment object
The scheduler is used by a lot of the simulated actors, it makes sense
to share it all the time.
commit 1659aa17fe0510030fb24d3b7867d2c4a366b5dd
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:07:56 2021 +0100
simulator: Use datetimes instead of a floating point simulated time
commit ef241dd84c400f9be0d92396867587d47216e385
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 16:13:01 2021 +0100
Introduce scaffolding for a scheduler simulator
This simulator will allow us to compare the behavior of the old and new
schedulers, as well as to test the impact of scheduler policies and their
parameters on the performance of the Software Heritage archival
infrastructure as a whole.
commit 49a14792b0329049b51cbc6ed9c48006e9ff1a73
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Tue Jan 19 17:56:44 2021 +0100
Import the journal subcommand in the main swh.scheduler cli
This issue was masked by tox.ini using pytest with --doctest-modules,
which imports all modules during test collection, and therefore executing
the side-effects of swh.scheduler.cli.journal.See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/171/ for more details.
Build is green
Patch application report for D4856 (id=17381)
Could not rebase; Attempt merge onto 98526539a8...
Updating 9852653..89b8839 Fast-forward .pre-commit-config.yaml | 1 + docs/index.rst | 1 + docs/simulator.rst | 65 ++++++++++ mypy.ini | 6 + requirements-simulator.txt | 2 + setup.py | 34 +++--- sql/updates/25.sql | 64 ++++++++++ swh/scheduler/backend.py | 87 +++++++++++++ swh/scheduler/cli/__init__.py | 2 +- swh/scheduler/cli/origin.py | 40 ++++++ swh/scheduler/cli/simulator.py | 68 +++++++++++ swh/scheduler/interface.py | 40 ++++++ swh/scheduler/model.py | 32 +++++ swh/scheduler/simulator/__init__.py | 147 ++++++++++++++++++++++ swh/scheduler/simulator/common.py | 102 ++++++++++++++++ swh/scheduler/simulator/origin_scheduler.py | 68 +++++++++++ swh/scheduler/simulator/origins.py | 128 ++++++++++++++++++++ swh/scheduler/simulator/task_scheduler.py | 76 ++++++++++++ swh/scheduler/sql/30-schema.sql | 24 +++- swh/scheduler/sql/40-func.sql | 40 ++++++ swh/scheduler/tests/test_api_client.py | 3 + swh/scheduler/tests/test_cli_origin.py | 11 ++ swh/scheduler/tests/test_scheduler.py | 181 +++++++++++++++++++++++++++- swh/scheduler/tests/test_simulator.py | 53 ++++++++ 24 files changed, 1255 insertions(+), 20 deletions(-) create mode 100644 docs/simulator.rst create mode 100644 requirements-simulator.txt create mode 100644 sql/updates/25.sql create mode 100644 swh/scheduler/cli/simulator.py create mode 100644 swh/scheduler/simulator/__init__.py create mode 100644 swh/scheduler/simulator/common.py create mode 100644 swh/scheduler/simulator/origin_scheduler.py create mode 100644 swh/scheduler/simulator/origins.py create mode 100644 swh/scheduler/simulator/task_scheduler.py create mode 100644 swh/scheduler/tests/test_simulator.py
Changes applied before test
commit 89b8839ce5d1e5db2bb9b69c96dbc943d1172ff0
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 18:36:53 2021 +0100
simulator: stop using get_scheduler directly
This reuses the scheduler instantiated by the cli instead of hardcoding
our own using the PG* variables.
commit f9f28ece9b78957a7dac050c9d21fe0b0c64ad95
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 16:32:27 2021 +0100
simulator: Add documentation.
commit 1b335be22b7ad25eede2ac605f86d2fd80a61b4d
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 16:17:24 2021 +0100
simulator: Make min_batch_size a parameter defined in the setup.
commit 403e97c5599934aed746f9301845c9e6f0d7d933
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Mon Jan 18 13:51:35 2021 +0100
simulator: add basic tests for fill_test_data and run
commit cb9b2c1ddb6cf641b3b23fedf7a36269cc4ced6d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:33:43 2021 +0100
simulator: implement a simulator for the "old" task-based scheduler
We extend the Task object with an autogenerated uuid allowing us to
track the task lifetime between its creation and the generation of visit
statuses, as the task-based scheduler does.
commit 44407d7fd413c62070d85f6ee1de2268a87e2906
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:31:42 2021 +0100
Move the simulator cli to the main cli module
commit f951490ccfbdf30c4ef57d0b41651f6f43278873
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:37:59 2021 +0100
simulator: Replace attrs with dataclasses for consistency
commit b9b3defd03f87febb5c06c50ac2b7c9d37e918d5
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:31:41 2021 +0100
simulator: wrap tasks and task events in typechecked objects
This allows us to extend these objects without redefining a bunch of
type annotations.
commit b4b83f6e15476f93c51d68adbbfdbbb10d71d444
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 14:47:33 2021 +0100
simulator: also fill data for the task-based scheduler
commit 029c95f8887cac6d0eeabb4516812371375dbd28
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Fri Jan 15 14:41:05 2021 +0100
simulator: Split into smaller files in the same package
commit 131994c324502f455080603fb8ebda0e77feba22
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:50:00 2021 +0100
simulator: Make the run time a CLI argument
commit 4e54c277a3a3faa3399b615b096bcb7149a5ff78
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:40:16 2021 +0100
simulator: tweak simulation environment constants
commit 631955aaccdb8a6f2cbdc2881ce70553c1d437e0
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:37:00 2021 +0100
simulator: generate more origins in fill_data
commit f8bdbec28238cdf9c487ae7ed1cc24cbfbdffdb3
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:35:01 2021 +0100
simulator: add typing for Environment.scheduler
commit b97bb855576e1edf23be70993b6df54dc0f16a6f
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:00:21 2021 +0100
simulator: add support for a basic SimulationReport
For now, this collects the runtime of tasks that have run, and gets
printed at the end of the simulation.
commit d50da6b64b1242d226dafbfc032184c8e5fb1c9f
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:45:23 2021 +0100
simulator: refine origin model to follow an exponential distribution
This models origins using a consistent characteristic "time between
commits" that follows an exponential distribution between 1 second and
10 years.
From this characteristic time, and feedback from the OriginVisitStats,
we can generate the expected run time and output status of the next
visit of that origin.
commit fd44eb75447aba9a03b43621b88f140d8dc15ec1
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:43:20 2021 +0100
simulator: Remove some debug statements and lower log level
commit 393313a7b5530a3f123e9ca7e92fe9d61038d829
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:17:11 2021 +0100
simulator: simulate the scheduler journal client
commit 1f1e6c5d5157ee8f30b8c56a1cf130ac5ef4e953
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:12:38 2021 +0100
simulator: generate OriginVisitStatus objects in modeled visits
To be able to generate uneventful visits, we would need to store
the last snapshot seen for a given origin. Instead of storing this
within the simulator, which would be a concern for large scale
simulations, we use the scheduler visit cache directly.
commit 62870d9d11e3f598130f2562181dc8a59b7e2e2d
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:09:58 2021 +0100
simulator: Move scheduler into the simulation environment object
The scheduler is used by a lot of the simulated actors, it makes sense
to share it all the time.
commit 89c76bd7e776f5dadf8b3ff13b9bd5d5cc42f208
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:07:56 2021 +0100
simulator: Use datetimes instead of a floating point simulated time
commit ccf03c4e1f9bd3b1e46a1de0bfc7c7e4b055284d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 16:13:01 2021 +0100
Introduce scaffolding for a scheduler simulator
This simulator will allow us to compare the behavior of the old and new
schedulers, as well as to test the impact of scheduler policies and their
parameters on the performance of the Software Heritage archival
infrastructure as a whole.
commit 53b034cb8d09efa0c9b448d29fb70d727bc6a066
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 18:39:21 2021 +0100
Add a cli for the scheduler metrics update endpoint
commit 737d12e5b9e694b22bef291c625090fb3aee2afc
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Tue Jan 19 17:48:31 2021 +0100
Introduce a new lister_get endpoint
commit 114ed952e513c7ad3dbb038a640e80bf079d0780
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Tue Jan 19 14:23:32 2021 +0100
Implement some basic aggregated metrics on listed origins
Metrics are computed and cached database-side by the `update_metrics`
function. The `get_metrics` function only retrieves the cached data.
The metrics are aggregated for each lister instance and visit type
(allowing complete reaggregation by visit type for cross-cutting statistics).
The following metrics have been implemented:
- number of known origins overall
- number of enabled origins (origins seen in the last listing)
- number of enabled origins that have never been successfully visited
- number of enabled origins with known activity since our last successful visitSee https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/181/ for more details.
Build is green
Patch application report for D4856 (id=17407)
Rebasing onto 7905a6bea4...
Current branch diff-target is up to date.
Changes applied before test
commit 898820fac52cf6fcfb5d2770aad49f131370a5a6
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 20 12:11:05 2021 +0100
simulator: collect and plot scheduler metrics over time
For now, only plot the known_origins and origins_never_visited metrics.
commit 9ce68f8d0e0ea69bd6672a50687079b5b1ea460c
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 18:36:53 2021 +0100
simulator: stop using get_scheduler directly
This reuses the scheduler instantiated by the cli instead of hardcoding
our own using the PG* variables.
commit 88e0b42805011bc3886f77ce5c91b3450351a16f
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 16:32:27 2021 +0100
simulator: Add documentation.
commit 62c6d90867bccb17ae076e1b5ee4db6fd350ad1b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Tue Jan 19 16:17:24 2021 +0100
simulator: Make min_batch_size a parameter defined in the setup.
commit 9468bb9384f14e5fa0548b7d985f66fb3e36c85a
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Mon Jan 18 13:51:35 2021 +0100
simulator: add basic tests for fill_test_data and run
commit ead7b347db9d8852b4c347729d7e6d32b72d9058
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:33:43 2021 +0100
simulator: implement a simulator for the "old" task-based scheduler
We extend the Task object with an autogenerated uuid allowing us to
track the task lifetime between its creation and the generation of visit
statuses, as the task-based scheduler does.
commit aecd27eee06aaa46d350e9d5b3f86ccc36a5446c
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 16:31:42 2021 +0100
Move the simulator cli to the main cli module
commit 05067e3ecc888271507505112b48ebc9f755f5e7
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:37:59 2021 +0100
simulator: Replace attrs with dataclasses for consistency
commit 24922fe2d995ca3ffa6c3c5a19c1f5f5531db4c8
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 15:31:41 2021 +0100
simulator: wrap tasks and task events in typechecked objects
This allows us to extend these objects without redefining a bunch of
type annotations.
commit d5318aea0a93a94c80f8d743ce1de63592161f5a
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 14:47:33 2021 +0100
simulator: also fill data for the task-based scheduler
commit 22ebb7a9a4bc6639e6f52d71c2b727537baf5019
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Fri Jan 15 14:41:05 2021 +0100
simulator: Split into smaller files in the same package
commit ad7bfbe731da64cc6d1ddaa3f5ae1ef1e3350f60
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:50:00 2021 +0100
simulator: Make the run time a CLI argument
commit df34db0bfc61df418f00338345b4b46a86340f62
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:40:16 2021 +0100
simulator: tweak simulation environment constants
commit 21ce2c88dddce081bfd525d08454ca09bbf521c6
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:37:00 2021 +0100
simulator: generate more origins in fill_data
commit 29204199774b40bea4d3d23ffe9407a5d090f8fa
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:35:01 2021 +0100
simulator: add typing for Environment.scheduler
commit 6433266106dda007d1e5304a0dcb01706c8acb42
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 12:00:21 2021 +0100
simulator: add support for a basic SimulationReport
For now, this collects the runtime of tasks that have run, and gets
printed at the end of the simulation.
commit c474a825336a4e4132e83982e180451b02d8f54d
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:45:23 2021 +0100
simulator: refine origin model to follow an exponential distribution
This models origins using a consistent characteristic "time between
commits" that follows an exponential distribution between 1 second and
10 years.
From this characteristic time, and feedback from the OriginVisitStats,
we can generate the expected run time and output status of the next
visit of that origin.
commit 2459badf0c05bf2cb663e66b9deabf1150638bb1
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Fri Jan 15 11:43:20 2021 +0100
simulator: Remove some debug statements and lower log level
commit cb12449e8f57e59ec4c7953a3c4a52c9193d202e
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:17:11 2021 +0100
simulator: simulate the scheduler journal client
commit 20b7f9c68f831839f4be1cae4b9ae2dce0fc2d96
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:12:38 2021 +0100
simulator: generate OriginVisitStatus objects in modeled visits
To be able to generate uneventful visits, we would need to store
the last snapshot seen for a given origin. Instead of storing this
within the simulator, which would be a concern for large scale
simulations, we use the scheduler visit cache directly.
commit 39ad47de2e753033c4b7114a64b5c3144b6ea821
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:09:58 2021 +0100
simulator: Move scheduler into the simulation environment object
The scheduler is used by a lot of the simulated actors, it makes sense
to share it all the time.
commit 31967fa850c3afe29fc37e41cfcd53ff5408e7b9
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Thu Jan 14 15:07:56 2021 +0100
simulator: Use datetimes instead of a floating point simulated time
commit fc3f06bd1d77c76bfba4c05efcd62abcb5c46eea
Author: Nicolas Dandrimont <nicolas@dandrimont.eu>
Date: Wed Jan 13 16:13:01 2021 +0100
Introduce scaffolding for a scheduler simulator
This simulator will allow us to compare the behavior of the old and new
schedulers, as well as to test the impact of scheduler policies and their
parameters on the performance of the Software Heritage archival
infrastructure as a whole.See https://jenkins.softwareheritage.org/job/DSCH/job/tests-on-diff/194/ for more details.