In D4923#125398, @douardda wrote:

In D4923#125371, @vlorentz wrote:

(requesting changes to get it out of my review queue)

That's not a valid reason! A valid reason is "I agree with olasd's comments, fix them (plz)"...

Jan 29 2021, 5:45 PM

douardda updated the diff for D4923: Simulation: allow to export results in a csv file.

typo

Jan 29 2021, 5:37 PM

douardda added inline comments to D4923: Simulation: allow to export results in a csv file.

Jan 29 2021, 5:34 PM

douardda added a comment to D4923: Simulation: allow to export results in a csv file.

In D4923#125371, @vlorentz wrote:

(requesting changes to get it out of my review queue)

Jan 29 2021, 5:29 PM

douardda updated the diff for D4923: Simulation: allow to export results in a csv file.

rebas

Jan 29 2021, 5:23 PM

douardda updated the diff for D4921: Make plotting optional in simulator cli command.

rebase

Jan 29 2021, 5:23 PM

Jan 27 2021

douardda requested changes to D4931: Add mapping of definitions and harvests.

few requests, please:

add tests with this commit; every introduced function should have at least one test.
add doctrings to your new functions,
improve the commit message (see https://chris.beams.io/posts/git-commit/ ); with the current one, I have no idea what exactly is done in this commit, and more importantly, why this is needed for.

Jan 27 2021, 3:54 PM

douardda accepted D4914: simulator: stop using the database as a cache for origin data.

Jan 27 2021, 3:48 PM

douardda closed T2970: Make swh-journal tests not depend on swh-model any more as Resolved.

Let's consider it as done.

Jan 27 2021, 3:47 PM · Journal

douardda closed D4951: Remove tests' journal_data.py in favor of the version in swh-model.

Jan 27 2021, 3:38 PM

douardda committed rDJNL9703864ef366: Remove tests' journal_data.py in favor of the version in swh-model (authored by douardda).

Remove tests' journal_data.py in favor of the version in swh-model

Jan 27 2021, 3:38 PM

douardda updated the diff for D4951: Remove tests' journal_data.py in favor of the version in swh-model.

set the DeprecationWarning category in journal_data

Jan 27 2021, 10:32 AM

douardda updated the diff for D4951: Remove tests' journal_data.py in favor of the version in swh-model.

remove stuff added mistakenly, and properly deprecate journal_data instead of breaking swh-storage

Jan 27 2021, 10:27 AM

douardda added inline comments to D4951: Remove tests' journal_data.py in favor of the version in swh-model.

Jan 27 2021, 10:00 AM

douardda added inline comments to D4951: Remove tests' journal_data.py in favor of the version in swh-model.

Jan 27 2021, 9:57 AM

douardda added inline comments to D4951: Remove tests' journal_data.py in favor of the version in swh-model.

Jan 27 2021, 9:54 AM

Jan 26 2021

douardda requested review of D4951: Remove tests' journal_data.py in favor of the version in swh-model.

Jan 26 2021, 5:32 PM

douardda closed D4950: Add swh-journal's model-related test data set in swh-model.

Jan 26 2021, 5:21 PM

douardda committed rDMODcad940dc8c07: Add swh-journal's model-related test data set in swh-model (authored by douardda).

Add swh-journal's model-related test data set in swh-model

Jan 26 2021, 5:21 PM

douardda added a revision to T2970: Make swh-journal tests not depend on swh-model any more: D4951: Remove tests' journal_data.py in favor of the version in swh-model.

Jan 26 2021, 5:09 PM · Journal

douardda requested review of D4950: Add swh-journal's model-related test data set in swh-model.

Jan 26 2021, 4:47 PM

douardda added a revision to T2970: Make swh-journal tests not depend on swh-model any more: D4950: Add swh-journal's model-related test data set in swh-model.

Jan 26 2021, 4:45 PM · Journal

douardda added a comment to T2970: Make swh-journal tests not depend on swh-model any more.

Back on this, the plan is now to make swh-journal not depend on the actual model definition, which is currently mostly due to the presence of the journal_data.py in swh-journal. So the plan is to move this file in swh-model so it's kept up to date with swh-model, even if it's mostly used for testing other packages (like swh-journal).

Jan 26 2021, 4:41 PM · Journal

douardda added a comment to D4914: simulator: stop using the database as a cache for origin data.

And once again, this "cache" behavior makes the simulator unable to run "forever" (it will eat RAM). Maybe it's an assumed design choice, but please document it somewhere.

Jan 26 2021, 9:50 AM

douardda added a comment to D4914: simulator: stop using the database as a cache for origin data.

Something I don't understand: why do you need to keep both _visit_times and latest_snapshots in "caches" when a snapshot is derived from this visit time (and visit type and origin)?

Jan 26 2021, 9:48 AM

douardda added a comment to D4909: simulator: add lister simulation.

Isn't there some inherent limitation with this lister_process (gradually eating RAM) that should be documented (maybe)?

Jan 26 2021, 9:33 AM

douardda added a comment to D4909: simulator: add lister simulation.

Note that I still think there should be something in docs/simulator.rst also...

Jan 26 2021, 9:29 AM

douardda accepted D4909: simulator: add lister simulation.

In D4909#123949, @vlorentz wrote:

We're not claiming this is a realistic model. We only tried to do something that isn't completely naive, and exercises simple edge cases. Making it realistic is hard, and will probably be most of @olasd's work this week.

Jan 26 2021, 9:27 AM

Jan 25 2021

douardda added a comment to D4909: simulator: add lister simulation.

In D4909#123805, @vlorentz wrote:

Yes, but that's not inconsistent as we can discover origins that we didn't know about.

Jan 25 2021, 12:23 PM

douardda added a comment to D4909: simulator: add lister simulation.

I'm really not sure to understand what the simulated model looks like in the end. Do I get it right that, including this diff:

Jan 25 2021, 12:04 PM

Jan 22 2021

douardda updated the diff for D4923: Simulation: allow to export results in a csv file.

rebased

Jan 22 2021, 4:20 PM

douardda updated the diff for D4921: Make plotting optional in simulator cli command.

s/-H/-P/

Jan 22 2021, 4:20 PM

douardda added inline comments to D4921: Make plotting optional in simulator cli command.

Jan 22 2021, 4:19 PM

douardda added a comment to D4927: lister.docs: add a lister template for the new API.

thanks. I think however, given its purpose, this example code should be heavily commented: each constant (eg. MyPageType) and each method should be commented (not docstrings but comments exaplaining what the method/variable is used for).

Jan 22 2021, 4:15 PM · Sprint 2021 01, Lister

douardda accepted D4912: grab_next_visits: don't re-schedule visits too fast.

Not very fond of this "one week => dead" embedded in there, but meh.

Jan 22 2021, 3:37 PM

douardda accepted D4916: Run simulator tests on all known scheduling policies.

Jan 22 2021, 3:33 PM

douardda accepted D4915: simulator: record visit metrics alongside scheduler metrics.

Jan 22 2021, 3:33 PM

douardda accepted D4910: Construct grab_next_visits query arguments incrementally.

ok, but it would have been nice to have an explanation of why this is necessary in the commit message.

Jan 22 2021, 3:29 PM

douardda accepted D4911: Allow overriding the timestamp of grab_next_visits.

Jan 22 2021, 3:27 PM

douardda added a comment to D4920: Randomize last_update in generated ListedOrigins in fill_test_data.

In D4920#123571, @vlorentz wrote:

In D4920#123533, @douardda wrote:

why not (cli option), but why (keep it deterministic)?

reproducibility, so we can run the simulator twice with different code, and be sure that differences in behavior are not caused by randomness

Jan 22 2021, 2:27 PM

douardda closed D4919: Add a --num-origins option to the fill-test-data cli command.

Jan 22 2021, 2:12 PM

douardda committed rDSCH86b255544c5d: Add a --num-origins option to the fill-test-data cli command (authored by douardda).

Add a --num-origins option to the fill-test-data cli command

Jan 22 2021, 2:12 PM

douardda updated the diff for D4919: Add a --num-origins option to the fill-test-data cli command.

rebased

Jan 22 2021, 2:12 PM

douardda closed D4922: Simulation: log at infol level recorded metrics.

Jan 22 2021, 2:10 PM

douardda committed rDSCHabb513ca7d09: Simulation: log at info level recorded metrics (authored by douardda).

Simulation: log at info level recorded metrics

Jan 22 2021, 2:10 PM

douardda updated the diff for D4922: Simulation: log at infol level recorded metrics.

rebased

Jan 22 2021, 2:10 PM

douardda added a comment to D4920: Randomize last_update in generated ListedOrigins in fill_test_data.

In D4920#123548, @douardda wrote:

In D4920#123533, @douardda wrote:

In D4920#123465, @vlorentz wrote:

I'd like to keep the simulator deterministic. What about adding a CLI option with a seed?

why not (cli option), but why (keep it deterministic)?

Also, a given seed will not be enough here: there is also the maxts = int(utcnow().timestamp()) that will kill the deterministic property...

Jan 22 2021, 12:28 PM

douardda added a comment to D4920: Randomize last_update in generated ListedOrigins in fill_test_data.

In D4920#123533, @douardda wrote:

In D4920#123465, @vlorentz wrote:

I'd like to keep the simulator deterministic. What about adding a CLI option with a seed?

why not (cli option), but why (keep it deterministic)?

Jan 22 2021, 12:27 PM

douardda added a comment to D4920: Randomize last_update in generated ListedOrigins in fill_test_data.

In D4920#123465, @vlorentz wrote:

I'd like to keep the simulator deterministic. What about adding a CLI option with a seed?

Jan 22 2021, 12:25 PM

douardda updated the diff for D4923: Simulation: allow to export results in a csv file.

rebased

Jan 22 2021, 12:23 PM

douardda updated the summary of D4923: Simulation: allow to export results in a csv file.

Jan 22 2021, 12:23 PM

douardda updated the summary of D4923: Simulation: allow to export results in a csv file.

Jan 22 2021, 12:22 PM

douardda retitled D4921: Make plotting optional in simulator cli command from Make plotting histograms optional in simulator cli command to Make plotting optional in simulator cli command.

Jan 22 2021, 12:21 PM

douardda updated the diff for D4921: Make plotting optional in simulator cli command.

rebase on D4916

Jan 22 2021, 12:21 PM

douardda requested review of D4920: Randomize last_update in generated ListedOrigins in fill_test_data.

Jan 22 2021, 11:40 AM

douardda updated the diff for D4923: Simulation: allow to export results in a csv file.

rebased

Jan 22 2021, 11:36 AM

douardda updated the diff for D4921: Make plotting optional in simulator cli command.

kill unnedded dependency on D4920

Jan 22 2021, 11:27 AM

douardda updated the summary of D4921: Make plotting optional in simulator cli command.

Jan 22 2021, 11:26 AM

douardda updated the diff for D4922: Simulation: log at infol level recorded metrics.

with the commit...

Jan 22 2021, 11:22 AM

douardda updated the diff for D4922: Simulation: log at infol level recorded metrics.

type + vorentz' comment

Jan 22 2021, 11:21 AM

douardda updated the summary of D4922: Simulation: log at infol level recorded metrics.

Jan 22 2021, 11:19 AM

douardda accepted D4877: npm: Reimplement lister using new Lister API.

Jan 22 2021, 11:08 AM

douardda added inline comments to D4909: simulator: add lister simulation.

Jan 22 2021, 11:06 AM

douardda added inline comments to D4909: simulator: add lister simulation.

Jan 22 2021, 11:05 AM

douardda requested review of D4923: Simulation: allow to export results in a csv file.

Jan 22 2021, 11:01 AM

douardda requested review of D4922: Simulation: log at infol level recorded metrics.

Jan 22 2021, 10:59 AM

douardda accepted D4899: Add scheduling policy for already visited origins with known last update.

lgtm

Jan 22 2021, 10:58 AM

douardda requested review of D4921: Make plotting optional in simulator cli command.

Jan 22 2021, 10:57 AM

douardda requested review of D4919: Add a --num-origins option to the fill-test-data cli command.

Jan 22 2021, 10:52 AM

Jan 21 2021

douardda added inline comments to D4895: Add a successive_visits counter to OriginVisitStats.

Jan 21 2021, 9:57 AM

douardda closed D4894: Simplify journal client tests.

Jan 21 2021, 9:55 AM

douardda committed rDSCHffe2aed2fa32: Simplify journal client tests (authored by douardda).

Simplify journal client tests

Jan 21 2021, 9:55 AM

Jan 20 2021

douardda updated the diff for D4895: Add a successive_visits counter to OriginVisitStats.

rebased

Jan 20 2021, 6:04 PM

douardda updated the diff for D4894: Simplify journal client tests.

rebased

Jan 20 2021, 6:03 PM

douardda added a reverting change for rDSCHb03d978241a6: Make sure swh.scheduler.cli.journal is loaded in test_cli_journal.py: rDSCHc7b740cafa64: Revert "Make sure swh.scheduler.cli.journal is loaded in test_cli_journal.py".

Jan 20 2021, 6:02 PM

douardda committed rDSCHc7b740cafa64: Revert "Make sure swh.scheduler.cli.journal is loaded in test_cli_journal.py" (authored by douardda).

Revert "Make sure swh.scheduler.cli.journal is loaded in test_cli_journal.py"

Jan 20 2021, 6:02 PM

douardda closed D4893: Make the max_date() helper function accept *dates as argument.

Jan 20 2021, 5:29 PM

douardda committed rDSCHc386fdf3b9fc: Make the max_date() helper function accept *dates as argument (authored by douardda).

Make the max_date() helper function accept *dates as argument

Jan 20 2021, 5:29 PM

douardda added inline comments to D4877: npm: Reimplement lister using new Lister API.

Jan 20 2021, 3:37 PM

douardda requested review of D4895: Add a successive_visits counter to OriginVisitStats.

Jan 20 2021, 12:49 PM

douardda added a comment to D4891: model: Allow new status values not_found and failed to OriginVisitStatus.

In D4891#122658, @ardumont wrote:

This makes me wonder if we shouldn't add an explicit failed status too, while we're at it, for explicit failures that couldn't generate a partial snapshot.

I'd be fine with that indeed.

I think we entertain the idea with @douardda and @vsellier

Jan 20 2021, 12:48 PM

douardda requested review of D4894: Simplify journal client tests.

Jan 20 2021, 12:46 PM

douardda accepted D4889: Add a cli for the scheduler metrics update endpoint.

lgtm

Jan 20 2021, 12:45 PM

douardda requested review of D4893: Make the max_date() helper function accept *dates as argument.

Jan 20 2021, 12:44 PM

douardda committed rDSCHb03d978241a6: Make sure swh.scheduler.cli.journal is loaded in test_cli_journal.py (authored by douardda).

Make sure swh.scheduler.cli.journal is loaded in test_cli_journal.py

Jan 20 2021, 12:20 PM

douardda accepted D4880: Implement some basic aggregated metrics on listed origins.

Looks ok to me. I'd like however to have a description of implemented metrics in the commit message (and in the documentation, but this may come later)

Jan 20 2021, 10:31 AM

douardda closed D4881: Move the `last_scheduled` ts from ListedOrigin to OriginVisitStatus.

Jan 20 2021, 10:05 AM

douardda committed rDSCHf8627a96fed6: Move the `last_scheduled` ts from ListedOrigin to OriginVisitStatus (authored by douardda).

Move the `last_scheduled` ts from ListedOrigin to OriginVisitStatus

Jan 20 2021, 10:05 AM