Page MenuHomeSoftware Heritage
Feed Advanced Search

Feb 1 2021

douardda added a comment to T2799: Add support for SWHID as source of repository for jupyterhub.

Note that the PR for repo2docker has been merged, and another PR for binderhub is currently in progress.

Feb 1 2021, 4:20 PM
douardda closed D4923: Simulation: allow to export results in a csv file.
Feb 1 2021, 3:57 PM
douardda committed rDSCHaaffff2631a7: Simulator: allow to export results in a csv file (authored by douardda).
Simulator: allow to export results in a csv file
Feb 1 2021, 3:57 PM
douardda closed D4984: Add minimal tests for the SimulationReport.format() method.
Feb 1 2021, 3:56 PM
douardda committed rDSCH9fce3f6f2c73: Add minimal tests for the SimulationReport.format() method (authored by douardda).
Add minimal tests for the SimulationReport.format() method
Feb 1 2021, 3:56 PM
douardda requested review of D4984: Add minimal tests for the SimulationReport.format() method.
Feb 1 2021, 3:40 PM
douardda updated the diff for D4923: Simulation: allow to export results in a csv file.

add mininal test

Feb 1 2021, 3:39 PM
douardda closed D4921: Make plotting optional in simulator cli command.
Feb 1 2021, 3:11 PM
douardda committed rDSCHaaf7dd6f1d82: Make plottings optional in simulator cli output (authored by douardda).
Make plottings optional in simulator cli output
Feb 1 2021, 3:11 PM

Jan 29 2021

douardda added a comment to D4923: Simulation: allow to export results in a csv file.

(requesting changes to get it out of my review queue)

That's not a valid reason! A valid reason is "I agree with olasd's comments, fix them (plz)"...

Jan 29 2021, 5:45 PM
douardda updated the diff for D4923: Simulation: allow to export results in a csv file.

typo

Jan 29 2021, 5:37 PM
douardda added inline comments to D4923: Simulation: allow to export results in a csv file.
Jan 29 2021, 5:34 PM
douardda added a comment to D4923: Simulation: allow to export results in a csv file.

(requesting changes to get it out of my review queue)

Jan 29 2021, 5:29 PM
douardda updated the diff for D4923: Simulation: allow to export results in a csv file.

rebas

Jan 29 2021, 5:23 PM
douardda updated the diff for D4921: Make plotting optional in simulator cli command.

rebase

Jan 29 2021, 5:23 PM

Jan 27 2021

douardda requested changes to D4931: Add mapping of definitions and harvests.

few requests, please:

  • add tests with this commit; every introduced function should have at least one test.
  • add doctrings to your new functions,
  • improve the commit message (see https://chris.beams.io/posts/git-commit/ ); with the current one, I have no idea what exactly is done in this commit, and more importantly, why this is needed for.
Jan 27 2021, 3:54 PM
douardda accepted D4914: simulator: stop using the database as a cache for origin data.
Jan 27 2021, 3:48 PM
douardda closed T2970: Make swh-journal tests not depend on swh-model any more as Resolved.

Let's consider it as done.

Jan 27 2021, 3:47 PM · Journal
douardda closed D4951: Remove tests' journal_data.py in favor of the version in swh-model.
Jan 27 2021, 3:38 PM
douardda committed rDJNL9703864ef366: Remove tests' journal_data.py in favor of the version in swh-model (authored by douardda).
Remove tests' journal_data.py in favor of the version in swh-model
Jan 27 2021, 3:38 PM
douardda updated the diff for D4951: Remove tests' journal_data.py in favor of the version in swh-model.

set the DeprecationWarning category in journal_data

Jan 27 2021, 10:32 AM
douardda updated the diff for D4951: Remove tests' journal_data.py in favor of the version in swh-model.

remove stuff added mistakenly, and properly deprecate journal_data instead of breaking swh-storage

Jan 27 2021, 10:27 AM
douardda added inline comments to D4951: Remove tests' journal_data.py in favor of the version in swh-model.
Jan 27 2021, 10:00 AM
douardda added inline comments to D4951: Remove tests' journal_data.py in favor of the version in swh-model.
Jan 27 2021, 9:57 AM
douardda added inline comments to D4951: Remove tests' journal_data.py in favor of the version in swh-model.
Jan 27 2021, 9:54 AM

Jan 26 2021

douardda requested review of D4951: Remove tests' journal_data.py in favor of the version in swh-model.
Jan 26 2021, 5:32 PM
douardda closed D4950: Add swh-journal's model-related test data set in swh-model.
Jan 26 2021, 5:21 PM
douardda committed rDMODcad940dc8c07: Add swh-journal's model-related test data set in swh-model (authored by douardda).
Add swh-journal's model-related test data set in swh-model
Jan 26 2021, 5:21 PM
douardda added a revision to T2970: Make swh-journal tests not depend on swh-model any more: D4951: Remove tests' journal_data.py in favor of the version in swh-model.
Jan 26 2021, 5:09 PM · Journal
douardda requested review of D4950: Add swh-journal's model-related test data set in swh-model.
Jan 26 2021, 4:47 PM
douardda added a revision to T2970: Make swh-journal tests not depend on swh-model any more: D4950: Add swh-journal's model-related test data set in swh-model.
Jan 26 2021, 4:45 PM · Journal
douardda added a comment to T2970: Make swh-journal tests not depend on swh-model any more.

Back on this, the plan is now to make swh-journal not depend on the actual model definition, which is currently mostly due to the presence of the journal_data.py in swh-journal. So the plan is to move this file in swh-model so it's kept up to date with swh-model, even if it's mostly used for testing other packages (like swh-journal).

Jan 26 2021, 4:41 PM · Journal
douardda added a comment to D4914: simulator: stop using the database as a cache for origin data.

And once again, this "cache" behavior makes the simulator unable to run "forever" (it will eat RAM). Maybe it's an assumed design choice, but please document it somewhere.

Jan 26 2021, 9:50 AM
douardda added a comment to D4914: simulator: stop using the database as a cache for origin data.

Something I don't understand: why do you need to keep both _visit_times and latest_snapshots in "caches" when a snapshot is derived from this visit time (and visit type and origin)?

Jan 26 2021, 9:48 AM
douardda added a comment to D4909: simulator: add lister simulation.

Isn't there some inherent limitation with this lister_process (gradually eating RAM) that should be documented (maybe)?

Jan 26 2021, 9:33 AM
douardda added a comment to D4909: simulator: add lister simulation.

Note that I still think there should be something in docs/simulator.rst also...

Jan 26 2021, 9:29 AM
douardda accepted D4909: simulator: add lister simulation.

We're not claiming this is a realistic model. We only tried to do something that isn't completely naive, and exercises simple edge cases. Making it realistic is hard, and will probably be most of @olasd's work this week.

Jan 26 2021, 9:27 AM

Jan 25 2021

douardda added a comment to D4909: simulator: add lister simulation.

Yes, but that's not inconsistent as we can discover origins that we didn't know about.

Jan 25 2021, 12:23 PM
douardda added a comment to D4909: simulator: add lister simulation.

I'm really not sure to understand what the simulated model looks like in the end. Do I get it right that, including this diff:

Jan 25 2021, 12:04 PM

Jan 22 2021

douardda updated the diff for D4923: Simulation: allow to export results in a csv file.

rebased

Jan 22 2021, 4:20 PM
douardda updated the diff for D4921: Make plotting optional in simulator cli command.

s/-H/-P/

Jan 22 2021, 4:20 PM
douardda added inline comments to D4921: Make plotting optional in simulator cli command.
Jan 22 2021, 4:19 PM
douardda added a comment to D4927: lister.docs: add a lister template for the new API.

thanks. I think however, given its purpose, this example code should be heavily commented: each constant (eg. MyPageType) and each method should be commented (not docstrings but comments exaplaining what the method/variable is used for).

Jan 22 2021, 4:15 PM · Sprint 2021 01, Lister
douardda accepted D4912: grab_next_visits: don't re-schedule visits too fast.

Not very fond of this "one week => dead" embedded in there, but meh.

Jan 22 2021, 3:37 PM
douardda accepted D4916: Run simulator tests on all known scheduling policies.
Jan 22 2021, 3:33 PM
douardda accepted D4915: simulator: record visit metrics alongside scheduler metrics.
Jan 22 2021, 3:33 PM
douardda accepted D4910: Construct grab_next_visits query arguments incrementally.

ok, but it would have been nice to have an explanation of why this is necessary in the commit message.

Jan 22 2021, 3:29 PM
douardda accepted D4911: Allow overriding the timestamp of grab_next_visits.
Jan 22 2021, 3:27 PM
douardda added a comment to D4920: Randomize last_update in generated ListedOrigins in fill_test_data.

why not (cli option), but why (keep it deterministic)?

  1. reproducibility, so we can run the simulator twice with different code, and be sure that differences in behavior are not caused by randomness
Jan 22 2021, 2:27 PM
douardda closed D4919: Add a --num-origins option to the fill-test-data cli command.
Jan 22 2021, 2:12 PM
douardda committed rDSCH86b255544c5d: Add a --num-origins option to the fill-test-data cli command (authored by douardda).
Add a --num-origins option to the fill-test-data cli command
Jan 22 2021, 2:12 PM
douardda updated the diff for D4919: Add a --num-origins option to the fill-test-data cli command.

rebased

Jan 22 2021, 2:12 PM
douardda closed D4922: Simulation: log at infol level recorded metrics.
Jan 22 2021, 2:10 PM
douardda committed rDSCHabb513ca7d09: Simulation: log at info level recorded metrics (authored by douardda).
Simulation: log at info level recorded metrics
Jan 22 2021, 2:10 PM
douardda updated the diff for D4922: Simulation: log at infol level recorded metrics.

rebased

Jan 22 2021, 2:10 PM
douardda added a comment to D4920: Randomize last_update in generated ListedOrigins in fill_test_data.

I'd like to keep the simulator deterministic. What about adding a CLI option with a seed?

why not (cli option), but why (keep it deterministic)?

Also, a given seed will not be enough here: there is also the maxts = int(utcnow().timestamp()) that will kill the deterministic property...

Jan 22 2021, 12:28 PM
douardda added a comment to D4920: Randomize last_update in generated ListedOrigins in fill_test_data.

I'd like to keep the simulator deterministic. What about adding a CLI option with a seed?

why not (cli option), but why (keep it deterministic)?

Jan 22 2021, 12:27 PM
douardda added a comment to D4920: Randomize last_update in generated ListedOrigins in fill_test_data.

I'd like to keep the simulator deterministic. What about adding a CLI option with a seed?

Jan 22 2021, 12:25 PM
douardda updated the diff for D4923: Simulation: allow to export results in a csv file.

rebased

Jan 22 2021, 12:23 PM
douardda updated the summary of D4923: Simulation: allow to export results in a csv file.
Jan 22 2021, 12:23 PM
douardda updated the summary of D4923: Simulation: allow to export results in a csv file.
Jan 22 2021, 12:22 PM
douardda retitled D4921: Make plotting optional in simulator cli command from Make plotting histograms optional in simulator cli command to Make plotting optional in simulator cli command.
Jan 22 2021, 12:21 PM
douardda updated the diff for D4921: Make plotting optional in simulator cli command.

rebase on D4916

Jan 22 2021, 12:21 PM
douardda requested review of D4920: Randomize last_update in generated ListedOrigins in fill_test_data.
Jan 22 2021, 11:40 AM
douardda updated the diff for D4923: Simulation: allow to export results in a csv file.

rebased

Jan 22 2021, 11:36 AM
douardda updated the diff for D4921: Make plotting optional in simulator cli command.

kill unnedded dependency on D4920

Jan 22 2021, 11:27 AM
douardda updated the summary of D4921: Make plotting optional in simulator cli command.
Jan 22 2021, 11:26 AM
douardda updated the diff for D4922: Simulation: log at infol level recorded metrics.

with the commit...

Jan 22 2021, 11:22 AM
douardda updated the diff for D4922: Simulation: log at infol level recorded metrics.

type + vorentz' comment

Jan 22 2021, 11:21 AM
douardda updated the summary of D4922: Simulation: log at infol level recorded metrics.
Jan 22 2021, 11:19 AM
douardda accepted D4877: npm: Reimplement lister using new Lister API.
Jan 22 2021, 11:08 AM
douardda added inline comments to D4909: simulator: add lister simulation.
Jan 22 2021, 11:06 AM
douardda added inline comments to D4909: simulator: add lister simulation.
Jan 22 2021, 11:05 AM
douardda requested review of D4923: Simulation: allow to export results in a csv file.
Jan 22 2021, 11:01 AM
douardda requested review of D4922: Simulation: log at infol level recorded metrics.
Jan 22 2021, 10:59 AM
douardda accepted D4899: Add scheduling policy for already visited origins with known last update.

lgtm

Jan 22 2021, 10:58 AM
douardda requested review of D4921: Make plotting optional in simulator cli command.
Jan 22 2021, 10:57 AM
douardda requested review of D4919: Add a --num-origins option to the fill-test-data cli command.
Jan 22 2021, 10:52 AM

Jan 21 2021

douardda added inline comments to D4895: Add a successive_visits counter to OriginVisitStats.
Jan 21 2021, 9:57 AM
douardda closed D4894: Simplify journal client tests.
Jan 21 2021, 9:55 AM
douardda committed rDSCHffe2aed2fa32: Simplify journal client tests (authored by douardda).
Simplify journal client tests
Jan 21 2021, 9:55 AM

Jan 20 2021

douardda updated the diff for D4895: Add a successive_visits counter to OriginVisitStats.

rebased

Jan 20 2021, 6:04 PM
douardda updated the diff for D4894: Simplify journal client tests.

rebased

Jan 20 2021, 6:03 PM
douardda added a reverting change for rDSCHb03d978241a6: Make sure swh.scheduler.cli.journal is loaded in test_cli_journal.py: rDSCHc7b740cafa64: Revert "Make sure swh.scheduler.cli.journal is loaded in test_cli_journal.py".
Jan 20 2021, 6:02 PM
douardda committed rDSCHc7b740cafa64: Revert "Make sure swh.scheduler.cli.journal is loaded in test_cli_journal.py" (authored by douardda).
Revert "Make sure swh.scheduler.cli.journal is loaded in test_cli_journal.py"
Jan 20 2021, 6:02 PM
douardda closed D4893: Make the max_date() helper function accept *dates as argument.
Jan 20 2021, 5:29 PM
douardda committed rDSCHc386fdf3b9fc: Make the max_date() helper function accept *dates as argument (authored by douardda).
Make the max_date() helper function accept *dates as argument
Jan 20 2021, 5:29 PM
douardda added inline comments to D4877: npm: Reimplement lister using new Lister API.
Jan 20 2021, 3:37 PM
douardda requested review of D4895: Add a successive_visits counter to OriginVisitStats.
Jan 20 2021, 12:49 PM
douardda added a comment to D4891: model: Allow new status values not_found and failed to OriginVisitStatus.

This makes me wonder if we shouldn't add an explicit failed status too, while we're at it, for explicit failures that couldn't generate a partial snapshot.

I'd be fine with that indeed.

I think we entertain the idea with @douardda and @vsellier

Jan 20 2021, 12:48 PM
douardda requested review of D4894: Simplify journal client tests.
Jan 20 2021, 12:46 PM
douardda accepted D4889: Add a cli for the scheduler metrics update endpoint.

lgtm

Jan 20 2021, 12:45 PM
douardda requested review of D4893: Make the max_date() helper function accept *dates as argument.
Jan 20 2021, 12:44 PM
douardda committed rDSCHb03d978241a6: Make sure swh.scheduler.cli.journal is loaded in test_cli_journal.py (authored by douardda).
Make sure swh.scheduler.cli.journal is loaded in test_cli_journal.py
Jan 20 2021, 12:20 PM
douardda accepted D4880: Implement some basic aggregated metrics on listed origins.

Looks ok to me. I'd like however to have a description of implemented metrics in the commit message (and in the documentation, but this may come later)

Jan 20 2021, 10:31 AM
douardda closed D4881: Move the `last_scheduled` ts from ListedOrigin to OriginVisitStatus.
Jan 20 2021, 10:05 AM
douardda committed rDSCHf8627a96fed6: Move the `last_scheduled` ts from ListedOrigin to OriginVisitStatus (authored by douardda).
Move the `last_scheduled` ts from ListedOrigin to OriginVisitStatus
Jan 20 2021, 10:05 AM

Jan 19 2021

douardda updated the diff for D4881: Move the `last_scheduled` ts from ListedOrigin to OriginVisitStatus.

rebased

Jan 19 2021, 5:49 PM
douardda closed D4885: Make the journal-client cli subcommand automagically loaded.
Jan 19 2021, 5:48 PM
douardda committed rDSCH0a32a31195f1: Make the journal-client cli subcommand automagically loaded (authored by douardda).
Make the journal-client cli subcommand automagically loaded
Jan 19 2021, 5:48 PM