- User Since
- Jul 10 2018, 12:38 PM (120 w, 2 d)
So the test fails on jenkins because they use the hg command from the system (since mercurial is oddly enough, not a dependency of swh-loader-mercurial) and on stretch, mercurial is 4.8
Using mercurial 5.5 is ok.
Tue, Oct 27
Define and use the SWH acronym
fixes and improvements suggested by maoranegg (big thx)
typos (thx ardumont)
I would expect the commit message to be a bot more explanatory: either this new test case adds some tested aspects that were not tested before, and it should mention it, or it does not, and it should also mention it explaining this new test is the base for futures extended ones in a more manageable way (what's the "updatable" stands for, if I get this right).
Mon, Oct 26
should be ok now (even if via ImmutableDict :-) )
See also T2706
Fri, Oct 23
closed by 2b869aa7d30d099ed6146d9f8dc667cd7a8eefc3
Also the commit message should give a bit more information on what this new script is needed for, maybe with a usage example.
This defines a bunch of commands. When and how should "I" use them?
please do not put the "depends on Dxxx" line in the git commit message.
ok on the diff itself, but why is this new example repo needed for? This should be explained in the commit message. (the "why"! always insist on the "why" rather than (or in addition to) the "what" in your commit messages, please.
ok but please properly document arguments in docstrings.
Thu, Oct 22
globally ok, but please add a README file as suggested in the previous comment
Would be nice to have a README file in tests/data explaining what these json files are and how to produce them.
Wouldn't it make a bit easier to name the generic version of the journal writer something like GenericKafkaJournalWriter and have KafkaJournalWriter = GenericKafkaJournalWriter[BaseModel] ? (for bw compat)
This globally LGTM but there is this path encoding issue. The 2 new functions in from_disk.py should take a bytes argument instead of a str one.
Wed, Oct 21
Mon, Oct 19
Fri, Oct 16
Same as before but with 1M (fresh) sha1s:
Since the results on uffizi above did suffer from a few caveats, I've made a few more tests:
- a first result has been obtained with a dataset that had only objects stored on the XFS part of the objstorage
- a second dataset has been created (with the order by sha256 part to spread the sha1s)
- but results are a mix hot/cold cache tests
Thu, Oct 15
Current benchmarck scenario:
Wed, Oct 14
Tue, Oct 13
I'm mostly OK with this now, so I'll make it "accepted", but please refactor a bit the cli_run_[n]ok() helper functions before landing it.
Fri, Oct 9
oh so much yes!
Thu, Oct 8
Overall ok, but I would have preferred the renaming be in a dedicated revision, separated from type annotation fixes/additions.
otherwise fine with me
does this requires the plugin's entrypoint in swh.core be removed ? (eg. because of swh.core.pytest_plugin being loaded twice or something like that) or is it safe to apply and use with a swh.core that still declates its pytest_plugin an entrypoint?
Since this "migration problem" also concerns cassandra, maybe an simple approach would be to add a Final version attribute to all model entities (a simple monotonic integer).
The split in 2 revisions is not mandatory, just sayin' for good measure.
looks good (did not even notice toDict() is not even a recursive method! so this dict_nodes really makes no sense at all).
maybe stupid question, but why using dict as unique key (in many model classes)? Why not use a tuple? I mean it seems to me that such a UID should be usable as dict keys or in a set directly.
dates are not unique (ie. multiple visits can share a date, and they
do in practice); and visit statuses already use visit ids in their
can you please remove the "noise" added by arc in the commit message? An update it (still the previous option name in there).
Wed, Oct 7
Fri, Oct 2
Maybe starting a pad/hackmd document would be easier at this point?
this is debatable, but it does "normalize" the given url, so it does something. I agress the https:// auto-add prefix is strange, but the trailing / still brings value. For example there are listers that do not implement this, so if you create a listing task with url=https://somehere.org/api/v1 it will fail because it will forge invalid urls (missing the trailing /).
Thu, Oct 1
Wed, Sep 30
Listed (oneshot full + recurring incremental) and loaded (as far as I can tell).
Sep 29 2020
(just updated my swh-env, now I see where this diff comes from :-) )