fine for me
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Sep 24 2021
Sep 23 2021
Sep 22 2021
In D6316#163800, @douardda wrote:In D6316#163794, @douardda wrote:You may use fcntl.flock for this
I mean using an empty (lock) file in the opam_root directory.
In D6316#163794, @douardda wrote:You may use fcntl.flock for this
In D6316#163766, @ardumont wrote:Following what i said in the loader diff, i'm actually closing this.
Ack on the lock folder but i won't attend to it immediately.[1] D6318
As i was wrong in my implementation of the loader implementation and @aleo made me realize, i've fixed it.
So now that lister diff becomes relevant again, so claimed it back.I think there was already a problem before, but since we have now more chance to hit it, I'd really like the opam_init process to lock the directory when running opam commands.
It's a great idea but i've no idea how to actually do that though.
Maybe adding --safe flag [1] during the command that actually list the packages would be enough instead.
I've actually added that for the loader [2] (for the command that also read information)[1]
--safe, --readonly Make sure nothing will be automatically updated or rewritten. Useful for calling from completion scripts, for example. Will fail whenever such an operation is needed ; also avoids waiting for locks, skips interactive questions and overrides the $OPAMDEBUG variable. This is equivalent to set environment variable $OPAMSAFE.[2] D6318
LGTM (not checked everything is accurate nor there are obvious missing services, but it's a huge improvement as is, thx)
In T1805#45984, @vlorentz wrote:Items 5, 6, 7 aka pagination, auth and batches - I believe these come naturally with item 4 (specification wise)
They don't. OpenAPI is a specification to describe APIs, and it contains absolutely nothing about pagination or batches.
I think there was already a problem before, but since we have now more chance to hit it, I'd really like the opam_init process to lock the directory when running opam commands.
It would be nice to have a README fil in swh/lister/maven/tests/data explaining what the data files are, where they come from, how they have been generated, etc.
In D6165#163629, @vlorentz wrote:What is the reason for this change? Is it more efficient assign requests to workers based on ID rather than randomly?
Sep 21 2021
some more :-)
Sep 20 2021
LGTM, but how is the new opam_root option expected to be set (in production I mean)?
I'm not done yet but here is first review on my side.
not useful as a dedicated task, see T1805 for the main discussion one on this subject
I don't understand what exactly is (not) tested here. What does "anomad-d" stand for BTW?
Sep 16 2021
fix indentation (tab->ws) and a few typos
Sep 14 2021
closed by 94be817f869409c64415b181824071d2998e33d5
closed by a3c1f39013bae1a6982140d51d8bb443dc1b5c9c
Keep port 5092 exposed on host
Sep 13 2021
Add a bit of documentation in the README file on how to consume kafka from the host
It's not worth the trouble, and there is a better solution (server-side)
In D6234#161606, @vlorentz wrote:You could also add a command in swh-dataset's entrypoint.sh that calls whatever Kafka's script does
In D6234#161506, @vlorentz wrote:In D6234#161491, @douardda wrote:So either I kill this diff or it stays "intricate" with the setup of the consumer (so the whole journalprocessor.py)
Note: this feature is mainly useful for testing purpose IMHO, so I suppose it's not that critical to keep it, I just find it handy when "playing" with swh dataset export
Meh. How much easier does it make testing, compared to using Kafka's CLI (from the linked comment)?
rebase
in favor of D6247 because phab/arcanist won't let me update this later any more (sorry)
attempt to trick phab/arcanist
rebase
Rebase (remove D6234 from dependencies)
In D6234#161331, @douardda wrote:In D6234#161233, @vlorentz wrote:Can we keep the reset stuff outside the journalprocessor.py logic? It's already complex enough
I'll give it a try
Sep 10 2021
rebase, fix typos, squash revisions
rebase and fix --reset help messsage
rebase
Add an explicit "skipped" message if a nothin is to be consumed for a topic
In D6234#161233, @vlorentz wrote:Can we keep the reset stuff outside the journalprocessor.py logic? It's already complex enough
In D6235#161311, @vlorentz wrote:lags reported by cmak was completely inconsistent
only because you have a small dataset, right?
With a larger one, the last batch of each partition should have a negligeable size.
In D6235#161236, @vlorentz wrote:There's a bunch of typos in your commit/diff msg: "wich", "oef", "ony", "ALL offsets that needs to be", "stash" -> "squash"
this is necessary to ensure these messages are committed in kafka,
otherwise, since the (considered) empty partition is unsubscribed from,
it never gets committed in JournalClient.handle_messages() (since this
later only commit assigned partitions).Why is this a problem?
Sep 9 2021
add forgotten revision: Reduce the size of the progress bar
Please use imperative style in the got commit message
https://chris.beams.io/posts/git-commit/
Sep 3 2021
Sep 1 2021
do we need the "list of forks" if we keep the "fork of what"? I mean these are the 2 ends of the fork relation, right?
Aug 30 2021
yes the idea is to have a beefy enough machine to perform full-size experiments on, that can then be (part of) the production infrastructure dedicated to the provenance index.
Aug 13 2021
In D6084#157322, @aeviso wrote:For the fix of revision_get, there should be a test.
The test is coming later from @jayeshv mongodb branch.
Please don't mix fixes with codestyling/renaming revisions in a single diff, it makes the review much harder.
And we could also use zfs-backed thin provisionning for the / of workers to save storage space (and possibly help to ensure consistency of deployed workers... not extra convinced of this later point)
In T3444#68653, @vlorentz wrote:but that requires some more storage on hypervisors we currently don't have
Don't the hypervisors also serve as OSDs? We could just get a disk per hypervisor (partially?) out of the ceph cluster and use it for the workers' /tmp, or even their whole disk.
but anyway, it looks fine to me
one other improvement may be to modify a bit the profile of the workers (to reduce the load on the ceph cluster):
- lower the replication factor for workers' volumes (or even use local storage, but that requires some more storage on hypervisors we currently don't have),
- (probably not very relevant but) stop having swap on workers (since this swap end up being on the ceph volume, so replicated etc.) (oh this has been done already, good)
Aug 12 2021
In D6071#157080, @aeviso wrote:
- the use of newly introduced as_dict() methods seems unrelated here; unless I'm mistaken, the purpose if this change is better assertion reports by pytest on failure; if so, it should be presented as this in a dedicated revision
This method is only used for test purposes but it doesn't make sense without the refactoring (the complete HistoryGraph class was not even present prior to the refactoring),