- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Sep 14 2021
closed by 94be817f869409c64415b181824071d2998e33d5
closed by a3c1f39013bae1a6982140d51d8bb443dc1b5c9c
Keep port 5092 exposed on host
Sep 13 2021
Add a bit of documentation in the README file on how to consume kafka from the host
It's not worth the trouble, and there is a better solution (server-side)
In D6234#161606, @vlorentz wrote:You could also add a command in swh-dataset's entrypoint.sh that calls whatever Kafka's script does
In D6234#161506, @vlorentz wrote:In D6234#161491, @douardda wrote:So either I kill this diff or it stays "intricate" with the setup of the consumer (so the whole journalprocessor.py)
Note: this feature is mainly useful for testing purpose IMHO, so I suppose it's not that critical to keep it, I just find it handy when "playing" with swh dataset export
Meh. How much easier does it make testing, compared to using Kafka's CLI (from the linked comment)?
rebase
in favor of D6247 because phab/arcanist won't let me update this later any more (sorry)
attempt to trick phab/arcanist
rebase
Rebase (remove D6234 from dependencies)
In D6234#161331, @douardda wrote:In D6234#161233, @vlorentz wrote:Can we keep the reset stuff outside the journalprocessor.py logic? It's already complex enough
I'll give it a try
Sep 10 2021
rebase, fix typos, squash revisions
rebase and fix --reset help messsage
rebase
Add an explicit "skipped" message if a nothin is to be consumed for a topic
In D6234#161233, @vlorentz wrote:Can we keep the reset stuff outside the journalprocessor.py logic? It's already complex enough
In D6235#161311, @vlorentz wrote:lags reported by cmak was completely inconsistent
only because you have a small dataset, right?
With a larger one, the last batch of each partition should have a negligeable size.
In D6235#161236, @vlorentz wrote:There's a bunch of typos in your commit/diff msg: "wich", "oef", "ony", "ALL offsets that needs to be", "stash" -> "squash"
this is necessary to ensure these messages are committed in kafka,
otherwise, since the (considered) empty partition is unsubscribed from,
it never gets committed in JournalClient.handle_messages() (since this
later only commit assigned partitions).Why is this a problem?
Sep 9 2021
add forgotten revision: Reduce the size of the progress bar
Please use imperative style in the got commit message
https://chris.beams.io/posts/git-commit/
Sep 3 2021
Sep 1 2021
do we need the "list of forks" if we keep the "fork of what"? I mean these are the 2 ends of the fork relation, right?
Aug 30 2021
yes the idea is to have a beefy enough machine to perform full-size experiments on, that can then be (part of) the production infrastructure dedicated to the provenance index.
Aug 13 2021
In D6084#157322, @aeviso wrote:For the fix of revision_get, there should be a test.
The test is coming later from @jayeshv mongodb branch.
Please don't mix fixes with codestyling/renaming revisions in a single diff, it makes the review much harder.
And we could also use zfs-backed thin provisionning for the / of workers to save storage space (and possibly help to ensure consistency of deployed workers... not extra convinced of this later point)
In T3444#68653, @vlorentz wrote:but that requires some more storage on hypervisors we currently don't have
Don't the hypervisors also serve as OSDs? We could just get a disk per hypervisor (partially?) out of the ceph cluster and use it for the workers' /tmp, or even their whole disk.
but anyway, it looks fine to me
one other improvement may be to modify a bit the profile of the workers (to reduce the load on the ceph cluster):
- lower the replication factor for workers' volumes (or even use local storage, but that requires some more storage on hypervisors we currently don't have),
- (probably not very relevant but) stop having swap on workers (since this swap end up being on the ceph volume, so replicated etc.) (oh this has been done already, good)
Aug 12 2021
In D6071#157080, @aeviso wrote:
- the use of newly introduced as_dict() methods seems unrelated here; unless I'm mistaken, the purpose if this change is better assertion reports by pytest on failure; if so, it should be presented as this in a dedicated revision
This method is only used for test purposes but it doesn't make sense without the refactoring (the complete HistoryGraph class was not even present prior to the refactoring),
Aug 11 2021
A few remarks:
Aug 10 2021
well this task should be closed, and a new subtask could be added for the alerting
unless I'm mistaken, this task can be closed now, it looks to have reached a steady state where the lag is near 0
Aug 9 2021
Aug 6 2021
I've been thinking a bit about the refactoring of the ProvenanceStorageServer as described in the doc, with a series of queues between the public API and the backend database.
typos
Aug 5 2021
there is a typo in the commit message
overall ok, but I'd like to see the comments about fixtures addressed first.
nice job, thx
Aug 2 2021
FTR I've tried to investigate a bit to find clues of what the origin of the outage was, but I did not find any obvious culprit.
Jul 30 2021
ok then
return itertools.chain([res], stream_results(f, page_token = res.page_token, **kwargs))
why not something like:
Jul 28 2021
rebase
rebase
rebase