A draft note to send to the #swh-devel ml is been drafted [1]
Open as draft for review first.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
All Stories
Oct 2 2021
Oct 1 2021
- "nonce" header is *after* gpgsig
- double "author" field in the original, and another commit with three "committer"....
- "mergetag" headers with an extra newline at the end (current versions of the loader strip it, looks like older ones didn't)
- "author xxx <yyy@gmail.com> <type 'int'> -0200" in original commit (dulwich obviously can't parse this)
[3] Another idea that was only discussed would be to make certain we first start by
ingesting in order tag references (under the assumption that we will then ingest mostly
in natural order the repository). Then focus on the remaining references (because mostly
there is a high probability that if we start with HEAD and/or master at firstz, we will
end up with the overall repository).
D6377 actually increased the memory footprint to the point of getting ingestion killed
fast. So closed!
It resolved itself, it's green again.
SWH I guess: I don't see the difference whether it's embedded in swh-objstorage, winery or a dedicated package.
Build is green
Thanks!
intermediary status:
- the bench lab is easily deployable on g5k on several workers to distribute the load [1]
- it's working well when the load is not so high. When the number of worker is increased, it seems the workers have some issues to talk with rabbitmq:
[loaders-77cdd444df-flcv9 loaders] [2021-09-30 23:46:55,449: INFO/MainProcess] missed heartbeat from celery@loaders-77cdd444df-p9ds5 [loaders-77cdd444df-flcv9 loaders] [2021-09-30 23:46:55,449: INFO/MainProcess] missed heartbeat from celery@loaders-77cdd444df-n6pvm [loaders-77cdd444df-flcv9 loaders] [2021-09-30 23:46:55,449: INFO/MainProcess] missed heartbeat from celery@loaders-77cdd444df-mrcjj [loaders-77cdd444df-flcv9 loaders] [2021-09-30 23:46:55,449: INFO/MainProcess] missed heartbeat from celery@loaders-77cdd444df-7bn4s [loaders-77cdd444df-flcv9 loaders] [2021-09-30 23:46:55,449: INFO/MainProcess] missed heartbeat from celery@loaders-77cdd444df-lg2bd
and also an unexplained time drift:
[loaders-77cdd444df-flcv9 loaders] [2021-09-30 23:46:55,447: WARNING/MainProcess] Substantial drift from celery@loaders-77cdd444df-lxjpl may mean clocks are out of sync. Current drift is [loaders-77cdd444df-flcv9 loaders] 356 seconds. [orig: 2021-09-30 23:46:55.447181 recv: 2021-09-30 23:40:59.633444] [loaders-77cdd444df-flcv9 loaders] [loaders-77cdd444df-flcv9 loaders] [2021-09-30 23:46:55,447: WARNING/MainProcess] Substantial drift from celery@loaders-77cdd444df-jd6v9 may mean clocks are out of sync. Current drift is [loaders-77cdd444df-flcv9 loaders] 355 seconds. [orig: 2021-09-30 23:46:55.447552 recv: 2021-09-30 23:41:00.723983] [loaders-77cdd444df-flcv9 loaders]
Intermediary status:
- We have successfully ran loaders in staging using the helm chart we have wrote [1] and an hardcoded number of worker, It adds the possibility to perform rolling upgrades for example
- We have tried the integrated horizontal pod autoscaler [2], it works pretty well but it's not adapted for our worker scenario. It's based on the cpu consumption(on our test [3], but can be other things) of the pod to decide if the number of running pods must be upscaled or downscaled. It can be very useful to manage classical load like for gunicorn container, but not for the scenario of long running tasks
- Kubernetes also has some functionalities to reduce the pressure on a node when some limts are reached but it looks like it's more emergency actions than proper scaling management. It's configured at the kubelet level and not dynamic at all [4]. It was rapidly tested but we have lost the node due to oom before the node eviction starts.
Build is green
Build is green
rebased patch
Rebase
In D6390#165819, @vlorentz wrote:Oh, I forgot we already had the code for this. Thanks :)
Oh, I forgot we already had the code for this. Thanks :)
In T3104#71460, @dachary wrote:Wouldn't it make sense to put the cffi-based cmph wrapper in a dedicated python module/project (not necessarily under the swh namespace)?
It would but who would maintain it in the long run ?
IMHO This diff should be squashed in D6165 (it's really part of the work adding the rabbitmq-based backend).
as @olasd should be squashed, but meh
Build is green
In D6372#165729, @jayeshv wrote:Fixed the issue in timestamp comparison
Build is green
Look to me that this open/close interface really should come with a context manager.
Updated requirements
I still think it's best to use the wrapped function name as "method" but meh
Fixed the issue in timestamp comparison
Looks good to me !
Build is green
Adapt according to sound suggestion
Looks good to me, I added two nitpick comments for small improvements.
- September 2021 update
fine to me, @anlambert @vlorentz , thoughts?
Build is green
Build is green
Rebase
Update from_disk module as well