status:
- scheduler v0.17.1 deployed on production [1] (db migrated) and staging.
- then swh-scheduler-journal-client service restarted.
status:
I bumped the priority since scheduler runners (next-gen) are depending on the results of journal client (scheduler metrics as well).
I've duplicated the credentials for the relevant forges, and updated the following instance names:
- Refactor a bit the journal client to update a docstring and inline one function (done, that'd be the 2 previous commits mentioned here just below that comment ^).
- Deactivate failing visits (delegating to listers the act of activating back those origins which gets live again). I have diffs which deal with this that needs some rebase and work according to latest change (I need to get back to it) [1].
- Deploy the current scheduler implementation (master) when that previous point is done. (That's gonna be my goal to reach prior to some vacation break).
including the next-gen scheduler runner not yet puppetized [4]
All got done except this part ^.
This needs first the following:
- D5809 to be rebased on latest master branch (v0.17)
- the saatchi venv (in swhscheduler home) to be updated with it
including the next-gen scheduler runner not yet puppetized [4]
Following actions in order:
Ensure the journal client is doing its new job
Deactivate failing visits (delegating to listers the act of activating back those
origins which gets live again). I have diffs which deal with this that needs some
rebase and work according to latest change (I need to get back to it) [1].
(^ for a while ;)
Status on this, after the recent refactoring we did with @olasd to simplify the actual
implementation (backend and journal client). There remains to:
It seems the remaining lister instances to process are the phabricator ones that also need credentials.
This is what we currently have in the listers table in scheduler database.
I've updated the listers with no credentials:
Updated stats in descending order on the no_last_update column:
Build is green
Build is green
Attend to the major part of the review (thx)
In D5977#154053, @ardumont wrote:
Looks good, thanks a lot!
Looks good, thanks a lot!
Build is green
Running through docker, i actually need to change a few things:
Build is green
Rebase
Build is green
Build is green
Adapt according to review:
Status on the latest development for this task, "Baseline for the recurrence of origin
visits" chapter has been implemented in the following stacked diffs (in review):
From my quick testing, I have noticed that the changelog methods only returns a limited number of results, so we would need to iterate calls.
I think the properly paginated way of doing with the current PyPI XMLRPC api is to call:
- changelog_last_serial() to get the highest serial, to be used as a termination condition. Currently returns 2168587
- changelog_since_serial(<current_serial>) in a loop until the last serial returned is higher than the one set as termination condition. Looks like this returns 50k results per call.
(this will make us miss the last few updates that happened since the lister started, but this is probably marginal).
In D5977#153851, @olasd wrote:Looking at the size of the changelog (2-ish million entries for 50k-ish pages means 50-ish requests), I /think/ the lister could always be running in incremental mode, rather than having to maintain two modes in the long run.
From my quick testing, I have noticed that the changelog methods only returns a limited number of results, so we would need to iterate calls.
Nice, that lister should perform better once that feature deployed to production.
Build is green
Adapt according to judicious remarks (thanks ;)
Thanks!
Nice, that lister should perform better once that feature deployed to production. I added a first batch of inline comments.
Build is green
Summary of the data available in the listed_origins table, broken down by lister and "known state" of origins:
That's all been working consistently for months now, closing!