The test of new version v0.1.4 including the fix on the the range split, the uid change and the incremental task fix is ok.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Sep 10 2020
LGTM
Tested in the docker-environment, the problem is not reproduced anymore with 5 concurrent listers.
Sep 9 2020
The concurrency issue was reproduced locally on the docker environment with a concurrency of 5.
I have tested to create a list-gitea-incremental task but it fails to but this time with another exception relative to an unexpected "sort" parameter : https://sentry.softwareheritage.org/share/issue/b0119b56f24347bcb58ac28c68685c62/
the configuration is deployed and the listers were restarted.
For info, on my desktop with the docker environment, with a limit of 100, the lister takes 3s to list the complete codeberg forge :
swh-lister_1 | [2020-09-08 18:33:19,259: INFO/ForkPoolWorker-1] Task swh.lister.gitea.tasks.RangeGiteaLister[363e0b30-b13a-4f62-bd31-9847dfe62450] succeeded in 3.7196799100056523s: {'status': 'eventful'}
The task ran in 30mn (1887s):
Sep 08 13:45:34 worker1 python3[237586]: [2020-09-08 13:45:34,851: INFO/ForkPoolWorker-4] Task swh.lister.launchpad.tasks.FullLaunchpadLister[73e298be-aeda-4882-b52d-dfe5a2ec316c] succeeded in 1887.75128286588s: {'status': 'eventful'}
- The data model does't need to be created because it was already done in T2358
- The task is created :
swhscheduler@scheduler0:~$ swh scheduler --config-file /etc/softwareheritage/scheduler.yml task add --policy oneshot list-gitea-full url=https://codeberg.org/api/v1/ limit=100 WARNING:swh.core.cli:Could not load subcommand storage: No module named 'swh.journal' INFO:swh.core.config:Loading config file /etc/softwareheritage/scheduler.yml Created 1 tasks
Sep 8 2020
- task-type registered :
swhscheduler@scheduler0:/etc/softwareheritage/backend$ swh scheduler --config-file /etc/softwareheritage/scheduler.yml task-type register -p lister.gitea WARNING:swh.core.cli:Could not load subcommand storage: No module named 'swh.journal' INFO:swh.core.config:Loading config file /etc/softwareheritage/scheduler.yml INFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.gitea INFO:swh.scheduler.cli.task_type:Create task type list-gitea-full in scheduler INFO:swh.scheduler.cli.task_type:Create task type list-gitea-incremental in scheduler
fix mix with launchpad tasks
The launchpad lister (v0.1.2) is deployed and running on staging
A parameter was missing in the call :
swhscheduler@scheduler0:~$ swh scheduler --config-file /etc/softwareheritage/scheduler.yml task-type list
Octocatalog-diff test :
➜ puppet-environment git:(master) ✗ bin/octocatalog-diff --octocatalog-diff-args --no-truncate-details --to arcpatch_D3884 worker0.internal.staging.swh.network Found host worker0.internal.staging.swh.network WARN -> Environment "arcpatch-D3884" contained non-word characters, correcting name to arcpatch_D3884 Cloning into '/tmp/swh-ocd.e8lQMoZ0/environments/production/data/private'... done. Cloning into '/tmp/swh-ocd.e8lQMoZ0/environments/arcpatch_D3884/data/private'... done. *** Running octocatalog-diff on host worker0.internal.staging.swh.network I, [2020-09-08T10:45:12.720125 #4652] INFO -- : Catalogs compiled for worker0.internal.staging.swh.network I, [2020-09-08T10:45:13.707209 #4652] INFO -- : Diffs computed for worker0.internal.staging.swh.network diff origin/production/worker0.internal.staging.swh.network current/worker0.internal.staging.swh.network ******************************************* File[/etc/softwareheritage/lister.yml] => parameters => content => @@ -24,4 +24,6 @@ - swh.lister.gitlab.tasks.FullGitLabRelister - swh.lister.gnu.tasks.GNUListerTask + - swh.lister.launchpad.tasks.FullLaunchpadLister + - swh.lister.launchpad.tasks.IncrementalLaunchpadLister - swh.lister.npm.tasks.NpmListerTask - swh.lister.phabricator.tasks.FullPhabricatorLister ******************************************* *** End octocatalog-diff on worker0.internal.staging.swh.network
Sep 4 2020
Wikimedia is using netbox as the source of trust in their infrastructure and puppet is configuring the facts from it. It's not exactly the same use case we want as we would like to have netbox automatically provisioned.
and their documentation : https://wikitech.wikimedia.org/wiki/Netbox
A docker-compose is available to easily test netbox : https://github.com/netbox-community/netbox-docker
This is the puppet configuration used at wikimedia : https://gerrit.wikimedia.org/r/c/operations/puppet/+/387880/
Sep 2 2020
Thanks @ardumont , I don't have a task for this one but it's good to know ;)
fix a typo