- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Dec 7 2021
the dependency is already declared on the docker-compose, it just ensure the container is started before launching swh-web-cron, not the service is well initialized and responding.
thanks @anlambert for the tips, it works as expected
it's related to this change because if it's let to DEBUG, the access log is logged 2 times.
LGTM thanks
LGTM
on saatchi:
root@saatchi:/etc/systemd/system# rm -v swh-scheduler-updater* ssh-ghtorrent.service removed 'swh-scheduler-updater-consumer-ghtorrent.service' removed 'swh-scheduler-updater-writer.service' removed 'swh-scheduler-updater-writer.timer' removed 'ssh-ghtorrent.service'
on scheduler0:
root@scheduler0:/etc/systemd/system# systemctl stop ssh-ghtorrent root@scheduler0:/etc/systemd/system# systemctl disable ssh-ghtorrent root@scheduler0:/etc/systemd/system# systemctl stop swh-scheduler-updater* root@scheduler0:/etc/systemd/system# systemctl disable swh-scheduler-updater* root@scheduler0:/etc/systemd/system# rm -v swh-scheduler-updater* removed 'swh-scheduler-updater-consumer-ghtorrent.service' removed 'swh-scheduler-updater-writer.service' removed 'swh-scheduler-updater-writer.timer' root@scheduler0:/etc/systemd/system# rm ssh-ghtorrent.service removed 'ssh-ghtorrent.service' root@scheduler0:/etc/systemd/system# systemctl reset-failed
Version v0.21.0 deployed in staging:
root@scheduler0:~# apt list --upgradable 2>/dev/null | grep swh | cut -f1 -d'/' | xargs -t apt install apt install python3-swh.core python3-swh.counters python3-swh.journal python3-swh.lister python3-swh.loader.core python3-swh.model python3-swh.objstorage python3-swh.scheduler python3-swh.storage ... root@scheduler0:~# systemctl reload gunicorn-swh-scheduler.service
The problem is reproduced in staging before the deployment
swhworker@worker1:~$ swh lister -C /etc/softwareheritage/lister.yml run -l npm Traceback (most recent call last): File "/usr/bin/swh", line 11, in <module> load_entry_point('swh.core==0.15.0', 'console_scripts', 'swh')() File "/usr/lib/python3/dist-packages/swh/core/cli/__init__.py", line 185, in main return swh(auto_envvar_prefix="SWH") File "/usr/lib/python3/dist-packages/click/core.py", line 764, in __call__ return self.main(*args, **kwargs) File "/usr/lib/python3/dist-packages/click/core.py", line 717, in main rv = self.invoke(ctx) File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/lib/python3/dist-packages/click/core.py", line 1137, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/lib/python3/dist-packages/click/core.py", line 956, in invoke return ctx.invoke(self.callback, **ctx.params) File "/usr/lib/python3/dist-packages/click/core.py", line 555, in invoke return callback(*args, **kwargs) File "/usr/lib/python3/dist-packages/click/decorators.py", line 17, in new_func return f(get_current_context(), *args, **kwargs) File "/usr/lib/python3/dist-packages/swh/lister/cli.py", line 65, in run get_lister(lister, **config).run() File "/usr/lib/python3/dist-packages/swh/lister/pattern.py", line 130, in run full_stats.origins += self.send_origins(origins) File "/usr/lib/python3/dist-packages/swh/lister/pattern.py", line 234, in send_origins ret = self.scheduler.record_listed_origins(batch_origins) File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 181, in meth_ return self.post(meth._endpoint_path, post_data) File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 278, in post return self._decode_response(response) File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 354, in _decode_response self.raise_for_status(response) File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 344, in raise_for_status raise exception from None swh.core.api.RemoteException: <RemoteException 500 CardinalityViolation: ['ON CONFLICT DO UPDATE command cannot affect row a second time\nHINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.\n']>
Dec 6 2021
keep the dict comprehension version
The last run failed with this error:
Unstuck the task scheduling:
softwareheritage-scheduler=> begin; update task set next_run=now(), status='next_run_not_scheduled' where id=153874548; BEGIN UPDATE 1 softwareheritage-scheduler=*> commit; COMMIT
Dec 2 2021
The slide of the restrospective of the experiment are available at : https://hedgedoc.softwareheritage.org/VOP9qh1MTqm4DjPQfFgNbQ
It was not easy to know if it's a lot of call or long running calls because it's regular sample and we don't have this granularity.
Dec 1 2021
Nov 29 2021
LGTM
Nov 26 2021
All the production nodes (`search-esnode[4-6]') are upgraded to bullseye
All the nodes are updated
- search-esnode0 updated without errors
Nov 25 2021
The upgrade by itself was made with the same command as explained previously.
The shard allocation was disabled during the process to avoid unnecessary movements of shard in the cluster
- The server was updated using the same procedure used for logstash0
- there is no error detected when puppet is running
- all the services are correctly started
- logstash0 upgrade
Nov 24 2021
production nodes are upgraded :
- stop the journal clients:
root@search1:~# systemctl stop swh-search-journal-client@indexed root@search1:~# systemctl stop swh-search-journal-client@objects
- flush the index to speedup the recovery
curl -XPOST http://search-esnode4:9200/_flush
For each node :
- disable shard allocation:
cat > /tmp/shard_allocation.json <<EOF { "persistent": { "cluster.routing.allocation.enable": "primaries" } } EOF
rebase
fix a typo in the commit message