- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Oct 19 2022
Oct 18 2022
Apr 29 2022
Fixed in D7718
Apr 27 2022
No longer happens with a more recent stack
Mar 3 2022
Feb 7 2022
In T3890#78234, @vlorentz wrote:
Feb 3 2022
Looks like we are going to keep the status quo in the short term, ie. a numeric offset for old objects, and offset_bytes for new objects without renaming.
Jan 27 2022
yeah!
See T3893 instead.
Jan 26 2022
Nov 22 2021
There is no azure kafka cluster anymore...
Sep 8 2021
metadata searches are now done in Elasticsearch since the deployment of T3433
Sep 3 2021
Aug 30 2021
Aug 26 2021
I think so, thanks
@vlorentz should we close this one?
Aug 25 2021
status.io incident closed
Save code now requests rescheduled:
swh-web=> select * from save_origin_request where loading_task_status='scheduled' limit 100; ... <output loast due to the psql pager :( ...
softwareheritage-scheduler=> select * from task where id in (398244739, 398244740, 398244742, 398244744, 398244745, 398244748, 398095676, 397470401, 397470402, 397470404, 397470399);
few minutes later:
swh-web=> select * from save_origin_request where loading_task_status='scheduled' limit 100; id | request_date | visit_type | origin_url | status | loading_task_id | visit_date | loading_task_status | visit_status | user_ids ----+--------------+------------+------------+--------+-----------------+------------+---------------------+--------------+---------- (0 rows)
- all the workers are restarted
- Several save code now requests look stuck in the scheduled status, currently looking how to unblock them
D6130 landed and applied one kafka at a time
ok roger that :).
I will increase to 524288 in the diff
The kafka servers are only running kafka and zookeeper, so the limit of open files isn't that critical. I think we can bump the limit more substantially than just x2 (maybe go directly with x8?), as I expect we'll still be adding more topics in the future.
all the loaders are restarted on worker01 and workers02, it seems the cluster is ok.
The open file limit was manually increased to stabilize the cluster:
# puppet agent --disable T3501 # diff -U3 /tmp/kafka.service kafka.service --- /tmp/kafka.service 2021-08-25 07:32:28.068928972 +0000 +++ kafka.service 2021-08-25 07:32:31.384955246 +0000 @@ -15,7 +15,7 @@ Environment='LOG_DIR=/var/log/kafka' Type=simple ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties -LimitNOFILE=65536 +LimitNOFILE=131072
- Incident created on status.io
- Loader disabled:
root@pergamon:~# clush -b -w @swh-workers 'puppet agent --disable "Kafka incident T3501"; systemctl stop cron; cd /etc/systemd/system/multi-user.target.wants; for unit in swh-worker@loader_*; do systemctl disable $unit; done; systemctl stop "swh-worker@loader_*"'
Jun 11 2021
Jun 8 2021
May 4 2021
If you face this issue, try restarting the containers using docker-compose down and docker-compose up.
Apr 21 2021
Note that none of their parent revisions can be found either in the archive (one invalid revision in a set of ingested revisions prevent any of them being inserted in the database I suppose, but they are already inserted in kafka at this moment).
Apr 20 2021
If we replaced the whole code with just this:
Apr 19 2021
Do you some more tests or this task can be declared as resolved?
So D5246 has landed a while ago. The s3 object copy process has now caught up on some partitions and I can confirm that the copy of the latest added objects happens without any race condition.
Apr 6 2021
Pass an object without `unique_key` and check it does raise an exception
Apr 4 2021
Hey @vlorentz
How do I check https://forge.softwareheritage.org/source/swh-journal/browse/master/swh/journal/writer/inmemory.py$31. Do I have to pass dummy content, raw_extrinsic_metadata, origin_visit, et cetera as the object_ to write_addition function and before passing verify if they have unique_key function implemented ?
Apr 1 2021
The journal client supports dynamic configuration via kwargs so no there is no need to improve it.
Mar 24 2021
Mar 15 2021
Mar 5 2021
I forgot one step, cleaning the previous alias origin -> origin_production not needed anymore:
vsellier@search-esnode1 ~ % curl -s http://$ES_SERVER/_cat/indices\?v && echo && curl -s http://$ES_SERVER/_cat/aliases\?v && echo && curl -s http://$ES_SERVER/_cat/health\?v health status index uuid pri rep docs.count docs.deleted store.size pri.store.size green open origin-production hZfuv0lVRImjOjO_rYgDzg 90 1 153130652 26701625 273.4gb 137.3gb
awesome
The new configuration is deployed, swh-search is now using the alias which should help for the future upgrades
Deployment in production:
- puppet stopped
- configuration updated to declare the index, it needs to be done to make swh-search initializing the aliaes before the journal clients starts (not guaranteed with a puppet apply)
- package updated
- gunicorn-swh-search service restarted:
Mar 05 09:08:46 search1 python3[1881743]: 2021-03-05 09:08:46 [1881743] gunicorn.error:INFO Starting gunicorn 19.9.0 Mar 05 09:08:46 search1 python3[1881743]: 2021-03-05 09:08:46 [1881743] gunicorn.error:INFO Listening at: unix:/run/gunicorn/swh-search/gunicorn.sock (1881743) Mar 05 09:08:46 search1 python3[1881743]: 2021-03-05 09:08:46 [1881743] gunicorn.error:INFO Using worker: sync Mar 05 09:08:46 search1 python3[1881748]: 2021-03-05 09:08:46 [1881748] gunicorn.error:INFO Booting worker with pid: 1881748 Mar 05 09:08:46 search1 python3[1881749]: 2021-03-05 09:08:46 [1881749] gunicorn.error:INFO Booting worker with pid: 1881749 Mar 05 09:08:46 search1 python3[1881750]: 2021-03-05 09:08:46 [1881750] gunicorn.error:INFO Booting worker with pid: 1881750 Mar 05 09:08:46 search1 python3[1881751]: 2021-03-05 09:08:46 [1881751] gunicorn.error:INFO Booting worker with pid: 1881751 Mar 05 09:08:53 search1 python3[1881750]: 2021-03-05 09:08:53 [1881750] swh.search.api.server:INFO Initializing indexes with configuration: Mar 05 09:08:53 search1 python3[1881750]: 2021-03-05 09:08:53 [1881750] elasticsearch:INFO HEAD http://search-esnode2.internal.softwareheritage.org:9200/origin-production [status:200 request:0.023s] Mar 05 09:08:54 search1 python3[1881750]: 2021-03-05 09:08:54 [1881750] elasticsearch:INFO PUT http://search-esnode1.internal.softwareheritage.org:9200/origin-production/_alias/origin-read [status:200 request:0.487s] Mar 05 09:08:54 search1 python3[1881750]: 2021-03-05 09:08:54 [1881750] elasticsearch:INFO PUT http://search-esnode3.internal.softwareheritage.org:9200/origin-production/_alias/origin-write [status:200 request:0.152s] Mar 05 09:08:54 search1 python3[1881750]: 2021-03-05 09:08:54 [1881750] elasticsearch:INFO PUT http://search-esnode1.internal.softwareheritage.org:9200/origin-production/_mapping [status:200 request:0.009s]
vsellier@search-esnode1 ~ % curl -s http://$ES_SERVER/_cat/indices\?v && echo && curl -s http://$ES_SERVER/_cat/aliases\?v && echo && curl -s http://$ES_SERVER/_cat/health\?v health status index uuid pri rep docs.count docs.deleted store.size pri.store.size green open origin-production hZfuv0lVRImjOjO_rYgDzg 90 1 153097672 144224208 288.1gb 149gb
Mar 4 2021
swh-search:v0.7.1 deployed in staging according to the defined plan.
The aliases are well created and used by the services
vsellier@search-esnode0 ~ % curl -XGET -H "Content-Type: application/json" http://192.168.130.80:9200/_cat/indices green open origin HthJj42xT5uO7w3Aoxzppw 80 0 929692 137147 4gb 4gb green close origin-backup-20210209-1736 P1CKjXW0QiWM5zlzX46-fg 80 0 green close origin-v0.5.0 SGplSaqPR_O9cPYU4ZsmdQ 80 0 vsellier@search-esnode0 ~ % curl -XGET -H "Content-Type: application/json" http://192.168.130.80:9200/_cat/aliases origin-read origin - - - - origin-write origin - - - -
Journal clients:
Mar 04 16:22:40 search0 swh[3598137]: INFO:elasticsearch:POST http://search-esnode0.internal.staging.swh.network:9200/origin-write/_bulk [status:200 request:0.013s] Mar 04 16:22:41 search0 swh[3598137]: INFO:elasticsearch:POST http://search-esnode0.internal.staging.swh.network:9200/origin-write/_bulk [status:200 request:0.012s]
Search:
Mar 04 15:40:20 search0 python3[3598040]: 2021-03-04 15:40:20 [3598040] swh.search.api.server:INFO Initializing indexes with configuration: Mar 04 15:40:20 search0 python3[3598040]: 2021-03-04 15:40:20 [3598040] elasticsearch:INFO HEAD http://search-esnode0.internal.staging.swh.network:9200/origin [status:200 request:0.005s] Mar 04 15:40:20 search0 python3[3598040]: 2021-03-04 15:40:20 [3598040] elasticsearch:INFO HEAD http://search-esnode0.internal.staging.swh.network:9200/origin-read/_alias [status:200 request:0.001s] Mar 04 15:40:20 search0 python3[3598040]: 2021-03-04 15:40:20 [3598040] elasticsearch:INFO HEAD http://search-esnode0.internal.staging.swh.network:9200/origin-write/_alias [status:200 request:0.001s] Mar 04 15:40:20 search0 python3[3598040]: 2021-03-04 15:40:20 [3598040] elasticsearch:INFO PUT http://search-esnode0.internal.staging.swh.network:9200/origin/_mapping [status:200 request:0.006s] Mar 04 16:19:27 search0 python3[3598042]: 2021-03-04 16:19:27 [3598042] elasticsearch:INFO GET http://search-esnode0.internal.staging.swh.network:9200/origin-read/_search?size=100 [status:200 request:0.076s]