priority set as normal as it happens only in staging and probably relative to the elasticsearch configuration in this environment
Sentry link: https://sentry.softwareheritage.org/share/issue/bb5a04156b8b4b1696a50cf8e24349d2/
priority set as normal as it happens only in staging and probably relative to the elasticsearch configuration in this environment
Sentry link: https://sentry.softwareheritage.org/share/issue/bb5a04156b8b4b1696a50cf8e24349d2/
looks like the server is short in heap
[2022-02-17T15:26:30,847][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5965188] overhead, spent [408ms] collecting in the last [1s] [2022-02-17T15:27:08,154][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5965225] overhead, spent [296ms] collecting in the last [1s] [2022-02-17T15:29:31,383][WARN ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][young][5965368][3283] duration [1s], collections [1]/[1.1s], total [1s]/[5.8m], memory [8.2gb]->[5.4gb]/[16gb], all_pools {[young] [2.8gb]->[0b]/[0b]}{[old] [4.7gb]->[5.3gb]/[16gb]}{[survivor] [652mb]->[184mb]/[0b]} [2022-02-17T15:29:31,384][WARN ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5965368] overhead, spent [1s] collecting in the last [1.1s] [2022-02-17T15:31:49,449][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5965506] overhead, spent [260ms] collecting in the last [1s] [2022-02-17T15:33:46,505][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5965623] overhead, spent [256ms] collecting in the last [1s] [2022-02-17T15:37:11,728][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5965828] overhead, spent [372ms] collecting in the last [1s] [2022-02-17T15:47:19,087][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5966435] overhead, spent [289ms] collecting in the last [1s] [2022-02-17T15:49:56,439][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5966592] overhead, spent [315ms] collecting in the last [1.1s] [2022-02-17T15:55:40,579][INFO ][o.e.m.j.JvmGcMonitorService] [search-esnode0] [gc][5966936] overhead, spent [274ms] collecting in the last [1s]
first, clean the unused resources, even if it will not free a lot of resources:
vsellier@search-esnode0 ~ % export ES_SERVER=192.168.130.80:9200 vsellier@search-esnode0 ~ % curl -XGET http://$ES_SERVER/_cat/aliases origin-read origin-v0.11 - - - - origin-write origin-v0.11 - - - - origin-v0.9.0-read origin-v0.9.0 - - - - origin-v0.9.0-write origin-v0.9.0 - - - - vsellier@search-esnode0 ~ % curl -XDELETE http://$ES_SERVER/origin-v0.9.0/_alias/origin-v0.9.0-read {"acknowledged":true}% vsellier@search-esnode0 ~ % curl -XDELETE -H "Content-Type: application/json" http://$ES_SERVER/origin-v0.9.0/_alias/origin-v0.9.0-write {"acknowledged":true}%
vsellier@search-esnode0 ~ % curl http://$ES_SERVER/_cat/indices green open origin-v0.11 HljzsdD9SmKI7-8ekB_q3Q 80 0 4206243 569646 4.2gb 4.2gb green close origin HthJj42xT5uO7w3Aoxzppw 80 0 green close origin-v0.9.0 o7FiYJWnTkOViKiAdCXCuA 80 0 green close origin-v0.10.0 -fvf4hK9QDeN8qYTJBBlxQ 80 0 green close origin-backup-20210209-1736 P1CKjXW0QiWM5zlzX46-fg 80 0 green close origin-v0.5.0 SGplSaqPR_O9cPYU4ZsmdQ 80 0
search-esnode0 ~ % df -h /srv/elasticsearch Filesystem Size Used Avail Use% Mounted on /dev/mapper/base--template--vg-root 17G 5.2G 11G 33% / tmpfs 16G 0 16G 0% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock /dev/vda1 236M 123M 101M 55% /boot elasticsearch-volume 194G 12G 182G 7% /srv/elasticsearch tmpfs 3.1G 0 3.1G 0% /run/user/1025
vsellier@search-esnode0 ~ % curl -XDELETE http://$ES_SERVER/origin {"acknowledged":true}% vsellier@search-esnode0 ~ % curl -XDELETE http://$ES_SERVER/origin-v0.9.0 {"acknowledged":true}% vsellier@search-esnode0 ~ % curl -XDELETE http://$ES_SERVER/origin-v0.10.0 {"acknowledged":true}% vsellier@search-esnode0 ~ % curl -XDELETE http://$ES_SERVER/origin-backup-20210209-1736 {"acknowledged":true}% vsellier@search-esnode0 ~ % curl -XDELETE http://$ES_SERVER/origin-v0.5.0 {"acknowledged":true}%
search-esnode0 ~ % df -h /srv/elasticsearch Filesystem Size Used Avail Use% Mounted on elasticsearch-volume 194G 4.4G 189G 3% /srv/elasticsearch
Elastisearch was restarted and the sentry issues closed.
Let's monitor if the gcs are coming coming again
The index managed by this server is not so big (~4.2gb) so the 16Go of the vms should be enough.
It's a lot less compared to the production cluster (~330gb per node) which doesn't have this gc problem.
After the elasticsearch restart, there is no more message relative to any gc overhead in the logs but there were a couple of timeouts during the night.
Further investigations are needed