- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
May 25 2021
May 20 2021
The basic installation with helm is simple for a mono server installation: https://rancher.com/docs/rancher/v2.5/en/installation/install-rancher-on-k8s/#install-the-rancher-helm-chart
for the status.swh.org point of view, status.io is providing some api endpoint to push metrics. It should be possible to add some metrics (up to 10 with our plan) to expose the behavior of the platform (daily/weekly and monthly statistics).
As a first step, we could expose the number of pending save code now requests and the number of origin visits to have some live data. An example of a status page with metrics : https://status.docker.com/
I'm working on a code snippet to test the integration feasibility/complexity.
great simplification ! thanks
LGTM
May 19 2021
After some hard time with vagrant internal and pergamon configuration, we finally have a puppet master working.
The collected resources are well detected and applied, for example here, with the logstash0's incinga resources :
Notice: /Stage[main]/Profile::Icinga2::Master/Icinga2::Object::Host[pergamon.softwareheritage.org]/Icinga2::Object[icinga2::object::Host::pergamon.softwareheritage.org]/Concat[/etc/icinga2/zones.d/master/pergamon.softwareheritage.org.conf]/File[/etc/icinga2/zones.d/master/pergamon.softwareheritage.org.conf]/ensure: defined content as '{md5}e98c7cafc5300df8101f591d1c7a708b' Info: Concat[/etc/icinga2/zones.d/master/pergamon.softwareheritage.org.conf]: Scheduling refresh of Class[Icinga2::Service] Notice: /Stage[main]/Profile::Grafana::Vhost/Icinga2::Object::Service[grafana http redirect on pergamon.softwareheritage.org]/Icinga2::Object[icinga2::object::Service::grafana http redirect on pergamon.softwareheritage.org]/Concat[/etc/icinga2/zones.d/master/exported-checks.conf]/File[/etc/icinga2/zones.d/master/exported-checks.conf]/content:
May 18 2021
and the build is green ;)
thanks for having investigated that
May 17 2021
May 12 2021
May 11 2021
May 10 2021
Theses errors will be caught by the alert created in T3222
The check is now active.
An alert will be raised by icinga if :
- logstash is not responding to the api call
- at least one error is detected when the logs are sent to elasticseach (ES responding, but an error is detected when the log is stored on the index).
update the commit message
fix indentation
May 8 2021
May 7 2021
According to the API (TIL the catalog can be requested like that), journal0 doesn't have the new plugins declared. So the check should be disabled as the filter is using this field
The new probe is deployed but nothing is displayed in icinga. Let's start a configuration debug session.
- fix inconsistency in check command naming
- remove the unecessary set option on the check script
network schema: reactivate the description of the firewalls' group
- rebase
- network schema:
- change the vlan order to be able to use only one gateway
- Adapt several labels
- rst: adatp according to the feebacks
May 6 2021
after searching how it can be integrated with the inciga checks, the simplest way I have found is to create a script that periodically query logstash to get the statistics and return this status in this cases:
- GREEN: neither non_retryable_failures, with_errors or failures fields founds on the json
- WARNING: failures field found
- CRITICAL: non_retryable_failures or with_errors field found
I have simulated different situations locally on the vagrant environment:
root@logstash0:~# curl -s http://localhost:9600/_node/stats/pipelines | jq '.pipelines.main.plugins.outputs' [ { "id": "c49a6902391a456022af4c89f0972781900d01d70cd5f312b292cb20c0d345eb", "documents": { "non_retryable_failures": 112, "successes": 103692 }, "events": { "out": 103804, "in": 103804, "duration_in_millis": 3529049 }, "name": "elasticsearch", "bulk_requests": { "responses": { "200": 2028 }, "failures": 3, "with_errors": 110, "successes": 1918 } } ]
- split procedure per firewall
- clarify the initial status before each section
- fix minor typos
upgrade done without any problem:
- CARP maintenance activated on pushkin -> glyptotek elected as primary
- pushkin upgrade done
- CARP maintenance deactivated on pushkin -> pushkin re-elected as primary
- nothing wrong detected after a safety period of 1 hour
- CARP maintenance mode activated on glyptotek to avoid an unexpected rebalance during the upgrade
- glyptotek upgrade done
- CARP maintenance mode deactivated on glyptotek
Actions performed:
- wwn-0x5000c500d5de652a(sdb) : new -> spare
- wwn-0x5000c500a22eed6f(sdh) : spare -> mirror
- wwn-0x5000c500d5dda886(sdc) : new -> mirror
The checks ran without detecting any bad block on the disk.
They can be added on the zfs pool again.
May 5 2021
changelog for the 21.1.5 version
Good day everyone,
rebase
And I think that I found how to fix the last remaining warning cited above.
good news ;)
thanks
completely remove the -e option
@anlambert I didn't succeed to work with link to the modules, but I found another way with a standard installation (i.e without the -e flag) we can force with a flag on the ci (cf D5681).
I really not sure if it's the good approach or not.
A full badblock test is launched on both disks:
root@storage1:~# badblocks -v -w -B -s -b 4096 /dev/sdb
root@storage1:~# badblocks -v -w -B -s -b 4096 /dev/sdc
thanks a lot @anlambert, I will look in that direction
Aparrently, the modules are well installed but the sphinx-apidoc is not detecting them due to the local installation
jenkins@4e5220b923d8:~/workspace/swh-environment/swh-docs$ .tox/sphinx-dev/bin/python3 -m pip list | grep swh swh.auth 0.5.4 /home/jenkins/workspace/swh-environment/swh-auth swh.core 0.13.2.dev1+g7d42035 /home/jenkins/workspace/swh-environment/swh-core swh.counters 0.7.1.dev1+g6a44a84 /home/jenkins/workspace/swh-environment/swh-counters swh.deposit 0.13.6 /home/jenkins/workspace/swh-environment/swh-deposit swh.docs 0.0.1.dev334+g044cb9b.d20210505 swh.fuse 1.0.3 /home/jenkins/workspace/swh-environment/swh-fuse swh.graph 0.3.2.dev3+g62c2fd3 /home/jenkins/workspace/swh-environment/swh-graph swh.icinga-plugins 0.3.1.dev1+g8878925 /home/jenkins/workspace/swh-environment/swh-icinga-plugins swh.indexer 0.7.1.dev4+g8f1fb0f /home/jenkins/workspace/swh-environment/swh-indexer swh.journal 0.7.2.dev8+g2972c7a /home/jenkins/workspace/swh-environment/swh-journal swh.lister 1.1.0 /home/jenkins/workspace/swh-environment/swh-lister swh.loader.core 0.22.1.dev2+g0e4bb4b /home/jenkins/workspace/swh-environment/swh-loader-core swh.loader.git 0.9.2.dev1+g15e12fa /home/jenkins/workspace/swh-environment/swh-loader-git swh.loader.mercurial 0.5.1.dev4+g8884714 /home/jenkins/workspace/swh-environment/swh-loader-mercurial swh.loader.svn 0.7.1 /home/jenkins/workspace/swh-environment/swh-loader-svn swh.model 2.4.2.dev1+gdf036ef /home/jenkins/workspace/swh-environment/swh-model swh.objstorage 0.2.3 /home/jenkins/workspace/swh-environment/swh-objstorage swh.objstorage.replayer 0.2.2 /home/jenkins/workspace/swh-environment/swh-objstorage-replayer swh.scanner 0.4.2.dev1+g30b40cc /home/jenkins/workspace/swh-environment/swh-scanner swh.scheduler 0.13.1.dev5+gbab557e /home/jenkins/workspace/swh-environment/swh-scheduler swh.search 0.8.1 /home/jenkins/workspace/swh-environment/swh-search swh.storage 0.27.5.dev4+g051b7715 /home/jenkins/workspace/swh-environment/swh-storage swh.vault 0.5.2.dev2+gf87dd54 /home/jenkins/workspace/swh-environment/swh-vault swh.web 0.0.307.dev3+gf8c750b6 /home/jenkins/workspace/swh-environment/swh-web swh.web.client 0.3.1.dev1+g4b610ad /home/jenkins/workspace/swh-environment/swh-web-client
sphinx-apidoc \ --implicit-namespaces \ --templatedir=../swh/docs/templates/ \ --maxdepth=3 \ --ext-viewcode --separate \ -o apidoc \ /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh \ /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/*/tests /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/*/tests/* /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/*/*/tests/* /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/*/*/*/tests/* /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/*/migrations /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/*/migrations/* /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/*/*/migrations/* /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/*/*/*/migrations/* /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/*/wsgi.py /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/*/*/wsgi.py /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/*/*/*/wsgi.py /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/*/*/*/wsgi.py /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/deposit/settings/* /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/web/settings/* /home/jenkins/workspace/swh-environment/swh-docs/.tox/sphinx-dev/lib/python3.7/site-packages/swh/dataset/* Creating file apidoc/swh.rst. Creating file apidoc/swh.docs.rst. Creating file apidoc/swh.docs.django_settings.rst. Creating file apidoc/swh.docs.sphinx.rst. Creating file apidoc/swh.docs.sphinx.conf.rst. Creating file apidoc/swh.docs.sphinx.view_in_phabricator.rst. Creating file apidoc/modules.rst.
The disk were replaced by Christophe.
Apparently, the led of one of the disk is still on, so they need to be switched off:
root@storage1:~# ls /dev/sd* | grep -e "[a-z]$" | xargs -n1 -t -i{} ledctl normal={} ledctl normal=/dev/sda ledctl normal=/dev/sdb ledctl normal=/dev/sdc ledctl normal=/dev/sdd ledctl normal=/dev/sde ledctl normal=/dev/sdf ledctl normal=/dev/sdg ledctl normal=/dev/sdh ledctl normal=/dev/sdi ledctl normal=/dev/sdj ledctl normal=/dev/sdk ledctl normal=/dev/sdl ledctl normal=/dev/sdm ledctl normal=/dev/sdn