Build has FAILED
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
All Stories
Dec 8 2018
Fix docstring typo
Are you sure these tests pass when the other Diff is merged?
Build has FAILED
- utils: Simplify random_block according to latest api change
Build is green
See https://jenkins.softwareheritage.org/job/DCORE/job/tox/98/ for more details.
- utils.grouper: Rename fillvalue to stop_value and fix docstring
- utils.grouper: Avoid potential bug of input data matching stop_value
Are you sure these tests pass when the other Diff is merged?
Dec 7 2018
Build has FAILED
Build is green
See https://jenkins.softwareheritage.org/job/DCORE/job/tox/64/ for more details.
Build is green
See https://jenkins.softwareheritage.org/job/DCIDX/job/tox/153/ for more details.
- rebase
Build has FAILED
- rebase
Build is green
See https://jenkins.softwareheritage.org/job/DCIDX/job/tox/151/ for more details.
- rebase
Build has FAILED
- squash
Build has FAILED
Munin metric | Comment | Prometheus metric combination | Prometheus comment |
Disk | |||
I/Os per device | node_disk_reads_completed_total; node_disk_writes_completed_total | Add derivative to get IOPS | |
Disk usage in percent (space) | (node_filesystem_size_bytes - node_filesystem_{avail,free}_bytes) / node_filesystem_size_bytes | avail = available to non-root, free = available to root (tune2fs -m / reserved-blocks-percentage) | |
Disk usage in percent (inodes) | (node_filesystem_files - node_filesystem_files_free) / node_filesystem_files | ||
Utilization per device | is this real ? it could be useful to see if a storage subsystem is overloaded | node_disk_io_time_seconds_total | total time spent in seconds doing IO on the specified device; AFAICT the derivative of this counter is what munin calls "utilization per device" |
node_disk_io_time_weighted_seconds_total | counts the number of seconds spent doing IO multiplied by the number of concurrent IO requests; maybe more relevant ? Docs: https://www.kernel.org/doc/Documentation/iostats.txt | ||
Disk usage in absolute human values. | percentages are meaningless if we resize filesystems | node_filesystem_size_bytes - node_filesystem_{avail,free}_bytes | avail = available to non-root, free = available to root |
Networking | |||
eth0 traffic | node_network_receive_bytes_total; node_network_transmit_bytes_total | derivative for bytes per second | |
node_network_receive_packets_total; node_network_transmit_packets_total | derivative for packets per second | ||
node_network_receive_errs_total; node_network_transmit_errs_total | alert if non-zero | ||
Database | |||
implemented with prometheus-sql-exporter | |||
Postgres replication lag | sql_pg_stat_replication{col=~'(send_lag_bytes,flush_lag_bytes,replay_lag_bytes)'} | replace commas with pipes... | |
Postgres database size | sql_pg_stat_database{col="dbsize"} | ||
Postgres oldest transaction | sql_pg_stat_activity{col="max_tx_duration"} | ||
Postgres oldest query | ? | ||
Postgres scan types (sequential / indexed) | sql_pg_stat_user_tables;sql_pg_statio_user_tables | ||
Postgres wal segments | sql_archive_ready; sql_pg_stat_archiver | use derivative of sql_pg_stat_archiver values to get archival rates | |
Postgres nb. of transactions | sql_txid | derivative to get tps | |
System | |||
CPU usage | node_cpu_seconds_total | use derivative for CPU usage | |
load average | node_load{1,5,15} | ||
Memory usage | node_memory_* | ||
Pending packages | XXX | needs to be implemented with the textfile collector (see /usr/share/doc/prometheus-node-exporter/examples/text_collector_examples/apt.sh) | |
Swap in/out | node_vmstat_pswpin; node_vmstat_pswpout | unit ?? probably absolute number of pages | |
Uptime | time() - node_boot_time_seconds | ||
RabbitMQ | |||
use https://github.com/kbudde/rabbitmq_exporter or https://github.com/deadtrickster/prometheus_rabbitmq_exporter | |||
Consumers | |||
Memory used by queue | |||
Unacknowledged messages | |||
Nb. of connections | |||
Softwareheritage (prado) | |||
Almost everything | integrate to sql-exporter configuration | ||
Most importantly Software Heritage Objects |
Build has FAILED
Build has FAILED
Build is green
See https://jenkins.softwareheritage.org/job/DCIDX/job/tox/146/ for more details.
- rebase
- Merge lines.
- add TODO
Build has FAILED
- remove useless function
Build has FAILED
- rebase