Page MenuHomeSoftware Heritage
Feed Advanced Search

Dec 13 2018

vlorentz added a comment to T1420: Identify easy tasks.

A search query to show candidates for this label: https://forge.softwareheritage.org/maniphest/query/Z8t6ZkjPI7JV/ (ie. it filters out high-priority tasks, tasks with someone already assigned to it, and sysadmin tasks)

Dec 13 2018, 1:46 PM · Documentation, Sprint 2018 12
vlorentz renamed T1420: Identify easy tasks from easy hacks to Identify easy tasks.
Dec 13 2018, 1:45 PM · Documentation, Sprint 2018 12
vlorentz added a project to T1428: Create an inventory of useful Munin metrics: Metrics/monitoring.
Dec 13 2018, 1:37 PM · Metrics/monitoring, Sprint 2018 12
vlorentz added a project to T1435: Improve swh-scheduler prometheus metrics: Metrics/monitoring.
Dec 13 2018, 1:37 PM · Metrics/monitoring, Sprint 2018 12
vlorentz added a project to T1438: Add labels to prometheus metrics to help queries: Metrics/monitoring.
Dec 13 2018, 1:36 PM · Metrics/monitoring, Sprint 2018 12
vlorentz added a project to T1408: More/better Metrics: Metrics/monitoring.
Dec 13 2018, 1:36 PM · Metrics/monitoring, Sprint 2018 12

Dec 12 2018

seirl added a comment to T1417: Mirroring.

Feature request from epol on IRC: being able to selectively listen only for the graph and not the object changes. (I guess this will be the case anyway, I'm just mentioning that feedback on the task.)

Dec 12 2018, 7:19 PM · Sprint 2018 12
ardumont added a member for Sprint 2018 12: ardumont.
Dec 12 2018, 3:46 PM
zack added a project to T1430: Add tests for the Debian loader: Debian loader.
Dec 12 2018, 3:39 PM · Debian loader, Sprint 2018 12
zack added a project to T1419: hg/svn support in save code now: Web app.
Dec 12 2018, 3:38 PM · Web app, Sprint 2018 12
zack renamed T1411: reach a minimum of 80% SLOC coverage across all components from 80% SLOC coverage to at least 80% SLOC coverage in all components.
Dec 12 2018, 3:38 PM · Development environment, Sprint 2018 12
zack moved T1436: Integrate swh-storage metrics in prometheus from Backlog to deployed on the Sprint 2018 12 board.
Dec 12 2018, 3:37 PM · Metrics/monitoring, Sprint 2018 12
zack moved T1422: drop swh-storage mocking everywhere from Backlog to deployed on the Sprint 2018 12 board.
Dec 12 2018, 3:37 PM · Sprint 2018 12
ardumont moved T1439: loader-mercurial: Improve coverage from done to deployed on the Sprint 2018 12 board.
Dec 12 2018, 2:58 PM · Sprint 2018 12
ardumont moved T1431: Add tests for the Tar loader. from done to deployed on the Sprint 2018 12 board.
Dec 12 2018, 2:58 PM · Sprint 2018 12
ardumont moved T1439: loader-mercurial: Improve coverage from in progress to done on the Sprint 2018 12 board.
Dec 12 2018, 2:58 PM · Sprint 2018 12
ardumont updated the task description for T1411: reach a minimum of 80% SLOC coverage across all components.
Dec 12 2018, 2:58 PM · Development environment, Sprint 2018 12
ardumont closed T1439: loader-mercurial: Improve coverage as Resolved.
Dec 12 2018, 2:53 PM · Sprint 2018 12
ardumont closed T1439: loader-mercurial: Improve coverage, a subtask of T1411: reach a minimum of 80% SLOC coverage across all components, as Resolved.
Dec 12 2018, 2:53 PM · Development environment, Sprint 2018 12
ardumont claimed T1439: loader-mercurial: Improve coverage.
Dec 12 2018, 10:49 AM · Sprint 2018 12
ardumont raised the priority of T1439: loader-mercurial: Improve coverage from Normal to High.
Dec 12 2018, 10:49 AM · Sprint 2018 12
ardumont triaged T1439: loader-mercurial: Improve coverage as Normal priority.
Dec 12 2018, 10:48 AM · Sprint 2018 12

Dec 11 2018

olasd added a comment to T1436: Integrate swh-storage metrics in prometheus.

Added draft dashboard: https://grafana.softwareheritage.org/d/3SAW_JEmk/software-heritage-archive-counters

Dec 11 2018, 5:45 PM · Metrics/monitoring, Sprint 2018 12
olasd closed T1436: Integrate swh-storage metrics in prometheus as Resolved by committing rSPSITE51b1d1c267c4: Use the proper column name for the value of the counters.
Dec 11 2018, 4:32 PM · Metrics/monitoring, Sprint 2018 12
olasd closed T1436: Integrate swh-storage metrics in prometheus, a subtask of T1408: More/better Metrics, as Resolved.
Dec 11 2018, 4:32 PM · Metrics/monitoring, Sprint 2018 12
olasd added a parent task for T1436: Integrate swh-storage metrics in prometheus: T1437: Rewrite the munin stats export for the website to use prometheus.
Dec 11 2018, 4:31 PM · Metrics/monitoring, Sprint 2018 12
olasd removed a subtask for T1436: Integrate swh-storage metrics in prometheus: T1437: Rewrite the munin stats export for the website to use prometheus.
Dec 11 2018, 4:31 PM · Metrics/monitoring, Sprint 2018 12
olasd triaged T1438: Add labels to prometheus metrics to help queries as High priority.
Dec 11 2018, 2:27 PM · Metrics/monitoring, Sprint 2018 12
olasd added a comment to T1414: Set up an inventory app.

As software is mostly declared in puppet, I think the main areas that could be improved would be

  • hardware inventory
  • network topology
  • puppet reports integration
Dec 11 2018, 2:11 PM · System administration, Sprint 2018 12
olasd added a parent task for T1436: Integrate swh-storage metrics in prometheus: T1434: Refactor prometheus SQL exporter configuration generation to use latest version from elephant shed.
Dec 11 2018, 1:42 PM · Metrics/monitoring, Sprint 2018 12
olasd added a subtask for T1436: Integrate swh-storage metrics in prometheus: T1437: Rewrite the munin stats export for the website to use prometheus.
Dec 11 2018, 1:17 PM · Metrics/monitoring, Sprint 2018 12
olasd added a parent task for T1436: Integrate swh-storage metrics in prometheus: T1355: Move the object counter from munin to prometheus.
Dec 11 2018, 1:16 PM · Metrics/monitoring, Sprint 2018 12
olasd triaged T1436: Integrate swh-storage metrics in prometheus as High priority.
Dec 11 2018, 1:15 PM · Metrics/monitoring, Sprint 2018 12
olasd triaged T1435: Improve swh-scheduler prometheus metrics as High priority.
Dec 11 2018, 1:14 PM · Metrics/monitoring, Sprint 2018 12

Dec 10 2018

olasd added a subtask for T1356: Kill munin: T1355: Move the object counter from munin to prometheus.
Dec 10 2018, 10:58 PM · Sprint 2018 12, System administration
ardumont updated the task description for T1411: reach a minimum of 80% SLOC coverage across all components.
Dec 10 2018, 2:01 PM · Development environment, Sprint 2018 12
ardumont closed T1431: Add tests for the Tar loader. as Resolved.
Dec 10 2018, 1:56 PM · Sprint 2018 12
ardumont closed T1431: Add tests for the Tar loader., a subtask of T1411: reach a minimum of 80% SLOC coverage across all components, as Resolved.
Dec 10 2018, 1:56 PM · Development environment, Sprint 2018 12
ardumont moved T1431: Add tests for the Tar loader. from in progress to done on the Sprint 2018 12 board.
Dec 10 2018, 1:55 PM · Sprint 2018 12
ardumont added a comment to T1431: Add tests for the Tar loader..

ardumont raised the priority of this task from Normal to High.

Dec 10 2018, 1:33 PM · Sprint 2018 12
ardumont claimed T1431: Add tests for the Tar loader..
Dec 10 2018, 1:19 PM · Sprint 2018 12
ardumont raised the priority of T1431: Add tests for the Tar loader. from Normal to High.
Dec 10 2018, 1:19 PM · Sprint 2018 12
ardumont added a revision to T1431: Add tests for the Tar loader.: D796: loader.tar: Cover the remote use cases in tests.
Dec 10 2018, 1:17 PM · Sprint 2018 12
vlorentz added a comment to T1415: Display Diff events on an IRC chan.

That won't capture all events, but the easiest solution is to use Jenkins' RSS: https://jenkins.softwareheritage.org/view/swh/rssAll

Dec 10 2018, 9:38 AM · Sprint 2018 12

Dec 7 2018

vlorentz moved T1421: drop swh-storage mocking everywhere from done to in progress on the Sprint 2018 12 board.
Dec 7 2018, 6:37 PM · Sprint 2018 12
ardumont updated the task description for T1411: reach a minimum of 80% SLOC coverage across all components.
Dec 7 2018, 6:36 PM · Development environment, Sprint 2018 12
ardumont updated the task description for T1411: reach a minimum of 80% SLOC coverage across all components.
Dec 7 2018, 6:35 PM · Development environment, Sprint 2018 12
vlorentz claimed T1426: Select a tool + define an architecture to run locally a full SWH instance.
Dec 7 2018, 6:20 PM · Sprint 2018 12
vlorentz moved T1426: Select a tool + define an architecture to run locally a full SWH instance from Backlog to in progress on the Sprint 2018 12 board.
Dec 7 2018, 6:19 PM · Sprint 2018 12
vlorentz moved T1428: Create an inventory of useful Munin metrics from Backlog to in progress on the Sprint 2018 12 board.
Dec 7 2018, 6:19 PM · Metrics/monitoring, Sprint 2018 12
vlorentz moved T1405: Make it easy to run a complete swh instance from Backlog to in progress on the Sprint 2018 12 board.
Dec 7 2018, 6:19 PM · Docker environment, Sprint 2018 12
vlorentz moved T1421: drop swh-storage mocking everywhere from in progress to done on the Sprint 2018 12 board.
Dec 7 2018, 6:19 PM · Sprint 2018 12
olasd added a comment to T1428: Create an inventory of useful Munin metrics.
Munin metricCommentPrometheus metric combinationPrometheus comment
Disk
I/Os per devicenode_disk_reads_completed_total; node_disk_writes_completed_totalAdd derivative to get IOPS
Disk usage in percent (space)(node_filesystem_size_bytes - node_filesystem_{avail,free}_bytes) / node_filesystem_size_bytesavail = available to non-root, free = available to root (tune2fs -m / reserved-blocks-percentage)
Disk usage in percent (inodes)(node_filesystem_files - node_filesystem_files_free) / node_filesystem_files
Utilization per deviceis this real ? it could be useful to see if a storage subsystem is overloadednode_disk_io_time_seconds_totaltotal time spent in seconds doing IO on the specified device; AFAICT the derivative of this counter is what munin calls "utilization per device"
node_disk_io_time_weighted_seconds_totalcounts the number of seconds spent doing IO multiplied by the number of concurrent IO requests; maybe more relevant ? Docs: https://www.kernel.org/doc/Documentation/iostats.txt
Disk usage in absolute human values.percentages are meaningless if we resize filesystemsnode_filesystem_size_bytes - node_filesystem_{avail,free}_bytesavail = available to non-root, free = available to root
Networking
eth0 trafficnode_network_receive_bytes_total; node_network_transmit_bytes_totalderivative for bytes per second
node_network_receive_packets_total; node_network_transmit_packets_totalderivative for packets per second
node_network_receive_errs_total; node_network_transmit_errs_totalalert if non-zero
Database
implemented with prometheus-sql-exporter
Postgres replication lagsql_pg_stat_replication{col=~'(send_lag_bytes,flush_lag_bytes,replay_lag_bytes)'}replace commas with pipes...
Postgres database sizesql_pg_stat_database{col="dbsize"}
Postgres oldest transactionsql_pg_stat_activity{col="max_tx_duration"}
Postgres oldest query?
Postgres scan types (sequential / indexed)sql_pg_stat_user_tables;sql_pg_statio_user_tables
Postgres wal segmentssql_archive_ready; sql_pg_stat_archiveruse derivative of sql_pg_stat_archiver values to get archival rates
Postgres nb. of transactionssql_txidderivative to get tps
System
CPU usagenode_cpu_seconds_totaluse derivative for CPU usage
load averagenode_load{1,5,15}
Memory usagenode_memory_*
Pending packagesXXXneeds to be implemented with the textfile collector (see /usr/share/doc/prometheus-node-exporter/examples/text_collector_examples/apt.sh)
Swap in/outnode_vmstat_pswpin; node_vmstat_pswpoutunit ?? probably absolute number of pages
Uptimetime() - node_boot_time_seconds
RabbitMQ
use https://github.com/kbudde/rabbitmq_exporter or https://github.com/deadtrickster/prometheus_rabbitmq_exporter
Consumers
Memory used by queue
Unacknowledged messages
Nb. of connections
Softwareheritage (prado)
Almost everythingintegrate to sql-exporter configuration
Most importantly Software Heritage Objects
Dec 7 2018, 3:29 PM · Metrics/monitoring, Sprint 2018 12

Dec 6 2018

zack added a comment to T1414: Set up an inventory app.

what kind of inventory we want to do with this? hardware? software? both?

Dec 6 2018, 1:47 AM · System administration, Sprint 2018 12

Dec 5 2018

vlorentz moved T1421: drop swh-storage mocking everywhere from Backlog to in progress on the Sprint 2018 12 board.
Dec 5 2018, 5:20 PM · Sprint 2018 12
vlorentz triaged T1405: Make it easy to run a complete swh instance as High priority.
Dec 5 2018, 5:12 PM · Docker environment, Sprint 2018 12
vlorentz triaged T1406: Documentation/tutorial for using public datasets (Athena/AWS) as Normal priority.
Dec 5 2018, 5:11 PM · Documentation, Sprint 2018 12
vlorentz triaged T1409: Publish swh v1.0.0 as Normal priority.
Dec 5 2018, 5:11 PM · Sprint 2018 12
vlorentz triaged T1407: Internal documentation (meta task) as Normal priority.
Dec 5 2018, 5:11 PM · Documentation, Sprint 2018 12
vlorentz triaged T1413: swh-docker-dev: Refactor/improve provisionning step as Normal priority.
Dec 5 2018, 5:11 PM · Development environment, System administration, Sprint 2018 12
vlorentz triaged T1412: refactor systemd swh services (puppet) as Normal priority.
Dec 5 2018, 5:11 PM · System administration, Sprint 2018 12
vlorentz triaged T1410: Kill implicit configuration: new configuration scheme as Normal priority.
Dec 5 2018, 5:11 PM · Core & foundations
vlorentz triaged T1414: Set up an inventory app as Normal priority.
Dec 5 2018, 5:11 PM · System administration, Sprint 2018 12
vlorentz triaged T1415: Display Diff events on an IRC chan as Normal priority.
Dec 5 2018, 5:11 PM · Sprint 2018 12
vlorentz triaged T1418: Loaders as Normal priority.
Dec 5 2018, 5:11 PM · Sprint 2018 12
vlorentz triaged T1417: Mirroring as Normal priority.
Dec 5 2018, 5:11 PM · Sprint 2018 12
vlorentz raised the priority of T1428: Create an inventory of useful Munin metrics from Normal to High.
Dec 5 2018, 5:10 PM · Metrics/monitoring, Sprint 2018 12
vlorentz raised the priority of T1356: Kill munin from Normal to High.
Dec 5 2018, 5:10 PM · Sprint 2018 12, System administration
vlorentz triaged T1419: hg/svn support in save code now as High priority.
Dec 5 2018, 5:10 PM · Web app, Sprint 2018 12
vlorentz triaged T1408: More/better Metrics as High priority.
Dec 5 2018, 5:10 PM · Metrics/monitoring, Sprint 2018 12
vlorentz triaged T1420: Identify easy tasks as High priority.
Dec 5 2018, 5:10 PM · Documentation, Sprint 2018 12
vlorentz triaged T1421: drop swh-storage mocking everywhere as High priority.
Dec 5 2018, 5:08 PM · Sprint 2018 12
ardumont added a comment to T1411: reach a minimum of 80% SLOC coverage across all components.

For the loader mercurial, there is a module swh.loader.mercurial.loader_verifier which is not production code.
It's there to test the loader manually, so that could either be probably moved to the tests and transformed into it or removed altogether.

Dec 5 2018, 5:01 PM · Development environment, Sprint 2018 12
vlorentz triaged T1411: reach a minimum of 80% SLOC coverage across all components as Normal priority.
Dec 5 2018, 3:27 PM · Development environment, Sprint 2018 12
vlorentz triaged T1431: Add tests for the Tar loader. as Normal priority.
Dec 5 2018, 3:24 PM · Sprint 2018 12
vlorentz triaged T1430: Add tests for the Debian loader as Normal priority.
Dec 5 2018, 3:24 PM · Debian loader, Sprint 2018 12
vlorentz closed T1393: swh/loader/core/loader.py has no proper test coverage, a subtask of T1411: reach a minimum of 80% SLOC coverage across all components, as Resolved.
Dec 5 2018, 3:23 PM · Development environment, Sprint 2018 12
vlorentz updated subscribers of T1421: drop swh-storage mocking everywhere.
Dec 5 2018, 11:19 AM · Sprint 2018 12
vlorentz updated subscribers of T1405: Make it easy to run a complete swh instance.
Dec 5 2018, 11:19 AM · Docker environment, Sprint 2018 12
vlorentz updated subscribers of T1426: Select a tool + define an architecture to run locally a full SWH instance.
Dec 5 2018, 11:18 AM · Sprint 2018 12

Dec 4 2018

ftigeot changed the status of T1428: Create an inventory of useful Munin metrics from Open to Work in Progress.

Disk

  • I/Os per device
  • Disk usage in percent
  • Utilization per device is this real ? it could be useful to see if a storage subsystem is overloaded
  • Disk usage in absolute human values. percentages are meaningless if we resize filesystems
Dec 4 2018, 4:11 PM · Metrics/monitoring, Sprint 2018 12
ftigeot changed the status of T1428: Create an inventory of useful Munin metrics, a subtask of T1408: More/better Metrics, from Open to Work in Progress.
Dec 4 2018, 4:11 PM · Metrics/monitoring, Sprint 2018 12
vlorentz added a comment to T1411: reach a minimum of 80% SLOC coverage across all components.

Tip: after running Tox in a repo, run coverage report -m to show which lines are not covered.

Dec 4 2018, 3:57 PM · Development environment, Sprint 2018 12
ftigeot updated subscribers of T1428: Create an inventory of useful Munin metrics.
Dec 4 2018, 2:46 PM · Metrics/monitoring, Sprint 2018 12
ftigeot triaged T1428: Create an inventory of useful Munin metrics as Normal priority.
Dec 4 2018, 2:45 PM · Metrics/monitoring, Sprint 2018 12
ardumont added a subtask for T1425: refactor the loader stack for package managers: T1379: npm loader.
Dec 4 2018, 1:18 PM · Sprint 2018 12
vlorentz triaged T1426: Select a tool + define an architecture to run locally a full SWH instance as High priority.
Dec 4 2018, 12:09 PM · Sprint 2018 12
vlorentz triaged T1425: refactor the loader stack for package managers as Normal priority.
Dec 4 2018, 12:05 PM · Sprint 2018 12
vlorentz triaged T1424: Add crates.io (Rust) lister as Normal priority.
Dec 4 2018, 12:04 PM · Crates lister, Archive coverage, Restricted Project, Sprint 2018 12
vlorentz triaged T1423: Add .crate (Rust) loader as Normal priority.
Dec 4 2018, 12:04 PM · Crates loader, Archive coverage, Sprint 2018 12, Restricted Project
vlorentz added a subtask for T1418: Loaders: T1379: npm loader.
Dec 4 2018, 12:03 PM · Sprint 2018 12
ardumont updated the task description for T1411: reach a minimum of 80% SLOC coverage across all components.
Dec 4 2018, 11:54 AM · Development environment, Sprint 2018 12
vlorentz added a subtask for T1410: Kill implicit configuration: new configuration scheme: T826: Objects that implicitely load configuration are a nightmare to test.
Dec 4 2018, 11:33 AM · Core & foundations
vlorentz added a subtask for T1411: reach a minimum of 80% SLOC coverage across all components: T1393: swh/loader/core/loader.py has no proper test coverage.
Dec 4 2018, 11:32 AM · Development environment, Sprint 2018 12
vlorentz added a subtask for T1421: drop swh-storage mocking everywhere: T1307: Remove mock storages used in tests..
Dec 4 2018, 11:31 AM · Sprint 2018 12
vlorentz merged T1422: drop swh-storage mocking everywhere into T1421: drop swh-storage mocking everywhere.
Dec 4 2018, 11:31 AM · Sprint 2018 12
vlorentz merged task T1422: drop swh-storage mocking everywhere into T1421: drop swh-storage mocking everywhere.
Dec 4 2018, 11:31 AM · Sprint 2018 12
vlorentz closed T1422: drop swh-storage mocking everywhere as Invalid.

Duplicate of T1421

Dec 4 2018, 11:31 AM · Sprint 2018 12
vlorentz added a subtask for T1422: drop swh-storage mocking everywhere: T1307: Remove mock storages used in tests..
Dec 4 2018, 11:30 AM · Sprint 2018 12
vlorentz created T1422: drop swh-storage mocking everywhere.
Dec 4 2018, 11:30 AM · Sprint 2018 12