Page MenuHomeSoftware Heritage

ftigeot (François Tigeot)
User

Projects

User Details

User Since
Sep 6 2017, 1:06 PM (340 w, 5 d)

Recent Activity

Nov 27 2019

ftigeot committed rTGRAadbfb536e983: Grafanalib dashboards: Add a Debian packaging target (authored by ftigeot).
Grafanalib dashboards: Add a Debian packaging target
Nov 27 2019, 3:15 PM
ftigeot closed D2352: Grafanalib dashboards: Add a Debian packaging target.
Nov 27 2019, 3:15 PM
ftigeot added a comment to T1697: Deploy Grafanalib-based dashboards with Puppet.

Puppet changes added in 17b2b3041212aca9e0a9a35c510885de7bb78230.
Ideally the Debian package should now be added to the Software Heritage private repository.

Nov 27 2019, 11:22 AM · Sprint 2018 12, System administration

Nov 26 2019

ftigeot committed rSPSITE17b2b3041212: grafana: Auto-import grafanalib dashboards (authored by ftigeot).
grafana: Auto-import grafanalib dashboards
Nov 26 2019, 4:13 PM

Nov 25 2019

ftigeot added a comment to T1697: Deploy Grafanalib-based dashboards with Puppet.

Instructions to create Debian packages have been added in D2352.

Nov 25 2019, 4:45 PM · Sprint 2018 12, System administration
ftigeot created D2352: Grafanalib dashboards: Add a Debian packaging target.
Nov 25 2019, 3:44 PM

Nov 22 2019

ftigeot committed rTGRA088a08768bd6: Grafanalib dashboards: Add cpu temperature (authored by ftigeot).
Grafanalib dashboards: Add cpu temperature
Nov 22 2019, 11:43 AM
ftigeot committed rTGRA6e8d1ab524cd: Grafanalib dashboards: ignore compiled files (authored by ftigeot).
Grafanalib dashboards: ignore compiled files
Nov 22 2019, 11:43 AM

Nov 19 2019

ftigeot closed T1883: Do not generate hw related dahsboards for VM/containers as Resolved.

Only temperature data should not be present in non-physical machines.
This ticket should be fixed by d3ad9bda4c7b7fcc19c340c2b7ac559882d8f934 .

Nov 19 2019, 4:30 PM · System administration
ftigeot committed R188:d3ad9bda4c7b: Grafanalib dashboards: Add cpu temperature (wip) (authored by ftigeot).
Grafanalib dashboards: Add cpu temperature (wip)
Nov 19 2019, 4:08 PM
ftigeot committed R188:e32d134f4353: Grafanalib dashboards: ignore compiled files (authored by ftigeot).
Grafanalib dashboards: ignore compiled files
Nov 19 2019, 4:08 PM
ftigeot abandoned D2240: swh-docs: Add storage sites documentation (v3).
Nov 19 2019, 2:34 PM
ftigeot changed the status of T1556: Document hardware architecture from Open to Work in Progress.

What is the purpose of this task ?
How is it different from T1974 ?

Nov 19 2019, 2:18 PM · Documentation
ftigeot changed the status of T1556: Document hardware architecture, a subtask of T1407: Internal documentation (meta task), from Open to Work in Progress.
Nov 19 2019, 2:18 PM · Documentation, Sprint 2018 12
ftigeot closed T1974: Document low-level storage layers as Resolved.

Documentation pushed to swh-docs in bc863ec6a56d539f57079b0b60e616a625c84f81 and 66b3e07ed9d9dbde2333cefe0e3375742dc76231.

Nov 19 2019, 2:15 PM · Documentation
ftigeot added a comment to D2240: swh-docs: Add storage sites documentation (v3).

Initial commit pushed, comments pushed in followup 66b3e07ed9d9dbde2333cefe0e3375742dc76231 .

Nov 19 2019, 12:28 PM
ftigeot committed rDDOC66b3e07ed9d9: swh-docs: Add infrastructure comments from #D2240 (authored by ftigeot).
swh-docs: Add infrastructure comments from #D2240
Nov 19 2019, 12:26 PM
ftigeot committed rDDOCbc863ec6a56d: swh-docs: Add storage sites documentation (v3) (authored by ftigeot).
swh-docs: Add storage sites documentation (v3)
Nov 19 2019, 11:34 AM

Nov 18 2019

ftigeot changed the status of T1883: Do not generate hw related dahsboards for VM/containers from Open to Work in Progress.

New work-in-progress dashboard visible here:
https://grafana.softwareheritage.org/d/QxkAwzbWk/cpu-temperatures-auto-generated?orgId=1&refresh=10s

Nov 18 2019, 4:11 PM · System administration

Nov 8 2019

ftigeot abandoned D2223: swh-docs: Add storage sites documentation (v2).

Thanks for this review, changes have been incorporated in D2240.

Nov 8 2019, 3:17 PM
ftigeot created D2240: swh-docs: Add storage sites documentation (v3).
Nov 8 2019, 3:16 PM
ftigeot abandoned D2140: swh-docs: Add storage sites documentation.

Thanks for this review. Changes added to D2223.

Nov 8 2019, 2:56 PM
ftigeot closed T1653: Prometheus rate functions considered unreliable, a subtask of T1356: Kill munin, as Wontfix.
Nov 8 2019, 11:43 AM · Sprint 2018 12, System administration
ftigeot closed T1653: Prometheus rate functions considered unreliable as Wontfix.

No relevant problem has been reported with our dataset/usage of Prometheus. Closing.

Nov 8 2019, 11:43 AM · Sprint 2018 12, System administration

Nov 6 2019

ftigeot closed T1442: Replace Munin graphs with Grafana/Prometheus dashboards, a subtask of T1356: Kill munin, as Resolved.
Nov 6 2019, 12:11 PM · Sprint 2018 12, System administration
ftigeot closed T1442: Replace Munin graphs with Grafana/Prometheus dashboards as Resolved.

I do not see any missing piece in the Grafana dashboard, the Munin graph service/VM can be shut down.

Nov 6 2019, 12:11 PM · Sprint 2018 12, System administration

Nov 5 2019

ftigeot created D2223: swh-docs: Add storage sites documentation (v2).
Nov 5 2019, 4:14 PM

Oct 15 2019

ftigeot added a comment to T1974: Document low-level storage layers.

Some work-in-progress Sphinxdoc documentation is visible in this Phabricator review: https://forge.softwareheritage.org/D2140 .

Oct 15 2019, 1:25 PM · Documentation
ftigeot created D2140: swh-docs: Add storage sites documentation.
Oct 15 2019, 12:02 PM

Oct 14 2019

ftigeot committed rDDOCdd17bc3c2fbb: swh-docs: Add Elasticsearch documentation (authored by ftigeot).
swh-docs: Add Elasticsearch documentation
Oct 14 2019, 12:17 PM

Oct 8 2019

ftigeot committed rDDOC776faec30cd3: swh-docs: Enable usage of embedded Graphviz graphs (authored by ftigeot).
swh-docs: Enable usage of embedded Graphviz graphs
Oct 8 2019, 12:06 PM

Oct 7 2019

ftigeot committed rDDOCc7e8fe88e9d9: swh-docs: Document additional software requirements (authored by ftigeot).
swh-docs: Document additional software requirements
Oct 7 2019, 4:07 PM

Sep 4 2019

ftigeot committed rSPPFIXd208e078c5dd: swh-postfix: Add check_client_access restrictions (authored by ftigeot).
swh-postfix: Add check_client_access restrictions
Sep 4 2019, 4:09 PM

Aug 29 2019

ftigeot changed the status of T1974: Document low-level storage layers from Open to Work in Progress.

Softwareheritage: low-level storage

Aug 29 2019, 4:33 PM · Documentation

Aug 28 2019

ftigeot closed T1965: forge@softwareheritage.org: Recipient address rejected: User unknown in virtual mailbox table as Resolved.
Aug 28 2019, 3:04 PM · System administration
ftigeot added a comment to T1965: forge@softwareheritage.org: Recipient address rejected: User unknown in virtual mailbox table.

ftigeot (François Tigeot) wrote:

ftigeot changed the task status from "Open" to "Work in Progress".
ftigeot added a comment.

Added a gandi Forwarded address to forward forge@softwareheritage.org
mails to forge@forge.softwareheritage.org.

Aug 28 2019, 3:03 PM · System administration
ftigeot changed the status of T1965: forge@softwareheritage.org: Recipient address rejected: User unknown in virtual mailbox table from Open to Work in Progress.

Added a gandi Forwarded address to forward forge@softwareheritage.org mails to forge@forge.softwareheritage.org.

Aug 28 2019, 3:02 PM · System administration

Aug 27 2019

ftigeot created T1974: Document low-level storage layers.
Aug 27 2019, 2:29 PM · Documentation
ftigeot added a comment to T1973: Kernel oops in i40e driver on hypervisor3.

Some post-4.15 commits seem to fix this kind of issue.

Aug 27 2019, 12:33 PM · System administration

Aug 26 2019

ftigeot accepted D1841: stats_exporter: Refactor and add docstrings to graph data export script.

The changes look fine to me.
The important thing is to get the same output.
With regard to the data directory, share/swh-data seemed to be a logical place and compatible with the usual hier(7) filesystem hierarchy.

Aug 26 2019, 3:44 PM

Aug 9 2019

ftigeot committed rSPSITEac753a023734: Archive counters exporter: optimize Prometheus queries (authored by ftigeot).
Archive counters exporter: optimize Prometheus queries
Aug 9 2019, 6:11 PM
ftigeot closed T1949: Archive graphs: vertical axis scaling not optimal as Resolved.

The yaxis scale was explicitly forced to begin at zero.
Removing that constraint allows the graphs to scale and fill their allocated vertical space.

Aug 9 2019, 2:35 PM
ftigeot added a comment to T1949: Archive graphs: vertical axis scaling not optimal.

When resizing the browser window or when loading the page the first time after having pasted its name in the URL bar, it is obvious the "Source files" data is used for all graphs.

Aug 9 2019, 11:45 AM
ftigeot changed the status of T1949: Archive graphs: vertical axis scaling not optimal from Open to Work in Progress.

This could be caused by the Javascript framework used to create graphs, Flot -- https://www.flotcharts.org/

Aug 9 2019, 10:40 AM
ftigeot closed T1544: archive graphs stopped being updated a while ago as Resolved.

Dedicated task created as T1949 .

Aug 9 2019, 10:39 AM · Website, Web app
ftigeot triaged T1949: Archive graphs: vertical axis scaling not optimal as High priority.
Aug 9 2019, 10:38 AM

Aug 8 2019

ftigeot closed T1437: Rewrite the munin stats export for the website to use prometheus as Resolved.
Aug 8 2019, 5:26 PM · System administration
ftigeot closed T1437: Rewrite the munin stats export for the website to use prometheus, a subtask of T1355: Move the object counter from munin to prometheus, as Resolved.
Aug 8 2019, 5:26 PM · System administration

Aug 7 2019

ftigeot added a comment to T1872: staging infra: New vlan.

Most of the relevant commits use the 192.168.128.0/24 address space.

Aug 7 2019, 2:02 PM · Staging environment, Staff, System administration

Aug 6 2019

ftigeot accepted D1820: defaults: Move gpg/certificate blocks to a dedicated config file.
Aug 6 2019, 10:49 AM · System administration

Aug 5 2019

ftigeot accepted D1815: pergamon: Add route to staging network to be able to check nodes.
Aug 5 2019, 12:27 PM
ftigeot accepted D1808: staging: Rework output to display a summary on nodes.
Aug 5 2019, 10:28 AM
ftigeot accepted D1807: staging: Add db0 node.
Aug 5 2019, 10:27 AM · System administration

Aug 2 2019

ftigeot accepted D1806: staging: Modularize node creation.

A bit too big to understand quickly but no real choice here.
Looks good.

Aug 2 2019, 10:09 PM · System administration
ftigeot accepted D1805: stats_web: Remove proxy request to munin.
Aug 2 2019, 4:42 PM
ftigeot accepted D1803: site.pp: Remove most desktops from puppet.
Aug 2 2019, 3:23 PM

Aug 1 2019

ftigeot accepted D1799: Docs: Update documentation to improve/clarify steps.
Aug 1 2019, 12:18 PM
ftigeot accepted D1797: staging: Bootstrap infrastructure with the gateway node.

Looks good.

Aug 1 2019, 11:45 AM
ftigeot accepted D1798: staging: Provision storage0 vm.
Aug 1 2019, 10:40 AM
ftigeot accepted D1796: init-template: Explain how to bootstrap a debian template image.
Aug 1 2019, 10:27 AM
ftigeot accepted D1795: README: Explain how to initialize and apply changes to infra.
Aug 1 2019, 10:25 AM
ftigeot accepted D1794: terraform: Prepare the workstation tools.

Typo line 17:

  • # Install so that terrafor actually sees the plugin

+ # Install so that terraform actually sees the plugin

Aug 1 2019, 9:49 AM

Jul 31 2019

ftigeot accepted D1792: network: Allow to override the ups/downs route for the network.

Some route definitions look unnecessary and could be cleaned up in a second pass.

Jul 31 2019, 4:24 PM · System administration

Jul 30 2019

ftigeot committed rSPSITE9825a780069b: Archive counters: update export path (authored by ftigeot).
Archive counters: update export path
Jul 30 2019, 2:40 PM
ftigeot committed rSPSITEb079718d907d: Archive counters: activate generation cron [3/3] (authored by ftigeot).
Archive counters: activate generation cron [3/3]
Jul 30 2019, 2:14 PM
ftigeot committed rSPSITEe3f36a47b2cd: Archive counters: activate generation cron [2/2] (authored by ftigeot).
Archive counters: activate generation cron [2/2]
Jul 30 2019, 2:06 PM
ftigeot committed rSPSITE291c9b16866c: Archive counters: activate generation cron (authored by ftigeot).
Archive counters: activate generation cron
Jul 30 2019, 1:29 PM

Jul 29 2019

ftigeot committed rSPSITE0680a4ff58dc: Archive counter exporter: deployment fix (authored by ftigeot).
Archive counter exporter: deployment fix
Jul 29 2019, 4:18 PM
ftigeot committed rSPSITE3cd214878f68: Website: Use Prometheus data to export archive counters (authored by ftigeot).
Website: Use Prometheus data to export archive counters
Jul 29 2019, 2:42 PM

Jul 26 2019

ftigeot accepted D1776: puppet/master: Add clean up certificate script.

lgtm

Jul 26 2019, 2:32 PM

Jul 24 2019

ftigeot accepted D1767: Reference provenance page in annex behind basic auth.

This technically looks good but from a security point of view, why put the secret "private" and "provenance-index" directories in a publically accessible location ?

Jul 24 2019, 5:48 PM · System administration, Staff
ftigeot changed the status of T1931: scheduler's cron cleanup error when filtering tasks to archive from Open to Work in Progress.

Fwiw, a manual connection to esnode1:9200 doesn't show this error

Jul 24 2019, 4:36 PM · Scheduling utilities
ftigeot added a comment to T1437: Rewrite the munin stats export for the website to use prometheus.

Depending on Prometheus for all data is not a hard requirement.

Jul 24 2019, 10:42 AM · System administration
ftigeot closed T792: Make the elasticsearch logging cluster actually a cluster as Resolved.

Removed T1017 Kafka subtask, it really has no relation to the Elasticsearch cluster being a true cluster or not.

Jul 24 2019, 10:37 AM · System administration (Elasticsearch consolidation (W24/2018))
ftigeot removed a parent task for T1017: Estimate for Kafka cluster specifications: T792: Make the elasticsearch logging cluster actually a cluster.
Jul 24 2019, 10:35 AM · System administration
ftigeot removed a subtask for T792: Make the elasticsearch logging cluster actually a cluster: T1017: Estimate for Kafka cluster specifications.
Jul 24 2019, 10:35 AM · System administration (Elasticsearch consolidation (W24/2018))
ftigeot closed T1338: Change BBUs on orsay as Wontfix.

Hardware is too old / starting to fall apart for other reasons.
It would be more cost-effective to replace it.

Jul 24 2019, 10:33 AM · System administration
ftigeot closed T1282: Revisit backups as Resolved.

Wrote backup tools documentation in T1372 .
No backup system changes wanted at this time.

Jul 24 2019, 10:31 AM · System administration

Jul 23 2019

ftigeot claimed T1437: Rewrite the munin stats export for the website to use prometheus.
Jul 23 2019, 5:38 PM · System administration

Jul 22 2019

ftigeot added a comment to T1338: Change BBUs on orsay.

Fwiw, I never got an answer from Dell on that topic.

Jul 22 2019, 4:48 PM · System administration

Jul 18 2019

ftigeot added a comment to T1437: Rewrite the munin stats export for the website to use prometheus.

For the "March 2019 problem", the json output generated from the Prometheus API itself misses the more recent data points.

Jul 18 2019, 4:26 PM · System administration
ftigeot changed the status of T1437: Rewrite the munin stats export for the website to use prometheus from Open to Work in Progress.

Prometheus data has been exported to a json file similar to the format produced by the Muni/RRD based toolchain.
Results are visible on https://www-dev.softwareheritage.org/archive/
(vs https://www.softwareheritage.org/archive/ for original graphs)

Jul 18 2019, 4:10 PM · System administration
ftigeot changed the status of T1437: Rewrite the munin stats export for the website to use prometheus, a subtask of T1355: Move the object counter from munin to prometheus, from Open to Work in Progress.
Jul 18 2019, 4:10 PM · System administration

Jul 16 2019

ftigeot closed T1355: Move the object counter from munin to prometheus, a subtask of T1356: Kill munin, as Resolved.
Jul 16 2019, 3:20 PM · Sprint 2018 12, System administration
ftigeot closed T1355: Move the object counter from munin to prometheus as Resolved.

Even though it is not necessarily obvious, the object counter has been stored in Prometheus since December 2018.

Jul 16 2019, 3:20 PM · System administration
ftigeot closed T1882: Merge swh-sysadmin into existing swh-grafana-dashboards as Resolved.

Grafanalib dashboards to swh-grafana-dashboards in rTGRAee5d3074bf58 .

Jul 16 2019, 11:48 AM · System administrators
ftigeot committed rTGRAd8c4f1ead4cf: Grafanalib dashboards: Add cpu temperature (wip) (authored by ftigeot).
Grafanalib dashboards: Add cpu temperature (wip)
Jul 16 2019, 11:42 AM
ftigeot committed rTGRAee5d3074bf58: Import existing Grafanalib dashboards (authored by ftigeot).
Import existing Grafanalib dashboards
Jul 16 2019, 11:42 AM

Jul 8 2019

ftigeot closed T1854: Backup Postgres "secondary" cluster as Resolved.

A backup script has been added to the Puppet environment in e93781ef32836396008e28599bf02d412c2184d3 and 26a74ad2178568398de8cf448cd79ba8c5320232 .

Jul 8 2019, 4:44 PM · System administration
ftigeot committed R194:132245bfb826: Ansible VM deployment: Add disk and network information (authored by ftigeot).
Ansible VM deployment: Add disk and network information
Jul 8 2019, 3:44 PM
ftigeot committed R194:492d92fb0bde: First VM deployed (authored by ftigeot).
First VM deployed
Jul 8 2019, 3:44 PM
ftigeot committed R194:e8d6c770f84f: Ansible automation: Add a bootstrap playbook (authored by ftigeot).
Ansible automation: Add a bootstrap playbook
Jul 8 2019, 3:44 PM
ftigeot committed rSPSITE26a74ad21785: Puppet environment: Add a Postgres cluster backup script [2/2] (authored by ftigeot).
Puppet environment: Add a Postgres cluster backup script [2/2]
Jul 8 2019, 3:13 PM
ftigeot closed D1697: Puppet environment: Add a Postgres cluster backup script [2/2].
Jul 8 2019, 3:13 PM
Herald added a reviewer for D1697: Puppet environment: Add a Postgres cluster backup script [2/2]: Reviewers.
Jul 8 2019, 2:49 PM
ftigeot closed T1857: Backup MongoDB databases as Resolved.

Backup done, a full copy of the main MongoDB databases is now present on banco.

Jul 8 2019, 12:26 PM · System administration

Jul 5 2019

ftigeot committed rSPSITEe93781ef3283: Puppet environment: Add a Postgres cluster backup script (authored by ftigeot).
Puppet environment: Add a Postgres cluster backup script
Jul 5 2019, 10:27 AM

Jul 3 2019

ftigeot added a comment to T1857: Backup MongoDB databases.

Approximately 85% of the dump data has been copied so far.

Jul 3 2019, 2:57 PM · System administration

Jul 1 2019

ftigeot added a comment to T1857: Backup MongoDB databases.

Dumps of the six MongoDB databases have been created at the Paris office.
They are being copied to banco:/srv/storage/space/mongo_dumps at Rocquencourt.

Jul 1 2019, 4:40 PM · System administration
ftigeot committed rSPSITEeb7719a0e616: dar: Rename /srv/postgres-backups to db-backups (authored by ftigeot).
dar: Rename /srv/postgres-backups to db-backups
Jul 1 2019, 2:05 PM