Page MenuHomeSoftware Heritage
Feed Advanced Search

May 4 2021

vsellier committed rDSNIPfdd1f2956958: add the diagram of the blog article on the counters (authored by vsellier).
add the diagram of the blog article on the counters
May 4 2021, 2:34 PM
vsellier updated the task description for T3306: Upgrade the firewalls to version 21.5.1.
May 4 2021, 1:06 PM · System administration
vsellier moved T3306: Upgrade the firewalls to version 21.5.1 from Backlog to in-progress on the System administration board.
May 4 2021, 1:06 PM · System administration
vsellier changed the status of T3306: Upgrade the firewalls to version 21.5.1 from Open to Work in Progress.
May 4 2021, 1:05 PM · System administration
vsellier changed the status of T3203: docs: Document the firewall installation and procedures, a subtask of T3194: Upgrade opnsense firewalls from 20.7.4 to 21.1.4, from Open to Work in Progress.
May 4 2021, 12:57 PM · System administration
vsellier changed the status of T3203: docs: Document the firewall installation and procedures from Open to Work in Progress.
May 4 2021, 12:57 PM · Documentation, System administration
vsellier moved T3300: Make the permissions of the swh services' configuration file uniform from in-progress to done on the System administration board.
May 4 2021, 12:56 PM · System administration
vsellier closed T3300: Make the permissions of the swh services' configuration file uniform as Resolved.

new permissions updated by puppet

May 4 2021, 12:56 PM · System administration

May 3 2021

vsellier committed rSPSITE45c68cc1bbad: fix indentation (authored by vsellier).
fix indentation
May 3 2021, 3:40 PM
vsellier closed D5662: Make the permissions of the swh configuration files consistent.
May 3 2021, 3:40 PM
vsellier committed rSPSITEd164e62259ab: Make the permissions of the swh configuration files consistent (authored by vsellier).
Make the permissions of the swh configuration files consistent
May 3 2021, 3:40 PM
vsellier updated the diff for D5662: Make the permissions of the swh configuration files consistent.

fix a typo on the commit message

May 3 2021, 3:37 PM
vsellier added a revision to T3300: Make the permissions of the swh services' configuration file uniform: D5662: Make the permissions of the swh configuration files consistent.
May 3 2021, 3:36 PM · System administration
vsellier updated the summary of D5662: Make the permissions of the swh configuration files consistent.
May 3 2021, 3:36 PM
vsellier requested review of D5662: Make the permissions of the swh configuration files consistent.
May 3 2021, 3:32 PM
vsellier moved T3300: Make the permissions of the swh services' configuration file uniform from Backlog to in-progress on the System administration board.
May 3 2021, 3:02 PM · System administration
vsellier changed the status of T3300: Make the permissions of the swh services' configuration file uniform from Open to Work in Progress.
May 3 2021, 2:54 PM · System administration
vsellier added a comment to T3243: Replace /dev/sdb and /dev/sdc on storage1.staging.

The replacement disks were delivered at rocquencourt :

May 3 2021, 8:29 AM · System administration, Staging environment
vsellier added a comment to T3041: [production] Provision enough space for the search ES cluster to ingest all intrinsic metadata.

The command seems to be delivered, I will check with the DSI how we can proceed for the installation

May 3 2021, 8:23 AM · System administration, Archive search

Apr 23 2021

vsellier added a comment to T3222: Monitor daily indexes are present on the log cluster and logs are correctly ingested.

logstash now exposes an api server[1] which seems to return some interesting metrics on the plugin behaviors.
For example, there is a section for the elasticsearch output plugin:

  "outputs": [
    {
      "id": "62d11c4234b8981da77a97955da92ac9de92b9a6dcd4582f407face31fd5c664",
      "events": {
        "duration_in_millis": 160089636,
        "in": 72818126,
        "out": 72818046
      },
      "bulk_requests": {
        "responses": {
          "200": 3860888
        },
        "successes": 3860888
      },
      "documents": {
        "successes": 72818046
      },
      "name": "elasticsearch"
    }
  ]
},

I'll try to implement a small python script checking if there is other response code than 200 in a first time to identify the behavior
Perhaps it will be also interesting to check other properties like queue size :

"queue": {
  "type": "memory",
  "events_count": 0,
  "queue_size_in_bytes": 0,
  "max_queue_size_in_bytes": 0
},
Apr 23 2021, 5:16 PM · System administration
vsellier added a comment to T3222: Monitor daily indexes are present on the log cluster and logs are correctly ingested.

I checked the icinga_logstash plugin[1] to see if it can be helpful but it's more oriented to logastash instances used to ingest data from log files. There is no options to check the number of events received/sent for example.

Apr 23 2021, 4:53 PM · System administration
vsellier committed rSENV035022b779a8: Replace clearly-defined vm by the mirror-test one (authored by vsellier).
Replace clearly-defined vm by the mirror-test one
Apr 23 2021, 4:46 PM
vsellier requested review of D5588: Activate swh-counters on all the webapps.
Apr 23 2021, 4:26 PM
vsellier added a revision to T2912: Next generation archive counters: D5588: Activate swh-counters on all the webapps.
Apr 23 2021, 4:26 PM · Roadmap 2021, System administration, Monitoring, Web app
vsellier changed the status of T3222: Monitor daily indexes are present on the log cluster and logs are correctly ingested, a subtask of T3219: No logs are ingested on elasticsearch since 2021-03-26, from Open to Work in Progress.
Apr 23 2021, 4:10 PM · System administrators
vsellier changed the status of T3222: Monitor daily indexes are present on the log cluster and logs are correctly ingested from Open to Work in Progress.
Apr 23 2021, 4:10 PM · System administration
vsellier edited projects for T3222: Monitor daily indexes are present on the log cluster and logs are correctly ingested, added: System administration; removed System administrators.
Apr 23 2021, 4:09 PM · System administration
vsellier added a comment to T3041: [production] Provision enough space for the search ES cluster to ingest all intrinsic metadata.

According to the tracking page, the command has left the factory the Apr 22, 2021, The ETA is May 28, 2021*.

Apr 23 2021, 4:00 PM · System administration, Archive search
vsellier claimed T3041: [production] Provision enough space for the search ES cluster to ingest all intrinsic metadata.
Apr 23 2021, 3:57 PM · System administration, Archive search
vsellier claimed T3129: Reliable monitoring of services: for users and for admins .
Apr 23 2021, 3:13 PM · Roadmap 2022, Roadmap 2021, Monitoring, meta-task
vsellier closed D5542: Remove tenma's access.

closed by rSPSITEe749fd9a244c669b108def9f008009b2f5563811

Apr 23 2021, 2:59 PM
vsellier closed T3251: Count authors from revisions and releases, a subtask of T2912: Next generation archive counters, as Resolved.
Apr 23 2021, 1:03 PM · Roadmap 2021, System administration, Monitoring, Web app
vsellier closed T3251: Count authors from revisions and releases as Resolved.

and the authors are now displayed on staging and production (webapp1)

Apr 23 2021, 1:03 PM · Monitoring, Web app
vsellier added a comment to T3251: Count authors from revisions and releases.

The lag for the production can be followed here: https://grafana.softwareheritage.org/goto/Di2H3z9Gk
(staging has already recovered)

Apr 23 2021, 12:57 PM · Monitoring, Web app
vsellier added a comment to T3251: Count authors from revisions and releases.

the swh-counters is deployed in production too:

  • upgrade swh-counters package and restart swh-counters backend and journal
root@counters1:~# apt dist-upgrade
...
Setting up python3-swh.counters (0.7.0-1~swh1~bpo10+1) ...
root@counters1:~# systemctl stop swh-counters-journal-client.service 
root@counters1:~# systemctl restart gunicorn-swh-counters.service 
root@counters1:~# systemctl start swh-counters-journal-client.service 
root@counters1:~# redis-cli pfcount person
(integer) 7

The count of the person already starts

  • stopping the journal-client to be able to reset the releases and revisions offsets
root@counters1:~# systemctl stop swh-counters-journal-client.service
  • reset the offsets
vsellier@kafka1 ~ % /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server $SERVER --reset-offsets --all-topics --to-current --dry-run  --export --group swh.counters.journal_client 2>&1 > ~/counters_journal_client_offsets.csv
# revision reset
vsellier@kafka1 ~ % 
 /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server $SERVER --reset-offsets  --group swh.counters.journal_client --to-earliest --execute --topic swh.journal.objects.revision
# release reset
vsellier@kafka1 ~ %  /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server $SERVER --reset-offsets  --group swh.counters.journal_client --to-earliest --execute --topic swh.journal.objects.release 
# checks
/opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server $SERVER --reset-offsets --all-topics --to-current --dry-run  --export --group swh.counters.journal_client 2>&1 > ~/counters_journal_client_offsets-backfill.csv 
vsellier@kafka1 ~ % diff ~/counters_journal_client_offsets.csv ~/counters_journal_client_offsets-backfill.csv | less 
1c1
< "swh.journal.objects.revision",25,8275180
---
> "swh.journal.objects.revision",25,0
8c8
< "swh.journal.objects.release",128,78484
---
> "swh.journal.objects.release",128,0
16c16
...
  • journal client restarted
root@counters1:~# systemctl start swh-counters-journal-client.service
  • the person counters is growing fastly
root@counters1:~# date;redis-cli pfcount person
Fri 23 Apr 2021 10:55:54 AM UTC
(integer) 72358
root@counters1:~# date;redis-cli pfcount person
Fri 23 Apr 2021 10:55:57 AM UTC
(integer) 80618
Apr 23 2021, 12:56 PM · Monitoring, Web app
vsellier closed D5586: Activate the person's counter on the home page with swh-counters.
Apr 23 2021, 12:30 PM
vsellier committed rDWAPPSb9ff5a073f9f: Activate the person's counter on the home page with swh-counters (authored by vsellier).
Activate the person's counter on the home page with swh-counters
Apr 23 2021, 12:30 PM
vsellier closed D5573: Update the counters' journal clients configuration to count the persons.
Apr 23 2021, 12:29 PM
vsellier committed rDENVd2dac157b76b: Update the counters' journal clients configuration to count the persons (authored by vsellier).
Update the counters' journal clients configuration to count the persons
Apr 23 2021, 12:29 PM
vsellier retitled D5573: Update the counters' journal clients configuration to count the persons from Update th counters' journal clients configuration to count the persons to Update the counters' journal clients configuration to count the persons.
Apr 23 2021, 12:26 PM
vsellier updated the diff for D5573: Update the counters' journal clients configuration to count the persons.
  • just keep the topic configuration as the journal split is not needed anymore
  • fix the type in the commit message
Apr 23 2021, 12:25 PM
vsellier added a comment to D5586: Activate the person's counter on the home page with swh-counters.

I hesitated to do it, but as it should not move anymore now everything is reactivated, I choose to keep it as it.
We'll see if there are mouvements on this list in the near future

Apr 23 2021, 12:23 PM
vsellier requested review of D5586: Activate the person's counter on the home page with swh-counters.
Apr 23 2021, 12:10 PM
vsellier added a revision to T3251: Count authors from revisions and releases: D5586: Activate the person's counter on the home page with swh-counters.
Apr 23 2021, 12:03 PM · Monitoring, Web app
vsellier committed rDENVab8702258dac: Revert "Add the counters-journal-client-messages deployment" (authored by vsellier).
Revert "Add the counters-journal-client-messages deployment"
Apr 23 2021, 11:46 AM
vsellier added a comment to T3251: Count authors from revisions and releases.
  • version 0.7.0 release with the last improvement (D5576) of vlorentz (thanks)
  • deployment done in staging
  • the person counting has started on the live messages:
root@counters0:~# redis-cli
127.0.0.1:6379> pfcount person
(integer) 7
  • now let reset the consumer offsets for the release and revision topics to backfill the person counter:
# offsets backup
/opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server $SERVER --reset-offsets --all-topics --to-current --dry-run  --export --group swh.counters.journal_client 2>&1 > ~/counters_journal_client_offsets.csv
# revision reset
 /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server $SERVER --reset-offsets  --group swh.counters.journal_client --to-earliest --execute --topic swh.journal.objects.revision
# release reset
 /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server $SERVER --reset-offsets  --group swh.counters.journal_client --to-earliest --execute --topic swh.journal.objects.release
Apr 23 2021, 11:16 AM · Monitoring, Web app
vsellier closed T3283: Create a vm to test the mirror environment as Resolved.

the vm is configured and the new database schema on the staging database created.
psql is also configured with several services:

  • swh-mirror : read-only connection on the swh-mirror schema
  • admin-swh-mirror: r/w connection
  • swh: read-only connection on the archive database (staging)
Apr 23 2021, 10:00 AM · System administration
vsellier added a comment to D5576: Remove 'journal_type' argument from the CLI.

I suppose the message.value() is returning a copy of the content

It does not: https://github.com/confluentinc/confluent-kafka-python/blob/b7f8dce998ec254f54c15e514850d7404c6a71a3/src/confluent_kafka/src/confluent_kafka.c#L439-L445

(Py_INCREF only increments the reference counter)

Apr 23 2021, 9:55 AM
vsellier committed rSPSITEe48733a0acc9: Add mirror-test vm and a dedicated schema on the staging database (authored by vsellier).
Add mirror-test vm and a dedicated schema on the staging database
Apr 23 2021, 9:27 AM
vsellier closed D5581: Add mirror-test vm and a dedicated schema on the staging database.
Apr 23 2021, 9:27 AM
vsellier updated the diff for D5581: Add mirror-test vm and a dedicated schema on the staging database.

limit the pre-configured databases to the main database + swh-mirror

Apr 23 2021, 9:26 AM
vsellier added inline comments to D5581: Add mirror-test vm and a dedicated schema on the staging database.
Apr 23 2021, 9:21 AM

Apr 22 2021

vsellier added a revision to T3283: Create a vm to test the mirror environment: D5581: Add mirror-test vm and a dedicated schema on the staging database.
Apr 22 2021, 8:24 PM · System administration
vsellier requested review of D5581: Add mirror-test vm and a dedicated schema on the staging database.
Apr 22 2021, 8:24 PM
vsellier committed rSPPRIVCbb5dcc9df026: Add password of swh-mirror database (authored by vsellier).
Add password of swh-mirror database
Apr 22 2021, 8:16 PM
vsellier accepted D5576: Remove 'journal_type' argument from the CLI.

I suppose this implementation will be less effective in term of memory consumption as we will keep a copy of the message contents on the dict (I suppose the message.value() is returning a copy of the content)
It also makes the module aware of how the objects are serialized on kafka, which looks quite low level.

Apr 22 2021, 6:46 PM
vsellier committed rSPRE7831daff5cbc: Create the mirror-test vm (authored by vsellier).
Create the mirror-test vm
Apr 22 2021, 5:57 PM
vsellier committed rSPRE5fa5a7f6106d: open the cicustom options (authored by vsellier).
open the cicustom options
Apr 22 2021, 5:57 PM
vsellier added a comment to T3283: Create a vm to test the mirror environment.

VM created by terraform :

mirror-tests_summary = 
hostname: mirror-test
fqdn: mirror-test.internal.staging.swh.network
network: ip=192.168.130.160/24,gw=192.168.130.1 macaddrs=E6:3C:8A:B7:26:5D
Apr 22 2021, 5:42 PM · System administration
vsellier added a comment to T3283: Create a vm to test the mirror environment.

VM declared on the inventory : https://inventory.internal.softwareheritage.org/virtualization/virtual-machines/103/
Future ip will be 192.168.130.160

Apr 22 2021, 5:31 PM · System administration
vsellier changed the status of T3283: Create a vm to test the mirror environment from Open to Work in Progress.
Apr 22 2021, 5:12 PM · System administration
vsellier triaged T3283: Create a vm to test the mirror environment as Normal priority.
Apr 22 2021, 4:49 PM · System administration
vsellier removed a project from T3165: Generate historical data from the new counters series: Roadmap 2021.
Apr 22 2021, 4:25 PM · System administration, Monitoring
vsellier closed D5572: Implement the jounal client counting an internal property of an object.
Apr 22 2021, 4:14 PM
vsellier committed rDCNT5cae9b7cfe30: Implement the jounal client counting an internal property of an object (authored by vsellier).
Implement the jounal client counting an internal property of an object
Apr 22 2021, 4:14 PM
vsellier added a comment to D5572: Implement the jounal client counting an internal property of an object.

thanks for the diff you will propose. I will land this one in the interval.

Apr 22 2021, 4:13 PM
vsellier added a comment to D5572: Implement the jounal client counting an internal property of an object.

It's for performance considerations only, for most of the counters, counting the keys is enough as it's the unique identifier in kafka.
The KeyOrientedJournalClient[1] is bypassing the object deserialization when a message is received, so a more classical client is needed for this specific Person case.

Apr 22 2021, 2:59 PM
vsellier updated the diff for D5572: Implement the jounal client counting an internal property of an object.

add missing doc strings

Apr 22 2021, 2:39 PM
vsellier updated the diff for D5572: Implement the jounal client counting an internal property of an object.

Update according the reviews' feedbacks

Apr 22 2021, 1:19 PM
vsellier added inline comments to D5572: Implement the jounal client counting an internal property of an object.
Apr 22 2021, 12:45 PM
vsellier added inline comments to D5572: Implement the jounal client counting an internal property of an object.
Apr 22 2021, 12:26 PM
vsellier requested review of D5573: Update the counters' journal clients configuration to count the persons.
Apr 22 2021, 12:08 PM
vsellier added a revision to T3251: Count authors from revisions and releases: D5573: Update the counters' journal clients configuration to count the persons.
Apr 22 2021, 12:08 PM · Monitoring, Web app
vsellier committed rDENV76232fa40ada: Add the counters-journal-client-messages deployment (authored by vsellier).
Add the counters-journal-client-messages deployment
Apr 22 2021, 11:06 AM
vsellier requested review of D5572: Implement the jounal client counting an internal property of an object.
Apr 22 2021, 10:37 AM
vsellier added a revision to T3251: Count authors from revisions and releases: D5572: Implement the jounal client counting an internal property of an object.
Apr 22 2021, 10:36 AM · Monitoring, Web app

Apr 21 2021

vsellier added a comment to T3242: Decommission ClearlyDefined resources.

puppet ressources cleaned:

root@pergamon:~# /usr/local/sbin/swh-puppet-master-decomission clearly-defined.internal.staging.swh.network
+ puppet node deactivate clearly-defined.internal.staging.swh.network
Submitted 'deactivate node' for clearly-defined.internal.staging.swh.network with UUID 26eb9a73-add9-4745-b068-6106ab2b20b4
+ puppet node clean clearly-defined.internal.staging.swh.network
Notice: Revoked certificate with serial 256
Notice: Removing file Puppet::SSL::Certificate clearly-defined.internal.staging.swh.network at '/var/lib/puppet/ssl/ca/signed/clearly-defined.internal.staging.swh.network.pem'
clearly-defined.internal.staging.swh.network
+ puppet cert clean clearly-defined.internal.staging.swh.network
Warning: `puppet cert` is deprecated and will be removed in a future release.
   (location: /usr/lib/ruby/vendor_ruby/puppet/application.rb:370:in `run')
Notice: Revoked certificate with serial 256
+ systemctl restart apache2
Apr 21 2021, 5:17 PM · System administration
vsellier closed D5557: Remove clearly-defined resources.
Apr 21 2021, 4:40 PM
vsellier committed rSPSITE239a1337af3d: Remove clearly-defined resources (authored by vsellier).
Remove clearly-defined resources
Apr 21 2021, 4:40 PM
vsellier updated the diff for D5557: Remove clearly-defined resources.

rebase

Apr 21 2021, 4:40 PM
vsellier closed T3242: Decommission ClearlyDefined resources as Resolved.
  • vm destroyed
  • configuration removed for terraform
  • database schemas cleared:
    • before:
root@db1:~# zpool list
NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
data  27.3T   623G  26.7T        -         -    16%     2%  1.00x    ONLINE  -
Apr 21 2021, 4:25 PM · System administration
vsellier committed rSPRE4ee025890a6a: Remove clearly-defined resources (authored by vsellier).
Remove clearly-defined resources
Apr 21 2021, 4:19 PM
vsellier accepted D5568: Update save code now monitoring checks.

lgtm

Apr 21 2021, 9:34 AM · Save Code Now

Apr 20 2021

vsellier added a comment to T3243: Replace /dev/sdb and /dev/sdc on storage1.staging.

The 2 disks were removed from the server and packaged to be sent to seagate.

Apr 20 2021, 5:32 PM · System administration, Staging environment
vsellier added a comment to T3041: [production] Provision enough space for the search ES cluster to ingest all intrinsic metadata.

The order was received and confirmed by dell ETA: 28th may
The detail was sent on the sysadm mailing list

Apr 20 2021, 5:21 PM · System administration, Archive search
vsellier committed rDSNIP5f6a258f80f5: Add the demo content of the kubernetes 101's 5mn talk (authored by vsellier).
Add the demo content of the kubernetes 101's 5mn talk
Apr 20 2021, 4:49 PM
vsellier committed rDENV27704a539d9e: improve documentation (authored by vsellier).
improve documentation
Apr 20 2021, 4:20 PM
vsellier committed rDENVdc71eed6c7da: Add dedicated deposit workers (checker and loader) (authored by ardumont).
Add dedicated deposit workers (checker and loader)
Apr 20 2021, 4:20 PM
vsellier committed rDENV30134c180d73: Remove unused dockerfile (authored by vsellier).
Remove unused dockerfile
Apr 20 2021, 4:20 PM
vsellier committed rDENVf9dad6d11992: use swh-counters in the webapp (authored by vsellier).
use swh-counters in the webapp
Apr 20 2021, 4:20 PM
vsellier committed rDENV4fbdd63d5764: Increase directory slicing level (authored by vsellier).
Increase directory slicing level
Apr 20 2021, 4:20 PM
vsellier committed rDENV5542bf610f31: increase loader limits (authored by vsellier).
increase loader limits
Apr 20 2021, 4:20 PM
vsellier committed rDENV77a880ae9cfe: Activate metadata search via swh-search (authored by vsellier).
Activate metadata search via swh-search
Apr 20 2021, 4:20 PM
vsellier committed rDENVb8ed26f590b2: fix elasticsearch mount point (authored by vsellier).
fix elasticsearch mount point
Apr 20 2021, 4:20 PM
vsellier committed rDENVdb83366c3bd9: kafka: avoid oom on startup when data is growing (authored by vsellier).
kafka: avoid oom on startup when data is growing
Apr 20 2021, 4:20 PM
vsellier committed rDENV0870b274c6d0: webapp: Activate vault interaction (authored by ardumont).
webapp: Activate vault interaction
Apr 20 2021, 4:20 PM
vsellier committed rDENV6a4ce883c115: Add vault db service (authored by ardumont).
Add vault db service
Apr 20 2021, 4:20 PM
vsellier committed rDENVb239c1b8be4c: Add vault service (authored by ardumont).
Add vault service
Apr 20 2021, 4:20 PM
vsellier committed rDENV9767cf144961: Add vault cooker workers (authored by ardumont).
Add vault cooker workers
Apr 20 2021, 4:20 PM
vsellier committed rDENVf21713ee863a: Add indexer journal client (authored by ardumont).
Add indexer journal client
Apr 20 2021, 4:20 PM