Page MenuHomeSoftware Heritage
Feed Advanced Search

Jun 29 2021

vsellier triaged T3417: Cleanup the old counters environment as Normal priority.
Jun 29 2021, 11:20 AM · System administration, Monitoring
vsellier added a comment to T3394: cassandra - origin url hashing encoding issue.

Thanks, it was tested this night on grid5000, all the origins were correctly replayed without issues.

Jun 29 2021, 10:36 AM · System administration, Storage manager

Jun 28 2021

vsellier updated the task description for T3357: Perform some tests of the cassandra storage on Grid5000.
Jun 28 2021, 8:49 PM · System administration, Storage manager
vsellier committed rDSNIP51adac9ac3dc: grid5000/cassandra: 2021-06-28 run (authored by vsellier).
grid5000/cassandra: 2021-06-28 run
Jun 28 2021, 7:05 PM
vsellier added a comment to T3396: cassandra - allow to configure the consistency level used by the queries.

only one confirmation of the write is needed

It's not perfect though. If the server that confirmed the writes breaks before it replicates the write, then the write is lost.

Jun 28 2021, 5:54 PM · System administration, Storage manager
vsellier added a comment to T3396: cassandra - allow to configure the consistency level used by the queries.

IMO, we should first try to have a global configuration for all the read/write queries, and improve that later if needed for performance or if it creates some problems. At worst, it will be possible to use the default ONE values by configuration.

Jun 28 2021, 4:48 PM · System administration, Storage manager
vsellier updated the task description for T3357: Perform some tests of the cassandra storage on Grid5000.
Jun 28 2021, 4:09 PM · System administration, Storage manager
vsellier updated the task description for T3357: Perform some tests of the cassandra storage on Grid5000.
Jun 28 2021, 4:09 PM · System administration, Storage manager
vsellier updated the task description for T3357: Perform some tests of the cassandra storage on Grid5000.
Jun 28 2021, 3:39 PM · System administration, Storage manager
vsellier added a comment to T3394: cassandra - origin url hashing encoding issue.

released in swh-storage:v0.32.0

Jun 28 2021, 3:37 PM · System administration, Storage manager
vsellier committed rDSNIPc80554db8de0: grid5000/cassandra: deploy some grafana dashboards (authored by vsellier).
grid5000/cassandra: deploy some grafana dashboards
Jun 28 2021, 12:27 PM
vsellier committed rDSNIPd940be8d5575: grid5000/cassandra: improve replayer tracability (authored by vsellier).
grid5000/cassandra: improve replayer tracability
Jun 28 2021, 12:27 PM
vsellier committed rDSNIPdabe01fc0194: grid5000/cassandra: Remove possible lvm volumes (authored by vsellier).
grid5000/cassandra: Remove possible lvm volumes
Jun 28 2021, 12:27 PM
vsellier committed rDSNIP75624f36dd16: grid5000/cassandra fix a type on the cassandra directories (authored by vsellier).
grid5000/cassandra fix a type on the cassandra directories
Jun 28 2021, 12:27 PM
vsellier committed rDSNIP7039064fcb13: grid5000/cassandra: don't fail the reservation if something goes wrong (authored by vsellier).
grid5000/cassandra: don't fail the reservation if something goes wrong
Jun 28 2021, 12:27 PM
vsellier committed rDSNIPc1eac7dd278a: grid5000/cassandra: Add a script to clean the cassandra data (authored by vsellier).
grid5000/cassandra: Add a script to clean the cassandra data
Jun 28 2021, 12:27 PM
vsellier accepted D5932: cassandra: Add support for non-ASCII origin 'URLs'..

LGTM, thank you for fixing that.

Jun 28 2021, 12:12 PM
vsellier updated the task description for T3398: [swh-search] Deploy v0.9.0 on production and execute a full origin and metadata reindexation.
Jun 28 2021, 9:58 AM · System administration, Archive search
vsellier closed T3398: [swh-search] Deploy v0.9.0 on production and execute a full origin and metadata reindexation, a subtask of T3373: Metadata search is failing due to a boolean field in the mapping of the metadata fields, as Resolved.
Jun 28 2021, 9:57 AM · System administration, Archive search
vsellier closed T3398: [swh-search] Deploy v0.9.0 on production and execute a full origin and metadata reindexation as Resolved.

The lag on the topics has recovered.
The configuration update of moma will be followed in T3373

Jun 28 2021, 9:57 AM · System administration, Archive search
vsellier added a comment to T3373: Metadata search is failing due to a boolean field in the mapping of the metadata fields.

The lag has recovered, the search on webapp1[1] is now fully up-to-date and can be tested before changing the configuration on the main webapp.

Jun 28 2021, 9:50 AM · System administration, Archive search

Jun 25 2021

vsellier changed the status of T3408: Provide read-only access to production servers, a subtask of T1526: Install a new VPN endpoint at Rocquencourt, from Open to Work in Progress.
Jun 25 2021, 6:03 PM · System administration
vsellier changed the status of T3408: Provide read-only access to production servers from Open to Work in Progress.
Jun 25 2021, 6:03 PM · System administration
vsellier triaged T3408: Provide read-only access to production servers as High priority.
Jun 25 2021, 5:50 PM · System administration
vsellier updated the diff for D5929: network: clean old gateways and routes.

improve commit message

Jun 25 2021, 5:09 PM
vsellier requested review of D5929: network: clean old gateways and routes.
Jun 25 2021, 5:08 PM
vsellier added a revision to T1526: Install a new VPN endpoint at Rocquencourt: D5929: network: clean old gateways and routes.
Jun 25 2021, 5:08 PM · System administration
vsellier triaged T3407: Upgrade sphinx docker image to use a more recent version of plantuml as Normal priority.
Jun 25 2021, 3:44 PM · System administration, Documentation
vsellier committed rDDOC2c97beb78a57: Remove the unsupported 'description' group option on old versions of nwdiag (authored by vsellier).
Remove the unsupported 'description' group option on old versions of nwdiag
Jun 25 2021, 3:13 PM
vsellier accepted D5920: Drop search instance from swh_storage_cloud role.

lgtm

Jun 25 2021, 10:46 AM

Jun 23 2021

vsellier added a comment to T3373: Metadata search is failing due to a boolean field in the mapping of the metadata fields.

The metadata indexation is finished, https://webapp1.internal.softwareheritage.org can now search on them via elasticsearch without any issue.
Let's now wait for the lag on origin* topics to recover: https://grafana.softwareheritage.org/goto/iOvBK6gnk?orgId=1

Jun 23 2021, 4:12 PM · System administration, Archive search
vsellier closed D5911: swh-search: configure webapp1 to use the v0.9.0 index.
Jun 23 2021, 3:38 PM
vsellier committed rSPSITEf5d049c76829: swh-search: configure webapp1 to use the v0.9.0 index (authored by vsellier).
swh-search: configure webapp1 to use the v0.9.0 index
Jun 23 2021, 3:38 PM
vsellier updated the test plan for D5911: swh-search: configure webapp1 to use the v0.9.0 index.
Jun 23 2021, 1:01 PM
vsellier updated the diff for D5911: swh-search: configure webapp1 to use the v0.9.0 index.

Also update the index name. It's not necessary as only the aliases are used, for the searchbut it's cleaner

Jun 23 2021, 1:00 PM
vsellier requested review of D5911: swh-search: configure webapp1 to use the v0.9.0 index.
Jun 23 2021, 12:56 PM
vsellier added a revision to T3373: Metadata search is failing due to a boolean field in the mapping of the metadata fields: D5911: swh-search: configure webapp1 to use the v0.9.0 index.
Jun 23 2021, 12:56 PM · System administration, Archive search
vsellier added a comment to T3398: [swh-search] Deploy v0.9.0 on production and execute a full origin and metadata reindexation.

It still remains 1 day to consume the origin*topics.
The metadata were completely ingested so the metadatasearch can be tested on webapp1 after the configuration will be updated to use the new index.

Jun 23 2021, 12:06 PM · System administration, Archive search
vsellier added a comment to T1526: Install a new VPN endpoint at Rocquencourt.
  • The louvre's certificates are all created on opnsense
  • I perform some tests to access the firewall through a serial console. The ssh access can be done by using a serial console of one of the server expose on the IPMI network:
# ipmitool -I lanplus -H swh-ceph-mon1-adm.inria.fr -U XXX -P XXX sol activate
[SOL Session operational.  Use ~? for help]
Jun 23 2021, 11:19 AM · System administration
vsellier added a comment to T1526: Install a new VPN endpoint at Rocquencourt.

All the revoked certificates are imported on the opnsense crl.
I will also import the valid certificates so we will have them in case a revocation is needed

Jun 23 2021, 9:20 AM · System administration

Jun 22 2021

vsellier accepted D5909: tests/journal_client: Use pytest-redis fixture instead of mocks.

LGTM thanks it's obviously less complicated

Jun 22 2021, 5:18 PM
vsellier added a comment to T1526: Install a new VPN endpoint at Rocquencourt.

The certificated revoked by louvre can be imported in opnsense and revoked in an internal crl.
It more simple than importing the current louvre's crl as an imported crl needs to be manage externally and it's raw content paste on the ui.

Jun 22 2021, 4:47 PM · System administration
vsellier added a comment to T1526: Install a new VPN endpoint at Rocquencourt.
  • main vlan440 ips changes to regroup the fw at the beginning of the range and have similar ips on each vlan:
    • pushkin: 192.168.100.128 -> 192.168.100.2
    • glyptotek: 192.168.100.129 -> 192.168.100.3
  • next step is to try to import the current certificate revocation list of louvre
Jun 22 2021, 2:59 PM · System administration
vsellier closed D5906: fw: update the monitoring to use the new main ips on vlan440.
Jun 22 2021, 2:41 PM
vsellier committed rSPSITE4cc3b315224d: fw: update the monitoring to use the new main ips on vlan440 (authored by vsellier).
fw: update the monitoring to use the new main ips on vlan440
Jun 22 2021, 2:41 PM
vsellier added a comment to T1526: Install a new VPN endpoint at Rocquencourt.
  • new dns entry vpn.softwareheritage.org created:
vpn	A	3600	128.93.166.2
  • The address change of the firewalls is in preparation with D5906
Jun 22 2021, 12:47 PM · System administration
vsellier retitled D5906: fw: update the monitoring to use the new main ips on vlan440 from fw: update the monitoring to use the new main ips on vlan440 This is the first step to keep the monitoring green to fw: update the monitoring to use the new main ips on vlan440.
Jun 22 2021, 12:46 PM
vsellier requested review of D5906: fw: update the monitoring to use the new main ips on vlan440.
Jun 22 2021, 12:37 PM
vsellier added a revision to T1526: Install a new VPN endpoint at Rocquencourt: D5906: fw: update the monitoring to use the new main ips on vlan440.
Jun 22 2021, 12:37 PM · System administration
vsellier updated the task description for T3398: [swh-search] Deploy v0.9.0 on production and execute a full origin and metadata reindexation.
Jun 22 2021, 12:17 PM · System administration, Archive search
vsellier added a comment to T3398: [swh-search] Deploy v0.9.0 on production and execute a full origin and metadata reindexation.

the reindexation should be done by the end of the day

Jun 22 2021, 12:17 PM · System administration, Archive search
vsellier added a comment to T3398: [swh-search] Deploy v0.9.0 on production and execute a full origin and metadata reindexation.
  • journal clients started:
root@search1:~/T3398# swh search --config-file journal_client_objects.yml journal-client objects
INFO:elasticsearch:POST http://search-esnode4.internal.softwareheritage.org:9200/origin-v0.9.0-write/_bulk [status:200 request:0.013s]
INFO:elasticsearch:POST http://search-esnode5.internal.softwareheritage.org:9200/origin-v0.9.0-write/_bulk [status:200 request:0.014s]
INFO:elasticsearch:POST http://search-esnode6.internal.softwareheritage.org:9200/origin-v0.9.0-write/_bulk [status:200 request:0.012s]
INFO:elasticsearch:POST http://search-esnode4.internal.softwareheritage.org:9200/origin-v0.9.0-write/_bulk [status:200 request:0.012s]
...
root@search1:~/T3398# swh search --config-file journal_client_indexed.yml journal-client objects
INFO:elasticsearch:POST http://search-esnode4.internal.softwareheritage.org:9200/origin-v0.9.0-write/_bulk [status:200 request:0.758s]
INFO:elasticsearch:POST http://search-esnode5.internal.softwareheritage.org:9200/origin-v0.9.0-write/_bulk [status:200 request:0.023s]
INFO:elasticsearch:POST http://search-esnode6.internal.softwareheritage.org:9200/origin-v0.9.0-write/_bulk [status:200 request:0.024s]
INFO:elasticsearch:POST http://search-esnode4.internal.softwareheritage.org:9200/origin-v0.9.0-write/_bulk [status:200 request:0.023s]
...
Jun 22 2021, 11:30 AM · System administration, Archive search
vsellier added a comment to T3398: [swh-search] Deploy v0.9.0 on production and execute a full origin and metadata reindexation.
  • journal clients configuration prepared:
root@search1:~/T3398# diff -U3 /etc/softwareheritage/search/journal_client_objects.yml journal_client_objects.yml 
--- /etc/softwareheritage/search/journal_client_objects.yml     2021-06-10 08:08:19.555062808 +0000
+++ journal_client_objects.yml  2021-06-22 09:19:04.841898294 +0000
@@ -8,13 +8,18 @@
     port: 9200
   - host: search-esnode6.internal.softwareheritage.org
     port: 9200
+  indexes:
+    origin: 
+      index: origin-v0.9.0
+      read_alias: origin-v0.9.0-read
+      write_alias: origin-v0.9.0-write
 journal:
   brokers:
   - kafka1.internal.softwareheritage.org
   - kafka2.internal.softwareheritage.org
   - kafka3.internal.softwareheritage.org
   - kafka4.internal.softwareheritage.org
-  group_id: swh.search.journal_client
+  group_id: swh.search.journal_client-v0.9.0
   prefix: swh.journal.objects
   object_types:
   - origin
root@search1:~/T3398# diff -U3 /etc/softwareheritage/search/journal_client_indexed.yml journal_client_indexed.yml
--- /etc/softwareheritage/search/journal_client_indexed.yml     2021-06-10 09:34:00.980897650 +0000
+++ journal_client_indexed.yml  2021-06-22 09:27:18.507340257 +0000
@@ -8,13 +8,18 @@
     port: 9200
   - host: search-esnode6.internal.softwareheritage.org
     port: 9200
+  indexes:
+    origin:
+      index: origin-v0.9.0
+      read_alias: origin-v0.9.0-read
+      write_alias: origin-v0.9.0-write
 journal:
   brokers:
   - kafka1.internal.softwareheritage.org
   - kafka2.internal.softwareheritage.org
   - kafka3.internal.softwareheritage.org
   - kafka4.internal.softwareheritage.org
-  group_id: swh.search.journal_client.indexed
+  group_id: swh.search.journal_client.indexed-v0.9.0
   prefix: swh.journal.indexed
   object_types:
   - origin_intrinsic_metadata
  • new index initialized:
root@search1:~/T3398# diff -U3 /etc/softwareheritage/search/server.yml server.yml 
--- /etc/softwareheritage/search/server.yml     2021-06-10 08:08:17.819058015 +0000
+++ server.yml  2021-06-22 09:11:16.132518743 +0000
@@ -10,7 +10,7 @@
     port: 9200
   indexes:
     origin:
-      index: origin-production
-      read_alias: origin-read
-      write_alias: origin-write
+      index: origin-v0.9.0
+      read_alias: origin-v0.9.0-read
+      write_alias: origin-v0.9.0-write
Jun 22 2021, 11:19 AM · System administration, Archive search
vsellier updated the task description for T3398: [swh-search] Deploy v0.9.0 on production and execute a full origin and metadata reindexation.
Jun 22 2021, 10:37 AM · System administration, Archive search
vsellier added a comment to T3398: [swh-search] Deploy v0.9.0 on production and execute a full origin and metadata reindexation.

On search1:

  • puppet disabled
  • swh-search / jounal clients stopped
  • packages updated:
apt list --upgradable 2>/dev/null | grep python3-swh | cut -f1 -d'/' | xargs  apt install -V --dry-run
...
The following packages will be upgraded:
The following packages will be upgraded:
   python3-swh.core (0.13.0-1~swh1~bpo10+1 => 0.14.3-1~swh1~bpo10+1)
   python3-swh.indexer (0.7.0-1~swh1~bpo10+1 => 0.8.0-1~swh1~bpo10+1)
   python3-swh.indexer.storage (0.7.0-1~swh1~bpo10+1 => 0.8.0-1~swh1~bpo10+1)
   python3-swh.journal (0.7.1-1~swh1~bpo10+1 => 0.8.0-1~swh1~bpo10+1)
   python3-swh.model (2.3.0-1~swh1~bpo10+1 => 2.6.1-1~swh1~bpo10+1)
   python3-swh.objstorage (0.2.2-1~swh1~bpo10+1 => 0.2.3-1~swh1~bpo10+1)
   python3-swh.scheduler (0.10.0-1~swh1~bpo10+1 => 0.15.0-1~swh1~bpo10+1)
   python3-swh.search (0.8.0-1~swh1~bpo10+1 => 0.9.0-1~swh1~bpo10+1)
   python3-swh.storage (0.27.2-1~swh1~bpo10+1 => 0.30.1-1~swh1~bpo10+1)
9 upgraded, 0 newly installed, 0 to remove and 8 not upgraded.
...
  • Index intialisation done and swh-search restarted:
root@search1:~# swh search -C /etc/softwareheritage/search/server.yml initialize
INFO:elasticsearch:HEAD http://search-esnode6.internal.softwareheritage.org:9200/origin-production [status:200 request:0.025s]
INFO:elasticsearch:HEAD http://search-esnode4.internal.softwareheritage.org:9200/origin-read/_alias [status:200 request:0.018s]
INFO:elasticsearch:HEAD http://search-esnode5.internal.softwareheritage.org:9200/origin-write/_alias [status:200 request:0.003s]
INFO:elasticsearch:PUT http://search-esnode6.internal.softwareheritage.org:9200/origin-production/_mapping [status:200 request:0.102s]
Done.
root@search1:~# systemctl start gunicorn-swh-search.service
  • journal client restarted with no errors on the logs, the search is still working fom the webapp
Jun 22 2021, 10:36 AM · System administration, Archive search
vsellier changed the status of T3398: [swh-search] Deploy v0.9.0 on production and execute a full origin and metadata reindexation from Open to Work in Progress.
Jun 22 2021, 9:48 AM · System administration, Archive search
vsellier changed the status of T3398: [swh-search] Deploy v0.9.0 on production and execute a full origin and metadata reindexation, a subtask of T3373: Metadata search is failing due to a boolean field in the mapping of the metadata fields, from Open to Work in Progress.
Jun 22 2021, 9:48 AM · System administration, Archive search
vsellier added a comment to T3357: Perform some tests of the cassandra storage on Grid5000.

An array with the possible node count relative to the replication factor was added on the hedgedoc document : https://hedgedoc.softwareheritage.org/m2MBUViUQl2r9dwcq3-_Nw?both

Jun 22 2021, 9:47 AM · System administration, Storage manager

Jun 21 2021

vsellier added a comment to T1526: Install a new VPN endpoint at Rocquencourt.

The first draft of the migration plan is available here: https://hedgedoc.softwareheritage.org/XBVc1QZhR_aVdbfSviqchg

Jun 21 2021, 5:14 PM · System administration
vsellier triaged T3398: [swh-search] Deploy v0.9.0 on production and execute a full origin and metadata reindexation as Normal priority.
Jun 21 2021, 11:50 AM · System administration, Archive search

Jun 18 2021

vsellier renamed T3396: cassandra - allow to configure the consistency level used by the queries from cassandra - allow to configure the consitency level used by the queries to cassandra - allow to configure the consistency level used by the queries.
Jun 18 2021, 7:24 PM · System administration, Storage manager
vsellier closed T3391: [swh-search] Deploy v0.9.0 on staging and execute a full origin and metadata reindexation, a subtask of T3373: Metadata search is failing due to a boolean field in the mapping of the metadata fields, as Resolved.
Jun 18 2021, 7:08 PM · System administration, Archive search
vsellier closed T3391: [swh-search] Deploy v0.9.0 on staging and execute a full origin and metadata reindexation as Resolved.

The backfill is done.

  • Puppet is reactivated
  • The index configuration set to the default:
root@search0:/etc/softwareheritage/search# export ES_SERVER=192.168.130.80:9200
root@search0:/etc/softwareheritage/search# export INDEX=origin-v0.9.0
root@search0:/etc/softwareheritage/search# cat >/tmp/config.json <<EOF
> {
>   "index" : {
>   "translog.sync_interval" : null,
>   "translog.durability": null,
>   "refresh_interval": null
>   }
> }
> EOF
root@search0:/etc/softwareheritage/search# curl -s -H "Content-Type: application/json" -XPUT http://${ES_SERVER}/${INDEX}/_settings -d @/tmp/config.json
{"acknowledged":true}
Jun 18 2021, 7:08 PM · System administration, Archive search
vsellier renamed T3396: cassandra - allow to configure the consistency level used by the queries from cassandra - allow to configure the consitency level used for the queries to cassandra - allow to configure the consitency level used by the queries.
Jun 18 2021, 5:22 PM · System administration, Storage manager
vsellier updated subscribers of T3396: cassandra - allow to configure the consistency level used by the queries.

@vlorentz If you have an idea on how to implement that, I take it ;), I'm not sure if I have not missed something

Jun 18 2021, 5:22 PM · System administration, Storage manager
vsellier triaged T3396: cassandra - allow to configure the consistency level used by the queries as Normal priority.
Jun 18 2021, 5:19 PM · System administration, Storage manager
vsellier updated the task description for T3395: cassandra - Timeouts during revision import.
Jun 18 2021, 4:57 PM · System administration, Storage manager
vsellier triaged T3395: cassandra - Timeouts during revision import as Normal priority.
Jun 18 2021, 4:57 PM · System administration, Storage manager
vsellier updated the task description for T3394: cassandra - origin url hashing encoding issue.
Jun 18 2021, 4:49 PM · System administration, Storage manager
vsellier triaged T3394: cassandra - origin url hashing encoding issue as Normal priority.
Jun 18 2021, 4:49 PM · System administration, Storage manager
vsellier added a comment to T3357: Perform some tests of the cassandra storage on Grid5000.

Several tests were executed with cassandra node on the parasilo cluster [1]
The configuration was always the same to calibrate the runs:

  • ZFS is used to manage to datasets
  • the commitlogs in the 200Go SSD drive
  • the data in the 4 600Gb HDD configured in RAID0
  • Default memory configuration (8Go / default GC (not g1))
  • Cassandra configuration : [2]
Jun 18 2021, 4:44 PM · System administration, Storage manager
vsellier added a comment to T3391: [swh-search] Deploy v0.9.0 on staging and execute a full origin and metadata reindexation.

The configuration of the index was changed to increase the reindexation speed:

root@search-esnode0:~# export ES_SERVER=192.168.130.80:9200
root@search-esnode0:~# export INDEX=origin-v0.9.0
root@search-esnode0:~# cat >/tmp/config.json <<EOF
> {
>   "index" : {
> "translog.sync_interval" : "60s",
> "translog.durability": "async",
> "refresh_interval": "60s"
>   }
> }
> EOF
root@search-esnode0:~# curl -s -H "Content-Type: application/json" -XPUT http://${ES_SERVER}/${INDEX}/_settings -d @/tmp/config.json

The default settings will be restored when the journal_client will have recovered.

Jun 18 2021, 3:31 PM · System administration, Archive search
vsellier closed D5895: swh-search: use the index with the last mapping version.
Jun 18 2021, 3:23 PM
vsellier committed rSPSITE88b29885d3e6: swh-search: use the index with the last mapping version (authored by vsellier).
swh-search: use the index with the last mapping version
Jun 18 2021, 3:23 PM
vsellier added a comment to T3391: [swh-search] Deploy v0.9.0 on staging and execute a full origin and metadata reindexation.

Unerlying index of the aliases changed to use the new index for read and write:

root@search0:~/T3391# curl -XPOST -H 'Content-Type: application/json' http://search-esnode0:9200/_aliases -d '
> {
>   "actions" : [
>     { "remove" : { "index" : "origin", "alias" : "origin-read" } },
>     { "remove" : { "index" : "origin", "alias" : "origin-write" } },
>     { "add" : { "index" : "origin-v0.9.0", "alias" : "origin-read" } },
>     { "add" : { "index" : "origin-v0.9.0", "alias" : "origin-write" } }
>   ]
> }'
Jun 18 2021, 2:45 PM · System administration, Archive search
vsellier updated the test plan for D5895: swh-search: use the index with the last mapping version.
Jun 18 2021, 2:31 PM
vsellier updated the diff for D5895: swh-search: use the index with the last mapping version.

Also change the consumer group of the journal clients to keep the current position
of the backfill

Jun 18 2021, 2:30 PM
vsellier requested review of D5895: swh-search: use the index with the last mapping version.
Jun 18 2021, 12:52 PM
vsellier added a revision to T3391: [swh-search] Deploy v0.9.0 on staging and execute a full origin and metadata reindexation: D5895: swh-search: use the index with the last mapping version.
Jun 18 2021, 12:52 PM · System administration, Archive search
vsellier closed T3392: [staging] Properly recreate the origin_intrinsic_metadata topic, a subtask of T3391: [swh-search] Deploy v0.9.0 on staging and execute a full origin and metadata reindexation, as Resolved.
Jun 18 2021, 12:18 PM · System administration, Archive search
vsellier closed T3392: [staging] Properly recreate the origin_intrinsic_metadata topic as Resolved.
Jun 18 2021, 12:18 PM · System administration
vsellier added a comment to T3392: [staging] Properly recreate the origin_intrinsic_metadata topic.

Indexation rescheduled as in https://forge.softwareheritage.org/T3037#58463:

swhscheduler@scheduler0:~$ /usr/bin/swh scheduler --config-file scheduler.yml task schedule_origins --storage-url http://storage1.internal.staging.swh.network:5002 index-origin-metadata 2>&1 | tee schedule_origins.logs
...
page_token: 79901
Scheduled 8000 tasks (80000 origins).
page_token: 80001
page_token: 80101
...

and counting...

Jun 18 2021, 12:14 PM · System administration
vsellier added a comment to T3392: [staging] Properly recreate the origin_intrinsic_metadata topic.
root@worker3:/etc/softwareheritage# puppet agent --disable 'recreate origin_intrinsic_metadata topic'
root@worker3:/etc/softwareheritage# systemctl stop swh-worker@indexer_origin_intrinsic_metadata.service
root@search0:~/T3391# puppet agent --disable 'recreate origin_intrinsic_metadata topic'
root@search0:~/T3391# systemctl stop swh-search-journal-client@indexed.service
vsellier@journal0 /var/log/kafka
 % /opt/kafka/bin/kafka-topics.sh  --bootstrap-server $SERVER --delete --topic swh.journal.indexed.origin_intrinsic_metadata
vsellier@journal0 /var/log/kafka
 % /opt/kafka/bin/kafka-topics.sh --bootstrap-server $SERVER --create --config cleanup.policy=compact --partitions 64 --replication-factor 1 --topic "swh.journal.indexed.origin_intrinsic_metadata"
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Created topic swh.journal.indexed.origin_intrinsic_metadata.
 % /opt/kafka/bin/kafka-topics.sh  --bootstrap-server $SERVER --describe --topic swh.journal.indexed.origin_intrinsic_metadata                                                                      
Topic: swh.journal.indexed.origin_intrinsic_metadata	PartitionCount: 64	ReplicationFactor: 1	Configs: cleanup.policy=compact,max.message.bytes=104857600
	Topic: swh.journal.indexed.origin_intrinsic_metadata	Partition: 0	Leader: 1	Replicas: 1	Isr: 1
	Topic: swh.journal.indexed.origin_intrinsic_metadata	Partition: 1	Leader: 1	Replicas: 1	Isr: 1
	Topic: swh.journal.indexed.origin_intrinsic_metadata	Partition: 2	Leader: 1	Replicas: 1	Isr: 1
...
root@worker3:/etc/systemd/system# systemctl start swh-worker@indexer_origin_intrinsic_metadata.service
root@worker3:/etc/systemd/system# puppet agent --enable
root@search0:~/T3391# systemctl start swh-search-journal-client@indexed.service 
root@search0:~/T3391# puppet agent --enable
Jun 18 2021, 11:59 AM · System administration
vsellier moved T3392: [staging] Properly recreate the origin_intrinsic_metadata topic from Backlog to in-progress on the System administration board.
Jun 18 2021, 9:47 AM · System administration
vsellier changed the status of T3392: [staging] Properly recreate the origin_intrinsic_metadata topic from Open to Work in Progress.
Jun 18 2021, 9:47 AM · System administration
vsellier added a comment to T3391: [swh-search] Deploy v0.9.0 on staging and execute a full origin and metadata reindexation.

It seems the swh.journal.indexed.origin_intrinsic_metadata was automatically created so the retention policy was not specifiy (and there is only one partition (!))

 % /opt/kafka/bin/kafka-topics.sh  --bootstrap-server $SERVER --describe --topic swh.journal.indexed.origin_intrinsic_metadata
Topic: swh.journal.indexed.origin_intrinsic_metadata	PartitionCount: 1	ReplicationFactor: 1	Configs: max.message.bytes=104857600
	Topic: swh.journal.indexed.origin_intrinsic_metadata	Partition: 0	Leader: 1	Replicas: 1	Isr: 1
Jun 18 2021, 9:45 AM · System administration, Archive search

Jun 17 2021

vsellier added a comment to T3391: [swh-search] Deploy v0.9.0 on staging and execute a full origin and metadata reindexation.
  • Temporary configuration created:
    • objects:
root@search0:~/T3391# diff -U3 /etc/softwareheritage/search/journal_client_objects.yml journal_client_objects.yml 
--- /etc/softwareheritage/search/journal_client_objects.yml     2020-12-10 11:04:08.460777825 +0000
+++ journal_client_objects.yml  2021-06-17 16:48:56.006110527 +0000
@@ -4,10 +4,15 @@
   hosts:
   - host: search-esnode0.internal.staging.swh.network
     port: 9200
+  indexes:
+    origin:
+      index: origin-v0.9.0
+      read_alias: origin-v0.9.0-read
+      write_alias: origin-v0.9.0-write
 journal:
   brokers:
   - journal0.internal.staging.swh.network
-  group_id: swh.search.journal_client
+  group_id: swh.search.journal_client-v0.9.0
   prefix: swh.journal.objects
   object_types:
   - origin
  • indexed:
root@search0:~/T3391# diff -U3 /etc/softwareheritage/search/journal_client_indexed.yml journal_client_indexed.yml 
--- /etc/softwareheritage/search/journal_client_indexed.yml     2021-02-09 17:48:44.269681575 +0000
+++ journal_client_indexed.yml  2021-06-17 16:49:57.926120227 +0000
@@ -4,10 +4,15 @@
   hosts:
   - host: search-esnode0.internal.staging.swh.network
     port: 9200
+  indexes:
+    origin:
+      index: origin-v0.9.0
+      read_alias: origin-v0.9.0-read
+      write_alias: origin-v0.9.0-write
 journal:
   brokers:
   - journal0.internal.staging.swh.network
-  group_id: swh.search.journal_client.indexed
+  group_id: swh.search.journal_client.indexed-v0.9.0
   prefix: swh.journal.indexed
   object_types:
   - origin_intrinsic_metadata
  • upgrade packages:
root@search0:~/T3391# apt list --upgradable
Listing... Done
libnss-systemd/buster-backports 247.3-5~bpo10+1 amd64 [upgradable from: 247.3-3~bpo10+1]
libpam-systemd/buster-backports 247.3-5~bpo10+1 amd64 [upgradable from: 247.3-3~bpo10+1]
libsystemd0/buster-backports 247.3-5~bpo10+1 amd64 [upgradable from: 247.3-3~bpo10+1]
libudev1/buster-backports 247.3-5~bpo10+1 amd64 [upgradable from: 247.3-3~bpo10+1]
python3-swh.core/unknown 0.14.3-1~swh1~bpo10+1 all [upgradable from: 0.13.0-1~swh1~bpo10+1]
python3-swh.indexer.storage/unknown 0.8.0-1~swh1~bpo10+1 all [upgradable from: 0.7.0-1~swh1~bpo10+1]
python3-swh.indexer/unknown 0.8.0-1~swh1~bpo10+1 all [upgradable from: 0.7.0-1~swh1~bpo10+1]
python3-swh.model/unknown 2.6.1-1~swh1~bpo10+1 all [upgradable from: 2.3.0-1~swh1~bpo10+1]
python3-swh.objstorage/unknown 0.2.3-1~swh1~bpo10+1 all [upgradable from: 0.2.2-1~swh1~bpo10+1]
python3-swh.scheduler/unknown 0.15.0-1~swh1~bpo10+1 all [upgradable from: 0.10.0-1~swh1~bpo10+1]
python3-swh.search/unknown 0.9.0-1~swh1~bpo10+1 all [upgradable from: 0.8.0-1~swh1~bpo10+1]
python3-swh.storage/unknown 0.30.1-1~swh1~bpo10+1 all [upgradable from: 0.27.2-1~swh1~bpo10+1]
systemd-sysv/buster-backports 247.3-5~bpo10+1 amd64 [upgradable from: 247.3-3~bpo10+1]
systemd-timesyncd/buster-backports 247.3-5~bpo10+1 amd64 [upgradable from: 247.3-3~bpo10+1]
systemd/buster-backports 247.3-5~bpo10+1 amd64 [upgradable from: 247.3-3~bpo10+1]
udev/buster-backports 247.3-5~bpo10+1 amd64 [upgradable from: 247.3-3~bpo10+1]
  • Initialize the new index and aliases:
root@search0:~/T3391# swh search --config-file journal_client_objects.yml initialize
INFO:elasticsearch:PUT http://search-esnode0.internal.staging.swh.network:9200/origin-v0.9.0 [status:200 request:5.373s]
INFO:elasticsearch:PUT http://search-esnode0.internal.staging.swh.network:9200/origin-v0.9.0/_alias/origin-v0.9.0-read [status:200 request:0.052s]
INFO:elasticsearch:PUT http://search-esnode0.internal.staging.swh.network:9200/origin-v0.9.0/_alias/origin-v0.9.0-write [status:200 request:0.038s]
INFO:elasticsearch:PUT http://search-esnode0.internal.staging.swh.network:9200/origin-v0.9.0/_mapping [status:200 request:0.086s]
Done.
  • starting the origin* journal client:
root@search0:~/T3391# swh search --config-file journal_client_objects.yml journal-client objects
INFO:elasticsearch:POST http://search-esnode0.internal.staging.swh.network:9200/origin-v0.9.0-write/_bulk [status:200 request:0.661s]
INFO:elasticsearch:POST http://search-esnode0.internal.staging.swh.network:9200/origin-v0.9.0-write/_bulk [status:200 request:1.671s]
INFO:elasticsearch:POST http://search-esnode0.internal.staging.swh.network:9200/origin-v0.9.0-write/_bulk [status:200 request:2.047s]
INFO:elasticsearch:POST http://search-esnode0.internal.staging.swh.network:9200/origin-v0.9.0-write/_bulk [status:200 request:0.179s]
  • starting the indexed metadata journal client:
root@search0:~/T3391# swh search --config-file journal_client_indexed.yml journal-client objects
  • Index status:
root@search0:~/T3391# curl -s http://search-esnode0:9200/_cat/indices\?v
health status index                       uuid                   pri rep docs.count docs.deleted store.size pri.store.size                                                                                        
green  open   origin                      HthJj42xT5uO7w3Aoxzppw  80   0    1320217       166798      1.7gb          1.7gb                                                                                        
green  open   origin-v0.9.0               o7FiYJWnTkOViKiAdCXCuA  80   0     263556          400    214.8mb        214.8mb                                                                                        
green  close  origin-backup-20210209-1736 P1CKjXW0QiWM5zlzX46-fg  80   0                                                                                                                                          
green  close  origin-v0.5.0               SGplSaqPR_O9cPYU4ZsmdQ  80   0

The origin_visits* topics have not started to be ingested, so the new fields are not yes present on the index.

Jun 17 2021, 7:24 PM · System administration, Archive search
vsellier added a subtask for T3373: Metadata search is failing due to a boolean field in the mapping of the metadata fields: T3391: [swh-search] Deploy v0.9.0 on staging and execute a full origin and metadata reindexation.
Jun 17 2021, 6:44 PM · System administration, Archive search
vsellier added a parent task for T3391: [swh-search] Deploy v0.9.0 on staging and execute a full origin and metadata reindexation: T3373: Metadata search is failing due to a boolean field in the mapping of the metadata fields.
Jun 17 2021, 6:44 PM · System administration, Archive search
vsellier moved T3391: [swh-search] Deploy v0.9.0 on staging and execute a full origin and metadata reindexation from Backlog to in-progress on the System administration board.
Jun 17 2021, 6:40 PM · System administration, Archive search
vsellier changed the status of T3391: [swh-search] Deploy v0.9.0 on staging and execute a full origin and metadata reindexation from Open to Work in Progress.
Jun 17 2021, 6:40 PM · System administration, Archive search
vsellier added a comment to T3373: Metadata search is failing due to a boolean field in the mapping of the metadata fields.

The fix is released in version 'v0.9.0'.
I will deploy it on staging and launch a complete reindexation of the metadata (+origin but needed by other changes on this release)

Jun 17 2021, 4:58 PM · System administration, Archive search
vsellier added inline comments to D5883: Setup storage and store last revision/release date.
Jun 17 2021, 4:04 PM
vsellier accepted D5878: Store last_eventful_visit_date.

Thanks for the changes.

Jun 17 2021, 2:19 PM
vsellier committed rDSNIP14814a4d4a89: grid5000/cassandra: document how to launch the local monitoring (authored by vsellier).
grid5000/cassandra: document how to launch the local monitoring
Jun 17 2021, 1:00 PM
vsellier committed rDSNIP3e391214d4e1: grid5000/cassandra: Declare the locally replicated monitoring (authored by vsellier).
grid5000/cassandra: Declare the locally replicated monitoring
Jun 17 2021, 12:46 PM
vsellier committed rDSNIPbe2a6e739253: grid5000/cassandra: Performance improvement (authored by vsellier).
grid5000/cassandra: Performance improvement
Jun 17 2021, 10:29 AM
vsellier committed rDSNIP25ae125e37ea: grid5000/cassandra: Segment paravance cluster nodes (authored by vsellier).
grid5000/cassandra: Segment paravance cluster nodes
Jun 17 2021, 10:29 AM
vsellier committed rDSNIP31dee79c9c7c: gird5000/cassandra: declare different environment per type of run (authored by vsellier).
gird5000/cassandra: declare different environment per type of run
Jun 17 2021, 10:29 AM
vsellier updated subscribers of D5878: Store last_eventful_visit_date.
Jun 17 2021, 9:32 AM