Page MenuHomeSoftware Heritage

Archive searchFolder
ActivePublic

Members

  • This project does not have any members.
  • View All

Recent Activity

Wed, Nov 24

vsellier closed T3741: swh-search - upgrade elasticsearch backend as Resolved.
Wed, Nov 24, 6:11 PM · System administration, Archive search
vsellier added a comment to T3741: swh-search - upgrade elasticsearch backend.

production nodes are upgraded :
For each node :

  • disable shard allocation:
cat > /tmp/shard_allocation.json <<EOF
{
  "persistent": {
    "cluster.routing.allocation.enable": "primaries"
  }
}
EOF
Wed, Nov 24, 5:26 PM · System administration, Archive search
vsellier added a revision to T3741: swh-search - upgrade elasticsearch backend: D6685: swh-search: upgrade elasticsearch to 7.15.2.
Wed, Nov 24, 4:24 PM · System administration, Archive search
vsellier added a revision to T3741: swh-search - upgrade elasticsearch backend: D6682: swh-search: Upgrade elasticsearch to 7.15.2.
Wed, Nov 24, 11:53 AM · System administration, Archive search
vsellier added a comment to T3741: swh-search - upgrade elasticsearch backend.

The staging elasticsearch is migrated to 7.15.2, everything looks good.

Wed, Nov 24, 10:39 AM · System administration, Archive search

Tue, Nov 23

vsellier added a revision to T3741: swh-search - upgrade elasticsearch backend: D6677: staging: upgrade swh-search elasticsearch to 7.15.2.
Tue, Nov 23, 7:53 PM · System administration, Archive search

Fri, Nov 19

seirl closed T3742: yarn called in swh-search setup.py but not present in developer setup docs as Resolved.

Fixed in https://forge.softwareheritage.org/rDDOC55cdfd9ee957f57cf91b0f6932cc941d2887d933

Fri, Nov 19, 5:29 PM · Archive search
seirl triaged T3742: yarn called in swh-search setup.py but not present in developer setup docs as Normal priority.
Fri, Nov 19, 5:17 PM · Archive search
vsellier triaged T3741: swh-search - upgrade elasticsearch backend as Normal priority.
Fri, Nov 19, 5:01 PM · System administration, Archive search

Fri, Nov 5

vsellier triaged T3708: Upgrade swh-search elasticsearch version as Normal priority.
Fri, Nov 5, 2:31 PM · Archive search, System administration (Component upgrades)

Oct 19 2021

vsellier renamed T3671: staging - swh-search (metadata indexer) is unable to update a document due to an unparseable date from staging - swh-search unable to update a document due to an unparseable date to staging - swh-search (metadata indexer) is unable to update a document due to an unparseable date.
Oct 19 2021, 11:03 AM · Intrinsic metadata, Archive search
vsellier updated the task description for T3671: staging - swh-search (metadata indexer) is unable to update a document due to an unparseable date.
Oct 19 2021, 10:54 AM · Intrinsic metadata, Archive search
vsellier triaged T3671: staging - swh-search (metadata indexer) is unable to update a document due to an unparseable date as Normal priority.
Oct 19 2021, 10:48 AM · Intrinsic metadata, Archive search

Oct 1 2021

anlambert added a revision to T2254: textual search language for the Web UI: D6390: search: Add query language support for staff users.
Oct 1 2021, 3:03 PM · Archive search, Web app

Sep 29 2021

ardumont closed T3620: Deploy swh.search v0.11.6 as Resolved.
Sep 29 2021, 6:20 PM · System administration, Archive search
ardumont moved T3620: Deploy swh.search v0.11.6 from code-review/monitoring to deployed/landed on the System administration board.
Sep 29 2021, 6:20 PM · System administration, Archive search
ardumont moved T3620: Deploy swh.search v0.11.6 from in-progress to code-review/monitoring on the System administration board.
Sep 29 2021, 6:20 PM · System administration, Archive search
ardumont updated the task description for T3620: Deploy swh.search v0.11.6.
Sep 29 2021, 6:12 PM · System administration, Archive search
ardumont changed the status of T3620: Deploy swh.search v0.11.6 from Open to Work in Progress.
Sep 29 2021, 6:00 PM · System administration, Archive search
ardumont added projects to T3620: Deploy swh.search v0.11.6: Archive search, System administration.
Sep 29 2021, 5:06 PM · System administration, Archive search

Sep 23 2021

vlorentz triaged T3606: Document the swh-search design as Normal priority.
Sep 23 2021, 2:55 PM · Archive search, Documentation

Sep 20 2021

vsellier added a revision to T3433: Deploy swh.search v0.10/v0.11: D6303: swh-web: fix the metadata backend configuration in the swh-search override.
Sep 20 2021, 1:42 PM · System administration, Archive search

Sep 13 2021

vlorentz added a comment to T2073: Index extrinsic metadata from the journal in swh-search/Elasticsearch.

On the other hand, journal clients are sort of a resolution to T2063.

Sep 13 2021, 2:42 PM · Archive search, Metadata workflow
olasd added a comment to T2073: Index extrinsic metadata from the journal in swh-search/Elasticsearch.

I'm tempted to postpone this issue until we resolve T2063...

Sep 13 2021, 11:36 AM · Archive search, Metadata workflow

Sep 10 2021

vlorentz added a comment to T2073: Index extrinsic metadata from the journal in swh-search/Elasticsearch.

I'm tempted to postpone this issue until we resolve T2063...

Sep 10 2021, 4:05 PM · Archive search, Metadata workflow
anlambert closed T3441: Implement query to get origin visit types dynamically as Resolved.

This has been implemented and is now used by swh-web in production, closing this.

Sep 10 2021, 10:49 AM · Archive search

Sep 8 2021

anlambert added a revision to T3441: Implement query to get origin visit types dynamically: D6219: common/utils: Add function to get origin visit types dynamically.
Sep 8 2021, 5:42 PM · Archive search
vlorentz closed T2590: Finish the indexer -> swh-search pipeline, a subtask of T2182: Switch production swh-web to use swh-search instead of postgresql search., as Resolved.
Sep 8 2021, 3:35 PM · System administration, Archive search, Storage manager
vlorentz closed T2590: Finish the indexer -> swh-search pipeline as Resolved.
Sep 8 2021, 3:35 PM · Journal, Archive search
vsellier closed T3040: [production] Enable swh-search's journal-client for indexed objects, a subtask of T2590: Finish the indexer -> swh-search pipeline, as Resolved.
Sep 8 2021, 3:24 PM · Journal, Archive search
vsellier closed T3040: [production] Enable swh-search's journal-client for indexed objects as Resolved.

metadata searches are now done in Elasticsearch since the deployment of T3433

Sep 8 2021, 3:24 PM · System administration, Journal, Archive search
vsellier renamed T3433: Deploy swh.search v0.10/v0.11 from Deploy swh.search v0.10/v0.11 on staging to Deploy swh.search v0.10/v0.11.
Sep 8 2021, 3:21 PM · System administration, Archive search
vsellier closed T3433: Deploy swh.search v0.10/v0.11 as Resolved.

Everything is deployed and look functional.

Sep 8 2021, 3:21 PM · System administration, Archive search

Sep 7 2021

vsellier added a revision to T3433: Deploy swh.search v0.10/v0.11: D6206: webapp: support new metadata search backend configuation.
Sep 7 2021, 4:08 PM · System administration, Archive search
vlorentz closed T3562: [swh-search] Document version conflict during parallel indexation as Resolved by committing rDSEA7479282c70db: Retry on concurrent conflicting updates.
Sep 7 2021, 3:31 PM · Archive search
vlorentz added a revision to T3562: [swh-search] Document version conflict during parallel indexation: D6203: Retry on concurrent conflicting updates.
Sep 7 2021, 2:53 PM · Archive search
vsellier added a revision to T3433: Deploy swh.search v0.10/v0.11: D6197: swh-search: use the consumer group used during the reindexation.
Sep 7 2021, 11:22 AM · System administration, Archive search

Sep 6 2021

vsellier triaged T3562: [swh-search] Document version conflict during parallel indexation as Normal priority.
Sep 6 2021, 2:52 PM · Archive search
vlorentz updated the task description for T3560: Polish the swh-search QL.
Sep 6 2021, 10:38 AM · Archive search, System administration
vlorentz removed a project from T3559: Enable the swh-search QL in staging: meta-task.
Sep 6 2021, 10:37 AM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata
vlorentz removed a project from T3558: Enable the swh-search QL in production: meta-task.
Sep 6 2021, 10:37 AM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata
vlorentz triaged T3560: Polish the swh-search QL as Normal priority.
Sep 6 2021, 10:37 AM · Archive search, System administration
vlorentz added a project to T3558: Enable the swh-search QL in production: Archive search.
Sep 6 2021, 10:36 AM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata
vlorentz triaged T3559: Enable the swh-search QL in staging as Normal priority.
Sep 6 2021, 10:36 AM · Archive search, System administration, Intrinsic metadata, Extrinsic metadata

Sep 3 2021

vsellier added a comment to T3433: Deploy swh.search v0.10/v0.11.

production deployment:

  • disable puppet
  • stop and disable the journal clients and the search backend
  • update the swh-search configuration to change the index name to origin-v0.11
root@search1:/etc/softwareheritage/search# diff -U3 /tmp/server.yml server.yml
--- /tmp/server.yml	2021-09-03 14:06:07.896137122 +0000
+++ server.yml	2021-09-03 14:05:47.072081879 +0000
@@ -10,7 +10,7 @@
     port: 9200
   indexes:
     origin:
-      index: origin-production
+      index: origin-v0.11
       read_alias: origin-read
       write_alias: origin-write
  • update the journal-clients to use a group id swh.search.journal_client.[indexed|object]-v0.11
root@search1:/etc/softwareheritage/search# diff -U3 /tmp/journal_client_objects.yml journal_client_objects.yml 
--- /tmp/journal_client_objects.yml	2021-09-03 14:06:52.660255797 +0000
+++ journal_client_objects.yml	2021-09-03 14:07:10.684303568 +0000
@@ -8,7 +8,7 @@
   - kafka2.internal.softwareheritage.org
   - kafka3.internal.softwareheritage.org
   - kafka4.internal.softwareheritage.org
-  group_id: swh.search.journal_client
+  group_id: swh.search.journal_client-v0.11
   prefix: swh.journal.objects
   object_types:
   - origin
root@search1:/etc/softwareheritage/search# diff -U3 /tmp/journal_client_indexed.yml journal_client_indexed.yml 
--- /tmp/journal_client_indexed.yml	2021-09-03 14:06:52.660255797 +0000
+++ journal_client_indexed.yml	2021-09-03 14:07:25.760343512 +0000
@@ -8,7 +8,7 @@
   - kafka2.internal.softwareheritage.org
   - kafka3.internal.softwareheritage.org
   - kafka4.internal.softwareheritage.org
-  group_id: swh.search.journal_client.indexed
+  group_id: swh.search.journal_client.indexed-v0.11
   prefix: swh.journal.indexed
   object_types:
   - origin_intrinsic_metadata
  • perform a system upgrade
root@search1:/etc/softwareheritage/search# apt dist-upgrade -V
...
The following NEW packages will be installed:
   python3-tree-sitter (0.19.0-1+swh1~bpo10+1)
The following packages will be upgraded:
   libnss-systemd (247.3-3~bpo10+1 => 247.3-6~bpo10+1)
   libpam-systemd (247.3-3~bpo10+1 => 247.3-6~bpo10+1)
   libsystemd0 (247.3-3~bpo10+1 => 247.3-6~bpo10+1)
   libudev1 (247.3-3~bpo10+1 => 247.3-6~bpo10+1)
   python3-swh.core (0.14.3-1~swh1~bpo10+1 => 0.14.5-1~swh1~bpo10+1)
   python3-swh.model (2.6.1-1~swh1~bpo10+1 => 2.8.0-1~swh1~bpo10+1)
   python3-swh.scheduler (0.15.0-1~swh1~bpo10+1 => 0.18.0-1~swh1~bpo10+1)
   python3-swh.search (0.9.0-1~swh1~bpo10+1 => 0.11.4-2~swh1~bpo10+1)
   python3-swh.storage (0.30.1-1~swh1~bpo10+1 => 0.36.0-1~swh1~bpo10+1)
   systemd (247.3-3~bpo10+1 => 247.3-6~bpo10+1)
   systemd-sysv (247.3-3~bpo10+1 => 247.3-6~bpo10+1)
   systemd-timesyncd (247.3-3~bpo10+1 => 247.3-6~bpo10+1)
   udev (247.3-3~bpo10+1 => 247.3-6~bpo10+1)
13 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
...

There is no need to reboot

  • enable and restart the swh-search backend
  • check the new index creation
root@search1:/etc/softwareheritage/search# curl ${ES_SERVER}/_cat/indices\?v
health status index             uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   origin-v0.11      XOUR_jKcTtWKjlPk_8EAlA  90   1          0            0     34.3kb         18.2kb
green  open   origin-v0.9.0     TH9xlECuS4CcJTDw0Fqieg  90   1  175001478     36494554      293gb        146.9gb
green  open   origin-production hZfuv0lVRImjOjO_rYgDzg  90   1  176722078     56232582      311gb        155.1gb
  • update the write index alias
root@search1:~/T3433# ./update-write-alias.sh 
{"acknowledged":true}{"acknowledged":true}root@search1:~/T3433# 
root@search1:~/T3433# curl ${ES_SERVER}/_cat/aliases\?v
alias               index             filter routing.index routing.search is_write_index
origin-write        origin-v0.11      -      -             -              -
origin-read-v0.9.0  origin-v0.9.0     -      -             -              -
origin-v0.9.0-read  origin-v0.9.0     -      -             -              -
origin-v0.9.0-write origin-v0.9.0     -      -             -              -
origin-write-v0.9.0 origin-v0.9.0     -      -             -              -
origin-read         origin-production -      -             -              -

All the v0.9.0 stuff will be cleared once the migration to the v0.11 done

  • restart the journal clients
root@search1:~# systemctl enable swh-search-journal-client@objects
Created symlink /etc/systemd/system/multi-user.target.wants/swh-search-journal-client@objects.service → /etc/systemd/system/swh-search-journal-client@.service.
root@search1:~# systemctl enable swh-search-journal-client@indexed
Created symlink /etc/systemd/system/multi-user.target.wants/swh-search-journal-client@indexed.service → /etc/systemd/system/swh-search-journal-client@.service.
root@search1:~# systemctl start swh-search-journal-client@objects
root@search1:~# systemctl start swh-search-journal-client@indexed
  • wait for the lag to recover, create additional journal clients if necessary
  • update the read index alias
  • land D6182, D6183, D6197
  • Update swh-web configuration to support the new way to configure the metadata search backend (D6202)
  • deploy them on webapp1 and moma
Sep 3 2021, 4:03 PM · System administration, Archive search
vsellier added a revision to T3040: [production] Enable swh-search's journal-client for indexed objects: D6183: swh-search: activate metadata search all ES on the main webapp.
Sep 3 2021, 3:45 PM · System administration, Journal, Archive search
vsellier added a revision to T3433: Deploy swh.search v0.10/v0.11: D6182: swh-search: update the configuration for the deployment of v0.11.4.
Sep 3 2021, 3:44 PM · System administration, Archive search
vsellier added a comment to T3433: Deploy swh.search v0.10/v0.11.
  • puppet configuration deployed in staging
  • read index updated with this script:
#!/bin/bash
Sep 3 2021, 9:57 AM · System administration, Archive search
vsellier added a revision to T3433: Deploy swh.search v0.10/v0.11: D6176: swh-search: deploy v0.11.4 in staging.
Sep 3 2021, 8:42 AM · System administration, Archive search
vsellier added a comment to T3433: Deploy swh.search v0.10/v0.11.

The lag has recovered in ~ 12hours.
The content of the index looks goods (just cherry picked a couple of origin).

Sep 3 2021, 8:34 AM · System administration, Archive search