Page MenuHomeSoftware Heritage
Feed Advanced Search

Dec 21 2020

vsellier added a comment to T2910: Sentry: Increase disk space.
  • before :
root@riverside:~# pvscan
  PV /dev/sda1   VG riverside-vg    lvm2 [<63.98 GiB / 0    free]
  Total: 1 [<63.98 GiB] / in use: 1 [<63.98 GiB] / in no VG: 0 [0   ]
root@riverside:~# df -h /
Filesystem                      Size  Used Avail Use% Mounted on
/dev/mapper/riverside--vg-root   60G   56G  1.4G  98% /

(2% some cleanup seems to have occur since the creation of the task :) )

  • disk extended on proxmox by 16Go on proxmox
(extract of dmesg of riverside)
[350521.461023] sd 2:0:0:0: Capacity data has changed
[350521.461339] sd 2:0:0:0: [sda] 167772160 512-byte logical blocks: (85.9 GB/80.0 GiB)
[350521.461484] sda: detected capacity change from 68719476736 to 85899345920
  • partition resized :
root@riverside:~# parted /dev/sda
GNU Parted 3.2
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print free                                                       
Model: QEMU QEMU HARDDISK (scsi)
Disk /dev/sda: 85.9GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:
Dec 21 2020, 7:05 PM · Sentry, System administration
vsellier changed the status of T2910: Sentry: Increase disk space from Open to Work in Progress.
Dec 21 2020, 6:58 PM · Sentry, System administration
vsellier committed rSPRE6efec4e443b4: staging: Add objstorage0 node (authored by vsellier).
staging: Add objstorage0 node
Dec 21 2020, 6:51 PM
vsellier committed rSPSITE96bfcf42004d: [staging] Configure and expose to internet a read-only objstorage (authored by vsellier).
[staging] Configure and expose to internet a read-only objstorage
Dec 21 2020, 6:20 PM
vsellier closed D4776: [staging] Configure and expose to internet a read-only objstorage.
Dec 21 2020, 6:20 PM
vsellier committed rSENV661482496784: Add objstorage0.staging.swh.network node to expose a r/o objstorage node (authored by vsellier).
Add objstorage0.staging.swh.network node to expose a r/o objstorage node
Dec 21 2020, 6:18 PM
vsellier closed D4775: Add objstorage0.staging.swh.network node to expose a r/o objstorage node.
Dec 21 2020, 6:18 PM
vsellier created D4776: [staging] Configure and expose to internet a read-only objstorage.
Dec 21 2020, 6:01 PM
vsellier added a revision to T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage): D4776: [staging] Configure and expose to internet a read-only objstorage.
Dec 21 2020, 6:01 PM · Staging environment, System administration
vsellier updated the diff for D4775: Add objstorage0.staging.swh.network node to expose a r/o objstorage node.

Remove out of scope changes

Dec 21 2020, 4:53 PM
vsellier added a revision to T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage): D4775: Add objstorage0.staging.swh.network node to expose a r/o objstorage node.
Dec 21 2020, 4:48 PM · Staging environment, System administration
vsellier created D4775: Add objstorage0.staging.swh.network node to expose a r/o objstorage node.
Dec 21 2020, 4:48 PM
vsellier updated the task description for T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage).
Dec 21 2020, 12:58 PM · Staging environment, System administration
vsellier added a comment to T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage).

A user was correctly configured and a read test performed :

Dec 21 2020, 12:57 PM · Staging environment, System administration
vsellier updated the task description for T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage).
Dec 21 2020, 12:38 PM · Staging environment, System administration
vsellier added a comment to T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage).

The network configuration is done. The server is now accessible from the internet at broker0.journal.staging.swh.network:9093

Dec 21 2020, 12:25 PM · Staging environment, System administration
vsellier updated the task description for T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage).
Dec 21 2020, 12:24 PM · Staging environment, System administration
vsellier requested changes to D4768: production: Add new hedgedoc instance.
Dec 21 2020, 12:15 PM
vsellier accepted rSENV8f6f8ee5aa62: Vagrantfile: Add hedgedoc instance.

LGTM

Dec 21 2020, 10:49 AM
vsellier accepted rSENV55386982d27a: vagrant: Generate certificate for bardo.softwareheritage.org.
Dec 21 2020, 10:49 AM
vsellier accepted D4770: production: Provision bardo instance to receive hedgedoc.

lgtm, I have a doubt on the numa activation but as it's also activated for kelvingrove, I assume it's correct

Dec 21 2020, 10:48 AM
vsellier raised the priority of T2906: Error on date parsing on the deposit from Normal to High.

Changing to high priority (@ardumont recommandation)

Dec 21 2020, 10:32 AM · SWORD deposit
vsellier triaged T2906: Error on date parsing on the deposit as Normal priority.
Dec 21 2020, 10:28 AM · SWORD deposit
vsellier triaged T2905: Deploy swh-search for production as Normal priority.
Dec 21 2020, 10:03 AM · System administration, Journal, Archive search
vsellier triaged T2904: Create a new production webapp using the frozen index on the staging ES as Normal priority.
Dec 21 2020, 9:59 AM · System administrators, Journal, Archive search
vsellier triaged T2903: Test different disk configuration on esnode1 as High priority.
Dec 21 2020, 9:30 AM · System administration

Dec 18 2020

vsellier updated the task description for T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage).
Dec 18 2020, 4:59 PM · Staging environment, System administration
vsellier added a comment to T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage).

The request to expose the journal to internet was done this afternoon to the dsi.

Dec 18 2020, 4:57 PM · Staging environment, System administration
vsellier added a comment to T2899: Sentry doesn't react to new errors.

To eliminate another possible root cause, a test was done in a temporary project with the last version of the python library, it doesn't work either

Dec 18 2020, 10:52 AM · Sentry, System administration

Dec 17 2020

vsellier accepted D4757: Decomission zookeeper instances used in the rocquencourt_legacy cluster.

lgtm

Dec 17 2020, 5:42 PM · System administration
vsellier added a comment to T2899: Sentry doesn't react to new errors.

we have followed the event track on the consumer code without finding anything suspicious.
As a last try, we have fully rebooted the vm, but as expected, it changed nothing at all.

Dec 17 2020, 5:37 PM · Sentry, System administration
vsellier updated subscribers of T2899: Sentry doesn't react to new errors.

@olasd, if you have some detailed of the version upgrades you have performed yesterday, perhaps it could help to diagnose.

Dec 17 2020, 3:22 PM · Sentry, System administration
vsellier changed the status of T2899: Sentry doesn't react to new errors from Open to Work in Progress.
Dec 17 2020, 2:59 PM · Sentry, System administration
vsellier accepted D4747: Decomission kafka from esnodes.

LGTM

Dec 17 2020, 12:14 PM
vsellier closed T2897: [staging] kafka data dir over 80%, a subtask of T2790: [staging] deploy the journal infrastructure, as Resolved.
Dec 17 2020, 10:00 AM · System administration, Staging environment
vsellier closed T2897: [staging] kafka data dir over 80% as Resolved.
Dec 17 2020, 10:00 AM · System administration, Staging environment
vsellier added a comment to T2897: [staging] kafka data dir over 80%.

After one week, the disk used by kafka was around 85% of usage

root@journal0:/tmp# df -h /srv/kafka/logdir
Filesystem      Size  Used Avail Use% Mounted on
kafka-volume    481G  409G   73G  85% /srv/kafka/logdir

Compared to the production, the compression was not activated on the zfs pool:

root@kafka1:~#  zfs get all data/kafka  | grep compress
data/kafka  compressratio         1.55x                  -
data/kafka  compression           lz4                    inherited from data
data/kafka  refcompressratio      1.55x                  -
root@journal0:/tmp# zfs get all  | grep compress
kafka-volume  compressratio         1.00x                  -
kafka-volume  compression           off                    default
kafka-volume  refcompressratio      1.00x                  -

So the compression was activated :

root@journal0:/tmp# zfs set compression=lz4 kafka-volume
root@journal0:/tmp# zfs get all  | grep compress
kafka-volume  compressratio         1.00x                  -
kafka-volume  compression           lz4                    local
kafka-volume  refcompressratio      1.00x                  -

As this parameter is only used for the new written data, we have force a compact on the biggest topics : `directory, revision and content`

 % ./kafka-topics.sh --zookeeper $ZK  --alter --topic swh.journal.objects.revision --config min.cleanable.dirty.ratio=0.01
WARNING: Altering topic configuration from this script has been deprecated and may be removed in future releases.
         Going forward, please use kafka-configs.sh for this functionality
Updated config for topic swh.journal.objects.revision.
vsellier@journal0 /opt/kafka/bin
 % ./kafka-topics.sh --zookeeper $ZK  --alter --topic swh.journal.objects_privileged.revision --config min.cleanable.dirty.ratio=0.01
WARNING: Altering topic configuration from this script has been deprecated and may be removed in future releases.
         Going forward, please use kafka-configs.sh for this functionality
Updated config for topic swh.journal.objects_privileged.revision.
Dec 17 2020, 10:00 AM · System administration, Staging environment
vsellier changed the status of T2897: [staging] kafka data dir over 80% from Open to Work in Progress.
Dec 17 2020, 9:58 AM · System administration, Staging environment

Dec 16 2020

vsellier accepted D4754: staging: Add clearly-defined node.

LGTM

Dec 16 2020, 3:43 PM
vsellier closed T2629: Recycle ceph-mon1 as a hypervisor integrated in the proxmox cluster as Resolved.

changing the status to resolved as everything looks good \o/

Dec 16 2020, 3:34 PM · System administration
vsellier closed T2629: Recycle ceph-mon1 as a hypervisor integrated in the proxmox cluster, a subtask of T2501: Proxmox reliability improvements (Summer 2020), as Resolved.
Dec 16 2020, 3:34 PM · System administration
vsellier added a comment to T2629: Recycle ceph-mon1 as a hypervisor integrated in the proxmox cluster.

After a new test, a vm deployed on pompidou can reach the network without any issue.
There were some glitches (kernel dump) after the migration, perhaps a reboot after the first migration test would have fixed to network problem.

Dec 16 2020, 3:33 PM · System administration
vsellier merged T2868: Integrate former ceph-mon1 server to the proxmox cluster into T2629: Recycle ceph-mon1 as a hypervisor integrated in the proxmox cluster.
Dec 16 2020, 3:31 PM · System administration
vsellier merged task T2868: Integrate former ceph-mon1 server to the proxmox cluster into T2629: Recycle ceph-mon1 as a hypervisor integrated in the proxmox cluster.
Dec 16 2020, 3:31 PM · System administration
vsellier updated subscribers of D4747: Decomission kafka from esnodes.

We should also check with @olasd if the zookeeper[1-3] can be decommissioned if we remove this brokers

Dec 16 2020, 3:08 PM
vsellier accepted D4753: Add clearly-defined vm role to access the staging clearly defined db instance.

LGTM
as tested together, removing the deep option of lookup seems to result in a more predictable behavior :)

Dec 16 2020, 3:04 PM
vsellier updated the task description for T2888: Elasticsearch cluster failure during a rolling restart.
Dec 16 2020, 10:22 AM · System administration
vsellier added a comment to T2888: Elasticsearch cluster failure during a rolling restart.
  • smartctl extended test are running on all the esnode* disks to detect possible defects. The results will be availble in few hours

All the smartctl tests are done and no additional faulty disks were detected

Dec 16 2020, 10:22 AM · System administration

Dec 15 2020

vsellier added a comment to T2888: Elasticsearch cluster failure during a rolling restart.

Remark regarding the extension of the storage via the addition of a new data directory [1], so not sure it's the best way to do it:

Dec 15 2020, 6:52 PM · System administration
vsellier added a comment to T2888: Elasticsearch cluster failure during a rolling restart.
  • smartctl extended test are running on all the esnode* disks to detect possible defects. The results will be availble in few hours
Dec 15 2020, 6:29 PM · System administration
vsellier updated the task description for T2888: Elasticsearch cluster failure during a rolling restart.
Dec 15 2020, 6:16 PM · System administration
vsellier added a comment to T2888: Elasticsearch cluster failure during a rolling restart.

We tried to temporarily restart esnode1 to reallocate the shards of the red indices for which esnode1 was the primary.
Actions:

  • Mount/Remount the xfs partition to flush the xfs journal
  • Perform a xfs_repair to ensure the fs is ok
  • configure elasticsearch deallocate the shard managed by esnode1
  • start esnode1
  • wait for the shards redistribution (swh_workers-2020.09.03was quickly recovered, and the remaining systemlogs.2018 deleted)
  • stop esnode1
  • disable puppet to avoid a restart of elasticsearch on esnode1
Dec 15 2020, 6:16 PM · System administration
vsellier accepted D4745: staging: Add clearly-defined postgresql instance.

LGTM

Dec 15 2020, 5:18 PM
vsellier added a comment to T2888: Elasticsearch cluster failure during a rolling restart.

The free disk space is again around ~85% used on esnode3 (~79% on esnode2).
The systemlogs.*2020.01.* indices were removed.

Dec 15 2020, 3:28 PM · System administration
vsellier accepted D4743: Onboard Tushar Goel as tg1999.
Dec 15 2020, 2:59 PM
vsellier added a comment to D4743: Onboard Tushar Goel as tg1999.

LGTM

Dec 15 2020, 2:59 PM
vsellier updated the task description for T2888: Elasticsearch cluster failure during a rolling restart.
Dec 15 2020, 11:37 AM · System administration
vsellier added a comment to T2888: Elasticsearch cluster failure during a rolling restart.

The shard allocation is reactivated, it should have enough free disk space to replicate all the shard on the 2 nodes.

Dec 15 2020, 11:35 AM · System administration
vsellier updated the task description for T2888: Elasticsearch cluster failure during a rolling restart.
Dec 15 2020, 11:32 AM · System administration
vsellier added a comment to T2888: Elasticsearch cluster failure during a rolling restart.
  1. cleanup of systemlogs index before 2020 (2018/2019)
Dec 15 2020, 11:31 AM · System administration
vsellier raised the priority of T2888: Elasticsearch cluster failure during a rolling restart from Normal to Unbreak Now!.
Dec 15 2020, 10:54 AM · System administration
vsellier added a comment to T2888: Elasticsearch cluster failure during a rolling restart.

Short term plan :

  • Remove old systemlogs indexes older than 1year to start, but we can go to 3 months if necessary
  • reactivate the shard allocation to have 1 replica for all the shards in case of a second node failure
  • Launch a long smartcl test on all the disks of each esnode* server
  • Contact DELL support to proceed to the replacement of the 2 failing disks (under warranty(?)) [1]
  • Try to recover the 16 red indexes if possible, if not, delete them as they are not critical
Dec 15 2020, 10:52 AM · System administration

Dec 14 2020

vsellier added a comment to T2888: Elasticsearch cluster failure during a rolling restart.

xfs has shutdown the partition so ES is lost .

Dec 14 2020, 10:49 PM · System administration
vsellier added a comment to T2888: Elasticsearch cluster failure during a rolling restart.

It seems there is a quite limited indices impacted by the corruption :

❯ curl -s  http://${ES_NODE}/_cat/indices | grep red                                                                                                        22:20:27
red    open  systemlogs-2020.08.30               o_gpFSjQRBuQBvWqaqA_dA 1 1                                   
red    open  systemlogs-2020.08.27               U4fKujQhTXmbGsx7zzLiPw 1 1                                   
red    open  systemlogs-2020.08.28               JVz-yhe4SeSow1TQPT61Jg 1 1                                   
red    open  systemlogs-2020.08.29               6avrSP3bRW2ZiwSlTpN0tA 1 1                                   
red    open  systemlogs-2020.08.22               jY7nPiXDS6a6aBnTDHNd1A 1 1                                   
red    open  systemlogs-2020.08.16               AK8wyDFQQ2KOgbzIdLvPqQ 1 1                                   
red    open  systemlogs-2020.08.13               o6OowHj-TMCBSglETaTj4w 1 1                                   
red    open  systemlogs-2020.08.10               NN0H_eaXQJW_20lsIMmg0Q 1 1                                   
red    open  systemlogs-2020.08.08               pkJVICAdSbqn3JgHU1h5Yw 1 1                                   
red    open  systemlogs-2020.09.07               naRyJEkZRCeOY5h_2avRyg 1 1                                   
red    open  systemlogs-2020.09.03               wb0DMaeqT2-Lh4nx8rafgQ 1 1                                   
red    open  systemlogs-2020.09.01               jelq1Ij5SGWQAKDqdbCYlQ 1 1                                   
red    open  swh_workers-2020.09.03              c1ZiRR8HS9W44T3nVd7f9Q 2 1  2733325        0   1.6gb    1.6gb
red    open  systemlogs-2020.07.24               743a1usWSw-whONPLhcKrA 1 1                                   
red    open  systemlogs-2020.07.25               zFkfn6l5SA-sby3A0SOAtw 1 1                                   
red    open  systemlogs-2020.07.17               PxL7sBrUQ8SXtbOEG5v_3A 1 1
Dec 14 2020, 10:21 PM · System administration
vsellier added a comment to T2888: Elasticsearch cluster failure during a rolling restart.

sdb and sdc on esnode1 have serious issues.
(there is no other disks with errors on other servers)

root@esnode1:~# smartctl -a /dev/sdb
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-13-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
Dec 14 2020, 10:19 PM · System administration
vsellier changed the status of T2888: Elasticsearch cluster failure during a rolling restart from Open to Work in Progress.
Dec 14 2020, 10:15 PM · System administration
vsellier accepted D4732: Add admin tools to default packages.

LGTM no more apt install dstat \o/ :P

Dec 14 2020, 12:02 PM
vsellier added a comment to T2817: Enable the swh-search environment in staging.

With the "optimized" configuration, the import is quite faster :

root@search-esnode0:~# curl -XPOST -H "Content-Type: application/json" http://${ES_SERVER}/_reindex\?pretty\&refresh=true\&requests_per_second=-1\&\&wait_for_completion=true -d @/tmp/reindex-production.json    
{
  "took" : 10215280,
  "timed_out" : false,
  "total" : 91517657,
  "updated" : 0,
  "created" : 91517657,
  "deleted" : 0,
  "batches" : 91518,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

"took" : 10215280, => 2h45

Dec 14 2020, 9:47 AM · System administrators, Staging environment, Journal, Archive search

Dec 11 2020

vsellier added a comment to T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage).
  • diff landed and applied on the server
  • VIP 128.93.166.40 configured on the firewall
  • NAT Port forward of port 9093 from public ip to internal journal0 declared on the firewall
  • DNS declaration of broker0.journal.staging.swh.network in gandi
  • Ask to DSI to apply the kafka firewall profile to 128.93.166.40
  • Configure a user to test the pipeline
Dec 11 2020, 6:11 PM · Staging environment, System administration
vsellier committed rSPSITE5c693c5cf08b: kafka: activate the authentication on the public network (authored by vsellier).
kafka: activate the authentication on the public network
Dec 11 2020, 5:19 PM
vsellier closed D4726: kafka: activate the authentication on the public network.
Dec 11 2020, 5:19 PM
vsellier updated the diff for D4726: kafka: activate the authentication on the public network.

rebase

Dec 11 2020, 5:18 PM
vsellier committed rSENV84531f26646d: Upgrade the journal0 to have the first CN matching broker1.journal.staging.swh. (authored by vsellier).
Upgrade the journal0 to have the first CN matching broker1.journal.staging.swh.
Dec 11 2020, 4:24 PM
vsellier created D4726: kafka: activate the authentication on the public network.
Dec 11 2020, 3:17 PM
vsellier added a revision to T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage): D4726: kafka: activate the authentication on the public network.
Dec 11 2020, 3:17 PM · Staging environment, System administration
vsellier committed rSPSITE5128dbb7b8d3: varnish: Correctly handle the vhost when the port number is included (authored by vsellier).
varnish: Correctly handle the vhost when the port number is included
Dec 11 2020, 2:36 PM
vsellier closed D4719: varnish: Correctly handle the vhost when the port number is included.
Dec 11 2020, 2:36 PM
vsellier added a comment to T2877: Investigate spurious deposit logs.

I agree for the default site but we have several legit requests from the monitoring not correctly routed so the configuration needs to be adapted.

Dec 11 2020, 11:46 AM · System administration, Staging environment, SWORD deposit
vsellier added a revision to T2877: Investigate spurious deposit logs: D4719: varnish: Correctly handle the vhost when the port number is included.
Dec 11 2020, 11:42 AM · System administration, Staging environment, SWORD deposit
vsellier created D4719: varnish: Correctly handle the vhost when the port number is included.
Dec 11 2020, 11:42 AM
vsellier added a comment to T2817: Enable the swh-search environment in staging.

The production index origin was correctly copied from the production cluster but it seems without the configuration to optimize the copy.
We keep this one and try a new optimized copy to check if the server still crash in an OOM with the new cpu and memory settings.

Dec 11 2020, 10:15 AM · System administrators, Staging environment, Journal, Archive search

Dec 10 2020

vsellier changed the status of T2682: Deploy a small publicly available kafka server (with some content) on a staging (+ the related objstorage) from Open to Work in Progress.
Dec 10 2020, 5:41 PM · Staging environment, System administration
vsellier committed rSPRE56974c0407c2: staging: Increase cpu, memory and disk of search-esnode0 (authored by vsellier).
staging: Increase cpu, memory and disk of search-esnode0
Dec 10 2020, 3:59 PM
vsellier added a comment to T2817: Enable the swh-search environment in staging.

FI: The origin index was recreated with the "official" mapping and a backfill was performed (necessary after the test of the flattened mapping)

Dec 10 2020, 3:42 PM · System administrators, Staging environment, Journal, Archive search
vsellier accepted D4716: Deactivate swh-search-journal-client@indexed service.
Dec 10 2020, 3:33 PM
vsellier closed T2817: Enable the swh-search environment in staging, a subtask of T2590: Finish the indexer -> swh-search pipeline, as Resolved.
Dec 10 2020, 3:29 PM · Journal, Archive search
vsellier closed T2817: Enable the swh-search environment in staging as Resolved.

The deployment manifest are ok and deployed in staging so this task can be resolved.
We will work on reactivating search-journal-client for the metadata in another task when T2876 is resolved

Dec 10 2020, 3:29 PM · System administrators, Staging environment, Journal, Archive search
vsellier accepted D4712: staging: Increase elasticsearch jvm heap size to half its memory.
Dec 10 2020, 3:22 PM
vsellier updated the task description for T2817: Enable the swh-search environment in staging.
Dec 10 2020, 3:19 PM · System administrators, Staging environment, Journal, Archive search
vsellier added a comment to T2876: metadata indexation : ES' dynamic mapping creation fails for field values that are of varying types.

We tried to change the mapping type of the field intrinsic_metadata from nested to flattened as you have suggested, we have now a new error related to the huge size of a description.
ES can be configured to accept bigger fields but I'm not sure it's relevant regarding the description field content.

Dec 10 2020, 3:18 PM · Intrinsic metadata, Indexer, Archive search
vsellier added a subtask for T2590: Finish the indexer -> swh-search pipeline: T2876: metadata indexation : ES' dynamic mapping creation fails for field values that are of varying types.
Dec 10 2020, 12:31 PM · Journal, Archive search
vsellier added a parent task for T2876: metadata indexation : ES' dynamic mapping creation fails for field values that are of varying types: T2590: Finish the indexer -> swh-search pipeline.
Dec 10 2020, 12:31 PM · Intrinsic metadata, Indexer, Archive search
vsellier triaged T2876: metadata indexation : ES' dynamic mapping creation fails for field values that are of varying types as Normal priority.
Dec 10 2020, 12:31 PM · Intrinsic metadata, Indexer, Archive search
vsellier accepted D4699: search: Deploy multiple search journal client instances.
Dec 10 2020, 11:36 AM
vsellier added a comment to T2817: Enable the swh-search environment in staging.

The copy of the production index is restarted.
To improve the speed of the copy, the index was tuned to reduce the disk pressure (it's a temporary configuration and should not be used in a normal case as it's not safe) :

cat >/tmp/config.json <<EOF
{
  "index" : {
    "translog.sync_interval" : "60s",
	"translog.durability": "async",
	"refresh_interval": "60s"
  }
}
EOF
Dec 10 2020, 11:14 AM · System administrators, Staging environment, Journal, Archive search
vsellier added a comment to T2817: Enable the swh-search environment in staging.
  • Parition and memory extended with terraform.
  • The disk resize needed some console actions to be extended :
Dec 10 2020, 10:39 AM · System administrators, Staging environment, Journal, Archive search
vsellier added a comment to T2817: Enable the swh-search environment in staging.

The production index import failed because the limit of 90% of used disk spaces was reached at some time to fall back to around 60G after a compaction
The progression was 80M documents of 91M.

Dec 10 2020, 9:59 AM · System administrators, Staging environment, Journal, Archive search
vsellier accepted D4711: test_journal_client: Migrate to pytest.
Dec 10 2020, 9:48 AM
vsellier accepted D4703: docker-compose.search.yml: Upgrade elasticsearch container.
Dec 10 2020, 9:46 AM
vsellier accepted D4702: docker-compose.search.yml: Specify the search journal client config.
Dec 10 2020, 9:46 AM