The visit_types indexation was added on swh-search:0.6.0
The mapping of the production index needs to be apdated (for visit_types + metadata date fields)
Some tests need to be done to repeat the deployment in order to limit the search downtime.
The visit_types indexation was added on swh-search:0.6.0
The mapping of the production index needs to be apdated (for visit_types + metadata date fields)
Some tests need to be done to repeat the deployment in order to limit the search downtime.
Status | Assigned | Task | ||
---|---|---|---|---|
Migrated | gitlab-migration | T2869 web search: allow to filter by origin type | ||
Migrated | gitlab-migration | T3061 swh-search: Deploy visit_types indexation in production |
The mapping is well updated when the initialize command line is called.
For example for a migration tested in docker (with origins and metadata ingested) :
❯ diff -U30 /tmp/mapping-0.5.0.json /tmp/mapping-0.6.1.json --- /tmp/mapping-0.5.0.json 2021-02-19 15:06:31.222879537 +0100 +++ /tmp/mapping-0.6.1.json 2021-02-19 15:33:58.465084816 +0100 @@ -1,33 +1,34 @@ { "origin" : { "mappings" : { + "date_detection" : false, "properties" : { ... }, "http://schema" : { <------ Automatic mappings are well present "properties" : { "org/author" : { "properties" : { "@list" : { "properties" : { "@type" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", @@ -125,35 +126,38 @@ ... "sha1" : { "type" : "keyword" }, "url" : { "type" : "text", "fields" : { "as_you_type" : { "type" : "search_as_you_type", "doc_values" : false, "analyzer" : "simple", "max_shingle_size" : 3 } }, "analyzer" : "simple" + }, + "visit_types" : { + "type" : "keyword" } } } } }
So the current migration could be performed with the following actions :
root@search1:/etc/systemd/system# apt list --upgradable Listing... Done python3-swh.search/unknown 0.6.1-1~swh1~bpo10+1 all [upgradable from: 0.5.0-1~swh1~bpo10+1] python3-swh.storage/unknown 0.23.2-1~swh1~bpo10+1 all [upgradable from: 0.23.1-1~swh1~bpo10+1] root@search1:/etc/systemd/system# apt dist-upgrade
% curl -s http://${ES_SERVER}/origin/_mapping\?pretty | jq '.origin.mappings' > mapping-v0.5.0.json
swhstorage@search1:~$ /usr/bin/swh search --config-file /etc/softwareheritage/search/server.yml initialize INFO:elasticsearch:HEAD http://search-esnode1.internal.softwareheritage.org:9200/origin [status:200 request:0.036s] INFO:elasticsearch:PUT http://search-esnode2.internal.softwareheritage.org:9200/origin/_mapping [status:200 request:0.196s] Done.
% curl -s http://${ES_SERVER}/origin/_mapping\?pretty | jq '.origin.mappings' > mapping-v0.6.1.json
% diff -U3 mapping-v0.5.0.json mapping-v0.6.1.json --- mapping-v0.5.0.json 2021-02-19 15:10:23.336628008 +0000 +++ mapping-v0.6.1.json 2021-02-19 15:12:50.660635267 +0000 @@ -1,4 +1,5 @@ { + "date_detection": false, "properties": { "has_visits": { "type": "boolean" @@ -25,6 +26,9 @@ } }, "analyzer": "simple" + }, + "visit_types": { + "type": "keyword" } } }
% /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server $SERVER --reset-offsets --topic swh.journal.objects.origin_visit --to-earliest --group swh.search.journal_client --execute GROUP TOPIC PARTITION NEW-OFFSET swh.search.journal_client swh.journal.objects.origin_visit 16 0 swh.search.journal_client swh.journal.objects.origin_visit 10 0 swh.search.journal_client swh.journal.objects.origin_visit 66 0 ...
root@search1:/etc/systemd/system# systemctl start gunicorn-swh-search.service root@search1:/etc/systemd/system# systemctl start swh-search-journal-client@objects.service root@search1:/etc/systemd/system# puppet agent --enable