Page MenuHomeSoftware Heritage

[swh-search] Improve the index/mapping migration process
Closed, MigratedEdits Locked


The current implementation uses the same index name to perform the indexation and the search.
The major drawback is it makes maintenance operations on the index complicated and create downtime on the public search service.
Now this search is used in production it should be avoided as far as possible.

A proposal to make the upgrades easier is to used 2 aliases, one for the search and one for the indexation.

The current upgrade of the origin index from version X to version Y is:

  • stop the indexations
  • create an new index originvX and copy the origin mapping to it
  • copy origin content to originvXwith a reindex operation
  • delete the origin index, the public search is impacted since this moment
  • recreate the origin via the swh-search cli, the public search is working again but no/few results are returned as the new index is empty
  • copy the content of originvX to origin with a reindex operation, the public search will start to return more results as the reindex is progressing (it can take several days)
  • delete the originv1 index
  • restart the indexation

With the usage of aliases, the process could be:
Given a current index origin-vX,

  • stop the indexation
  • create a new index origin-vY and configure the write alias to use it
  • copy the content of the origin-vX to the origin-vY with a reindex operation (can take several days)
  • update the search alias to use origin-vY
  • delete the origin-vX index

The public search is not impacted as a fully populated index is always present. Only new updates will be delayed be the reindexation duration.

The changes to implement on swh-search should be:

  • explicitly named the index and aliases names to use in the configuration
  • On the initialization function:
    • test if the search alias exist, if not, create it and configure it to use the index, otherwise do nothing
    • test if the write alias exits, if not, create it and configure it to use the index, otherwise do nothing
    • test if the index exists, create it and apply the mapping if not
  • Always call the initialization method during the startup to ensure the index exists and the mapping is applied. It will avoid to start indexing with an auto-generated (and divergent) mapping

Furthermore, this first step will allow an automatic migration to be implemented when needed.

Event Timeline

vsellier renamed this task from [use index aliases] to [swh-search] Improve the migration process.Mar 1 2021, 1:00 PM
vsellier renamed this task from [swh-search] Improve the migration process to [swh-search] Improve the index/mapping migration process.
vsellier triaged this task as Normal priority.
vsellier created this task.
vsellier changed the task status from Open to Work in Progress.Mar 1 2021, 3:54 PM

The new configuration is deployed, swh-search is now using the alias which should help for the future upgrades