Page MenuHomeSoftware Heritage

Fix heterogeneity of names in metadata tables
Closed, ResolvedPublic

Description

The tables revision_metadata and origin_intrinsic_metadata in the indexer storage contain mostly the same data. The only differences are that the latter has a from_revision field that holds the id of a revision, and a metadata_tsvector that's a kind of cache for fulltext-search queries the metadata.

However, naming between the two is fairly inconsistent. The following should be done:

  • The table revision_metadata should be renamed to revision_intrinsic_metadata,
  • The column revision_metadata.translated_metadata should be renamed revision_metadata.metadata,
  • The column origin_intrinsic_metadata.origin_id should be renamed id.

The indexer code (swh/indexer/metadata.py) and storage API (swh/indexer/storage/*.py) endpoints should be updated to reflect this, as well as their tests (swh/indexer/tests/).

A database migration script should also be written (sql/upgrades/).

Event Timeline

vlorentz triaged this task as Low priority.Mar 1 2019, 2:43 PM
vlorentz created this task.
haltode closed this task as Resolved.Mar 13 2019, 1:28 PM
haltode claimed this task.
haltode added a subscriber: haltode.

Fixed in 4f6ab3c9ab17.