Page MenuHomeSoftware Heritage

Rewrite indexers as journal clients when relevant
Closed, ResolvedPublic

Description

Currently on the metadata indexer was implemented as one but it was dedicated to create one-shot tasks with an indirection on the scheduler.

This would:

  • simplify the stack by removing moving parts (scheduler, storage access db for content indexer...).
  • allow better monitoring (as we already have a grafana dashboard for journal clients)
  • allow indexation to be retried [1] on error
  • stop one index computation failure to fail the full batch indexation

Indexers:

  • D7899: origin intrinsic metadata
  • D8149: origin extrinsic metadata
  • D8147: mimetype (content indexer)
  • D8156: fossology-indexer (content indexer)

Related Objects

Event Timeline

vlorentz triaged this task as Normal priority.May 24 2022, 5:29 PM
vlorentz created this task.
ardumont updated the task description. (Show Details)
ardumont updated the task description. (Show Details)