Page MenuHomeSoftware Heritage

Full-text search on source code (prototype)
Open, WishlistPublic

Description

We would like to have a full text search prototype in the archive contents.
An efficient way of doing this is to exploit the Merkle graph as follows:

  • index only the file contents (each content may appear in many different places)
  • use the swh-graph and/or provenance index to show the results in context

Prototype working on a 1%+ subset of the archive.

Related work:

Event Timeline

douardda renamed this task from Full-text search on a sizeable archive subset to Full-text search (prototype).Jan 20 2020, 2:15 PM
douardda created this task.
vlorentz triaged this task as Normal priority.Jan 22 2020, 4:19 PM
zack renamed this task from Full-text search (prototype) to Full-text search on source code (prototype).Sep 4 2020, 11:38 AM
zack updated the task description. (Show Details)
rdicosmo lowered the priority of this task from Normal to Wishlist.Mar 8 2021, 10:46 AM
rdicosmo edited projects, added Roadmap 2021; removed Roadmap 2020.