Page MenuHomeSoftware Heritage

textual search language for the Web UI
Open, NormalPublic


The current Web UI search accept a single string that is used as list of tokens to search either in URLs or metadata.

We want to have a "search engine-like" sub-syntax that, without becoming too structured, allows to express specific facets.

Here are facets that come to mind as already useful today:

  • loader: (or "visit type", not sure what would be the best name for this) to filter on which loader has been used (without this today we cannot easily select all pypi, debian, or cran packages, for instance)
  • last_visited: (with some relational operator) to return only results that have been last visited in a given time frame
  • metadata fields: our metadata indexing extract a bunch of properties that would be great to filter on, rather than lumping them all together into a single full-text search (this requires some care into avoiding clashes between the metadata key namespace and the search facet one)

Other stuff that might be added in the future, provided we have suitable backend indexing:

  • project size: filter on number of, e.g., commits
  • file content: filter on projects that contain a given substring. This one hints at the fact that we will probably want to have a selector determining which type of objects will be returned, e.g., origin (the only possibility today) v. a specific type of Merkle DAG node

Related Objects


Event Timeline

zack triaged this task as Normal priority.Jan 29 2020, 1:31 PM
zack created this task.
zack updated the task description. (Show Details)Jan 29 2020, 2:31 PM