Page MenuHomeSoftware Heritage

Add various markdown variants to list of intrinsic metadata files to be indexed
Open, NormalPublic

Description

There is a steady trend of new projects using a markdown/rst/html version of the traditional text files that hold information about the projects: one finds README.md, AUTHORS.rst, INSTALL.txt, even LICENSE.html, with all variants of upcase/lowercase in the name. Sometimes these variants site side by side with the plain text ones (README, AUTHORS, etc.), sometimes they just replace them.

The metadata indexing pipeline needs to look for all these variants to make sure we do not miss relevant metadata information.

This is related to T2064, but not limited to deposits.