Page MenuHomeSoftware Heritage

swh scanner db import does not validate SWHIDs
Closed, MigratedEdits Locked

Description

As per title, swh scanner db import would import any line from the input file and store them happily as "SWHIDs" in the generated sqlite DB. Rather, it should barf on invalid SWHIDs.

However, currently parse_swhid isn't great performance-wise (T2788), so it might be worth to wait for that to be fixed before using that to validate SWHIDs. (Or, alternatively, use a faster way to validated SWHIDs for db import, like a regex.)