Page MenuHomeSoftware Heritage

Test more edge cases of metadata indexer mappings
Closed, MigratedEdits Locked

Description

The metadata indexer crawls the archive's content to extra metadata files. Mappings are the part of this indexer responsible for translating the many dialects into the CodeMeta format.

As they have to read data from all source code we know of, they may encounter many edge cases of these formats, which may or may not be valid.

Currently, only the most frequent edge cases are dealt with (see swh/indexer/tests/test_metadata.py and diffs linked to this task), but we should test more to check they behave correctly (even though "correct behavior" isn't always properly defined)

Related Objects

Event Timeline

vlorentz triaged this task as Normal priority.Jan 16 2019, 11:42 AM
vlorentz created this task.

" AttributeError on @id with a colon but less than two slashes" https://github.com/digitalbazaar/pyld/issues/91 -> T4436

vlorentz renamed this task from Add more test for edge cases of indexer mappings. to Add more tests for edge cases of indexer mappings..Jan 21 2019, 12:37 PM
zack renamed this task from Add more tests for edge cases of indexer mappings. to Add more tests for edge cases of indexer mappings.May 25 2019, 5:31 PM
vlorentz renamed this task from Add more tests for edge cases of indexer mappings to Test more edge cases of metadata indexer mappings.Jan 27 2020, 4:39 PM
vlorentz updated the task description. (Show Details)
vlorentz updated the task description. (Show Details)

" AttributeError on @id with a colon but less than two slashes" https://github.com/digitalbazaar/pyld/issues/91

This one might get fixed soon (based on this commit). So we need to look for other edge cases.

Sentry issue: SWH-INDEXER-HH

(URL is http:\\www.jeremiasandris.de; same bug as above)

(forget this, I was trying to point to the wrong task)