Test more edge cases of metadata indexer mappings
Closed, MigratedEdits Locked
Actions

Assigned To

Authored By

	vlorentz
	Jan 16 2019, 11:42 AM

Description

The metadata indexer crawls the archive's content to extra metadata files. Mappings are the part of this indexer responsible for translating the many dialects into the CodeMeta format.

As they have to read data from all source code we know of, they may encounter many edge cases of these formats, which may or may not be valid.

Currently, only the most frequent edge cases are dealt with (see swh/indexer/tests/test_metadata.py and diffs linked to this task), but we should test more to check they behave correctly (even though "correct behavior" isn't always properly defined)

Revisions and Commits

rDCIDX Metadata indexer
	Closed	D1070 Prevent name clash when a metadata file has a key named 'dict'.
	Closed	D1071 Catch encoding errors when parsing pom.xml.
	Closed	D971 Fix parsing of the Description field in PKG-INFO.
	Closed	D965 Add more type checks to sanitize Mappings' input.
	Closed	D960 Maven mapping: fix crash on minimal pom.xml.
	Closed	D961 Maven mapping: fix crash on empty or invalid pom.xml.

Related Objects

Mentioned In: D956: Add gemspec mapping.

Event Timeline

vlorentz triaged this task as Normal priority.Jan 16 2019, 11:42 AM

vlorentz created this task.

vlorentz mentioned this in D956: Add gemspec mapping..

vlorentz added revisions: D961: Maven mapping: fix crash on empty or invalid pom.xml., D960: Maven mapping: fix crash on minimal pom.xml..Jan 16 2019, 12:08 PM

" AttributeError on @id with a colon but less than two slashes" https://github.com/digitalbazaar/pyld/issues/91 -> T4436

vlorentz renamed this task from Add more test for edge cases of indexer mappings. to Add more tests for edge cases of indexer mappings..Jan 21 2019, 12:37 PM

vlorentz added revisions: D965: Add more type checks to sanitize Mappings' input., D971: Fix parsing of the Description field in PKG-INFO..

vlorentz added revisions: D1071: Catch encoding errors when parsing pom.xml., D1070: Prevent name clash when a metadata file has a key named 'dict'..Feb 4 2019, 2:09 PM

zack renamed this task from Add more tests for edge cases of indexer mappings. to Add more tests for edge cases of indexer mappings.May 25 2019, 5:31 PM

vlorentz removed vlorentz as the assignee of this task.Jan 22 2020, 3:36 PM

vlorentz added a project: Easy hack.Jan 27 2020, 4:35 PM

vlorentz renamed this task from Add more tests for edge cases of indexer mappings to Test more edge cases of metadata indexer mappings.Jan 27 2020, 4:39 PM

vlorentz updated the task description. (Show Details)

KShivendu added a subscriber: KShivendu.Feb 26 2021, 6:57 PM

In T1475#27210, @vlorentz wrote:

" AttributeError on @id with a colon but less than two slashes" https://github.com/digitalbazaar/pyld/issues/91

This one might get fixed soon (based on this commit). So we need to look for other edge cases.

rohan-sachan added a subscriber: rohan-sachan.Mar 25 2021, 7:12 AM

Sentry issue: SWH-INDEXER-HH

(URL is http:\\www.jeremiasandris.de; same bug as above)

Sentry issue: SWH-INDEXER-HH

(forget this, I was trying to point to the wrong task)

This task has been migrated to GitLab.

Test more edge cases of metadata indexer mappingsClosed, MigratedEdits LockedActions

Description

Revisions and Commits

Related Objects

Event Timeline

Test more edge cases of metadata indexer mappings
Closed, MigratedEdits Locked
Actions