Paths

Table of Contentst

Differential D956

Add gemspec mapping.
ClosedPublic
Actions

Authored by vlorentz on Jan 15 2019, 8:19 PM.

Details

Reviewers

douardda
ardumont

Group Reviewers

Maniphest Tasks

T1328: Add Ruby/Gem metadata indexer

Commits

rDCIDX8147565ac8d8: Add gemspec mapping.

Summary

Diff Detail

Repository

rDCIDX Metadata indexer

Lint

Automatic diff as part of commit; lint not applicable.

Unit

Automatic diff as part of commit; unit tests not applicable.

Event Timeline

vlorentz created this revision.Jan 15 2019, 8:19 PM

Herald added a reviewer: Reviewers. · View Herald TranscriptJan 15 2019, 8:19 PM

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/221/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/221/console

Harbormaster failed remote builds in B3549: Diff 3046!Jan 15 2019, 8:22 PM

vlorentz added a child revision: D957: Factorize list merges in indexer mappings.Jan 15 2019, 8:22 PM

douardda requested changes to this revision.Jan 16 2019, 10:21 AM

douardda added a subscriber: douardda.

douardda added inline comments.

swh/indexer/metadata_dictionary.py
502	For all logging statements, shouldn't it make sens to have the id of the object being indexed ? I've not checked how indexers work so i'm not sure if this id makes sense and is available at this level, but IMHO having warnings and error messages in the logs which we cannot easily link to a known entity (an origin, a revision, etc.) is pretty frustrating.
508–510	unless i'm mistaken, the match var is not used, so why not for line in lines: if self._re_spec_new.match*line): break Another solution is using the dropwhile iterator (itertools), something like: lines = dropwhile(lambda x: not self._re_spec_new.match(x), lines.split('\n')) and move the 'emptyness' detection at the end of the function.
538–539	I don't like too much this kind of `if isinstance(str)`... What are the expected types for s? Can it be something else than str or None? In which case I prefer `if s is not None:`
542	why forbid tuples or iterators?
swh/indexer/tests/test_metadata.py
737	When writing tests for this kind of feature, please also add limit cases. How about invalid spec files? what if the description (or any other field) is a (literal) multi line string/list/dict? One test for the most basic and "friendly" data is not enough.

This revision now requires changes to proceed.Jan 16 2019, 10:21 AM

ardumont added a subscriber: ardumont.Jan 16 2019, 10:21 AM

ardumont added inline comments.

swh/indexer/metadata_dictionary.py
527	Maybe put this in a `parse_ruby_expression` method.
529	It'd be interesting to know what's not supported so a log here would be nice (content's id)

Also the build is not happy for import reason (swh.scheduler.conftest...), maybe a missing bump version somewhere?

vlorentz marked 7 inline comments as done.Jan 16 2019, 11:42 AM

vlorentz added inline comments.

swh/indexer/metadata_dictionary.py
502	T1474
529	There's a lot of stuff that's not supported (most of which would need a real Ruby interpreter), so it would get very spammy.
538–539	Any basic data type (str, list, dict, bytes, int, float, tuple), we're parsing arbitrary and untrusted input.
542	Ruby does not have tuple literals, and `ast.literal_eval` never returns iterators.
swh/indexer/tests/test_metadata.py
737	T1475

Use itertools instead of a for loop.
Move Ruby evaluation to its own function.

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/225/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/225/console

Harbormaster failed remote builds in B3558: Diff 3052!Jan 16 2019, 11:45 AM

Sanitize the output of eval_ruby_expression.

Build is green
See https://jenkins.softwareheritage.org/job/DCIDX/job/tox/229/ for more details.

Harbormaster completed remote builds in B3571: Diff 3062.Jan 16 2019, 6:45 PM

Re-implement a safe subset of ast.literal_eval instead of checking the type of its output afterward.

Build is green
See https://jenkins.softwareheritage.org/job/DCIDX/job/tox/230/ for more details.

Harbormaster completed remote builds in B3576: Diff 3067.Jan 17 2019, 11:41 AM

olasd added a subscriber: olasd.Jan 17 2019, 4:03 PM

olasd added inline comments.

swh/indexer/metadata_dictionary.py
502	All logged events from tasks some metadata items attached in the systemd journal (see swh.core.logger.get_extra_data and JournalHandler), notably the (celery) task id.

vlorentz added a task: T1328: Add Ruby/Gem metadata indexer.Jan 21 2019, 3:53 PM

rebase on top of D989.

vlorentz added a parent revision: D989: Update codemeta crosswalk..Jan 23 2019, 2:40 PM

Harbormaster failed remote builds in B3658: Diff 3133!Jan 23 2019, 2:41 PM

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/249/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/249/console

Rebase

Build is green
See https://jenkins.softwareheritage.org/job/DCIDX/job/tox/264/ for more details.

Harbormaster completed remote builds in B3685: Diff 3159.Jan 24 2019, 3:09 PM

ardumont accepted this revision.Jan 28 2019, 3:21 PM

douardda accepted this revision.Jan 29 2019, 9:55 AM

This revision is now accepted and ready to land.Jan 29 2019, 9:55 AM

rebase + squash

Closed by commit rDCIDX8147565ac8d8: Add gemspec mapping. (authored by vlorentz). · Explain WhyJan 29 2019, 10:14 AM

This revision was automatically updated to reflect the committed changes.

Harbormaster failed remote builds in B3770: Diff 3226!Jan 29 2019, 10:14 AM

Build has FAILED

Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/282/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/282/console

vlorentz mentioned this in T1328: Add Ruby/Gem metadata indexer.Jan 29 2019, 10:57 AM

Revision Contents
Changeset List

Path

Size

swh/

indexer/

5 lines

metadata_dictionary.py

112 lines

tests/

test_metadata.py

55 lines

Diff 3227

swh/indexer/codemeta.py

Loading...

swh/indexer/metadata_dictionary.py

Loading...

swh/indexer/tests/test_metadata.py

Loading...