make check
Diff Detail
- Repository
- rDCIDX Metadata indexer
- Branch
- bug/codespell
- Lint
No Linters Available - Unit
No Unit Test Coverage - Build Status
Buildable 7856 Build 11304: tox-on-jenkins Jenkins Build 11303: arc lint + arc unit
Event Timeline
Build has FAILED
Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/616/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/616/console
Build has FAILED
Link to build: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/621/
See console output for more information: https://jenkins.softwareheritage.org/job/DCIDX/job/tox/621/console
from the diff title:
fixing [codespell] false-positives due to single-quoted strings
The "problem" I believe is that 'files' is interpreted as a trailing apostrophe. It's clearly a bug in codespell, but as it's neutral whether we use single or double, I think it's worth an exception to make a clean codespell run pass.
Indeed, I thought codespell was a linter instead of a spell-checker.
You can fix the false-positive by overriding codespell's regexp to ignore single quotes at the end of words:
find swh docs -name '*.py' -o -name '*.rst' | xargs -r codespell -r "[\\w\\-'’\`]*[\\w\\-’\`]"
(the default regexp is: https://github.com/codespell-project/codespell/blob/d7fa1e4/codespell_lib/_codespell.py#L29 )
> (the default regexp is: https://github.com/codespell-project/codespell/blob/d7fa1e4/codespell_lib/_codespell.py#L29 )
Nice, thanks for checking, this is a much better solution indeed. (As a nitpick, I'll go for "[\\w\\-'’\`]+(?=[^'])" instead, but the idea is the same.)
I'll push it to swh-environment and also upstream to codespell.
Nice indeed. You may also want to patch it so it doesn't detect the quote in r'foo', but I don't see a good way to do it (there may be a lot of characters in front of strings...)