tree-sitter returns byte indices, not char indices.
Resolves SWH-SEARCH-12
Differential D6217
translator: Fix parsing of multibyte characters Authored by vlorentz on Sep 8 2021, 5:10 PM. Tags None Subscribers None
Details
tree-sitter returns byte indices, not char indices. Resolves SWH-SEARCH-12
Diff Detail
Event TimelineComment Actions Build is green Patch application report for D6217 (id=22496)Could not rebase; Attempt merge onto 7479282c70... Updating 7479282..e59807b Fast-forward swh/search/tests/test_translator.py | 51 ++++++++++++++++++++++++++++++++++++- swh/search/translator.py | 6 ++--- swh/search/utils.py | 11 ++++++-- 3 files changed, 62 insertions(+), 6 deletions(-) Changes applied before testcommit e59807b5c78bed7547305a8887d6c52521ba3044
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Wed Sep 8 17:09:43 2021 +0200
translator: Fix parsing of multibyte characters
tree-sitter returns byte indices, not char indices.
commit 7f1f1be3f253e9ed59807491eb1043616a0bf4e3
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date: Wed Sep 8 17:08:46 2021 +0200
utils: Fix unescape() on non-ASCII strings.
'unicode_escape' assumes latin-1 as input.See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/292/ for more details. |