tree-sitter returns byte indices, not char indices.
Resolves SWH-SEARCH-12
Differential D6217
translator: Fix parsing of multibyte characters vlorentz on Sep 8 2021, 5:10 PM. Authored by Tags None Subscribers None
Details
tree-sitter returns byte indices, not char indices. Resolves SWH-SEARCH-12
Diff Detail
Event TimelineComment Actions Build is green Patch application report for D6217 (id=22496)Could not rebase; Attempt merge onto 7479282c70... Updating 7479282..e59807b Fast-forward swh/search/tests/test_translator.py | 51 ++++++++++++++++++++++++++++++++++++- swh/search/translator.py | 6 ++--- swh/search/utils.py | 11 ++++++-- 3 files changed, 62 insertions(+), 6 deletions(-) Changes applied before testcommit e59807b5c78bed7547305a8887d6c52521ba3044 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Wed Sep 8 17:09:43 2021 +0200 translator: Fix parsing of multibyte characters tree-sitter returns byte indices, not char indices. commit 7f1f1be3f253e9ed59807491eb1043616a0bf4e3 Author: Valentin Lorentz <vlorentz@softwareheritage.org> Date: Wed Sep 8 17:08:46 2021 +0200 utils: Fix unescape() on non-ASCII strings. 'unicode_escape' assumes latin-1 as input. See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/292/ for more details. |