Page MenuHomeSoftware Heritage

translator.py: Translate search query language to ES DSL
ClosedPublic

Authored by KShivendu on Jul 26 2021, 8:53 PM.

Details

Summary

Translate swh search query language queries into Elasticsearch DSL

Depends on D6024

Diff Detail

Repository
rDSEA Archive search
Branch
translator
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 22755
Build 35486: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 35485: arc lint + arc unit

Unit TestsFailed

TimeTest
1 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.search.tests.test_translator::test_conjunction_op_precedence_override
def test_conjunction_op_precedence_override(): query = "(visited = false or visits > 2) and visits < 5" expected = {
1 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.search.tests.test_translator::test_conjunction_operators
def test_conjunction_operators(): query = "visited = true or visits > 2 and visits < 5" expected = {
1 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.search.tests.test_translator::test_date_created_not_equal_to_filter
def test_date_created_not_equal_to_filter(): query = "created != 2020-01-01" expected = {
1 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.search.tests.test_translator::test_deeply_nested_filters
def test_deeply_nested_filters(): query = "(((visited = true and visits > 0)))" expected = {
1 msJenkins > .tox.py3.lib.python3.7.site-packages.swh.search.tests.test_translator::test_keyword_filter
def test_keyword_filter(): query = 'keyword in [word1, "word2 \\" \' word3"]' expected = {
View Full Test Results (13 Failed · 137 Passed · 1 Skipped)

Event Timeline

Build has FAILED

Patch application report for D6025 (id=21782)

Could not rebase; Attempt merge onto 2edbbbe833...

Updating 2edbbbe..757b046
Fast-forward
 query_language/grammar.js                   |  28 +++--
 query_language/test/corpus/combinations.txt |  13 ++-
 swh/search/tests/test_translator.py         | 111 +++++++++++++++++++
 swh/search/translator.py                    | 161 ++++++++++++++++++++++++++++
 4 files changed, 300 insertions(+), 13 deletions(-)
 create mode 100644 swh/search/tests/test_translator.py
 create mode 100644 swh/search/translator.py
Changes applied before test
commit 757b0462d0bd11e8663c1e1d5f3de0b0ddc07773
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:18:11 2021 +0530

    translator.py: Translate search query language to ES DSL
    
    Summary:
    
    Translate swh search query language queries into Elasticsearch DSL
    
    Depends on D6024
    
    Test Plan:
    
    Reviewers:
    
    Subscribers:

commit c27cb9a0a4d6a657e8ba02ab0f705bb77cc9b6ee
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    grammar.js: Segregate sort_by and limit from filters
    
    Summary:
    The grammar should not allow using sort_by and limit more than once throughout
    the query.
    
    Unlike other filters, these two must not be concatenated by 'and' or 'or'
    
    Reviewers: #reviewers
    
    Differential Revision: https://forge.softwareheritage.org/D6024

Link to build: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/236/
See console output for more information: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/236/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 26 2021, 8:57 PM
Harbormaster failed remote builds in B22750: Diff 21782!

you should use raise NotImplementedError("blah blah blah") instead of return "TODO", it makes it clearer that you reach an unimplemented case, and which one.

  • translator.py: Translate search query language to ES DSL
  • translator.py: Complete all the filters and add new tests
  • translator.py: Import swh_ql.so using pkg_resources

Build has FAILED

Patch application report for D6025 (id=21787)

Rebasing onto 05efa5418c...

Current branch diff-target is up to date.
Changes applied before test
commit 715a2eb2206cd946bd3ad5fdc87b1cee3fb885ba
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 17:55:31 2021 +0530

    translator.py: Import swh_ql.so using pkg_resources

commit c3174265485ee9758507fd2fc41f52f46d35dd09
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:58:31 2021 +0530

    translator.py: Complete all the filters and add new tests

commit 3d030756b3f63082148253d44df06eb0ea1f447a
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:18:11 2021 +0530

    translator.py: Translate search query language to ES DSL
    
    Translate swh search query language queries into Elasticsearch DSL

Link to build: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/238/
See console output for more information: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/238/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 27 2021, 2:30 PM
Harbormaster failed remote builds in B22755: Diff 21787!
  • translator: Try fixing swh_ql.so path

Build has FAILED

Patch application report for D6025 (id=21803)

Rebasing onto 05efa5418c...

Current branch diff-target is up to date.
Changes applied before test
commit 16c254dc569b0d0cc8adb9ad1e6f70ad1ab39759
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 21:20:47 2021 +0530

    translator: Try fixing swh_ql.so path

commit 715a2eb2206cd946bd3ad5fdc87b1cee3fb885ba
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 17:55:31 2021 +0530

    translator.py: Import swh_ql.so using pkg_resources

commit c3174265485ee9758507fd2fc41f52f46d35dd09
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:58:31 2021 +0530

    translator.py: Complete all the filters and add new tests

commit 3d030756b3f63082148253d44df06eb0ea1f447a
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:18:11 2021 +0530

    translator.py: Translate search query language to ES DSL
    
    Translate swh search query language queries into Elasticsearch DSL

Link to build: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/239/
See console output for more information: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/239/console

Harbormaster returned this revision to the author for changes because remote builds failed.Jul 27 2021, 5:55 PM
Harbormaster failed remote builds in B22771: Diff 21803!
  • translator.py: Fix resource path for tox environment

Build is green

Patch application report for D6025 (id=21805)

Rebasing onto 05efa5418c...

Current branch diff-target is up to date.
Changes applied before test
commit 905889ddf0f3f4676ed841c325774d3a6a72cef8
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 17:55:31 2021 +0530

    translator.py: Fix resource path for tox environment

commit c3174265485ee9758507fd2fc41f52f46d35dd09
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:58:31 2021 +0530

    translator.py: Complete all the filters and add new tests

commit 3d030756b3f63082148253d44df06eb0ea1f447a
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:18:11 2021 +0530

    translator.py: Translate search query language to ES DSL
    
    Translate swh search query language queries into Elasticsearch DSL

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/240/ for more details.

swh/search/translator.py
18

I know that this is bad and shouldn't be used.
I'll try to find a better approach in the next iteration of this diff.

If you've any suggestions please comment.

KShivendu added inline comments.
swh/search/translator.py
18

@anlambert if you've any suggestions please comment.

_traverse should raise errors instead of silently ignoring unexpected nodes.

swh/search/translator.py
18

You need to set it as a data file in setup.py to access it with pkg_resource. https://setuptools.readthedocs.io/en/latest/userguide/datafiles.html

27

for consistency with the existing codebase

62–65

Why in instead of ==?

KShivendu marked 2 inline comments as done and 2 inline comments as done.
  • translator: Changes suggested by vlorentz
swh/search/translator.py
18

Build is green

Patch application report for D6025 (id=21821)

Rebasing onto 05efa5418c...

Current branch diff-target is up to date.
Changes applied before test
commit 3575aac1c04744a934ef5b2ff5309b1fa084e03b
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Wed Jul 28 14:44:00 2021 +0530

    translator: Changes suggested by vlorentz

commit 905889ddf0f3f4676ed841c325774d3a6a72cef8
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 17:55:31 2021 +0530

    translator.py: Fix resource path for tox environment

commit c3174265485ee9758507fd2fc41f52f46d35dd09
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:58:31 2021 +0530

    translator.py: Complete all the filters and add new tests

commit 3d030756b3f63082148253d44df06eb0ea1f447a
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:18:11 2021 +0530

    translator.py: Translate search query language to ES DSL
    
    Translate swh search query language queries into Elasticsearch DSL

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/241/ for more details.

Rebase:

  • translator.py: Translate search query language to ES DSL
  • translator.py: Complete all the filters and add new tests
  • translator.py: Fix resource path for tox environment
  • translator: Changes suggested by vlorentz
  • translator.py: Update list of possible ql paths (suggested by vlorentz)

Build is green

Patch application report for D6025 (id=21857)

Rebasing onto c34011b7c7...

Current branch diff-target is up to date.
Changes applied before test
commit 3dab4f5d16d6f40e8ee4d1052211ab799183e3af
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 29 14:12:10 2021 +0530

    translator.py: Update list of possible ql paths

commit 847c5fa366445bab38096c2979ab74a11f876f06
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Wed Jul 28 14:44:00 2021 +0530

    translator: Changes suggested by vlorentz

commit c0b78c26c12c085062240b0d65969f04c89f9f6b
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 17:55:31 2021 +0530

    translator.py: Fix resource path for tox environment

commit 82a2dec905ecc66d1201687e17abd12312a5b038
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:58:31 2021 +0530

    translator.py: Complete all the filters and add new tests

commit 7d8feb2ab0373c4f05e6fbaf1371fe817fbc2a9f
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:18:11 2021 +0530

    translator.py: Translate search query language to ES DSL
    
    Translate swh search query language queries into Elasticsearch DSL

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/243/ for more details.

swh/search/translator.py
88

same here

287

this can happen even with known categories.

You should add an else branch to every if

  • translator.py: Mention filter category and name in exception
  • Squash

Build is green

Patch application report for D6025 (id=21865)

Rebasing onto c34011b7c7...

Current branch diff-target is up to date.
Changes applied before test
commit f67cc169e66e1d30aaf41c3544b2d523480cc823
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 30 00:19:21 2021 +0530

    translator.py: Mention filter category and name in exception

commit 2d596405d7571ef15ba0f19f7ed3dfd6b88ad1cb
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:18:11 2021 +0530

    translator.py: Translate search query language to ES DSL
    
    Translate swh search query language queries into Elasticsearch DSL

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/244/ for more details.

swh/search/translator.py
88

or unexpected number of children

translator.py: Mention number of children while throwing exception in _traverse

Build is green

Patch application report for D6025 (id=21868)

Rebasing onto c34011b7c7...

Current branch diff-target is up to date.
Changes applied before test
commit 734370ade926508fa14d31595b27429794f2f52c
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:18:11 2021 +0530

    translator.py: Translate search query language to ES DSL
    
    Translate swh search query language queries into Elasticsearch DSL

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/245/ for more details.

swh/search/tests/test_translator.py
149–167

it's clearer this way IMO

swh/search/translator.py
28

that was an example, there should be a better error here

This revision is now accepted and ready to land.Jul 30 2021, 12:08 PM
  • Changes suggested by vlorentz
    • Improve keyword filter test using raw string
    • Improve swh_ql.so not found error message

Build is green

Patch application report for D6025 (id=21870)

Rebasing onto c34011b7c7...

Current branch diff-target is up to date.
Changes applied before test
commit 3428e71a042b6e8b78de5c34c2fcc8cb1cae68ea
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 27 00:18:11 2021 +0530

    translator.py: Translate search query language to ES DSL
    
    Translate swh search query language queries into Elasticsearch DSL

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/246/ for more details.