Page MenuHomeSoftware Heritage

query_language: Setup tree-sitter and grammar.js
ClosedPublic

Authored by KShivendu on Jul 13 2021, 1:00 PM.

Details

Summary

This revision introduces the grammar for the search query language and completes the setup required for a smoother development of the grammar.

The parsers generated from the proposed grammar serve two different purposes:

  • Translation of search queries into elasticsearch DSL (or any other search backends that we may use in the future)
  • Autocompletion of the queries in the SWH Archive User Interface

tree-sitter is an excellent candidate for the task because it has bindings for python (swh.search) as well as wasm (swh.web)

Diff Detail

Repository
rDSEA Archive search
Branch
setup-parser
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 22627
Build 35275: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 35274: arc lint + arc unit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Build has FAILED

Patch application report for D5990 (id=21654)

Rebasing onto fe7640f710...

Current branch diff-target is up to date.
Changes applied before test
commit de667a32243c6ad3e594e310193573ddb76597b3
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:09:07 2021 +0530

    fix typo

commit 723a20a4aaf427e5956cae589e14e9d9068801e1
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:07:04 2021 +0530

    code to inspect jenkins builds

commit 15cc5d592b6029de1bcf87ad82f92dc747855106
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 01:45:43 2021 +0530

    Improve build process for .so and .wasm files

commit 84952909e54031dcd9f7ff882d015bf96c38860d
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 18:28:28 2021 +0530

    Polish the code

commit 128ec9f103bb5691aa31a56aefbadf2d816a24bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:07:23 2021 +0530

    Install tree-sitter-cli (NodeJS) during builds

commit 186dacb1e51f5c42dd7bb6047bdccc07422fce03
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:03:42 2021 +0530

    Generate parser before building swh_ql.so

commit b9a3b8108f5495907ac405f8e7a3b6ed4a484125
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    parser: Setup TreeSitter with first draft for the grammar
    
    This is the first step towards implementing the search query language which
    can be directly translated to Elasticsearch (or any other search backend) queries
    and is also useful for the introducing autocomplete in the swh archive.
    
    Also, we need a parser that can be used in swh.search backend as well as the
    swh.web interface so we've decided to go with TreeSitter which satisfies these
    conditions, is easier to write (written in JS) and is compaitible
    with many langauges.

Link to build: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/207/
See console output for more information: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/207/console

Sorry for the mess (so many failed Jenkins builds), I couldn't reproduce the errors on my local machine.

It turns out that our main problem is emcc command failed - emcc: error: src/parser.c: No such file or directory ("src/parser.c" was expected to be an input file, based on the commandline arguments provided)

To inspect the Jenkins build environment I used pwd and ls commands in package.json which prints :
/tmp/pip-req-build-2udgramk/query_language and

build.py
grammar.js
sample_query
src
swh_ql.so
test

respectively.

  • This means src/parser.c must be present but emcc isn't able to use it.
  • tree-sitter build-wasm internally triggers docker pull emscripten/emsdk:2.0.24 and uses this image to build the .wasm file from src/parser.c

and I think that somehow this container is mounted at the wrong location (or the src/ folder didn't get copied for some reason) which leads to build failures.

What should we do ?

search_language/grammar.js
38–51 ↗(On Diff #21643)

That's a good idea but needs some extra work. So I'm putting a TODO here and I'll revisit it later when I have more time. Fine with you?

  • Specify directory for build-wasm

Build has FAILED

Patch application report for D5990 (id=21655)

Rebasing onto fe7640f710...

Current branch diff-target is up to date.
Changes applied before test
commit 544c347ed2a7c9ea0b4cfa9e412be9733861f7c7
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:39:08 2021 +0530

    Specify directory for build-wasm

commit de667a32243c6ad3e594e310193573ddb76597b3
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:09:07 2021 +0530

    fix typo

commit 723a20a4aaf427e5956cae589e14e9d9068801e1
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:07:04 2021 +0530

    code to inspect jenkins builds

commit 15cc5d592b6029de1bcf87ad82f92dc747855106
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 01:45:43 2021 +0530

    Improve build process for .so and .wasm files

commit 84952909e54031dcd9f7ff882d015bf96c38860d
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 18:28:28 2021 +0530

    Polish the code

commit 128ec9f103bb5691aa31a56aefbadf2d816a24bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:07:23 2021 +0530

    Install tree-sitter-cli (NodeJS) during builds

commit 186dacb1e51f5c42dd7bb6047bdccc07422fce03
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:03:42 2021 +0530

    Generate parser before building swh_ql.so

commit b9a3b8108f5495907ac405f8e7a3b6ed4a484125
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    parser: Setup TreeSitter with first draft for the grammar
    
    This is the first step towards implementing the search query language which
    can be directly translated to Elasticsearch (or any other search backend) queries
    and is also useful for the introducing autocomplete in the swh archive.
    
    Also, we need a parser that can be used in swh.search backend as well as the
    swh.web interface so we've decided to go with TreeSitter which satisfies these
    conditions, is easier to write (written in JS) and is compaitible
    with many langauges.

Link to build: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/208/
See console output for more information: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/208/console

  • Use emsdk for building .wasm

Build has FAILED

Patch application report for D5990 (id=21656)

Rebasing onto fe7640f710...

Current branch diff-target is up to date.
Changes applied before test
commit d002f3f3961e52376c64a6fb0ce310127fde335c
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 12:07:00 2021 +0530

    Use emsdk for building .wasm

commit 544c347ed2a7c9ea0b4cfa9e412be9733861f7c7
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:39:08 2021 +0530

    Specify directory for build-wasm

commit de667a32243c6ad3e594e310193573ddb76597b3
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:09:07 2021 +0530

    fix typo

commit 723a20a4aaf427e5956cae589e14e9d9068801e1
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:07:04 2021 +0530

    code to inspect jenkins builds

commit 15cc5d592b6029de1bcf87ad82f92dc747855106
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 01:45:43 2021 +0530

    Improve build process for .so and .wasm files

commit 84952909e54031dcd9f7ff882d015bf96c38860d
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 18:28:28 2021 +0530

    Polish the code

commit 128ec9f103bb5691aa31a56aefbadf2d816a24bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:07:23 2021 +0530

    Install tree-sitter-cli (NodeJS) during builds

commit 186dacb1e51f5c42dd7bb6047bdccc07422fce03
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:03:42 2021 +0530

    Generate parser before building swh_ql.so

commit b9a3b8108f5495907ac405f8e7a3b6ed4a484125
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    parser: Setup TreeSitter with first draft for the grammar
    
    This is the first step towards implementing the search query language which
    can be directly translated to Elasticsearch (or any other search backend) queries
    and is also useful for the introducing autocomplete in the swh archive.
    
    Also, we need a parser that can be used in swh.search backend as well as the
    swh.web interface so we've decided to go with TreeSitter which satisfies these
    conditions, is easier to write (written in JS) and is compaitible
    with many langauges.

Link to build: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/209/
See console output for more information: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/209/console

Build has FAILED

Patch application report for D5990 (id=21657)

Rebasing onto fe7640f710...

Current branch diff-target is up to date.
Changes applied before test
commit 002e9047c47697375db009b7f880940e797bf8bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 12:14:12 2021 +0530

    Use sh instead of source

commit d002f3f3961e52376c64a6fb0ce310127fde335c
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 12:07:00 2021 +0530

    Use emsdk for building .wasm

commit 544c347ed2a7c9ea0b4cfa9e412be9733861f7c7
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:39:08 2021 +0530

    Specify directory for build-wasm

commit de667a32243c6ad3e594e310193573ddb76597b3
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:09:07 2021 +0530

    fix typo

commit 723a20a4aaf427e5956cae589e14e9d9068801e1
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:07:04 2021 +0530

    code to inspect jenkins builds

commit 15cc5d592b6029de1bcf87ad82f92dc747855106
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 01:45:43 2021 +0530

    Improve build process for .so and .wasm files

commit 84952909e54031dcd9f7ff882d015bf96c38860d
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 18:28:28 2021 +0530

    Polish the code

commit 128ec9f103bb5691aa31a56aefbadf2d816a24bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:07:23 2021 +0530

    Install tree-sitter-cli (NodeJS) during builds

commit 186dacb1e51f5c42dd7bb6047bdccc07422fce03
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:03:42 2021 +0530

    Generate parser before building swh_ql.so

commit b9a3b8108f5495907ac405f8e7a3b6ed4a484125
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    parser: Setup TreeSitter with first draft for the grammar
    
    This is the first step towards implementing the search query language which
    can be directly translated to Elasticsearch (or any other search backend) queries
    and is also useful for the introducing autocomplete in the swh archive.
    
    Also, we need a parser that can be used in swh.search backend as well as the
    swh.web interface so we've decided to go with TreeSitter which satisfies these
    conditions, is easier to write (written in JS) and is compaitible
    with many langauges.

Link to build: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/210/
See console output for more information: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/210/console

search_language/grammar.js
38–51 ↗(On Diff #21643)

sure

Yes, that's a good point.
On the other hand, not including it means we install-depend on nodejs+yarn and a C compiler...

I think we can only bundle the C source files of the parser in the Python wheel / sdist and thus only depends on a C compiler.

Can we have some documentation of the query language, included in this diff?
E.g., a file under docs/ which will then be rendered on docs.s.o as user documentation for how to use the query language.

It will also help getting an idea of the proposed syntax for people reviewing this diff, as the Tree sitter grammar is not really reader friendly ;-)

In D5990#154613, @zack wrote:

Can we have some documentation of the query language, included in this diff?
E.g., a file under docs/ which will then be rendered on docs.s.o as user documentation for how to use the query language.

And if you're looking for an example of the style of such a document, Google's help page on how to refine web searches is quite slick.

I think we can only bundle the C source files of the parser in the Python wheel / sdist and thus only depends on a C compiler.

Oh, great point. I think that's a good compromise, a C compiler is a common dependency for Python packages

Build has FAILED

Patch application report for D5990 (id=21668)

Rebasing onto fe7640f710...

Current branch diff-target is up to date.
Changes applied before test
commit 964ebec2ee6adca4317c0d328410a9d15aecd7bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 19:54:20 2021 +0530

    Use docker for emsdk

commit 002e9047c47697375db009b7f880940e797bf8bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 12:14:12 2021 +0530

    Use sh instead of source

commit d002f3f3961e52376c64a6fb0ce310127fde335c
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 12:07:00 2021 +0530

    Use emsdk for building .wasm

commit 544c347ed2a7c9ea0b4cfa9e412be9733861f7c7
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:39:08 2021 +0530

    Specify directory for build-wasm

commit de667a32243c6ad3e594e310193573ddb76597b3
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:09:07 2021 +0530

    fix typo

commit 723a20a4aaf427e5956cae589e14e9d9068801e1
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:07:04 2021 +0530

    code to inspect jenkins builds

commit 15cc5d592b6029de1bcf87ad82f92dc747855106
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 01:45:43 2021 +0530

    Improve build process for .so and .wasm files

commit 84952909e54031dcd9f7ff882d015bf96c38860d
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 18:28:28 2021 +0530

    Polish the code

commit 128ec9f103bb5691aa31a56aefbadf2d816a24bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:07:23 2021 +0530

    Install tree-sitter-cli (NodeJS) during builds

commit 186dacb1e51f5c42dd7bb6047bdccc07422fce03
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:03:42 2021 +0530

    Generate parser before building swh_ql.so

commit b9a3b8108f5495907ac405f8e7a3b6ed4a484125
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    parser: Setup TreeSitter with first draft for the grammar
    
    This is the first step towards implementing the search query language which
    can be directly translated to Elasticsearch (or any other search backend) queries
    and is also useful for the introducing autocomplete in the swh archive.
    
    Also, we need a parser that can be used in swh.search backend as well as the
    swh.web interface so we've decided to go with TreeSitter which satisfies these
    conditions, is easier to write (written in JS) and is compaitible
    with many langauges.

Link to build: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/211/
See console output for more information: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/211/console

  • Inspect docker container inside builds

Build has FAILED

Patch application report for D5990 (id=21670)

Rebasing onto fe7640f710...

Current branch diff-target is up to date.
Changes applied before test
commit 3d26e880516a0ba1cfe21462a64408aae09c2cd2
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 20:14:25 2021 +0530

    Inspect docker container inside builds

commit 964ebec2ee6adca4317c0d328410a9d15aecd7bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 19:54:20 2021 +0530

    Use docker for emsdk

commit 002e9047c47697375db009b7f880940e797bf8bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 12:14:12 2021 +0530

    Use sh instead of source

commit d002f3f3961e52376c64a6fb0ce310127fde335c
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 12:07:00 2021 +0530

    Use emsdk for building .wasm

commit 544c347ed2a7c9ea0b4cfa9e412be9733861f7c7
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:39:08 2021 +0530

    Specify directory for build-wasm

commit de667a32243c6ad3e594e310193573ddb76597b3
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:09:07 2021 +0530

    fix typo

commit 723a20a4aaf427e5956cae589e14e9d9068801e1
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:07:04 2021 +0530

    code to inspect jenkins builds

commit 15cc5d592b6029de1bcf87ad82f92dc747855106
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 01:45:43 2021 +0530

    Improve build process for .so and .wasm files

commit 84952909e54031dcd9f7ff882d015bf96c38860d
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 18:28:28 2021 +0530

    Polish the code

commit 128ec9f103bb5691aa31a56aefbadf2d816a24bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:07:23 2021 +0530

    Install tree-sitter-cli (NodeJS) during builds

commit 186dacb1e51f5c42dd7bb6047bdccc07422fce03
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:03:42 2021 +0530

    Generate parser before building swh_ql.so

commit b9a3b8108f5495907ac405f8e7a3b6ed4a484125
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    parser: Setup TreeSitter with first draft for the grammar
    
    This is the first step towards implementing the search query language which
    can be directly translated to Elasticsearch (or any other search backend) queries
    and is also useful for the introducing autocomplete in the swh archive.
    
    Also, we need a parser that can be used in swh.search backend as well as the
    swh.web interface so we've decided to go with TreeSitter which satisfies these
    conditions, is easier to write (written in JS) and is compaitible
    with many langauges.

Link to build: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/212/
See console output for more information: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/212/console

  • Inspecting jenkins build with echo

Build has FAILED

Patch application report for D5990 (id=21672)

Rebasing onto fe7640f710...

Current branch diff-target is up to date.
Changes applied before test
commit f64c24c7dc76d8a2a2d055e75afc8eb03ef84c12
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 20:21:33 2021 +0530

    Inspecting jenkins build with echo

commit b0239b8eef4663a27bed2969aec10139711db422
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 20:21:07 2021 +0530

    Inspecting jenkins build with echo

commit 3d26e880516a0ba1cfe21462a64408aae09c2cd2
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 20:14:25 2021 +0530

    Inspect docker container inside builds

commit 964ebec2ee6adca4317c0d328410a9d15aecd7bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 19:54:20 2021 +0530

    Use docker for emsdk

commit 002e9047c47697375db009b7f880940e797bf8bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 12:14:12 2021 +0530

    Use sh instead of source

commit d002f3f3961e52376c64a6fb0ce310127fde335c
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 12:07:00 2021 +0530

    Use emsdk for building .wasm

commit 544c347ed2a7c9ea0b4cfa9e412be9733861f7c7
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:39:08 2021 +0530

    Specify directory for build-wasm

commit de667a32243c6ad3e594e310193573ddb76597b3
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:09:07 2021 +0530

    fix typo

commit 723a20a4aaf427e5956cae589e14e9d9068801e1
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:07:04 2021 +0530

    code to inspect jenkins builds

commit 15cc5d592b6029de1bcf87ad82f92dc747855106
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 01:45:43 2021 +0530

    Improve build process for .so and .wasm files

commit 84952909e54031dcd9f7ff882d015bf96c38860d
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 18:28:28 2021 +0530

    Polish the code

commit 128ec9f103bb5691aa31a56aefbadf2d816a24bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:07:23 2021 +0530

    Install tree-sitter-cli (NodeJS) during builds

commit 186dacb1e51f5c42dd7bb6047bdccc07422fce03
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:03:42 2021 +0530

    Generate parser before building swh_ql.so

commit b9a3b8108f5495907ac405f8e7a3b6ed4a484125
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    parser: Setup TreeSitter with first draft for the grammar
    
    This is the first step towards implementing the search query language which
    can be directly translated to Elasticsearch (or any other search backend) queries
    and is also useful for the introducing autocomplete in the swh archive.
    
    Also, we need a parser that can be used in swh.search backend as well as the
    swh.web interface so we've decided to go with TreeSitter which satisfies these
    conditions, is easier to write (written in JS) and is compaitible
    with many langauges.

Link to build: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/213/
See console output for more information: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/213/console

  • Use locally installed emscripten instead of docker

Build is green

Patch application report for D5990 (id=21676)

Rebasing onto fe7640f710...

Current branch diff-target is up to date.
Changes applied before test
commit e3a535b5f56aebe5de3a4c729c54f5e99b7263ee
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 20:45:31 2021 +0530

    Use locally installed emscripten instead of docker

commit f64c24c7dc76d8a2a2d055e75afc8eb03ef84c12
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 20:21:33 2021 +0530

    Inspecting jenkins build with echo

commit b0239b8eef4663a27bed2969aec10139711db422
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 20:21:07 2021 +0530

    Inspecting jenkins build with echo

commit 3d26e880516a0ba1cfe21462a64408aae09c2cd2
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 20:14:25 2021 +0530

    Inspect docker container inside builds

commit 964ebec2ee6adca4317c0d328410a9d15aecd7bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 19:54:20 2021 +0530

    Use docker for emsdk

commit 002e9047c47697375db009b7f880940e797bf8bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 12:14:12 2021 +0530

    Use sh instead of source

commit d002f3f3961e52376c64a6fb0ce310127fde335c
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 12:07:00 2021 +0530

    Use emsdk for building .wasm

commit 544c347ed2a7c9ea0b4cfa9e412be9733861f7c7
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:39:08 2021 +0530

    Specify directory for build-wasm

commit de667a32243c6ad3e594e310193573ddb76597b3
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:09:07 2021 +0530

    fix typo

commit 723a20a4aaf427e5956cae589e14e9d9068801e1
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 11:07:04 2021 +0530

    code to inspect jenkins builds

commit 15cc5d592b6029de1bcf87ad82f92dc747855106
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Fri Jul 16 01:45:43 2021 +0530

    Improve build process for .so and .wasm files

commit 84952909e54031dcd9f7ff882d015bf96c38860d
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 18:28:28 2021 +0530

    Polish the code

commit 128ec9f103bb5691aa31a56aefbadf2d816a24bd
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:07:23 2021 +0530

    Install tree-sitter-cli (NodeJS) during builds

commit 186dacb1e51f5c42dd7bb6047bdccc07422fce03
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Thu Jul 15 16:03:42 2021 +0530

    Generate parser before building swh_ql.so

commit b9a3b8108f5495907ac405f8e7a3b6ed4a484125
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    parser: Setup TreeSitter with first draft for the grammar
    
    This is the first step towards implementing the search query language which
    can be directly translated to Elasticsearch (or any other search backend) queries
    and is also useful for the introducing autocomplete in the swh archive.
    
    Also, we need a parser that can be used in swh.search backend as well as the
    swh.web interface so we've decided to go with TreeSitter which satisfies these
    conditions, is easier to write (written in JS) and is compaitible
    with many langauges.

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/214/ for more details.

  • Improve Makefile and README
  • Improve tree-sitter's native test development workflow
KShivendu retitled this revision from parser: Setup TreeSitter with first draft for the grammar to query_language: Setup tree-sitter and grammar.js.Jul 16 2021, 8:04 PM
KShivendu edited the summary of this revision. (Show Details)

Build has FAILED

Patch application report for D5990 (id=21678)

Rebasing onto fe7640f710...

Current branch diff-target is up to date.
Changes applied before test
commit 51f7a2a6f0c22b408e13b89f6b4d3fc96f1ddd69
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    query_language: Setup tree-sitter and grammar.js
    
    This revision introduces the grammar for the search query language and
    completes the setup required for smoother development of the grammar.
    
    The parsers generated from the proposed grammar serve two different purposes:
    - Translation of search queries into elasticsearch DSL (or any
    other search backend that we may use in the future)
    - Autocompletion of the queries in the SWH Archive User Interface
    
    tree-sitter is an excellent candidate for the task because it has
    bindings for python (swh.search) as well as wasm (swh.web)

Link to build: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/215/
See console output for more information: https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/215/console

KShivendu edited the summary of this revision. (Show Details)
KShivendu edited the summary of this revision. (Show Details)
  • Add newline at the end of package.json and sample_query

Build is green

Patch application report for D5990 (id=21679)

Rebasing onto fe7640f710...

Current branch diff-target is up to date.
Changes applied before test
commit b252de2e92e864d600194e472f67211fbcaef7e0
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    query_language: Setup tree-sitter and grammar.js
    
    This revision introduces the grammar for the search query language and
    completes the setup required for a smoother development of the grammar.
    
    The parsers generated from the proposed grammar serve two different purposes:
    - Translation of search queries into elasticsearch DSL (or any
    other search backend that we may use in the future)
    - Autocompletion of the queries in the SWH Archive User Interface
    
    tree-sitter is an excellent candidate for the task because it has
    bindings for python (swh.search) as well as wasm (swh.web)

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/216/ for more details.

Can we have some documentation of the query language, included in this diff?

@zack
It is okay with you if I add it in the next diff? This one has become extremely long because of lots of build failures.

query_language/grammar.js
76–77

@vlorentz I'll cover this in the next diff.

@zack
It is okay with you if I add it in the next diff? This one has become extremely long because of lots of build failures.

It's OK if you want to land the documentation in a separate diff. But I still would like to *see* that documentation before accepting this diff because, right now, I've no idea at all what the search language look like.

Maybe you can submit a separate diff with just the documentation? It can even avoid depending on this one, because they will for sure not conflict.

After re-reading this diff, there are some major things I think you need to change:

  • Makefiles shouldn't install dependencies, especially not in /opt. Instead, just document install instructions in the README.
  • Many of the build targets in the Makefile duplicate what is also in setup.py. Instead, the Makefile should just call setup.py (if you want to keep steps separate, registers new setup.py commands)
  • more of a minor issue: class custom_build should inherit from distutils.Command instead of distutils.command.build_py, because it's not building Python code, and should not be the build_py cmdclass. Instead it should be added as a command that the main "build" command depends on. (I'm not sure it's possible to do so; please try and we'll see what to do if it's not)
  • query_language/grammar.js: Include Z in date regex
  • query_language: Add precedence and improve field names
  • query_language: Add support for freely using brackets
  • README: Add emsdk setup instructions
  • setup.py: Add commands for build steps
  • query_language/grammar.js: Remove redundancies using functions

Build is green

Patch application report for D5990 (id=21732)

Rebasing onto d58705a0eb...

First, rewinding head to replay your work on top of it...
Applying: query_language: Setup tree-sitter and grammar.js
Applying: query_language/grammar.js: Include Z in date regex
Applying: query_language: Add precedence and improve field names
Applying: query_language: Add support for freely using brackets
Applying: README: Add emsdk setup instructions
Applying: setup.py: Add commands for build steps
Applying: query_language/grammar.js: Remove redundancies using functions
Changes applied before test
commit 2b5c2a1a20205370687a95c06c58db1069486354
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Wed Jul 21 23:57:00 2021 +0530

    query_language/grammar.js: Remove redundancies using functions

commit ed4c29366f28dfef7b9db60b4b7a5ca31ba582f8
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Wed Jul 21 23:18:47 2021 +0530

    setup.py: Add commands for build steps

commit 8b6845a6baa28c6e268a72bdb421bcfb69dc08b0
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Wed Jul 21 21:18:04 2021 +0530

    README: Add emsdk setup instructions

commit 3f477cf36a14db94dc966f9e468aecee02ac716c
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Wed Jul 21 21:16:34 2021 +0530

    query_language: Add support for freely using brackets

commit d6fa863e6a67cbc5545398a0234c086d0d14f259
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Wed Jul 21 16:55:29 2021 +0530

    query_language: Add precedence and improve field names

commit 6360a80028f5cfbfe2ff245c89085de5042de5a2
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Sun Jul 18 01:04:41 2021 +0530

    query_language/grammar.js: Include Z in date regex

commit 5ce122cb7ab3a3aeb4413a3e75a1c814f527e0c1
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    query_language: Setup tree-sitter and grammar.js
    
    This revision introduces the grammar for the search query language and
    completes the setup required for a smoother development of the grammar.
    
    The parsers generated from the proposed grammar serve two different purposes:
    - Translation of search queries into elasticsearch DSL (or any
    other search backend that we may use in the future)
    - Autocompletion of the queries in the SWH Archive User Interface
    
    tree-sitter is an excellent candidate for the task because it has
    bindings for python (swh.search) as well as wasm (swh.web)

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/224/ for more details.

might be @jayeshv could be interested by this diff in regards to the recent graphql discussion btw

  • grammar.js: Add some comments to improve readability
  • Add support for escaping " and ' from the filter values
  • Improve and break bulky tree-sitter tests into smaller tests (for readability)
  • setup.py: Use super().run() instead of build.run(self)
  • Squash commits

Build is green

Patch application report for D5990 (id=21750)

Rebasing onto 4e453304ad...

Current branch diff-target is up to date.
Changes applied before test
commit 166796ab91a2fbefc960f31e519bbc3b1d77cb06
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    query_language: Setup tree-sitter and grammar.js
    
    This revision defines the grammar for the search query language and
    prepares swh.search for a smoother development of the grammar.
    
    The parsers generated from the proposed grammar serve two different purposes:
    - Translation of search queries into elasticsearch DSL in swh.search (or any
    other search backend that we may use in the future)
    - Autocompletion of the queries in the swh.web (Archive UI)
    
    tree-sitter has been selected for the task because it has bindings for
    python (swh.search) as well as wasm (swh.web).

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/228/ for more details.

  • query_language: Add test-sitter tests for escaping keywords in filter values
    • Origins with ' and " inside filter values
    • Origins with 'and' and 'or' inside filter values

Build is green

Patch application report for D5990 (id=21751)

Rebasing onto 4e453304ad...

Current branch diff-target is up to date.
Changes applied before test
commit e378f68307a808911a5f7e32cb61482a40d44198
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    query_language: Setup tree-sitter and grammar.js
    
    This revision defines the grammar for the search query language and
    prepares swh.search for a smoother development of the grammar.
    
    The parsers generated from the proposed grammar serve two different purposes:
    - Translation of search queries into elasticsearch DSL in swh.search (or any
    other search backend that we may use in the future)
    - Autocompletion of the queries in the swh.web (Archive UI)
    
    tree-sitter has been selected for the task because it has bindings for
    python (swh.search) as well as wasm (swh.web).

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/229/ for more details.

query_language/grammar.js: Improve function comments

Build is green

Patch application report for D5990 (id=21752)

Rebasing onto 4e453304ad...

Current branch diff-target is up to date.
Changes applied before test
commit 8be32c1684dd604616da883a38d520ef966ba670
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    query_language: Setup tree-sitter and grammar.js
    
    This revision defines the grammar for the search query language and
    prepares swh.search for a smoother development of the grammar.
    
    The parsers generated from the proposed grammar serve two different purposes:
    - Translation of search queries into elasticsearch DSL in swh.search (or any
    other search backend that we may use in the future)
    - Autocompletion of the queries in the swh.web (Archive UI)
    
    tree-sitter has been selected for the task because it has bindings for
    python (swh.search) as well as wasm (swh.web).

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/230/ for more details.

  • query_language: Segregate sort_by and limit from filters
  • Add tests for the change mentioned above

Build is green

Patch application report for D5990 (id=21774)

Rebasing onto 122d7caf65...

First, rewinding head to replay your work on top of it...
Applying: query_language: Setup tree-sitter and grammar.js
Changes applied before test
commit 168fcb448d2a78de7e0fc812584927e9f59b6edb
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    query_language: Setup tree-sitter and grammar.js
    
    This revision defines the grammar for the search query language and
    prepares swh.search for a smoother development of the grammar.
    
    The parsers generated from the proposed grammar serve two different purposes:
    - Translation of search queries into elasticsearch DSL in swh.search (or any
    other search backend that we may use in the future)
    - Autocompletion of the queries in the swh.web (Archive UI)
    
    tree-sitter has been selected for the task because it has bindings for
    python (swh.search) as well as wasm (swh.web).

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/232/ for more details.

grammar.js: Allow using '-' with sort_by options

Build is green

Patch application report for D5990 (id=21778)

Rebasing onto 122d7caf65...

First, rewinding head to replay your work on top of it...
Applying: query_language: Setup tree-sitter and grammar.js
Changes applied before test
commit 6738861b51ff23808a5d3fdaaf5eb8741b01a5ce
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    query_language: Setup tree-sitter and grammar.js
    
    This revision defines the grammar for the search query language and
    prepares swh.search for a smoother development of the grammar.
    
    The parsers generated from the proposed grammar serve two different purposes:
    - Translation of search queries into elasticsearch DSL in swh.search (or any
    other search backend that we may use in the future)
    - Autocompletion of the queries in the swh.web (Archive UI)
    
    tree-sitter has been selected for the task because it has bindings for
    python (swh.search) as well as wasm (swh.web).

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/233/ for more details.

This revision is now accepted and ready to land.Jul 26 2021, 5:11 PM

Build is green

Patch application report for D5990 (id=21779)

Rebasing onto 122d7caf65...

Current branch diff-target is up to date.
Changes applied before test
commit 2edbbbe833e9e3e1d77231396c54095839bbcc13
Author: KShivendu <shivendu@iitbhilai.ac.in>
Date:   Tue Jul 13 16:21:51 2021 +0530

    query_language: Setup tree-sitter and grammar.js
    
    This revision defines the grammar for the search query language and
    prepares swh.search for a smoother development of the grammar.
    
    The parsers generated from the proposed grammar serve two different purposes:
    - Translation of search queries into elasticsearch DSL in swh.search (or any
    other search backend that we may use in the future)
    - Autocompletion of the queries in the swh.web (Archive UI)
    
    tree-sitter has been selected for the task because it has bindings for
    python (swh.search) as well as wasm (swh.web).

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/234/ for more details.