Page MenuHomeSoftware Heritage

translator: Fix 'visited = false' queries to actually return results.
ClosedPublic

Authored by vlorentz on Feb 16 2022, 11:06 AM.

Details

Summary

Non-visited origins don't have a 'has_visits' field at all, so comparing
it to false never returns results.

Diff Detail

Repository
rDSEA Archive search
Branch
master
Lint
No Linters Available
Unit
No Unit Test Coverage
Build Status
Buildable 26892
Build 42039: Phabricator diff pipeline on jenkinsJenkins console · Jenkins
Build 42038: arc lint + arc unit

Event Timeline

Build is green

Patch application report for D7184 (id=26023)

Could not rebase; Attempt merge onto 4e635b230a...

Updating 4e635b2..b35df43
Fast-forward
 docs/query-language.rst                            | 18 +++++------
 setup.py                                           | 20 +++++++++++-
 swh/search/elasticsearch.py                        |  4 +--
 swh/search/query_language/grammar.js               |  5 +--
 swh/search/query_language/sample_query             |  4 +--
 .../query_language/test/corpus/combinations.txt    | 20 ++++++------
 swh/search/query_language/tokens.js                |  2 ++
 swh/search/tests/test_elasticsearch.py             | 36 +++++++++++++++++++++-
 swh/search/tests/test_translator.py                | 29 ++++++++++++++---
 swh/search/translator.py                           | 18 ++++++++++-
 10 files changed, 124 insertions(+), 32 deletions(-)
Changes applied before test
commit b35df430b7e7edb920dc870469f63449e140a541
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Feb 16 11:06:09 2022 +0100

    translator: Fix 'visited = false' queries to actually return results.
    
    Non-visited origins don't have a 'has_visits' field at all, so comparing
    it to `false` never returns results.

commit 3107cad2bf837cdcc083cf856a1976dbd715559b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Feb 16 10:36:10 2022 +0100

    Use ':' for substring matching instead of '='
    
    I find it very confusing to use '=' for this operation.

commit 3eed4b99a764eeb5b53cc47e92cf4246856e4caa
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Feb 16 10:27:35 2022 +0100

    setup.py: Regenerate parser when sources were changed

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/310/ for more details.

anlambert added a subscriber: anlambert.

Looks good to me, it is related to T3927 right ?

This revision is now accepted and ready to land.Feb 16 2022, 11:50 AM

Not at all. I found this while trying to understand why visited = false and metadata = "kubernetes" or origin = "minikube" returns results in prod. (I still don't know)