Page MenuHomeSoftware Heritage

translator: Fix 'visited = false' queries to actually return results.
ClosedPublic

Authored by vlorentz on Feb 16 2022, 11:06 AM.

Details

Summary

Non-visited origins don't have a 'has_visits' field at all, so comparing
it to false never returns results.

Diff Detail

Repository
rDSEA Archive search
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D7184 (id=26023)

Could not rebase; Attempt merge onto 4e635b230a...

Updating 4e635b2..b35df43
Fast-forward
 docs/query-language.rst                            | 18 +++++------
 setup.py                                           | 20 +++++++++++-
 swh/search/elasticsearch.py                        |  4 +--
 swh/search/query_language/grammar.js               |  5 +--
 swh/search/query_language/sample_query             |  4 +--
 .../query_language/test/corpus/combinations.txt    | 20 ++++++------
 swh/search/query_language/tokens.js                |  2 ++
 swh/search/tests/test_elasticsearch.py             | 36 +++++++++++++++++++++-
 swh/search/tests/test_translator.py                | 29 ++++++++++++++---
 swh/search/translator.py                           | 18 ++++++++++-
 10 files changed, 124 insertions(+), 32 deletions(-)
Changes applied before test
commit b35df430b7e7edb920dc870469f63449e140a541
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Feb 16 11:06:09 2022 +0100

    translator: Fix 'visited = false' queries to actually return results.
    
    Non-visited origins don't have a 'has_visits' field at all, so comparing
    it to `false` never returns results.

commit 3107cad2bf837cdcc083cf856a1976dbd715559b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Feb 16 10:36:10 2022 +0100

    Use ':' for substring matching instead of '='
    
    I find it very confusing to use '=' for this operation.

commit 3eed4b99a764eeb5b53cc47e92cf4246856e4caa
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Wed Feb 16 10:27:35 2022 +0100

    setup.py: Regenerate parser when sources were changed

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/310/ for more details.

anlambert added a subscriber: anlambert.

Looks good to me, it is related to T3927 right ?

This revision is now accepted and ready to land.Feb 16 2022, 11:50 AM

Not at all. I found this while trying to understand why visited = false and metadata = "kubernetes" or origin = "minikube" returns results in prod. (I still don't know)