Page MenuHomeSoftware Heritage

Add typing to origin_update's argument and origin_search's return
ClosedPublic

Authored by vlorentz on Feb 12 2021, 1:04 PM.

Diff Detail

Repository
rDSEA Archive search
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Build is green

Patch application report for D5070 (id=18090)

Could not rebase; Attempt merge onto 47db624364...

Updating 47db624..e2bcb53
Fast-forward
 requirements.txt                        |  1 +
 swh/search/elasticsearch.py             | 41 ++++++++++++++++----
 swh/search/in_memory.py                 | 31 +++++++++++----
 swh/search/interface.py                 | 26 +++++++++++--
 swh/search/journal_client.py            |  3 +-
 swh/search/tests/test_cli.py            |  9 ++---
 swh/search/tests/test_journal_client.py |  6 +--
 swh/search/tests/test_search.py         | 69 ++++++++++++++++++++++++++++++++-
 8 files changed, 158 insertions(+), 28 deletions(-)
Changes applied before test
commit e2bcb5330b2068129ae1385cc085fa82b639026b
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 12 13:03:39 2021 +0100

    Add typing to origin_update's argument and origin_search's return

commit decec2d34102737d4e7634c4260de77d5882e9f2
Author: Antoine Lambert <antoine.lambert@inria.fr>
Date:   Fri Feb 12 10:31:50 2021 +0100

    Enable to filter searched origins by visit types
    
    Summary:
    That diff adds support to filter origins by visit types in `swh-search`.
    
    The following changes have been made:
    
    - Add a new `visit_types` field to elasticsearch document for origin.
    
    - Add a new optional `visit_types` parameter to origin_search method in SearchInterface.
    
    - Implement visit types filtering in search backends.
    
    - Send origin visit types to elasticsearch when processing origin visits in journal client.
    
    I have tested that I could populate a local elasticsearch instance with those new data
    using the following command.
$ swh --log-level DEBUG search -C ~/.config/swh/search.yml journal-client objects -o origin_visit
```

I used the following configuration file:
```lang=yaml, name=search.yml
search:
  cls: elasticsearch
  hosts:
    - http://localhost:9200

journal:
  brokers:
    - kafka1.internal.softwareheritage.org
    - kafka2.internal.softwareheritage.org
    - kafka3.internal.softwareheritage.org
    - kafka4.internal.softwareheritage.org

  prefix: swh.journal.objects
  group_id: anlambert.search
```

Related to T2869

Reviewers: #reviewers, vlorentz

Maniphest Tasks: T2869

Differential Revision: https://forge.softwareheritage.org/D5064
See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/78/ for more details.
This revision is now accepted and ready to land.Feb 18 2021, 4:42 PM

Build is green

Patch application report for D5070 (id=18314)

Rebasing onto 2ba3565a17...

Current branch diff-target is up to date.
Changes applied before test
commit 04dadef9385273145a8d77eb3dee48fa17941afa
Author: Valentin Lorentz <vlorentz@softwareheritage.org>
Date:   Fri Feb 12 13:03:39 2021 +0100

    Add typing to origin_update's argument and origin_search's return

See https://jenkins.softwareheritage.org/job/DSEA/job/tests-on-diff/91/ for more details.