diff --git a/PKG-INFO b/PKG-INFO index b729b40..861b468 100644 --- a/PKG-INFO +++ b/PKG-INFO @@ -1,52 +1,52 @@ Metadata-Version: 2.1 Name: swh.search -Version: 0.8.0 +Version: 0.8.1 Summary: Software Heritage search service Home-page: https://forge.softwareheritage.org/diffusion/DSEA Author: Software Heritage developers Author-email: swh-devel@inria.fr License: UNKNOWN Project-URL: Bug Reports, https://forge.softwareheritage.org/maniphest Project-URL: Funding, https://www.softwareheritage.org/donate Project-URL: Source, https://forge.softwareheritage.org/source/swh-search Project-URL: Documentation, https://docs.softwareheritage.org/devel/swh-search/ Description: swh-search ========== Search service for the Software Heritage archive. It is similar to swh-storage in what it contains, but provides different ways to query it: while swh-storage is mostly a key-value store that returns an object from a primary key, swh-search is focused on reverse indices, to allow finding objects that match some criteria; for example full-text search. Currently uses ElasticSearch, and provides only origin search (by URL and metadata) # Dependencies Python tests for this module include tests that cannot be run without a local ElasticSearch instance, so you need the ElasticSearch server executable on your machine (no need to have a running ElasticSearch server). ## Debian-like host The elasticsearch package is required. As it's not part of debian-stable, [another debian repository is required to be configured](https://www.elastic.co/guide/en/elasticsearch/reference/current/deb.html#deb-repo) ## Non Debian-like host The tests expect: - `/usr/share/elasticsearch/jdk/bin/java` to exist. - `org.elasticsearch.bootstrap.Elasticsearch` to be in java's classpath. Platform: UNKNOWN Classifier: Programming Language :: Python :: 3 Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3) Classifier: Operating System :: OS Independent Classifier: Development Status :: 3 - Alpha Requires-Python: >=3.7 Description-Content-Type: text/markdown Provides-Extra: testing diff --git a/debian/changelog b/debian/changelog index 6cc77db..72e1254 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,228 +1,230 @@ -swh-search (0.8.0-1~swh1~bpo10+1) buster-swh; urgency=medium +swh-search (0.8.1-1~swh1) unstable-swh; urgency=medium - * Rebuild for buster-swh + * New upstream release 0.8.1 - (tagged by Antoine Lambert + on 2021-04-29 14:36:43 +0200) + * Upstream changes: - version 0.8.1 - -- Software Heritage autobuilder (on jenkins-debian1) Thu, 08 Apr 2021 15:44:24 +0000 + -- Software Heritage autobuilder (on jenkins-debian1) Thu, 29 Apr 2021 12:41:23 +0000 swh-search (0.8.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.8.0 - (tagged by Nicolas Dandrimont on 2021-04-08 17:37:41 +0200) * Upstream changes: - Release swh.search 0.8.0 - Implement a blocklist for origin results - Fix docs typesetting -- Software Heritage autobuilder (on jenkins-debian1) Thu, 08 Apr 2021 15:42:22 +0000 swh-search (0.7.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.7.1 - (tagged by Vincent SELLIER on 2021-03-04 15:59:28 +0100) * Upstream changes: - v0.7.1 - Changelog: - * Allow to instantiate the service with default indexes configuration -- Software Heritage autobuilder (on jenkins-debian1) Thu, 04 Mar 2021 15:06:34 +0000 swh-search (0.7.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.7.0 - (tagged by Vincent SELLIER on 2021-03-04 12:09:12 +0100) * Upstream changes: - v0.7.0 - Changelog: - * Ensure the elasticsearch indexes are initialized before the first request - * Use elasticsearch aliases to simplify maintenance operations - * search.cli: Drop unused and untested rpc-serve cli entrypoint - * api.wsgi: Drop unused wsgi module - * Add missing server tests - * Add typing to origin_update's argument and origin_search's return -- Software Heritage autobuilder (on jenkins-debian1) Thu, 04 Mar 2021 11:19:29 +0000 swh-search (0.6.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.6.1 - (tagged by Antoine Lambert on 2021-02-18 18:55:56 +0100) * Upstream changes: - version 0.6.1 -- Software Heritage autobuilder (on jenkins-debian1) Thu, 18 Feb 2021 18:00:51 +0000 swh-search (0.6.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.6.0 - (tagged by Antoine Lambert on 2021-02-18 15:28:07 +0100) * Upstream changes: - version 0.6.0 -- Software Heritage autobuilder (on jenkins-debian1) Thu, 18 Feb 2021 14:31:07 +0000 swh-search (0.5.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.5.0 - (tagged by Vincent SELLIER on 2021-02-18 11:20:43 +0100) * Upstream changes: - v0.5.0 - Add monitoring metrics -- Software Heritage autobuilder (on jenkins-debian1) Thu, 18 Feb 2021 10:25:39 +0000 swh-search (0.4.2-1~swh1) unstable-swh; urgency=medium * New upstream release 0.4.2 - (tagged by Antoine Lambert on 2021-02-17 11:09:21 +0100) * Upstream changes: - version 0.4.2 -- Software Heritage autobuilder (on jenkins-debian1) Wed, 17 Feb 2021 10:14:16 +0000 swh-search (0.4.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.4.1 - (tagged by Vincent SELLIER on 2021-01-07 16:15:23 +0100) * Upstream changes: - v0.4.1 -- Software Heritage autobuilder (on jenkins-debian1) Thu, 07 Jan 2021 15:18:24 +0000 swh-search (0.4.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.4.0 - (tagged by Vincent SELLIER on 2020-12-23 16:37:18 +0100) * Upstream changes: - Support an index name prefix -- Software Heritage autobuilder (on jenkins-debian1) Wed, 23 Dec 2020 15:41:09 +0000 swh-search (0.3.5-1~swh1) unstable-swh; urgency=medium * New upstream release 0.3.5 - (tagged by Valentin Lorentz on 2020-12-22 17:32:26 +0100) * Upstream changes: - v0.3.5 - * Write some basic documentation to describe what swh-search is. - * Add more comments in elasticsearch.py -- Software Heritage autobuilder (on jenkins-debian1) Tue, 22 Dec 2020 16:38:29 +0000 swh-search (0.3.4-1~swh1) unstable-swh; urgency=medium * New upstream release 0.3.4 - (tagged by Antoine R. Dumont (@ardumont) on 2020-12-17 12:13:49 +0100) * Upstream changes: - v0.3.4 - search.journal_client: Actually filter on full origin_visit_status -- Software Heritage autobuilder (on jenkins-debian1) Thu, 17 Dec 2020 11:16:32 +0000 swh-search (0.3.3-1~swh1) unstable-swh; urgency=medium * New upstream release 0.3.3 - (tagged by Antoine R. Dumont (@ardumont) on 2020-12-11 15:20:01 +0100) * Upstream changes: - v0.3.3 - Use cross-field search. - Normalize Codemeta documents by expanding them. - Add test for long descriptions. -- Software Heritage autobuilder (on jenkins-debian1) Fri, 11 Dec 2020 14:22:59 +0000 swh-search (0.3.2-1~swh1) unstable-swh; urgency=medium * New upstream release 0.3.2 - (tagged by Antoine R. Dumont (@ardumont) on 2020-12-10 09:49:35 +0100) * Upstream changes: - v0.3.2 - search.journal_client: Fix key error - test_journal_client: Migrate to pytest -- Software Heritage autobuilder (on jenkins-debian1) Thu, 10 Dec 2020 08:54:53 +0000 swh-search (0.3.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.3.1 - (tagged by Antoine R. Dumont (@ardumont) on 2020-12-09 18:21:33 +0100) * Upstream changes: - v0.3.1 - Allow configuration through cli or config file -- Software Heritage autobuilder (on jenkins-debian1) Wed, 09 Dec 2020 18:53:39 +0000 swh-search (0.3.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.3.0 - (tagged by Antoine R. Dumont (@ardumont) on 2020-12-08 11:30:33 +0100) * Upstream changes: - v0.3.0 - cli: Subscribe journal client to origin_intrinsic_metadata topic - cli: Subscribe journal client to origin_visit_status - cli: Allow topic prefix declaration through cli or configuration - cli: Allow object- type declaration through cli or configuration - tox.ini: pin black to the pre-commit version (19.10b0) to avoid flip-flops - Run isort after the CLI import changes -- Software Heritage autobuilder (on jenkins-debian1) Tue, 08 Dec 2020 10:33:30 +0000 swh-search (0.2.3-1~swh1) unstable-swh; urgency=medium * New upstream release 0.2.3 - (tagged by David Douard on 2020-09-25 12:51:11 +0200) * Upstream changes: - v0.2.3 -- Software Heritage autobuilder (on jenkins-debian1) Fri, 25 Sep 2020 10:53:12 +0000 swh-search (0.2.2-1~swh1) unstable-swh; urgency=medium * New upstream release 0.2.2 - (tagged by Antoine R. Dumont (@ardumont) on 2020-08-03 11:58:53 +0200) * Upstream changes: - v0.2.2 - Fix test_cli.invoke for old PyYAML versions (such as 3.13, in Debian 10). -- Software Heritage autobuilder (on jenkins-debian1) Mon, 03 Aug 2020 10:00:05 +0000 swh-search (0.2.1-1~swh1) unstable-swh; urgency=medium * New upstream release 0.2.1 - (tagged by Antoine R. Dumont (@ardumont) on 2020-08-03 10:59:31 +0200) * Upstream changes: - v0.2.1 - setup.py: Migrate from vcversioner to setuptools-scm -- Software Heritage autobuilder (on jenkins-debian1) Mon, 03 Aug 2020 09:00:39 +0000 swh-search (0.2.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.2.0 - (tagged by Antoine R. Dumont (@ardumont) on 2020-08-03 10:40:39 +0200) * Upstream changes: - v0.2.0 - swh.search: Define an interface for search backends and use it - swh.search.get_search: Simplify instantiation -- Software Heritage autobuilder (on jenkins-debian1) Mon, 03 Aug 2020 08:42:45 +0000 swh-search (0.1.0-1~swh1) unstable-swh; urgency=medium * New upstream release 0.1.0 - (tagged by Antoine R. Dumont (@ardumont) on 2020-07-31 14:05:22 +0200) * Upstream changes: - v0.1.0 - Type origin_search(...) -> PagedResult[Dict] - README: Update necessary dependencies for test purposes - Fixes on journal updates - Blackify strings - setup: Update the minimum required runtime python3 version -- Software Heritage autobuilder (on jenkins-debian1) Fri, 31 Jul 2020 12:10:22 +0000 swh-search (0.0.4-1~swh1) unstable-swh; urgency=medium * New upstream release 0.0.4 - (tagged by Antoine R. Dumont (@ardumont) on 2020-01-23 15:00:50 +0100) * Upstream changes: - v0.0.4 docs: Remove swh-py-template label - Only return results where all terms match. - Don't use refresh='wait_for' when updating origins. - Add a 'sha1' field to origin documents, used for sorting. - Add a pre-commit config file - Migrate tox.ini to extras = xxx instead of deps = .[testing] - De- specify testenv:py3 - Include all requirements in MANIFEST.in -- Software Heritage autobuilder (on jenkins-debian1) Thu, 23 Jan 2020 14:04:17 +0000 swh-search (0.0.3-1~swh2) unstable-swh; urgency=medium * Filter out swh/__init__.py from package -- Nicolas Dandrimont Tue, 14 Jan 2020 16:38:23 +0100 swh-search (0.0.3-1~swh1) unstable-swh; urgency=medium * Initial packaging -- Nicolas Dandrimont Mon, 13 Jan 2020 16:59:11 +0100 diff --git a/swh.search.egg-info/PKG-INFO b/swh.search.egg-info/PKG-INFO index b729b40..861b468 100644 --- a/swh.search.egg-info/PKG-INFO +++ b/swh.search.egg-info/PKG-INFO @@ -1,52 +1,52 @@ Metadata-Version: 2.1 Name: swh.search -Version: 0.8.0 +Version: 0.8.1 Summary: Software Heritage search service Home-page: https://forge.softwareheritage.org/diffusion/DSEA Author: Software Heritage developers Author-email: swh-devel@inria.fr License: UNKNOWN Project-URL: Bug Reports, https://forge.softwareheritage.org/maniphest Project-URL: Funding, https://www.softwareheritage.org/donate Project-URL: Source, https://forge.softwareheritage.org/source/swh-search Project-URL: Documentation, https://docs.softwareheritage.org/devel/swh-search/ Description: swh-search ========== Search service for the Software Heritage archive. It is similar to swh-storage in what it contains, but provides different ways to query it: while swh-storage is mostly a key-value store that returns an object from a primary key, swh-search is focused on reverse indices, to allow finding objects that match some criteria; for example full-text search. Currently uses ElasticSearch, and provides only origin search (by URL and metadata) # Dependencies Python tests for this module include tests that cannot be run without a local ElasticSearch instance, so you need the ElasticSearch server executable on your machine (no need to have a running ElasticSearch server). ## Debian-like host The elasticsearch package is required. As it's not part of debian-stable, [another debian repository is required to be configured](https://www.elastic.co/guide/en/elasticsearch/reference/current/deb.html#deb-repo) ## Non Debian-like host The tests expect: - `/usr/share/elasticsearch/jdk/bin/java` to exist. - `org.elasticsearch.bootstrap.Elasticsearch` to be in java's classpath. Platform: UNKNOWN Classifier: Programming Language :: Python :: 3 Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3) Classifier: Operating System :: OS Independent Classifier: Development Status :: 3 - Alpha Requires-Python: >=3.7 Description-Content-Type: text/markdown Provides-Extra: testing diff --git a/swh/search/interface.py b/swh/search/interface.py index 86dbb07..a37b47a 100644 --- a/swh/search/interface.py +++ b/swh/search/interface.py @@ -1,80 +1,80 @@ # Copyright (C) 2020-2021 The Software Heritage developers # See the AUTHORS file at the top-level directory of this distribution # License: GNU General Public License version 3, or any later version # See top-level LICENSE file for more information from typing import Iterable, List, Optional, TypeVar from typing_extensions import TypedDict from swh.core.api import remote_api_endpoint from swh.core.api.classes import PagedResult as CorePagedResult TResult = TypeVar("TResult") PagedResult = CorePagedResult[TResult, str] class MinimalOriginDict(TypedDict): - """Mandatory keys of an :cls:`OriginDict`""" + """Mandatory keys of an :class:`OriginDict`""" url: str class OriginDict(MinimalOriginDict, total=False): """Argument passed to :meth:`SearchInterface.origin_update`.""" visit_types: List[str] has_visits: bool class SearchInterface: @remote_api_endpoint("check") def check(self): """Dedicated method to execute some specific check per implementation. """ ... @remote_api_endpoint("flush") def flush(self) -> None: """Blocks until all previous calls to _update() are completely applied. """ ... @remote_api_endpoint("origin/update") def origin_update(self, documents: Iterable[OriginDict]) -> None: """Persist documents to the search backend. """ ... @remote_api_endpoint("origin/search") def origin_search( self, *, url_pattern: Optional[str] = None, metadata_pattern: Optional[str] = None, with_visit: bool = False, visit_types: Optional[List[str]] = None, page_token: Optional[str] = None, limit: int = 50, ) -> PagedResult[MinimalOriginDict]: """Searches for origins matching the `url_pattern`. Args: url_pattern: Part of the URL to search for with_visit: Whether origins with no visit are to be filtered out visit_types: Only origins having any of the provided visit types (e.g. git, svn, pypi) will be returned page_token: Opaque value used for pagination limit: number of results to return Returns: PagedResult of origin dicts matching the search criteria. If next_page_token is None, there is no longer data to retrieve. """ ... diff --git a/tox.ini b/tox.ini index 428d66f..d71d9af 100644 --- a/tox.ini +++ b/tox.ini @@ -1,34 +1,72 @@ [tox] envlist=black,flake8,mypy,py3 [testenv] extras = testing deps = pytest-cov commands = pytest --cov={envsitepackagesdir}/swh/search \ {envsitepackagesdir}/swh/search \ --cov-branch {posargs} [testenv:black] skip_install = true deps = black==19.10b0 commands = {envpython} -m black --check swh [testenv:flake8] skip_install = true deps = flake8 commands = {envpython} -m flake8 [testenv:mypy] extras = testing deps = mypy commands = mypy swh + +# build documentation outside swh-environment using the current +# git HEAD of swh-docs, is executed on CI for each diff to prevent +# breaking doc build +[testenv:sphinx] +whitelist_externals = make +usedevelop = true +extras = + testing +deps = + # fetch and install swh-docs in develop mode + -e git+https://forge.softwareheritage.org/source/swh-docs#egg=swh.docs + +setenv = + SWH_PACKAGE_DOC_TOX_BUILD = 1 + # turn warnings into errors + SPHINXOPTS = -W +commands = + make -I ../.tox/sphinx/src/swh-docs/swh/ -C docs + + +# build documentation only inside swh-environment using local state +# of swh-docs package +[testenv:sphinx-dev] +whitelist_externals = make +usedevelop = true +extras = + testing +deps = + # install swh-docs in develop mode + -e ../swh-docs + +setenv = + SWH_PACKAGE_DOC_TOX_BUILD = 1 + # turn warnings into errors + SPHINXOPTS = -W +commands = + make -I ../.tox/sphinx-dev/src/swh-docs/swh/ -C docs