Page MenuHomeSoftware Heritage

D6005.id21733.diff
No OneTemporary

D6005.id21733.diff

diff --git a/docs/index.rst b/docs/index.rst
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -26,6 +26,7 @@
:caption: Contents:
cli
+ query-language
Reference Documentation
-----------------------
diff --git a/docs/query-language.rst b/docs/query-language.rst
new file mode 100644
--- /dev/null
+++ b/docs/query-language.rst
@@ -0,0 +1,190 @@
+Search Query Language
+=====================
+
+
+Every query is composed of filters separated by ``and`` or ``or``.
+These filters have 3 components in the order : ``Name Operator Value``
+
+Some of the examples are :
+ * ``origin = django and language in [python] and visits >= 5``
+ * ``last_revision > 2020-01-01 and limit = 10``
+ * ``last_visit > 2021-01-01 or last_visit < 2020-01-01``
+ * ``visited = false and metadata = "kubernetes" or origin = "minikube"``
+ * ``keyword in ["orchestration", "kubectl"] and language in ["go", "rust"]``
+ * ``(origin = debian or visit_type = ["deb"]) and license in ["GPL-3"]``
+
+**Note**:
+ * Whitespaces are optional between the three components of a filter.
+ * The conjunction operators have left precedence. Therefore ``foo and bar and baz`` means ``(foo and bar) and baz``
+ * Precedence of ``and`` is more than that of ``or``. Therefore ``foo or bar and baz`` means ``foo or (bar and baz)``
+ * This precedence of ``and`` can be overridden with the use of ``(`` and ``)``. For example, you can override the previous query as ``(foo or bar) and baz``
+ * ``and`` or ``or`` can be used inside the filter values using quotation marks. Example : ``metadata : "vcs history and metadata"``
+
+The filters have been classified based on the type of value that they expects.
+
+
+Pattern filters
+---------------
+Returns origins having the given keywords in their url or intrinsic metadata
+
+ * Name:
+ * ``origin``: Keywords from the origin url
+ * ``metadata``: Keywords from all the intrinsic metadata fields
+ * Operator: ``=``
+ * Value: String wrapped in quotation marks(``"`` or ``'``)
+
+**Note:** If a string has no whitespace then the quotation marks become optional.
+
+**Examples:**
+
+ * ``origin = https://github.com/Django/django``
+ * ``origin = kubernetes``
+ * ``origin = "github python"``
+ * ``metadata = orchestration``
+ * ``metadata = "javascript language"``
+
+Boolean filters
+---------------
+Returns origins having their boolean type values equal to given values
+
+ * Name: ``visited`` : Whether the origin has been visited
+ * Operator: ``=``
+ * Value: ``true`` or ``false``
+
+**Examples:**
+
+ * ``visited = true``
+ * ``visited = false``
+
+
+Numeric filters
+---------------
+Returns origins having their numeric type values in the given range
+
+ * Name: ``visits`` : Number of visits of an origin
+ * Operator: ``<`` ``<=`` ``=`` ``!=`` ``>`` ``>=``
+ * Value: Positive integer
+
+**Examples:**
+
+
+ * ``visits > 2``
+ * ``visits = 5``
+ * ``visits <= 10``
+
+
+Un-bounded List filters
+-----------------------
+
+Returns origins that satisfy the criteria based on a given list
+
+ * Name:
+ * ``language`` : Programming languages used
+ * ``license`` : License used
+ * ``keyword`` : keywords (often same as tags) or description (includes README) from the metadata
+ * Operator: ``in`` ``not in``
+ * Value: Array of strings
+
+**Note:**
+ * If a string has no whitespace then the quotation marks become optional.
+
+ * The ``keyword`` filter gives more priority to the keywords field of intrinsic metadata than the description field. So origins having the queried term in their intrinsic metadata keyword will appear first.
+
+
+**Examples:**
+
+ * ``language in [python, js]``
+ * ``license in ["GPL 3.0 or later", MIT]``
+ * ``keyword in ["Software Heritage", swh]``
+
+
+Bounded List filters
+--------------------
+
+Returns origins that satisfy the criteria based on a list of fixed options
+
+ **visit_type**
+
+ * Name: ``visit_type`` : Returns only origins with at least one of the specified visit types
+ * Operator: ``=``
+ * Value: Array of the following values
+
+ ``any``
+ ``cran``
+ ``deb``
+ ``deposit``
+ ``ftp``
+ ``hg``
+ ``git``
+ ``nixguix``
+ ``npm``
+ ``pypi``
+ ``svn``
+ ``tar``
+
+ **sort_by**
+
+ * Name: ``sort_by`` : Sorts origins based on the given list of origin attributes
+ * Operator: ``=``
+ * Value: Array of the following values
+
+ ``visits``
+ ``last_visit``
+ ``last_eventful_visit``
+ ``last_revision``
+ ``last_release``
+ ``created``
+ ``modified``
+ ``published``
+
+**Examples:**
+
+
+ * ``visit_type = [svn, npm]``
+ * ``visit_type = [nixguix, "ftp"]``
+ * ``sort_by = ["last_visit", created]``
+ * ``sort_by = [visits, modified]``
+
+Date filters
+------------
+
+Returns origins having their date type values in the given range
+
+ * Name:
+
+ * ``last_visit`` : Latest visit date
+ * ``last_eventful_visit`` : Latest visit date where a new snapshot was detected
+ * ``last_revision`` : Latest commit date
+ * ``last_release`` : Latest release date
+ * ``created`` Creation date
+ * ``modified`` Modification date
+ * ``published`` Published date
+
+ * Operator: ``<`` ``<=`` ``=`` ``!=`` ``>`` ``>=``
+ * Value: Date in ``Standard ISO`` format
+
+ **Note:** The last three date filters are based on metadata that has to be manually entered
+ by the repository authors. So they might not be correct or up-to-date.
+
+**Examples:**
+
+ * ``last_visit > 2001-01-01 and last_visit < 2101-01-01``
+ * ``last_revision = "2000-01-01 18:35Z"``
+ * ``last_release != "2021-07-17T18:35:00Z"``
+ * ``created <= "2021-07-17 18:35"``
+
+Limit filter
+------------
+
+Limits the number of results to at most N
+
+ * Name: ``limit``
+ * Operator: ``=``
+ * Value: Positive Integer
+
+**Note:** The default value of the limit is 50
+
+**Examples:**
+
+ * ``limit = 1``
+ * ``limit = 15``
\ No newline at end of file

File Metadata

Mime Type
text/plain
Expires
Nov 5 2024, 3:27 PM (11 w, 14 h ago)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3226518

Event Timeline