Page MenuHomeSoftware Heritage

D4971.diff
No OneTemporary

D4971.diff

diff --git a/docs/index.rst b/docs/index.rst
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -19,6 +19,54 @@
graph (:mod:`swh.storage.storage`).
+Using ``swh-storage``
+---------------------
+
+First, note that ``swh-storage`` is an internal API of Software Heritage, that
+is only available to software running on the SWH infrastructure and developers
+:ref:`running their own Software Heritage <getting-started>`.
+If you want to access the Software Heritage archive without running your own,
+you should use the `Web API`_ instead.
+
+As ``swh-storage`` has multiple backends, it is instantiated via the
+:py:func:`swh.storage.get_storage` function, which takes as argument the backend type
+(usually ``remote``, if you already have access to a running swh-storage).
+
+It returns an instance of a class implementing
+:py:class:`swh.storage.interface.StorageInterface`; which is mostly a set of key-value
+stores, one for each object type.
+
+Many of the arguments and return types are "model objects", ie. immutable objects
+that are instances of the classes defined in :py:mod:`swh.model.model`.
+
+Methods returning long lists of arguments are paginated; by returning both a list
+of results and an opaque token to get the next page of results.
+For example, to list all the visits of an origin using ``origin_visit_get``
+ten visits at a time, you can do:
+
+.. code-block::
+
+ storage = get_storage("remote", url="http://localhost:5002")
+ while True:
+ page = storage.origin_visit_get(origin="https://github.com/torvalds/linux")
+ for visit in page.results:
+ print(visit)
+ if page.next_page_token is None:
+ break
+
+Or, using :py:func:`swh.core.api.classes.stream_results` for convenience:
+
+.. code-block::
+
+ storage = get_storage("remote", url="http://localhost:5002")
+ visits = stream_results(
+ storage.origin_visit_get, origin="https://github.com/torvalds/linux"
+ )
+ for visit in visits:
+ print(visit)
+
+.. _Web API: https://archive.softwareheritage.org/api/
+
Database schema
---------------
diff --git a/swh/storage/__init__.py b/swh/storage/__init__.py
--- a/swh/storage/__init__.py
+++ b/swh/storage/__init__.py
@@ -28,10 +28,14 @@
`storage_args`.
Args:
- storage (dict): dictionary with keys:
- - cls (str): storage's class, either local, remote, memory, filter,
- buffer
- - args (dict): dictionary with keys
+ cls (str): storage's class, can be:
+ - ``local`` to use a postgresql database
+ - ``cassandra`` to use a cassandra database
+ - ``remote`` to connect to a swh-storage server
+ - ``memory`` for an in-memory storage, useful for fast tests
+ - ``filter``, ``buffer``, ... to use specific storage "proxies", see their
+ respective documentations
+ args (dict): dictionary with keys
Returns:
an instance of swh.storage.Storage or compatible class

File Metadata

Mime Type
text/plain
Expires
Dec 20 2024, 1:36 PM (11 w, 4 d ago)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3218258

Event Timeline