Page Menu
Home
Software Heritage
Search
Configure Global Search
Log In
Files
F7124094
D4971.diff
No One
Temporary
Actions
View File
Edit File
Delete File
View Transforms
Subscribe
Mute Notifications
Award Token
Flag For Later
Size
2 KB
Subscribers
None
D4971.diff
View Options
diff --git a/docs/index.rst b/docs/index.rst
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -19,6 +19,54 @@
graph (:mod:`swh.storage.storage`).
+Using ``swh-storage``
+---------------------
+
+First, note that ``swh-storage`` is an internal API of Software Heritage, that
+is only available to software running on the SWH infrastructure and developers
+:ref:`running their own Software Heritage <getting-started>`.
+If you want to access the Software Heritage archive without running your own,
+you should use the `Web API`_ instead.
+
+As ``swh-storage`` has multiple backends, it is instantiated via the
+:py:func:`swh.storage.get_storage` function, which takes as argument the backend type
+(usually ``remote``, if you already have access to a running swh-storage).
+
+It returns an instance of a class implementing
+:py:class:`swh.storage.interface.StorageInterface`; which is mostly a set of key-value
+stores, one for each object type.
+
+Many of the arguments and return types are "model objects", ie. immutable objects
+that are instances of the classes defined in :py:mod:`swh.model.model`.
+
+Methods returning long lists of arguments are paginated; by returning both a list
+of results and an opaque token to get the next page of results.
+For example, to list all the visits of an origin using ``origin_visit_get``
+ten visits at a time, you can do:
+
+.. code-block::
+
+ storage = get_storage("remote", url="http://localhost:5002")
+ while True:
+ page = storage.origin_visit_get(origin="https://github.com/torvalds/linux")
+ for visit in page.results:
+ print(visit)
+ if page.next_page_token is None:
+ break
+
+Or, using :py:func:`swh.core.api.classes.stream_results` for convenience:
+
+.. code-block::
+
+ storage = get_storage("remote", url="http://localhost:5002")
+ visits = stream_results(
+ storage.origin_visit_get, origin="https://github.com/torvalds/linux"
+ )
+ for visit in visits:
+ print(visit)
+
+.. _Web API: https://archive.softwareheritage.org/api/
+
Database schema
---------------
diff --git a/swh/storage/__init__.py b/swh/storage/__init__.py
--- a/swh/storage/__init__.py
+++ b/swh/storage/__init__.py
@@ -28,10 +28,14 @@
`storage_args`.
Args:
- storage (dict): dictionary with keys:
- - cls (str): storage's class, either local, remote, memory, filter,
- buffer
- - args (dict): dictionary with keys
+ cls (str): storage's class, can be:
+ - ``local`` to use a postgresql database
+ - ``cassandra`` to use a cassandra database
+ - ``remote`` to connect to a swh-storage server
+ - ``memory`` for an in-memory storage, useful for fast tests
+ - ``filter``, ``buffer``, ... to use specific storage "proxies", see their
+ respective documentations
+ args (dict): dictionary with keys
Returns:
an instance of swh.storage.Storage or compatible class
File Metadata
Details
Attached
Mime Type
text/plain
Expires
Dec 20 2024, 1:36 PM (11 w, 4 d ago)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3218258
Attached To
D4971: Write introduction to swh-storage.
Event Timeline
Log In to Comment