Page Menu
Home
Software Heritage
Search
Configure Global Search
Log In
Files
F7123258
D7159.id26063.diff
No One
Temporary
Actions
View File
Edit File
Delete File
View Transforms
Subscribe
Mute Notifications
Award Token
Flag For Later
Size
6 KB
Subscribers
None
D7159.id26063.diff
View Options
diff --git a/docs/db.rst b/docs/db.rst
new file mode 100644
--- /dev/null
+++ b/docs/db.rst
@@ -0,0 +1,176 @@
+.. _swh-core-db:
+
+Common database utilities
+=========================
+
+The ``swh.core.db`` module offers a set of common (postgresql) database
+handling utilities and features for other swh packages implementing a
+`datastore`, aka a service responsible for providing a data store via a common
+interface which can use a postgresql database as backend. Examples are
+:mod:`swh.storage` or :mod:`swh.scheduler`.
+
+Most of the time, this database-based data storage facility will depend on a data
+schema (may be based in :mod:`swh.model` or not) and provide a unified interface
+based on an Python class to abstract access to this datastore.
+
+Some packages may implement only a postgresql backend, some may provide more
+backends.
+
+This :mod:`swh.core.db` only deals with the postgresql part and provides common
+features and tooling to manage the database lifecycle in a consistent and
+unified way among all the :mod:`swh` packages.
+
+It comes with a few command line tools to manage the specific :mod:`swh`
+package database.
+
+As such, most of the database management cli commands require a configuration
+file holding the database connection information.
+
+For example, for the :mod:`swh.storage` package, one will be able to create,
+initialize and upgrade the postgresql database using simple commands.
+
+To create the database and perform superuser initialization steps (see below):
+
+.. code-block:: bash
+
+ $ swh db create storage --dbname=postgresql://superuser:passwd@localhost:5433/test-storage
+
+If the database already exists but lacks superuser level initialization steps,
+you may use:
+
+.. code-block:: bash
+
+ $ swh db init-admin storage --dbname=postgresql://superuser:passwd@localhost:5433/test-storage
+
+
+Then assuming the ``config.yml`` file existence:
+
+.. code-block:: yaml
+
+ storage:
+ cls: postgresql
+ db: host=localhost, port=5433, dbname=test-storage, username=normal-user, password=pwd
+ objstorage:
+ cls: memory
+
+then you can run:
+
+.. code-block:: bash
+
+ $ swh db --config-file=config.yml init storage
+ DONE database for storage initialized (flavor default) at version 182
+
+Note: you can define the ``SWH_CONFIG_FILENAME`` environment variable instead
+of using the ``--config-name`` command line option.
+
+or check the actual data model version of this database:
+
+.. code-block:: bash
+
+ $ swh db --config-file=config.yml version storage
+ module: storage
+ flavor: default
+ version: 182
+
+as well as the migration history for the database:
+
+.. code-block:: bash
+
+ $ swh db --config-file=config.yml version --all storage
+ module: storage
+ flavor: default
+ 182 [2022-02-11 15:08:31.806070+01:00] Work In Progress
+ 181 [2022-02-11 14:06:27.435010+01:00] Work In Progress
+
+
+The database migration is done using the ``swh db upgrade`` command.
+
+
+
+Implementation of a swh.core.db datastore
+-----------------------------------------
+
+To use this database management tooling, in a :mod:`swh` package, the following
+conditions are expected:
+
+- the package should provide an ``sql`` directory in its root namespace
+ providing initialization sql scripts. Scripts should be named like
+ ``nn-xxx.sql`` and are executed in order according to the ``nn`` integer
+ value. Scripts having ``-superuser-`` in their name will be executed by the
+ ``init-admin`` tool and are expected to require superuser access level,
+ whereas scripts without ``-superuser-`` in their name will be executed by the
+ ``swh db init`` command and are expected to require write access
+ level (with no need for superuser access level).
+
+- the package should provide a ``sql/upgrade`` directory with SQL migration
+ scripts in its root namespace. Script names are expected to be of the form
+ ``nnn.sql`` where `nnn` is the version to which this script does the
+ migration from a database at version `nnn - 1`.
+
+- the initialization and migration scripts should not create nor fill the
+ metadata related tables (``dbversion`` and ``dbmodule``).
+
+- the package should provide a ``get_datastore`` function in its root namespace
+ returning an instance of the datastore object. Normally, this datastore
+ object uses ``swh.core.db.BaseDb`` to interact with the actual database.
+
+- The datastore object should provide a ``get_current_version()`` method
+ returning the database version expected by the code.
+
+See existing ``swh`` packages like ``swh.storage`` or ``swh.scheduler`` for
+usage examples.
+
+Writing tests
+-------------
+
+The ``swh.core.db.pytest_plugin`` provides a few helper tools to write unit
+tests for postgresql based datastores.
+
+By default, when using these fixtures, a posgresql server will be started (by
+the pytest_postgresql fixture) and a template database will be created using
+the ``postgresql_proc`` fixture factory provided by ``pytest_postgresql``.
+
+Then a dedicated fixture must be declared to use the ``postgresql_proc``
+fixture generated by the fixture factory function.
+
+This template database will then be used to create a new database for test
+using this dedicated fixture.
+
+In order to help the database initialization process and make it consistent
+with the database initialization tools from the ``swh db`` cli, an
+``initialize_database_for_module()`` function is provided to be used with the
+fixture factory described above.
+
+Typically, writing tests for a ``swh`` package ``swh.example`` would look like:
+
+.. code-block:: python
+
+ from functools import partial
+
+ from pytest_postgresql import factories
+ from swh.core.db.pytest_plugin import postgresql_fact
+ from swh.core.db.pytest_plugin import initialize_database_for_module
+
+ example_postgresql_proc = factories.postgresql_proc(
+ dbname="example",
+ load=[partial(initialize_database_for_module,
+ modname="example", version=1)]
+ )
+
+ postgresql_example = postgresql_fact("example_postgresql_proc")
+
+ def test_example(postgresql_example):
+ with postgresql_example.cursor() as c:
+ c.execute("select version from dbversion limit 1")
+ assert c.fecthone()[0] == 1
+
+
+Note: most of the time, you will want to put the scaffolding part of the code
+above in a ``conftest.py`` file.
+
+
+The ``load`` argument of the ``factories.postgresql_proc`` will be used to
+initialize the template database that will be used to create a new database for
+each test, while the ``load`` argument of the ``postgresql_fact`` fixture will
+be executed before each test (in the database created from the template
+database and dedicated to the test being executed).
diff --git a/docs/index.rst b/docs/index.rst
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -9,4 +9,5 @@
:maxdepth: 2
cli
+ db
/apidoc/swh.core
File Metadata
Details
Attached
Mime Type
text/plain
Expires
Wed, Dec 18, 6:50 AM (6 d, 3 h ago)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3220011
Attached To
D7159: Add a documentation section for swh.core.db
Event Timeline
Log In to Comment