Page Menu
Home
Software Heritage
Search
Configure Global Search
Log In
Files
F9339231
db.rst
No One
Temporary
Actions
Download File
Edit File
Delete File
View Transforms
Subscribe
Mute Notifications
Award Token
Flag For Later
Size
6 KB
Subscribers
None
db.rst
View Options
..
_swh-core-db:
Common database utilities
=========================
The
``swh.core.db``
module offers a set of common (postgresql) database
handling utilities and features for other swh packages implementing a
`datastore`
, aka a service responsible for providing a data store via a common
interface which can use a postgresql database as backend. Examples are
:mod:
`swh.storage`
or
:mod:
`swh.scheduler`
.
Most of the time, this database-based data storage facility will depend on a data
schema (may be based in
:mod:
`swh.model`
or not) and provide a unified interface
based on an Python class to abstract access to this datastore.
Some packages may implement only a postgresql backend, some may provide more
backends.
This
:mod:
`swh.core.db`
only deals with the postgresql part and provides common
features and tooling to manage the database lifecycle in a consistent and
unified way among all the
:mod:
`swh`
packages.
It comes with a few command line tools to manage the specific
:mod:
`swh`
package database.
As such, most of the database management cli commands require a configuration
file holding the database connection information.
For example, for the
:mod:
`swh.storage`
package, one will be able to create,
initialize and upgrade the postgresql database using simple commands.
To create the database and perform superuser initialization steps (see below):
..
code-block
::
bash
$
swh
db
create
storage
--dbname
=
postgresql://superuser:passwd@localhost:5433/test-storage
If the database already exists but lacks superuser level initialization steps,
you may use:
..
code-block
::
bash
$
swh
db
init-admin
storage
--dbname
=
postgresql://superuser:passwd@localhost:5433/test-storage
Then assuming the
``config.yml``
file existence:
..
code-block
::
yaml
storage
:
cls
:
postgresql
db: host=localhost, port=5433, dbname=test-storage, username=normal-user, password=pwd
objstorage:
cls: memory
then you can run:
..
code-block
::
bash
$
swh
db
--config-file
=
config.yml
init
storage
DONE
database
for
storage
initialized
(
flavor
default
)
at
version
182
Note: you can define the
``SWH_CONFIG_FILENAME``
environment variable instead
of using the
``--config-name``
command line option.
or check the actual data model version of this database:
..
code-block
::
bash
$
swh
db
--config-file
=
config.yml
version
storage
module:
storage
flavor:
default
version:
182
as well as the migration history for the database:
..
code-block
::
bash
$
swh
db
--config-file
=
config.yml
version
--all
storage
module:
storage
flavor:
default
182
[
2022
-02-11
15
:08:31.806070+01:00
]
Work
In
Progress
181
[
2022
-02-11
14
:06:27.435010+01:00
]
Work
In
Progress
The database migration is done using the
``swh db upgrade``
command.
Implementation of a swh.core.db datastore
-----------------------------------------
To use this database management tooling, in a
:mod:
`swh`
package, the following
conditions are expected:
-
the package should provide an
``sql``
directory in its root namespace
providing initialization sql scripts. Scripts should be named like
``nn-xxx.sql``
and are executed in order according to the
``nn``
integer
value. Scripts having
``-superuser-``
in their name will be executed by the
``init-admin``
tool and are expected to require superuser access level,
whereas scripts without
``-superuser-``
in their name will be executed by the
``swh db init``
command and are expected to require write access
level (with no need for superuser access level).
-
the package should provide a
``sql/upgrade``
directory with SQL migration
scripts in its root namespace. Script names are expected to be of the form
``nnn.sql``
where
`nnn`
is the version to which this script does the
migration from a database at version
`nnn - 1`
.
-
the initialization and migration scripts should not create nor fill the
metadata related tables (
``dbversion``
and
``dbmodule``
).
-
the package should provide a
``get_datastore``
function in its root namespace
returning an instance of the datastore object. Normally, this datastore
object uses
``swh.core.db.BaseDb``
to interact with the actual database.
-
The datastore object should provide a
``get_current_version()``
method
returning the database version expected by the code.
See existing
``swh``
packages like
``swh.storage``
or
``swh.scheduler``
for
usage examples.
Writing tests
-------------
The
``swh.core.db.pytest_plugin``
provides a few helper tools to write unit
tests for postgresql based datastores.
By default, when using these fixtures, a posgresql server will be started (by
the pytest_postgresql fixture) and a template database will be created using
the
``postgresql_proc``
fixture factory provided by
``pytest_postgresql``
.
Then a dedicated fixture must be declared to use the
``postgresql_proc``
fixture generated by the fixture factory function.
This template database will then be used to create a new database for test
using this dedicated fixture.
In order to help the database initialization process and make it consistent
with the database initialization tools from the
``swh db``
cli, an
``initialize_database_for_module()``
function is provided to be used with the
fixture factory described above.
Typically, writing tests for a
``swh``
package
``swh.example``
would look like:
..
code-block
::
python
from
functools
import
partial
from
pytest_postgresql
import
factories
from
swh.core.db.pytest_plugin
import
postgresql_fact
from
swh.core.db.pytest_plugin
import
initialize_database_for_module
example_postgresql_proc
=
factories
.
postgresql_proc
(
dbname
=
"example"
,
load
=
[
partial
(
initialize_database_for_module
,
modname
=
"example"
,
version
=
1
)]
)
postgresql_example
=
postgresql_fact
(
"example_postgresql_proc"
)
def
test_example
(
postgresql_example
):
with
postgresql_example
.
cursor
()
as
c
:
c
.
execute
(
"select version from dbversion limit 1"
)
assert
c
.
fecthone
()[
0
]
==
1
Note: most of the time, you will want to put the scaffolding part of the code
above in a
``conftest.py``
file.
The
``load``
argument of the
``factories.postgresql_proc``
will be used to
initialize the template database that will be used to create a new database for
each test, while the
``load``
argument of the
``postgresql_fact``
fixture will
be executed before each test (in the database created from the template
database and dedicated to the test being executed).
File Metadata
Details
Attached
Mime Type
text/x-python
Expires
Jul 4 2025, 9:31 AM (5 w, 5 d ago)
Storage Engine
blob
Storage Format
Raw Data
Storage Handle
3365979
Attached To
rDCORE Foundations and core functionalities
Event Timeline
Log In to Comment