Changeset View
Changeset View
Standalone View
Standalone View
README.md
swh-storage | swh-storage | ||||
=========== | =========== | ||||
Abstraction layer over the archive, allowing to access all stored source code | Abstraction layer over the archive, allowing to access all stored source code | ||||
artifacts as well as their metadata. | artifacts as well as their metadata. | ||||
See the | See the | ||||
[documentation](https://docs.softwareheritage.org/devel/swh-storage/index.html) | [documentation](https://docs.softwareheritage.org/devel/swh-storage/index.html) | ||||
for more details. | for more details. | ||||
Tests | ## Quick start | ||||
----- | |||||
Python tests for this module include tests that cannot be run without a local | ### Dependencies | ||||
Postgres database. You are not obliged to run those tests though: | |||||
- `make test`: will run all tests | Python tests for this module include tests that cannot be run without | ||||
- `make test-nodb`: will run only tests that do not need a local DB | a local Postgresql database, so you need the Postgresql server executable on | ||||
ardumont: `Postgresql` | |||||
- `make test-db`: will run only tests that do need a local DB | your machine (no need to have a running Postgresql server). On a Debian-like | ||||
host: | |||||
If you do want to run DB-related tests, you should ensure you have access zith | ``` | ||||
sufficient privileges to a Postgresql database. | $ sudo apt install libpq-dev postgresql | ||||
``` | |||||
### Installation | |||||
It is strongly recommanded to use a virtualenv. In the following, we | |||||
consider you work in a virtualenv named `swh`. See the | |||||
[developer setup guide](https://docs.softwareheritage.org/devel/developer-setup.html#developer-setup) | |||||
for a more details on how to setup a working environment. | |||||
You can install the package directly from | |||||
[pypi](https://pypi.org/p/swh.storage): | |||||
### Using your system database | ``` | ||||
(swh) :~$ pip install swh.storage | |||||
[...] | |||||
``` | |||||
Or from sources: | |||||
``` | |||||
(swh) :~$ git clone https://forge.softwareheritage.org/source/swh-storage.git | |||||
[...] | |||||
(swh) :~$ cd swh-storage | |||||
(swh) :~/swh-storage$ pip install . | |||||
[...] | |||||
``` | |||||
You need to ensure that your user is authorized to create and drop DBs, and in | Then you can check it's properly installed: | ||||
particular DBs named "softwareheritage-test" and "softwareheritage-dev" | ``` | ||||
(swh) :~$ swh storage --help | |||||
Usage: swh storage [OPTIONS] COMMAND [ARGS]... | |||||
Note: the testdata repository (swh-storage-testdata) is not required any more. | Software Heritage Storage tools. | ||||
### Using pifpaf | Options: | ||||
-h, --help Show this message and exit. | |||||
[pifpaf](https://github.com/jd/pifpaf) is a suite of fixtures and a | Commands: | ||||
command-line tool that allows to start and stop daemons for a quick throw-away | rpc-serve Software Heritage Storage RPC server. | ||||
usage. | ``` | ||||
It can be used to run tests that need a Postgres database without any other | |||||
configuration reauired nor the need to have special access to a running | |||||
database: | |||||
```bash | ## Tests | ||||
$ pifpaf run postgresql make test-db | The best way of running Python tests for this module is to use | ||||
[snip] | [tox](https://tox.readthedocs.io/). | ||||
---------------------------------------------------------------------- | |||||
Ran 124 tests in 56.203s | |||||
OK | ``` | ||||
(swh) :~$ pip install tox | |||||
``` | ``` | ||||
Note that pifpaf is not yet available as a Debian package, so you may have to | ### tox | ||||
install it in a venv. | |||||
From the sources directory, simply use tox: | |||||
Development | ``` | ||||
----------- | (swh) :~/swh-storage$ tox | ||||
[...] | |||||
========= 315 passed, 6 skipped, 15 warnings in 40.86 seconds ========== | |||||
_______________________________ summary ________________________________ | |||||
flake8: commands succeeded | |||||
py3: commands succeeded | |||||
congratulations :) | |||||
``` | |||||
A test server could locally be running for tests. | ## Development | ||||
The storage server can be locally started. It requires a configuration file and | |||||
a running Postgresql database. | |||||
### Sample configuration | ### Sample configuration | ||||
In either /etc/softwareheritage/storage/storage.yml, | A typical configuration `storage.yml` file is: | ||||
~/.config/swh/storage.yml or ~/.swh/storage.yml: | |||||
``` | ``` | ||||
storage: | storage: | ||||
cls: local | cls: local | ||||
args: | args: | ||||
db: "dbname=softwareheritage-dev user=<user>" | db: "dbname=softwareheritage-dev user=<user> password=<pwd>" | ||||
objstorage: | objstorage: | ||||
cls: pathslicing | cls: pathslicing | ||||
args: | args: | ||||
root: /home/storage/swh-storage/ | root: /tmp/swh-storage/ | ||||
slicing: 0:2/2:4/4:6 | slicing: 0:2/2:4/4:6 | ||||
``` | ``` | ||||
which means, this uses: | which means, this uses: | ||||
- a local storage instance whose db connection is to | - a local storage instance whose db connection is to | ||||
softwareheritage-dev local instance | `softwareheritage-dev` local instance, | ||||
- the objstorage uses a local objstorage instance whose: | - the objstorage uses a local objstorage instance whose: | ||||
- root path is /home/storage/swh-storage | - `root` path is /tmp/swh-storage, | ||||
- slicing scheme is 0:2/2:4/4:6. This means that the identifier of | - slicing scheme is `0:2/2:4/4:6`. This means that the identifier of | ||||
the content (sha1) which will be stored on disk at first level | the content (sha1) which will be stored on disk at first level | ||||
with the first 2 hex characters, the second level with the next 2 | with the first 2 hex characters, the second level with the next 2 | ||||
hex characters and the third level with the next 2 hex | hex characters and the third level with the next 2 hex | ||||
characters. And finally the complete hash file holding the raw | characters. And finally the complete hash file holding the raw | ||||
content. For example: 00062f8bd330715c4f819373653d97b3cd34394c | content. For example: 00062f8bd330715c4f819373653d97b3cd34394c | ||||
will be stored at 00/06/2f/00062f8bd330715c4f819373653d97b3cd34394c | will be stored at 00/06/2f/00062f8bd330715c4f819373653d97b3cd34394c | ||||
Note that the 'root' path should exist on disk. | Note that the `root` path should exist on disk before starting the server. | ||||
### Starting the storage server | |||||
### Run server | If the python package has been properly installed (e.g. in a virtual env), you | ||||
should be able to use the command: | |||||
Command: | |||||
``` | ``` | ||||
python3 -m swh.storage.api.server ~/.config/swh/storage.yml | (swh) :~/swh-storage$ swh storage rpc-serve storage.yml | ||||
``` | ``` | ||||
This runs a local swh-storage api at 5002 port. | This runs a local swh-storage api at 5002 port. | ||||
``` | |||||
(swh) :~/swh-storage$ curl http://127.0.0.1:5002 | |||||
<html> | |||||
<head><title>Software Heritage storage server</title></head> | |||||
<body> | |||||
<p>You have reached the | |||||
<a href="https://www.softwareheritage.org/">Software Heritage</a> | |||||
storage server.<br /> | |||||
See its | |||||
<a href="https://docs.softwareheritage.org/devel/swh-storage/">documentation | |||||
and API</a> for more information</p> | |||||
``` | |||||
### And then what? | ### And then what? | ||||
In your upper layer (loader-git, loader-svn, etc...), you can define a | In your upper layer | ||||
remote storage with this snippet of yaml configuration. | ([loader-git](https://forge.softwareheritage.org/source/swh-loader-git/), | ||||
[loader-svn](https://forge.softwareheritage.org/source/swh-loader-svn/), | |||||
etc...), you can define a remote storage with this snippet of yaml | |||||
configuration. | |||||
``` | ``` | ||||
storage: | storage: | ||||
cls: remote | cls: remote | ||||
args: | args: | ||||
url: http://localhost:5002/ | url: http://localhost:5002/ | ||||
``` | ``` | ||||
Show All 13 Lines |
Postgresql