diff --git a/docker/README.md b/docker/README.md --- a/docker/README.md +++ b/docker/README.md @@ -584,6 +584,59 @@ (swh) ~/swh-environment$ swh scheduler task respawn 1 ``` +## Data persistence for a development setting + +The default docker-compose.yml configuration is not geared towards data persistence, +but application testing. + +Volumes defined in associated images are anonymous and may get either unused or removed. + +One safe way to make sure these volumes persist is to use named external volumes, +created beforehand. +To use vanilla named volumes, fully managed by Docker, create them like this: + +``` +for vn in swh-storage-data swh-objstorage-data; +do + docker volume create "${vn}" +done +``` + +We can also create them as named host volumes so that the data itself can be managed +out of the Docker-reserved zone, like a non-containerized service. It makes easier +to access data from the host, but may raise permission issues on this shared data. + +``` +for vn in swh-storage-data swh-objstorage-data; +do + sudo mkdir "/data/docker/${vn}" + docker volume create -d local \ + --opt type=none --opt o=bind --opt device="/data/docker/${vn}" "${vn}" +done +``` + +Then, the volumes may be defined as follows in a `docker-compose.override.yml`. +Note that volume definitions are merged with those in `docker-compose.yml` based +on destination path. + +``` +services: + swh-storage-db: + volumes: + - "swh_storage_data:/var/lib/postgresql/data" + swh-objstorage: + volumes: + - "objstorage_data:/srv/softwareheritage/objects" + +volumes: + swh_storage_data: + external: true + swh_objstorage_data: + external: true +``` + +This way, `docker-compose down -v` will not remove those volumes along with the +anonymous ones, only an explicit `docker volume rm` will. ## Starting a kafka-powered mirror of the storage