diff --git a/docs/install-objstorage.rst b/docs/install-objstorage.rst deleted file mode 100644 --- a/docs/install-objstorage.rst +++ /dev/null @@ -1,538 +0,0 @@ -.. highlight:: bash - -.. _objstorage_install: - - -Object Storage -============== - -The machine that hosts the Object Storage must have access to enough storage so -the whole content of the Archive can be copied. It should be at least 300TiB -(as of today, the whole content of the archive represent around 200TiB of -compressed data). - -The object storage does not require a database, however, it needs a way to -store objects (blobs). - -There are several backends currently available: - -- `pathslicing`: store objects in a POSIX filesystem -- `remote`: use a HTTP based RPC exposing an existing objstorage, -- `pathslicing`: store objects in a POSIX filesystem -- `azure`: use Azure's storage, -- `azure-prefixed`: Azure's storage with a prefix; typically used in - conjunction with the `multiplexer` backend (see below) to distribute the - storage amoung a set of Azure tokens for better performances, -- `memory`: keep everything in RAM, for testing purpose, -- `weed`: use seaweedfs as blob storage, -- `rados`: RADOS based object storage (Ceph), -- `s3`: Amazon S3 storage, -- `swift`: OpensStack Swift storage. -- `multiplexer`: assemble several objstorages as once. -- `striping`: xxx -- `filtered`: xxx - - -Beware that not all these backends are production ready. - -Please read the documentation of each of these backends for more details on how -to configure them. - -Docker based deployment ------------------------ - -In a docker based deployment, all machine names must be resolvable from within -a docker container and accessible from there. - -When testing this guide on a single docker host, the simplest solution is to -start your docker containers linked to a common bridge:: - - $ docker network create swh - e0d85947d4f53f8b2f0393517f373ab4f5b06d02e1efa07114761f610b1f7afa - $ - -In the examples below we will use such a network config. - -Build the image -~~~~~~~~~~~~~~~ - -``` -$ docker build -t swh/base https://forge.softwareheritage.org/source/swh-docker.git -``` - -Configure the object storage -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -In this example, we use a local storage as backend located in -`/srv/softwareheritage/objects`. Write a configuration file named -`objstorage.yml`:: - - objstorage: - cls: pathslicing - args: - root: /srv/softwareheritage/objects - slicing: 0:5 - - client_max_size: 1073741824 - -Testing the configuration -~~~~~~~~~~~~~~~~~~~~~~~~~ - -Then start the test SWGI server for the RPC objstorage service:: - - $ docker run --rm \ - --network swh \ - -v ${PWD}/objstorage.yml:/etc/softwareheritage/config.yml \ - -v /srv/softwareheritage/objects:/srv/softwareheritage/objects \ - -p 5003:5000 \ - swh/base objstorage - -You should be able to query this server (fraom another terminal since the -container is started attached to its console):: - - $ http http://127.0.0.1:5003 - HTTP/1.1 200 OK - Content-Length: 25 - Content-Type: text/plain; charset=utf-8 - Date: Thu, 20 Jun 2019 08:02:32 GMT - Server: Python/3.5 aiohttp/3.5.1 - - SWH Objstorage API server - - $ http http://127.0.0.1:5003/check_config check_write=True - HTTP/1.1 200 OK - Content-Length: 4 - Content-Type: application/json - Date: Thu, 20 Jun 2019 08:06:58 GMT - Server: Python/3.5 aiohttp/3.5.4 - - true - -Note: in the example above, we use httpie_ as HTTP client. You can use any -other tool (curl, wget...) - -.. _httpie: https://httpie.org - -Since we started this container attached, just hit Ctrl+C to quit in the -terminal in which the docker container is running. - -Running in production -~~~~~~~~~~~~~~~~~~~~~ - -This container uses gunicorn as SWGI server. However, since this later does not -handle the HTTP stack well enough for a production system, it is recommanded to -run this behind a proper HTTP server like nginx. - -First, we start the objstorage container without exposing the TCP port, but -using a mounted file as socket to be able to share it with other containers. -Here, we create this socket file in `/srv/softwareheritage/objstorage.sock`:: - - $ docker run -d --name objstorage \ - --network swh \ - -v ${PWD}/objstorage.yml:/etc/softwareheritage/config.yml \ - -v /srv/softwareheritage/objects:/srv/softwareheritage/objects \ - -v /srv/softwareheritage/socks/objstorage:/var/run/gunicorn/swh \ - swh/base objstorage - -And start an HTTP server that will proxy the UNIX socket -`/srv/softwareheritage/socks/objstorage.sock`. Using Nginx, you can use the -following `nginx.conf` file:: - - worker_processes 4; - - # Show startup logs on stderr; switch to debug to print, well, debug logs when - # running nginx-debug - error_log /dev/stderr info; - - events { - worker_connections 1024; - } - - http { - include mime.types; - default_type application/octet-stream; - sendfile on; - keepalive_timeout 65; - - # Built-in Docker resolver. Needed to allow on-demand resolution of proxy - # upstreams. - resolver 127.0.0.11 valid=30s; - - upstream app_server { - # fail_timeout=0 means we always retry an upstream even if it failed - # to return a good HTTP response - - # for UNIX domain socket setups - server unix:/tmp/gunicorn/gunicorn.sock fail_timeout=0; - } - - server { - listen 80 default_server; - - # Add a trailing slash to top level requests - rewrite ^/([^/]+)$ /$1/ permanent; - - location / { - set $upstream "http://app_server"; - proxy_pass $upstream; - } - } - } - - -And run nginx in a docker container with:: - - $ docker run \ - --network swh \ - -v ${PWD}/conf/nginx.conf:/etc/nginx/nginx.conf:ro \ - -v /tmp/objstorage/objstorage.sock:/tmp/gunicorn.sock \ - -p 5003:80 \ - nginx - - -Which you can check for proper fucntionning also:: - - $ http :5003/check_config check_write=True - HTTP/1.1 200 OK - Connection: keep-alive - Content-Length: 1 - Content-Type: application/x-msgpack - Date: Thu, 20 Jun 2019 10:13:39 GMT - Server: nginx/1.17.0 - - true - - -If you want your docker conotainers to start automatically, add the -`--restart=always` option to docker commands above. This should prevent you -from having to write custom service unit files. - - -Manual installation on a Debian system --------------------------------------- - -Ensure you have a Debian machine with Software Heritage apt repository -:ref:`properly configured `. - -There are several storage scenarios supported by the :ref:`Object Storage -`. We will focus on a simple scenario where local storage is used -using a regular filesystem. - -Let's assume this storage capacity is available on `/srv/softwareheritage`. - -- Install the Object Storage package and dependencies:: - - ~$ sudo apt install python3-swh.objstorage gunicorn3 nginx-light - -- Create a dedicated `swh` user:: - - ~$ sudo useradd -md /srv/softwareheritage -s /bin/bash swh - -- Create the required directory for objects storage:: - - ~$ sudo mkdir /srv/softwareheritage/objects - ~$ sudo chown swh: /srv/softwareheritage/objects - -- Configure the Object Storage RPC Server:: - - ~$ sudo mkdir /etc/softwareheritage/ - ~$ sudo sh -c 'cat > /etc/softwareheritage/objstorage.yml' < objstorage: - > cls: pathslicing - > args: - > root: /srv/softwareheritage/objects - > slicing: 0:5 - > - > client_max_size: 1073741824 - > EOF - ~$ - -- Ensure the Object Storage service can be started by hand:: - - ~$ sudo -u swh swh-objstorage -C /etc/softwareheritage/objstorage.yml serve - ======== Running on http://0.0.0.0:5003 ======== - (Press CTRL+C to quit) - - In another terminal, check the HTTP server responds properly:: - - ~$ curl 127.0.0.1:5003 - SWH Objstorage API server - ~$ - - Quit the test server by hitting Ctrl+C in the terminal it is running in. - -- Ensure reauired directories for gunicorn exists:: - - ~$ sudo mkdir -p /etc/gunicorn/instances - ~$ sudo mkdir -p /var/run/gunicorn/swh-objstorage/ - ~$ sudo chown swh: /var/run/gunicorn/swh-objstorage/ - -- Copy the gunicorn config file below to `/etc/gunicorn/instances/objstorage.cfg`:: - - import traceback - import gunicorn.glogging - - class Logger(gunicorn.glogging.Logger): - log_only_errors = True - - def access(self, resp, req, environ, request_time): - """ See http://httpd.apache.org/docs/2.0/logs.html#combined - for format details - """ - - if not (self.cfg.accesslog or self.cfg.logconfig or self.cfg.syslog): - return - - # wrap atoms: - # - make sure atoms will be test case insensitively - # - if atom doesn't exist replace it by '-' - atoms = self.atoms(resp, req, environ, request_time) - safe_atoms = self.atoms_wrapper_class(atoms) - - try: - if self.log_only_errors and str(atoms['s']) == '200': - return - self.access_log.info(self.cfg.access_log_format % safe_atoms, extra={'swh_atoms': atoms}) - except: - self.exception('Failed processing access log entry') - - logger_class = Logger - logconfig = '/etc/gunicorn/logconfig.ini' - - # custom settings - bind = "unix:/run/gunicorn/swh-objstorage/gunicorn.sock" - workers = 16 - worker_class = "aiohttp.worker.GunicornWebWorker" - timeout = 3600 - graceful_timeout = 3600 - keepalive = 5 - max_requests = 0 - max_requests_jitter = 0 - # Uncomment the following lines if you want statsd monitoring - # statsd_host = "127.0.0.1:8125" - # statsd_prefix = "swh-objstorage" - -- Copy the logging config file to `/etc/gunicorn/logconfig.ini`:: - - [loggers] - keys=root, gunicorn.error, gunicorn.access - - [handlers] - keys=console, journal - - [formatters] - keys=generic - - [logger_root] - level=INFO - handlers=console,journal - - [logger_gunicorn.error] - level=INFO - propagate=0 - handlers=journal - qualname=gunicorn.error - - [logger_gunicorn.access] - level=INFO - propagate=0 - handlers=journal - qualname=gunicorn.access - - [handler_console] - class=StreamHandler - formatter=generic - args=(sys.stdout, ) - - [handler_journal] - class=swh.core.logger.JournalHandler - formatter=generic - args=() - - [formatter_generic] - format=%(asctime)s [%(process)d] [%(levelname)s] %(message)s - datefmt=%Y-%m-%d %H:%M:%S - class=logging.Formatter - - -- Ensure the Object Storage server can be started via gunicorn:: - - ~$ SWH_CONFIG_FILENAME=/etc/softwareheritage/objstorage.yml \ - gunicorn3 -c /etc/gunicorn/instances/objstorage.cfg swh.objstorage.api.wsgi - [...] - ^C - ~$ - -- Add a `systemd` Service Unit file for this gunicorn WSGI server; copy the - file below to `/etc/systemd/system/gunicorn-swh-objstorage.service`:: - - [Unit] - Description=Gunicorn instance swh-objstorage - ConditionPathExists=/etc/gunicorn/instances/swh-objstorage.cfg - PartOf=gunicorn.service - ReloadPropagatedFrom=gunicorn.service - Before=gunicorn.service - - [Service] - User=swhstorage - Group=swhstorage - PIDFile=/run/gunicorn/swh-objstorage/pidfile - RuntimeDirectory=/run/gunicorn/swh-objstorage - WorkingDirectory=/run/gunicorn/swh-objstorage - Environment=SWH_CONFIG_FILENAME=/etc/softwareheritage/objstorage.yml - Environment=SWH_LOG_TARGET=journal - ExecStart=/usr/bin/gunicorn3 -p /run/gunicorn/swh-objstorage/pidfile -c /etc/gunicorn/instances/objstorage.cfg swh.objstorage.api.wsgi - ExecStop=/bin/kill -TERM $MAINPID - ExecReload=/bin/kill -HUP $MAINPID - - [Install] - WantedBy=multi-user.target - - And the file below to `/etc/systemd/system/gunicorn.service`:: - - [Unit] - Description=All gunicorn services - - [Service] - Type=oneshot - ExecStart=/bin/true - ExecReload=/bin/true - RemainAfterExit=on - - [Install] - WantedBy=multi-user.target - -- Load these Service Unit files and activate them:: - - ~$ sudo systemctl daemon-reload - ~$ sudo systemctl enable --now gunicorn-swh-objstorage.service - - -- Configure the nginx HTTP server as a reverse proxy for the gunicorn SWGI - server; here is an example of the file `/etc/nginx/nginx.conf`:: - - user www-data; - worker_processes 16; - worker_rlimit_nofile 1024; - - pid /var/run/nginx.pid; - error_log /var/log/nginx/error.log error; - - events { - accept_mutex off; - accept_mutex_delay 500ms; - worker_connections 1024; - } - - http { - - include /etc/nginx/mime.types; - default_type application/octet-stream; - - access_log /var/log/nginx/access.log; - - sendfile on; - server_tokens on; - - types_hash_max_size 1024; - types_hash_bucket_size 512; - - server_names_hash_bucket_size 128; - server_names_hash_max_size 1024; - - keepalive_timeout 65s; - keepalive_requests 100; - client_body_timeout 60s; - send_timeout 60s; - lingering_timeout 5s; - tcp_nodelay on; - - gzip on; - gzip_comp_level 1; - gzip_disable msie6; - gzip_min_length 20; - gzip_http_version 1.1; - gzip_proxied off; - gzip_vary off; - - client_body_temp_path /var/nginx/client_body_temp; - client_max_body_size 10m; - client_body_buffer_size 128k; - proxy_temp_path /var/nginx/proxy_temp; - proxy_connect_timeout 90s; - proxy_send_timeout 90s; - proxy_read_timeout 90s; - proxy_buffers 32 4k; - proxy_buffer_size 8k; - proxy_set_header Host $host; - proxy_set_header X-Real-IP $remote_addr; - proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; - proxy_set_header Proxy ""; - proxy_headers_hash_bucket_size 64; - - - include /etc/nginx/conf.d/*.conf; - include /etc/nginx/sites-enabled/*; - } - - - `/etc/nginx/conf.d/swh-objstorage-gunicorn-upstream.conf`:: - - upstream swh-objstorage-gunicorn { - server unix:/run/gunicorn/swh-objstorage/gunicorn.sock fail_timeout=0; - } - - `/etc/nginx/sites-enabled/swh-objstorage.conf`:: - - server { - listen 0.0.0.0:5003 deferred; - - server_name 127.0.0.1 localhost ::1; - client_max_body_size 4G; - - index index.html index.htm index.php; - access_log /var/log/nginx/nginx-swh-objstorage.access.log combined if=$error_status; - error_log /var/log/nginx/nginx-swh-objstorage.error.log; - - location / { - proxy_pass http://swh-objstorage-gunicorn; - proxy_read_timeout 3600s; - proxy_connect_timeout 90s; - proxy_send_timeout 90s; - proxy_buffering off; - proxy_set_header Host $host; - proxy_set_header X-Real-IP $remote_addr; - proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; - proxy_set_header Proxy ""; - } - } - - Note the `` in the example file above to adapt to your server name. - - `/etc/nginx/conf.d/swh-objstorage-default.conf`:: - - server { - listen 0.0.0.0:5003 default_server; - - server_name nginx-swh-objstorage-default; - - return 444; - index index.html index.htm index.php; - access_log /var/log/nginx/nginx-swh-objstorage-default.access.log combined; - error_log /var/log/nginx/nginx-swh-objstorage-default.error.log; - - location / { - index index.html index.htm index.php; - } - } - - -- Restart the `nginx` service:: - - ~$ sudo systemctl restart nginx.service - -- Check the whole stack is responding:: - - ~$ curl http://127.0.0.1:5003/ - SWH Objstorage API serverd - ~$ diff --git a/docs/install-storage.rst b/docs/install-storage.rst deleted file mode 100644 --- a/docs/install-storage.rst +++ /dev/null @@ -1,190 +0,0 @@ -.. highlight:: bash - -.. _storage_install: - - -Graph Storage -============= - -The machine that hosts the (graph) Storage must have access to a Postgresql -database. It must also have access to a running objstorage instance. Setting up -these services will not be covered here. - -In this guide, we assume that: - -- Postgresql is running on machine `pghost` and the database `swh-storage` - exists and is owned by the postgresql user `swhuser`, -- objstorage is running on machine `objstorage` listening on port 5003. - - -Docker based deployment ------------------------ - -In a docker based deployment, all machine names must be resolvable from within -a docker container and accessible from there. - -When testing this guide on a single docker host, the simplest solution is to -start your docker containers linked to a common bridge:: - - $ docker network create swh - e0d85947d4f53f8b2f0393517f373ab4f5b06d02e1efa07114761f610b1f7afa - $ - -In the examples below we will use such a network config. - -Build the image -~~~~~~~~~~~~~~~ - -``` -$ docker build -t swh/base https://forge.softwareheritage.org/source/swh-docker.git -``` - -Configure the storage -~~~~~~~~~~~~~~~~~~~~~ - -Write a configuration file named `storage.yml`:: - - storage: - cls: local - args: - db: postgresql://swhuser:p4ssw0rd@pghost/swh-storage - - objstorage: - cls: remote - args: - url: http://objstorage:5003 - - client_max_size: 1073741824 - - -Testing the configuration -~~~~~~~~~~~~~~~~~~~~~~~~~ - -Then start the test SWGI server for the RPC storage service:: - - $ docker run --rm \ - --network swh \ - -v ${PWD}/storage.yml:/etc/softwareheritage/config.yml \ - -p 5002:5000 \ - swh/base storage - -You should be able to query this server (fraom another terminal since the -container is started attached to its console):: - - $ http http://127.0.0.1:5002 - HTTP/1.1 200 OK - Content-Length: 25 - Content-Type: text/plain; charset=utf-8 - Date: Thu, 20 Jun 2019 08:02:32 GMT - Server: Python/3.5 aiohttp/3.5.1 - - SWH Objstorage API server - - $ http http://127.0.0.1:5003/check_config check_write=True - HTTP/1.1 200 OK - Content-Length: 4 - Content-Type: application/json - Date: Thu, 20 Jun 2019 08:06:58 GMT - Server: Python/3.5 aiohttp/3.5.4 - - true - $ - -Note: in the example above, we use httpie_ as HTTP client. You can use any -other tool (curl, wget...) - -.. _httpie: https://httpie.org - -Since we started this container attached, just hit Ctrl+C to quit in the -terminal in which the docker container is running. - -Running in production -~~~~~~~~~~~~~~~~~~~~~ - -This container uses gunicorn as SWGI server. However, since this later does not -handle the HTTP stack well enough for a production system, it is recommended to -run this behind a proper HTTP server like nginx. - -First, we start the objstorage container without exposing the TCP port, but -using a mounted file as socket to ba able to share it with other containers. -Here, we create this socket file in `/srv/softwareheritage/objstorage.sock`:: - - $ docker run -d --name swh-objstorage \ - --network swh \ - -v ${PWD}/objstorage.yml:/etc/softwareheritage/config.yml \ - -v /srv/softwareheritage/objects:/srv/softwareheritage/objects \ - -v /srv/softwareheritage/socks:/var/run/gunicorn/swh \ - swh/base objstorage - -And start an HTTP server that will proxy the UNIX socket -`/srv/softwareheritage/socks/objstorage.sock`. Using Nginx, you can use the -following `nginx.conf` file:: - - worker_processes 4; - - # Show startup logs on stderr; switch to debug to print, well, debug logs when - # running nginx-debug - error_log /dev/stderr info; - - events { - worker_connections 1024; - } - - http { - include mime.types; - default_type application/octet-stream; - sendfile on; - keepalive_timeout 65; - - # Built-in Docker resolver. Needed to allow on-demand resolution of proxy - # upstreams. - resolver 127.0.0.11 valid=30s; - - upstream app_server { - # fail_timeout=0 means we always retry an upstream even if it failed - # to return a good HTTP response - - # for UNIX domain socket setups - server unix:/tmp/gunicorn/gunicorn.sock fail_timeout=0; - } - - server { - listen 80 default_server; - - # Add a trailing slash to top level requests - rewrite ^/([^/]+)$ /$1/ permanent; - - location / { - set $upstream "http://app_server"; - proxy_pass $upstream; - } - } - } - - -And run nginx in a docker container with:: - - $ docker run \ - --network swh \ - -v ${PWD}/conf/nginx.conf:/etc/nginx/nginx.conf:ro \ - -v /tmp/objstorage/objstorage.sock:/tmp/gunicorn.sock \ - -p 5003:80 \ - nginx - - -Which you can check it is properly functionning:: - - $ http :5003/check_config check_write=True - HTTP/1.1 200 OK - Connection: keep-alive - Content-Length: 1 - Content-Type: application/x-msgpack - Date: Thu, 20 Jun 2019 10:13:39 GMT - Server: nginx/1.17.0 - - true - - -If you want your docker conotainers to start automatically, add the -`--restart=always` option to docker commands above. This should prevent you -from having to write custom service unit files. diff --git a/docs/install-webapp.rst b/docs/install-webapp.rst deleted file mode 100644 --- a/docs/install-webapp.rst +++ /dev/null @@ -1,256 +0,0 @@ -.. highlight:: bash - -.. _webapp_install: - - -Web Application -=============== - -The machine that hosts the front end web application must have access to all -the Software Heritage service RPC endpoints. Some of them may be optional, in -which case parts of the web UI won't work as expected. - -In this guide, we assume that: - -- storage RPC server is running on machine `storage` listening on port 5002, -- objstorage RPC server is running on machine `objstorage` listening on - port 5003. - -For the metadata/indexer part to work, il will also require: - -- indexer storage RPC server is running on machine `indexerstorage` listening - on port 5007. - -For the Vault part to work, it will also require: - -- vault RPC server is running on machine `vault` listening on port 5005, -- scheduler RPC server is running on machine `scheduler` listening on - port 5008. - -For the Deposit part to work, it will also require: - -- deposit RPC server is running on machine `vault` listening on port 5006, -- scheduler RPC server is running on machine `scheduler` listening on - port 5008. - - -Docker based deployment ------------------------ - -In a docker based deployment, obviously, all machine names listed above must be -resolvable from within a docker container and accessible from there. - -When testing this guide on a single docker host, the simplest solution is to -start your docker containers linked to a common bridge:: - - $ docker network create swh - e0d85947d4f53f8b2f0393517f373ab4f5b06d02e1efa07114761f610b1f7afa - $ - -In the examples below we will use such a network config. - - -Build the image -~~~~~~~~~~~~~~~ - -:: - - $ docker build -t swh/web -f Dockerfile.web \ - https://forge.softwareheritage.org/source/swh-docker.git - [...] - Successfully tagged swh/web:latest - - -Configure the web app -~~~~~~~~~~~~~~~~~~~~~ - -Write a configuration file named `web.yml` like:: - - storage: - cls: remote - args: - url: http://storage:5002/ - timeout: 1 - - objstorage: - cls: remote - args: - url: http://objstorage:5003/ - - indexer_storage: - cls: remote - args: - url: http://indexer-storage:5007/ - - scheduler: - cls: remote - args: - url: http://scheduler:5008/ - - vault: - cls: remote - args: - url: http://vault:5005/ - - deposit: - private_api_url: https://deposit:5006/1/private/ - private_api_user: swhworker - private_api_password: '' - - allowed_hosts: - - app_server - - debug: no - - serve_assets: yes - - throttling: - cache_uri: null - scopes: - swh_api: - limiter_rate: - default: 120/h - exempted_networks: - - 0.0.0.0/0 - swh_vault_cooking: - limiter_rate: - default: 120/h - exempted_networks: - - 0.0.0.0/0 - swh_save_origin: - limiter_rate: - default: 120/h - exempted_networks: - - 0.0.0.0/0 - - -Testing the configuration -~~~~~~~~~~~~~~~~~~~~~~~~~ - -Then initialize the web app:: - - $ docker run --rm \ - --network swh \ - -v ${PWD}/web.yml:/etc/softwareheritage/config.yml \ - swh/web migrate - Migrating db using swh.web.settings.production - Operations to perform: - Apply all migrations: admin, auth, contenttypes, sessions, swh.web.common - Running migrations: - No migrations to apply. - Creating admin user - $ - -and start the web app:: - - $ docker run --rm \ - --network swh \ - -v ${PWD}/web.yml:/etc/softwareheritage/config.yml \ - -p 5004:5000 \ - swh/web serve - starting the swh-web server - [...] - -You should be able to navigate the web application using your browser on -http://localhost:5004 . - -If everything works fine, hit Ctrl+C in the terminal in which the docker -container is running. - -Using memcache -~~~~~~~~~~~~~~ - -It is strongly advised to use a memcache for the web app. Considering such a -service is listening on `memcache:11211`, you should adapt the -`throttling.cache_uri` parameter of your `web.yml` file accordingly:: - - [...] - - throttling: - cache_uri: memcache:11211 - - [...] - -You can easily start such a memcached server using:: - - $ docker run --name memcache --network swh -d memcached - - -Running in production -~~~~~~~~~~~~~~~~~~~~~ - -This container uses gunicorn as WSGI server. However, since this later does not -handle the HTTP stack well enough for a production system, it is recommended to -run this behind a proper HTTP server like nginx via a unix socket. - -First, we start the webapp container without exposing the TCP port, but -using a mounted file as socket to be able to share it with other containers. -Here, we create this socket file in `/srv/softwareheritage/socks/web.sock`:: - - $ docker run -d --name webapp \ - --network swh \ - -v ${PWD}/web.yml:/etc/softwareheritage/config.yml \ - -v /srv/softwareheritage/socks:/var/run/gunicorn/swh \ - swh/web serve - -And start an HTTP server that will proxy the UNIX socket -`/srv/softwareheritage/socks/web/sock`. Using Nginx, you can use the -following `nginx.conf` file:: - - worker_processes 4; - - # Show startup logs on stderr; switch to debug to print, well, debug logs when - # running nginx-debug - error_log /dev/stderr info; - - events { - worker_connections 1024; - } - - http { - include mime.types; - default_type application/octet-stream; - sendfile on; - keepalive_timeout 65; - - # Built-in Docker resolver. Needed to allow on-demand resolution of proxy - # upstreams. - resolver 127.0.0.11 valid=30s; - - upstream app_server { - # fail_timeout=0 means we always retry an upstream even if it failed - # to return a good HTTP response - - # for UNIX domain socket setups - server unix:/tmp/gunicorn/sock fail_timeout=0; - } - - server { - listen 80 default_server; - - location / { - set $upstream "http://app_server"; - proxy_pass $upstream; - } - } - } - - -Note that the `app_server` name in this file above must be listed in the -`allowed_hosts` config option in the `web.yml` file. - -And run nginx in a docker container with:: - - $ docker run -d \ - --network swh \ - -v ${PWD}/conf/nginx.conf:/etc/nginx/nginx.conf:ro \ - -v /srv/softwareheritage/socks/web:/tmp/gunicorn \ - -p 5004:80 \ - nginx - - -Which you can check it is properly functionning navigating on http://localhost:5004 - -If you want your docker conotainers to start automatically, add the -`--restart=always` option to docker commands above. This should prevent you -from having to write custom service unit files. diff --git a/docs/installation.rst b/docs/installation.rst deleted file mode 100644 --- a/docs/installation.rst +++ /dev/null @@ -1,45 +0,0 @@ -.. highlight:: bash - -.. _installation: - -How to Set Up a Software Heritage Archive -========================================= - -This series of guides will help you install a (partial) Software Heritage -Instance in a production system. - - -The global :ref:`architecture of the Software Heritage Archive ` -looks like: - -.. thumbnail:: images/general-architecture.svg - - General view of the |swh| architecture. - -Each component can be installed alone, however there are some dependencies -between those services. - -The base system used for these guides is a Debian system running the latest -stable version. - -Components of the archicture can be installed on a sigle node (not recommended) -or on set (cluster) of machines. - - -.. _swh_debian_repo: - -Debian Repository -~~~~~~~~~~~~~~~~~ - -On each machine, the Software Heritage apt repository must be configured:: - - ~$ sudo apt install apt-transport-https lsb-release - ~$ echo deb [trusted=yes] https://debian.softwareheritage.org/ $(lsb_release -cs)-swh main | \ - sudo sh -c 'cat > /etc/apt/sources.list.d/softwareheritage.list' - ~$ sudo apt update - - - -.. toctree:: - - install-objstorage