# Logging investigation
Storage service crash on connreset vith objstorage service.
https://forge.softwareheritage.org/P837
Thought it was an objstorage crash with no objstorage logs, but after investigation it seems not the case (all but HTTP 500 errors are not logged).
Observation: objstorage server only logs internal server errors, whereas storage also logs access.
How is it so?
## Library server implementations
### Aiohttp
https://docs.aiohttp.org/en/stable/logging.html
https://docs.aiohttp.org/en/stable/web_quickstart.html#aiohttp-web-exceptions
https://docs.aiohttp.org/en/stable/web_advanced.html#middlewares
"If the exception is not an instance of HTTPException it is converted to
500 HTTPInternalServerError after processing the middlewares chain."
Access and error logs active by default, given that logging is configured (not default).
All logs defaults >=DEBUG and to stderr if no handlers.
Access logs are actually out as INFO, whereas errors as ERROR.
### Flask
https://flask.palletsprojects.com/en/1.1.x/logging/
Main logger `app.logger`. Access logger `werkzeug`.
Flask default adds a `StreamHandler` to `app.logger`, if logging not configured.
Werkzeug default adds a `StreamHandler` to `logging.root`, if logging not configured.
Python stream handlers log to stderr by default.
### Gunicorn
https://docs.aiohttp.org/en/stable/deployment.html#aiohttp-deployment-gunicorn
https://docs.gunicorn.org/en/stable/settings.html#accesslog
Defaults: accesslog: None, errorlog: -
Gunicorn main logging options: --error-logfile --access-logfile --loglevel --log-config --capture-output --log-syslog
Example logging config file https://github.com/benoitc/gunicorn/blob/master/examples/logging.conf
## SWH server implementations
Find everything related to logs:
`git grep -iP "[\s_.]log(?:ger|ging)?[\s_.]"`
### `core.api.asynchronous`
Base: aiohttp
Logger: no logging config or getting
Both error and access logs enabled.
Error handler:
bubbles up aiohttp.web.HTTPException errors (not only 4XX) but log and send 5XX errors.
HTTP errors not used beyond the defaults of the framework (403, 404, 405, 413, 417)
`grep -n "raise HTTP" ~/.virtualenvs/swh/lib/python3.7/site-packages/aiohttp*/**`
-> remove special aiohttp.web.HTTPException handling?
### `core.api.__init__`
Base: Flask
`logger = logging.getLogger(__name__)`
Error handler:
handles all errors by logging and sending
### objstorage
Base: `core.api.asynchronous`
No special logging or error handler.
Absolutely no relevant logging code.
### storage
Base: `core.api.__init__`
2 error_handlers: StorageArgumentException = 400, other exceptions = 500
Logging handler is added, seems useless because the same as Flask default.
```
swh/storage/api/server.py: handler = logging.StreamHandler(); app.logger.addHandler(handler) // useless?
bin/swh-storage-add-dir: logging.basicConfig(level=logging.INFO)
swh/storage/cli.py: logging.getLogger("werkzeug").setLevel(ctx.obj["log_level"])
swh/storage/cli.py: logging.basicConfig()
```
Loggers mostly used for warnings (2 occurences) or replay/backfill (irrelevant).
Where does the displayed API access logs DEBUG come from?
### vault
Base: `core.api.asynchronous`
No special logging or error handler.
```
swh/vault/__init__.py:logger = logging.getLogger(__name__)
swh/vault/__init__.py: logger.debug("Instantiating
swh/vault/cli.py: setup_log_handler(loglevel=ctx.obj.get("log_level", logging.INFO),
swh/vault/cookers/base.py: logging.exception("Bundle cooking failed.")
```
## Testing objstorage logging
### Setup
Have a valid FS tree for objstorage:
`sudo mkdir -p /srv/softwareheritage/objects && sudo chown -R me:me /srv`
Devel server:
`swh objstorage -C conf/objstorage.yml rpc-serve`
Production server:
`SWH_CONFIG_FILENAME=conf/objstorage.yml gunicorn --bind 0.0.0.0:5003
--worker-class aiohttp.worker.GunicornWebWorker --log-level DEBUG
'swh.objstorage.api.server:make_app_from_configfile()'`
Added prints in `core.api.asynchronous:middleware_handler` except block.
Base request:
`curl -s -X POST http://127.1:5003/check_config -H "Content-Type: application/json" -d '{"check_write": false}'`
Base logs:
```
ERROR:root:...
Traceback (most recent call last):
...
TypeError: check_config() got an unexpected keyword argument 'check_writ
```
```
INFO:aiohttp.access:127.0.0.1 [29/Oct/2020:16:22:21 +0000] "POST /check_config HTTP/1.1" 500 897 "-" "curl/7.64.0"
```
Objstorage devel server log *all* access like above access log, log *some* errors like above error log.
Server: always above access log, sometimes above logging.root error log
### Tests
- POST, /check_confi, json, {"check_writ": false}
no error log, 404, server no traceback, client "404: Not Found"
- POST, /check_config, json, {"check_writ": false}
error log, 500, server+client traceback "TypeError: .*unexpected keyword argument"
- POST, /check_config, json, {"check_write": false}
no error log, 200, server no traceback, client "."
- POST, /check_config, jso, {"check_write": true}
error log, 500, server+client traceback "ValueError: Wrong content type"
- GET, /check_config
no error log, 405, server no traceback, client "405: Method Not Allowed"
## Testing storage logging
Added "TEST" in `gunicorn.worker.pre_request` log code to verify this is the log source.
`SWH_CONFIG_FILENAME=conf/storage.yml gunicorn --bind 0.0.0.0:5002 --log-level DEBUG
--error-logfile None 'swh.storage.api.server:make_app_from_configfile()'`
## Conclusion
Gunicorn (default) workers call `pre_request` at each `handle_request`. This does log through errorlog.
aiohttp GunicornWebWorker inherits Gunicorn BaseWorker but does not implement some features like Gunicorn server hooks like `pre_request`
Our configs disable access logs but Gunicorn workers do "accesslog" through errorlog...
To have consistant logging accross SWH API servers, we could disable the `pre_request` hook through the Gunicorn config file, and enable accesslog.
Should we remove aiohttp error special case in core.api.async error handler?
Proposed config file would be similar to:
```
pre_request = lambda *args: None
accesslog ="-"
```
## Searching Docker configurations for logging
- docker-compose files: no logging
- SWH service configs (`conf/*.yml`): no logging
- `env/common_python.env`: no logging
- `services/swh-{service}/entrypoint.sh`:
- All same gunicorn config: "accesslog: None, errorlog: -, loglevel: DEBUG"
- swh.core/web.gunicorn_config defines only sentry config,
including Flask (core) Django (web) integration.
- `--worker-class` used to specify async (aiohttp) instead of sync, in objstorage
- Inconsistency in options passing: --worker-class, --reload, --log-level
- All use Gunicorn except Vault which uses aiohttp devel server!
- Deposit is Django != architecture schema, uses core gunicorn_config which does not handle Django
- Graph completely absent from docker configs, not used in prod or run by storage?
- nginx.conf:
- exposes web, keycloack, monitoring, deposit, storages/idx-storage/scheduler, amqp, flower
- NOT exposes kafka, vault, workers, zookeeper, journal