Some workers can use quite lots of disk space and memory for their computations (e.g. swh-loader-svn).
This can have quite some unfortunate side-effects:
- Out of memory error -> worker's killed before it can actually empty its occupied disk
- The following workers have less disk space to work, so they can then get stopped too early
- Thus, this can cause filling up disk-space in the worker machine quite faster
- logging more rapidly errors which can cause filling up disk in other machines as well (the ones centralizing logs).
- workers consuming queues without really doing anything (which lead to losing sight on what is actually done)
Having some ways of :
- mitigating the memory occupation
- monitoring and cleaning up the useless dangling temporary directories
could help mitigate this.