Page MenuHomeSoftware Heritage

Separate system logs from application logs
Closed, MigratedEdits Locked

Description

Right now, all log entries shipped to the Elasticsearch cluster are put into the same indices.
It would be best to create separate indices for system logs and application (swh-worker) logs if we want to easily apply different retention policies to these two broad kinds of log data in the future.

Event Timeline

zack triaged this task as Normal priority.Feb 14 2018, 11:43 AM

This Logstash configuration appears to behave as expected:

output {
    if "swh-worker@" in [systemd_unit] {
        elasticsearch {
                hosts => ["petitpalais.internal.softwareheritage.org:9200"]
                index => "swh_workers-%{+YYYY.MM.dd}"
        }
    } else {
        elasticsearch {
                hosts => ["petitpalais.internal.softwareheritage.org:9200"]
                index => "systemlogs-%{+YYYY.MM.dd}"
        }
    }
}

Howewer, Logstash applies a default template to logstash-* indices and does no such thing for indices named differently.
It is possible systemlogs-* and swh_workers-* indices will end up with suboptimal mappings without further configuration.

Production logstash configuration on banco.internal.softwareheritage.org changed today according to the above pattern.