Page MenuHomeSoftware Heritage

Manage filebeat configuration via puppet

Authored by vsellier on Sep 16 2020, 11:11 AM.



Related to T2591

Test Plan

Octocatalog-diff for moma :

*** Running octocatalog-diff on host
I, [2020-09-21T14:17:34.928892 #4635]  INFO -- : Catalogs compiled for
I, [2020-09-21T14:17:36.404154 #4635]  INFO -- : Diffs computed for
diff origin/production/ current/
+ File[/etc/filebeat/filebeat.yml] =>
   parameters =>
     "content": "# File managed by puppet - modifications will be lost\nfilebeat....
     "ensure": "present",
     "group": "root",
     "mode": "0644",
     "notify": "Service[filebeat]",
     "owner": "root"
+ File[/etc/filebeat/inputs.d/deposit-non-ssl-access.yml] =>
   parameters =>
     "content": "# File managed by puppet - modifications will be lost\n- type: l...
     "ensure": "present",
     "group": "root",
     "mode": "0644",
     "notify": "Service[filebeat]",
     "owner": "root"
+ File[/etc/filebeat/inputs.d/webapp-non-ssl-access.yml] =>
   parameters =>
     "content": "# File managed by puppet - modifications will be lost\n- type: l...
     "ensure": "present",
     "group": "root",
     "mode": "0644",
     "notify": "Service[filebeat]",
     "owner": "root"
+ File[/etc/filebeat/inputs.d] =>
   parameters =>
     "ensure": "directory",
     "group": "root",
     "mode": "0755",
     "owner": "root",
     "purge": true
+ Profile::Filebeat::Log_input[deposit-non-ssl-access] =>
   parameters =>
     "fields": {
       "apache_log_type": "access_log"
     "paths": [
+ Profile::Filebeat::Log_input[webapp-non-ssl-access] =>
   parameters =>
     "fields": {
       "apache_log_type": "access_log"
     "paths": [
*** End octocatalog-diff on

Diff Detail

rSPSITE puppet-swh-site
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

vsellier created this revision.Sep 16 2020, 11:11 AM
ardumont added inline comments.

swh/deploy/deposit.pp could use this as well ;)

olasd requested changes to this revision.Sep 16 2020, 2:49 PM
olasd added a subscriber: olasd.

I don't like using concat and ERB templates to generate yaml files at all:

  • you need to remember to escape your variables in the ERB template
  • you need to keep track of indent levels across the concat fragments
  • you need to have explicit indents in the ERB which is ugly

I very much prefer generating the config "data structure" as a puppet dict, then serializing that to yaml with the inline_yaml command.

Now this begs the question of how to "integrate" multiple snippets from various profiles (swh-web, swh-deposit) into this main config file.

I think the best way to do this is to support the filebeat "drop-in" input configurations:

  1. generate a base configuration in the filebeat profile, which contains an include for input drop-ins as well as the output config.
  2. have a profile::filebeat::input define which generates a drop-in file with input settings
  3. call the profile::filebeat::input defined resource in the swh-web and swh-deposit deployments, pointing at the apache configs
This revision now requires changes to proceed.Sep 16 2020, 2:49 PM
vsellier updated this revision to Diff 13985.Sep 16 2020, 5:12 PM

Use the capability of filebeat to use a file per input.

It's a lot more simple than build a single big yaml file with several concat

vsellier edited the test plan for this revision. (Show Details)Sep 16 2020, 5:14 PM
olasd added a comment.Sep 16 2020, 5:56 PM

Thanks! This indeed looks much less fragile than the concat version.

I've left a bunch of inline style and functionality comments.

I still don't think we should be templating yaml files. Please build the config structure as a hash within the puppet manifest, and use the inline_yaml function to generate the file contents. That function also generates a "changes will be lost" header automatically.

From memory, you can have a look at the way we generate the prometheus or grafana configurations for inspiration.


There should be one main logstash_hosts variable, set to the proper value, and used as an "%{alias()}" in other "service-specific" variables.

This way we can change a single variable to point at another logstash server, e.g. per environment, while still keeping separate variables per profile.


For now we've mostly avoided using class arguments for profiles, to avoid being surprised by hiera variable "autoloading".

(what i mean by hiera autoloading : if you set a profile::filebeat::config_directory hiera variable, it'll override the "'/etc/filebeat'" you've set here)

I don't know if that's a sensible choice, and it can certainly be reconsidered, but for now I think we should keep it consistent.


misaligned =>


Needs an explicit owner, group and mode (probably '0644'). Also should consider using the purge attribute to remove leftover files.


This should be a Hash, not an Array. (It is, after all, a hash in the configuration file itself).
The "type" of log input should be configurable.


Needs an include profile::filebeat. That way we're sure that the directory resource is defined etc.


I'd even make the inputs.d directory an explicit variable in profile::filebeat, so the directory resource there and the path here are guaranteed to match.

I'll also note that you can access the variable using this scope even if it's defined in the body of the class (no need to have it as a class argument).


Is this really needed?

olasd requested changes to this revision.Sep 17 2020, 11:37 AM
This revision now requires changes to proceed.Sep 17 2020, 11:37 AM
vsellier updated this revision to Diff 14092.EditedSep 21 2020, 9:28 AM

update the diff according the previous feedbacks :

  • The logstash hosts are declared on a uniq property
  • No more yaml templates ;)
  • The profile::filbeat doesn't use parameters anymore
  • Ensure the permissions are correctly set
  • Add the purge option on the inputs.d directory
ardumont accepted this revision as: ardumont.Sep 21 2020, 2:03 PM

Looks clearer this way indeed!

vsellier edited the test plan for this revision. (Show Details)Sep 21 2020, 2:18 PM
olasd accepted this revision.Sep 22 2020, 2:13 PM
This revision is now accepted and ready to land.Sep 22 2020, 2:13 PM
This revision was automatically updated to reflect the committed changes.