Page MenuHomeSoftware Heritage

journalbeat failed to start after reboot
Closed, MigratedEdits Locked

Description

After a reboot by azure, journalbeat won't restart on worker10.euwest.azure

May 16 07:02:20 worker10 journalbeat[3603569]: INFO [beat] instance/beat.go:1014 Beat info {"beat": {"path": {"config": "/etc/journalbeat", "data": "/var/lib/journalbeat", "home": "/usr/share/journalbeat", "logs": "/var/log/journalbeat"}, "type": "journalbeat", "uuid": "e9808d4c-6689-4f34-80c2-a6c4e3685e6c"}}
May 16 07:02:20 worker10 journalbeat[3603569]: INFO [beat] instance/beat.go:1023 Build info {"build": {"commit": "fd322dad6ceafec40c84df4d2a0694ea357d16cc", "libbeat": "7.15.2", "time": "2021-11-04T13:22:31.000Z", "version": "7.15.2"}}
May 16 07:02:20 worker10 journalbeat[3603569]: INFO [beat] instance/beat.go:1026 Go runtime info {"go": {"os":"linux","arch":"amd64","max_procs":2,"version":"go1.16.6"}}
May 16 07:02:20 worker10 journalbeat[3603569]: INFO [beat] instance/beat.go:1030 Host info {"host": {"architecture":"x86_64","boot_time":"2022-05-14T04:14:46Z","containerized":false,"name":"worker10","ip":["127.0.0.1/8","::1/128","192.168.200.14/22","fe80::20d:3aff:fe23:98a0/64"],"kernel_version":"5.10.0-0.bpo.9-cloud-amd64","mac":["00:0d:3a:23:98:a0"],"os":{"type":"linux","family":"debian","platform":"debian","name":"Debian GNU/Linux","version":"10 (buster)","major":10,"minor":0,"patch":0,"codename":"buster"},"timezone":"UTC","timezone_offset_sec":0,"id":"8688b7fbc2cf470caf90c24067eb50a6"}}
May 16 07:02:20 worker10 journalbeat[3603569]: INFO [beat] instance/beat.go:1059 Process info {"process": {"capabilities": {"inheritable":null,"permitted":["chown","dac_override","dac_read_search","fowner","fsetid","kill","setgid","setuid","setpcap","linux_immutable","net_bind_service","net_broadcast","net_admin","net_raw","ipc_lock","ipc_owner","sys_module","sys_rawio","sys_chroot","sys_ptrace","sys_pacct","sys_admin","sys_boot","sys_nice","sys_resource","sys_time","sys_tty_config","mknod","lease","audit_write","audit_control","setfcap","mac_override","mac_admin","syslog","wake_alarm","block_suspend","audit_read","38","39","40"],"effective":["chown","dac_override","dac_read_search","fowner","fsetid","kill","setgid","setuid","setpcap","linux_immutable","net_bind_service","net_broadcast","net_admin","net_raw","ipc_lock","ipc_owner","sys_module","sys_rawio","sys_chroot","sys_ptrace","sys_pacct","sys_admin","sys_boot","sys_nice","sys_resource","sys_time","sys_tty_config","mknod","lease","audit_write","audit_control","setfcap","mac_override","mac_admin","syslog","wake_alarm","block_suspend","audit_read","38","39","40"],"bounding":["chown","dac_override","dac_read_search","fowner","fsetid","kill","setgid","setuid","setpcap","linux_immutable","net_bind_service","net_broadcast","net_admin","net_raw","ipc_lock","ipc_owner","sys_module","sys_rawio","sys_chroot","sys_ptrace","sys_pacct","sys_admin","sys_boot","sys_nice","sys_resource","sys_time","sys_tty_config","mknod","lease","audit_write","audit_control","setfcap","mac_override","mac_admin","syslog","wake_alarm","block_suspend","audit_read","38","39","40"],"ambient":null}, "cwd": "/var/lib/journalbeat", "exe": "/usr/share/journalbeat/bin/journalbeat", "name": "journalbeat", "pid": 3603569, "ppid": 1, "seccomp": {"mode":"filter","no_new_privs":true}, "start_time": "2022-05-16T07:02:19.580Z"}}
May 16 07:02:20 worker10 journalbeat[3603569]: INFO instance/beat.go:309 Setup Beat: journalbeat; Version: 7.15.2
May 16 07:02:20 worker10 systemd[1]: journalbeat.service: Scheduled restart job, restart counter is at 5.

Event Timeline

vsellier changed the task status from Open to Work in Progress.May 16 2022, 9:13 AM
vsellier triaged this task as Normal priority.
vsellier created this task.

the file /var/lib/journalbeat/registry looks corrupted:
on worker10.euwest:

root@worker10:/var/lib/journalbeat# cat registry 
<?xml version="1.0" encoding="utf-8"?>
<GoalState xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="goalstate10.xsd">
  <Version>2012-11-30</Version>
  <Incarnation>1</Incarnation>
  <Machine>
    <ExpectedState>Started</ExpectedState>
    <StopRolesDeadlineHint>3

on worker09.euwest:

root@worker09:/var/lib/journalbeat# cat registry 
update_time: 2022-05-16T07:11:29.680690647Z
journal_entries:
- path: LOCAL_SYSTEM_JOURNAL
  cursor: s=1b5676c17e22450b80579b9caf065703;i=659f65c;b=97b0842367c749299a4a12ec839f1c3b;m=5b66c4ba4c0;t=5df1bbb86b72f;x=8e43c09dfc1a706e
  realtime_timestamp: 1652685086832431
  monotonic_timestamp: 6281059083456

After removing the file and restarting journalbeat, it looks ok:

root@worker10:/var/lib/journalbeat# mv registry registry-corrupted
root@worker10:/var/lib/journalbeat# systemctl restart journalbeat
root@worker10:/var/lib/journalbeat# systemctl status journalbeat | grep Active
     Active: active (running) since Mon 2022-05-16 07:15:05 UTC; 1min 58s ago
vsellier claimed this task.
vsellier moved this task from Backlog to done on the System administration board.
vsellier renamed this task from journalbeat fail to start after reboot to journalbeat fails to start after reboot.May 16 2022, 9:24 AM
vsellier renamed this task from journalbeat fails to start after reboot to journalbeat failed to start after reboot.