Page MenuHomeSoftware Heritage

Check battery status on storage adapters
Closed, ResolvedPublic

Description

  • At least one of our physical machines at Rocquencourt has a failed battery.
  • All our servers use LSI storage adapters, their BBU status can be checked with megacli .

Event Timeline

ftigeot created this task.Nov 13 2018, 12:16 PM
ftigeot triaged this task as High priority.

List of physical machines at Rocquencourt: louvre beaubourg orsay banco

louvre

  • storage adapters:
03:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 [Invader] (rev 02)
21:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)
  • BBU status: OK
  • BBU manufacture date: 2011-07-18

beaubourg

  • storage adapters:
04:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 [Invader] (rev 02)
81:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)
  • BBU status: OK
  • BBU manufacture date: invalid

banco

  • storage adapters:
01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 [Invader] (rev 02)
04:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)
  • BBU status: OK
  • BBU manufacture date: invalid

orsay

  • storage adapters:
05:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)
22:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)
  • BBU status: OK, failed
  • BBU manufacture date: 2011-01-31, 2010-10-05
ftigeot closed this task as Resolved.Nov 13 2018, 2:55 PM

In summary, only orsay has a failed BBU.
Given the fact that it contains two identical RAID adapters with old-age, similar BBUs, it could be useful to change both at once.

zack added a subscriber: zack.Nov 14 2018, 8:13 AM

@ftigeot thanks for checking!
Please file tasks for changing the failing batteries (priority high) and automate the monitoring of failing batteries so that in the future we will be automatically notified of future similar failures (priority normal).