LSI RAID Monitoring Plugin setup
The LSI RAID Monitoring Plugin can be used to monitor LSI Controller RAID sets. The plugin is written in Perl and uses the command line tool storcli to interact with the controller.
Current Version
The current check_lsi_raid version can be obtained from GitHub:
Functionalities
The plugin README lists all available checks:
- Version 2.3 Plugin README (github.com)
GitHub Support
Questions are answered via GitHub:
Prerequisites
A detailed description on how to install the requirements will follow in the next sections:
- On the monitored server
- On the Icinga server
- Command definition
- Service defintion
- If using the Call-Home-Service
- Define templates (Using Call-Home-Service with Icinga or Nagios)
Installation
Manually
Installing the plugin manually involves copying the file into the directory /usr/lib/nagios/plugins
.
:~$ git clone https://github.com/thomas-krenn/check_lsi_raid.git Cloning into 'check_lsi_raid'... :~$ cd check_lsi_raid/ :~/check_lsi_raid$ ls check_lsi_raid check_lsi_raid.html README :~/check_lsi_raid$ sudo cp check_lsi_raid /usr/lib/nagios/plugins/
The LSI Storage Command Line Tool (StorCLI) can be obtained from the LSI website for the corresponding controller and operating system: http://www.lsi.com/Search/Pages/downloads.aspx?k=Latest%20StorCLI
TK Ubuntu-Repository
After setting up the Repo - Thomas Krenn Ubuntu Repo - the package nagios-plugins-thomas-krenn installs check_lsi_raid:
:~$ sudo apt-get install nagios-plugins-thomas-krenn [...] Suggested packages: arcconf storcli freeipmi-tools libipc-run-perl The following NEW packages will be installed: nagios-plugins-thomas-krenn 0 upgraded, 1 newly installed, 0 to remove and 79 not upgraded. Need to get 0 B/25.2 kB of archives. After this operation, 127 kB of additional disk space will be used. Selecting previously unselected package nagios-plugins-thomas-krenn. (Reading database ... 68398 files and directories currently installed.) Unpacking nagios-plugins-thomas-krenn (from .../nagios-plugins-thomas-krenn_0.3-1_all.deb) ... Setting up nagios-plugins-thomas-krenn (0.3-1) ...
The suggested package storcli
must be installed also:
:~$ sudo apt-get install storcli [...] The following NEW packages will be installed: storcli 0 upgraded, 1 newly installed, 0 to remove and 72 not upgraded. Need to get 1,385 kB of archives. After this operation, 0 B of additional disk space will be used. Get:1 http://archive.thomas-krenn.com/packages/ precise/optional storcli amd64 1.03.11-1 [1,385 kB] Fetched 1,385 kB in 0s (1,779 kB/s) Selecting previously unselected package storcli. (Reading database ... 64459 files and directories currently installed.) Unpacking storcli (from .../storcli_1.03.11-1_amd64.deb) ... Setting up storcli (1.03.11-1) ...
Configuration
The plugin is suited to monitor a host via NRPE as well as a local host. In both cases the plugin has to be installed on the monitored server.
Via NRPE
On the Icinga Server
The host definition defines the command that is called via NRPE on the remote host. The plugin parameters are specified on the remote host afterwards:
define service { service_description lsi-raid-nrpe display_name LSI RAID use generic-service host_name test check_command check_nrpe_1arg!check_lsi_raid }
Attention: In order to use the Call-Home-Service some templates have to be created - cf. Using Call-Home-Service with Icinga or Nagios. Then instead of use generic-service
, the line use thomas-krenn-service
must be used at service definition!
Using TKmon to configure check_lsi_raid
The LSI raid plugin is already integrated in the TKmon service catalogue. It is sufficient to select "LSI RAID via NRPE" when adding a service to a host:
On the monitored Server
The user nagios has to call storcli without providing a sudo password. Therefore the following sudoers definition is created.
Attention: With CentOS the NRPE daemon runs as user nrpe! Therefore you have to replace nagios with nrpe in the sudoers configuration.
:~$ sudo vi /etc/sudoers.d/check_lsi_raid nagios ALL=(root)NOPASSWD:/usr/sbin/storcli :~$ sudo chmod 440 /etc/sudoers.d/check_lsi_raid
If the definition is correct, the following command does not ask for a password:
:~$ sudo su nagios --shell /bin/bash :~$ sudo /usr/sbin/storcli -V Storage Command Line Tool Ver 1.03.11 Jan 30, 2013 (c)Copyright 2012, LSI Corporation, All Rights Reserved. Exit Code: 0x00
A NRPE configuration file specifies the check command that is called if check_lsi_raid is used via NRPE. This command must be the same as the one on the Icinga side in the host definition.
Notice: With CentOS it may help to specify the script interpretor explicitly, e.g. perl /usr/lib/nagios/plugins/check_lsi_raid -C 0 -p /usr/sbin/storcli
.
:~$ sudo vi /etc/nagios/nrpe.d/raid.cfg command[check_lsi_raid]=/usr/lib/nagios/plugins/check_lsi_raid -C 0 -p /usr/sbin/storcli
:~$ sudo service nagios-nrpe-server restart
Finally a test on the Icinga side checks the correct definitions:
:~$ /usr/lib/nagios/plugins/check_nrpe -H 10.0.0.2 -c check_lsi_raid OK (CTR, LD, PD, CV)|CV_Temperature=27;70;85 ROC_Temperature=62;80;90
Check local RAID
A local check is useful if the Icinga server itself uses a LSI RAID controller.
Manually
Requirements for a local installation are the plugin, storcli and the sudoers entry. As a first step an Icinga command definition is created:
:~$ sudo vi /etc/nagios-plugins/config/check_lsi_raid.cfg define command { command_name check_lsi_raid command_line /usr/lib/nagios/plugins/check_lsi_raid -C '$ARG1$' -p '$ARG2$' }
A service definition uses this command:
define service{ use generic-service host_name tkmon service_description lsi-raid check_command check_lsi_raid!0!/usr/sbin/storcli }
Example Output
$ sudo ./check_lsi_raid -p /opt/MegaRAID/storcli/storcli64 -vvv Critical (CTR Warn, LD Crit, PD Warn) [c0/v0_State = Critical (Dgrd)][c0/e252/s2_State = Critical (Rbld)][CTR_Degraded_drives = Warning (1)] [c0/e252/s2_Rebuild = Warning (4)]|CV_Temperature=24;70;85 ROC_Temperature=58;80;90 Used storcli commands: - /usr/bin/sudo /opt/MegaRAID/storcli/storcli64 /c0 /cv show status - /usr/bin/sudo /opt/MegaRAID/storcli/storcli64 adpallinfo a0 - /usr/bin/sudo /opt/MegaRAID/storcli/storcli64 /c0/vall show all - /usr/bin/sudo /opt/MegaRAID/storcli/storcli64 /c0/vall show init - /usr/bin/sudo /opt/MegaRAID/storcli/storcli64 /c0/eall/sall show all - /usr/bin/sudo /opt/MegaRAID/storcli/storcli64 /c0/eall/sall show initialization - /usr/bin/sudo /opt/MegaRAID/storcli/storcli64 /c0/eall/sall show rebuild Critical sensors: - c0/v0_State (Dgrd) - c0/e252/s2_State (Rbld) Warning sensors: - CTR_Degraded_drives (1) - c0/e252/s2_Rebuild (4) CTR information: - LSI MegaRAID SAS 9271-4i: - Serial No=SV30900638 - FW Package Build=23.28.0-0010 - Mfg. Date=02/23/13 - Revision No=07B - BIOS Version=5.46.02.0_4.16.08.00_0x06060900 - FW Version=3.400.05-3175 - ROC temperature=58 degree Celcius LD information: - c0/v0: - Access=RW - Cache=RWBD - Consist=Yes - DG/VD=0/0 - Size=74.0 - State=Dgrd - TYPE=RAID1 - ld=c0/v0 - sCC=- PD information: - c0/e252/s1: - BBM Error Count=0 - DG=- - DID=6 - Drive Temperature=N/A - EID:Slt=252:1 - Intf=SATA - Med=SSD - Media Error Count=0 - Model=INTEL SSDSC2BB080G4 - Other Error Count=0 - PI=N - Predictive Failure Count=0 - S.M.A.R.T alert flagged by drive=No - SED=N - SeSz=512B - Shield Counter=0 - Size=74.0GB - Sp=U - State=UGood - pd=c0/e252/s1 - c0/e252/s2: - BBM Error Count=0 - DG=0 - DID=5 - Drive Temperature=0C (32.00 F) - EID:Slt=252:2 - Intf=SATA - Med=SSD - Media Error Count=0 - Model=INTEL SSDSC2BB080G4 - Other Error Count=0 - PI=N - Predictive Failure Count=0 - S.M.A.R.T alert flagged by drive=No - SED=N - SeSz=512B - Shield Counter=0 - Size=74.0GB - Sp=U - State=Rbld - pd=c0/e252/s2 - rebuild=4 - c0/e252/s3: - BBM Error Count=0 - DG=0 - DID=4 - Drive Temperature=N/A - EID:Slt=252:3 - Intf=SATA - Med=SSD - Media Error Count=0 - Model=INTEL SSDSC2BB080G4 - Other Error Count=0 - PI=N - Predictive Failure Count=0 - S.M.A.R.T alert flagged by drive=No - SED=N - SeSz=512B - Shield Counter=0 - Size=74.0GB - Sp=U - State=Onln - pd=c0/e252/s3 CV information: - CV_Replacement_required=No - CV_Status=OK - CV_Temperature=24
Author: Georg Schönberger