====== Monitoring ZFS zpool With Opsview or Nagios ====== The paths below are in reference to using Opsview to monitor, but the plugin is perfectly compatible with nagios - just adjust the paths as needed. 1. Create the script /usr/local/nagios/libexec/check_zfs.sh #!/bin/bash # # check_zfs.sh # # Program: ZFS status check plugin for Nagios # License : GPL # # $Id: check_zfs.sh,v 1.1 2013/03/26 14:12:06 root Exp root $ # # Description : # # This plugin check if ZFS pool is online # # Usage : # # check_zfs.sh <poolname> # # Debug #set -x cd $(dirname $0) # get return codes . ./utils.sh # Finally Inform Nagios of what we found... if [ x`sudo /sbin/zpool list -H -o health $1` \!= xONLINE ] then echo "CRITICAL - ZFS pool $1 not ONLINE" exit $STATE_CRITICAL else echo "OK - ZFS pool $1 is ONLINE" exit $STATE_OK fi # Hey what are we doing here ??? exit $STATE_UNKNOWN 2. Allow the 'nagios' user to run the 'zpool list' command as root to get the status: /etc/sudoers nagios ALL=(root) NOPASSWD: /sbin/zpool list * 3. Test the script runs when called by user nagios: # sudo -u nagios /usr/local/nagios/libexec/check_zfs.sh datapool OK - ZFS pool datapool is ONLINE 4. Create the include directory specified in the nrpe.cfg file # grep ^include_dir /usr/local/nagios/etc/nrpe.cfg include_dir=/usr/local/nagios/etc/nrpe_local # mkdir /usr/local/nagios/etc/nrpe_local 5. Add a check alias used by NRPE when called and restart NRPE # echo 'command[check_zpool]=/usr/local/nagios/libexec/check_zfs.sh $ARG1$' > /usr/local/nagios/etc/nrpe_local/check_zpool.cfg # /etc/init.d/opsview-agent restart 6. Test from the Opsview/Nagios server # sudo -u nagios /usr/local/nagios/libexec/check_nrpe -H 66.55.44.33 -c check_zpool -a datapool OK - ZFS pool datapool is ONLINE 7. Add the custom test to opsview/nagios