====== Monitoring ZFS zpool With Opsview or Nagios ======
The paths below are in reference to using Opsview to monitor, but the plugin is perfectly compatible with nagios - just adjust the paths as needed.
1. Create the script /usr/local/nagios/libexec/check_zfs.sh
#!/bin/bash
#
# check_zfs.sh
#
# Program: ZFS status check plugin for Nagios
# License : GPL
#
# $Id: check_zfs.sh,v 1.1 2013/03/26 14:12:06 root Exp root $
#
# Description :
#
# This plugin check if ZFS pool is online
#
# Usage :
#
# check_zfs.sh <poolname>
#
# Debug
#set -x
cd $(dirname $0)
# get return codes
. ./utils.sh
# Finally Inform Nagios of what we found...
if [ x`sudo /sbin/zpool list -H -o health $1` \!= xONLINE ]
then
echo "CRITICAL - ZFS pool $1 not ONLINE"
exit $STATE_CRITICAL
else
echo "OK - ZFS pool $1 is ONLINE"
exit $STATE_OK
fi
# Hey what are we doing here ???
exit $STATE_UNKNOWN
2. Allow the 'nagios' user to run the 'zpool list' command as root to get the status:
/etc/sudoers
nagios ALL=(root) NOPASSWD: /sbin/zpool list *
3. Test the script runs when called by user nagios:
# sudo -u nagios /usr/local/nagios/libexec/check_zfs.sh datapool
OK - ZFS pool datapool is ONLINE
4. Create the include directory specified in the nrpe.cfg file
# grep ^include_dir /usr/local/nagios/etc/nrpe.cfg
include_dir=/usr/local/nagios/etc/nrpe_local
# mkdir /usr/local/nagios/etc/nrpe_local
5. Add a check alias used by NRPE when called and restart NRPE
# echo 'command[check_zpool]=/usr/local/nagios/libexec/check_zfs.sh $ARG1$' > /usr/local/nagios/etc/nrpe_local/check_zpool.cfg
# /etc/init.d/opsview-agent restart
6. Test from the Opsview/Nagios server
# sudo -u nagios /usr/local/nagios/libexec/check_nrpe -H 66.55.44.33 -c check_zpool -a datapool
OK - ZFS pool datapool is ONLINE
7. Add the custom test to opsview/nagios