Monitoring ZFS zpool With Opsview or Nagios

The paths below are in reference to using Opsview to monitor, but the plugin is perfectly compatible with nagios - just adjust the paths as needed.

1. Create the script <strong>/usr/local/nagios/libexec/check_zfs.sh</strong>

#!/bin/bash

#
# check_zfs.sh
#
# Program: ZFS status check plugin for Nagios
# License : GPL
#
# $Id: check_zfs.sh,v 1.1 2013/03/26 14:12:06 root Exp root $
#
# Description :
#
#  This plugin check if ZFS pool is online
#
# Usage :
#
#  check_zfs.sh &lt;poolname&gt;
#

# Debug
#set -x

cd $(dirname $0)

# get return codes
. ./utils.sh

# Finally Inform Nagios of what we found...
if [ x`sudo /sbin/zpool list -H -o health $1` \!= xONLINE ]
then
        echo &quot;CRITICAL - ZFS pool $1 not ONLINE&quot;
        exit $STATE_CRITICAL
else
        echo &quot;OK - ZFS pool $1 is ONLINE&quot;
        exit $STATE_OK
fi

# Hey what are we doing here ???
exit $STATE_UNKNOWN

2. Allow the 'nagios' user to run the 'zpool list' command as root to get the status:

<strong>/etc/sudoers</strong>

nagios  ALL=(root) NOPASSWD: /sbin/zpool list *

3. Test the script runs when called by user nagios:

# sudo -u nagios /usr/local/nagios/libexec/check_zfs.sh datapool
OK - ZFS pool datapool is ONLINE

4. Create the include directory specified in the <strong>nrpe.cfg</strong> file

# grep ^include_dir /usr/local/nagios/etc/nrpe.cfg
include_dir=/usr/local/nagios/etc/nrpe_local
# mkdir /usr/local/nagios/etc/nrpe_local

5. Add a check alias used by NRPE when called and restart NRPE

# echo 'command[check_zpool]=/usr/local/nagios/libexec/check_zfs.sh $ARG1$' &gt; /usr/local/nagios/etc/nrpe_local/check_zpool.cfg
# /etc/init.d/opsview-agent restart

6. Test from the Opsview/Nagios server

# sudo -u nagios /usr/local/nagios/libexec/check_nrpe -H 66.55.44.33 -c check_zpool -a datapool
OK - ZFS pool datapool is ONLINE

7. Add the custom test to opsview/nagios