Note: Nagios is not designed to be a replacement for a full-blown SNMP management application like HP OpenView or OpenNMS. However, you can set things up so that SNMP traps received by a host on your network can generate alerts in Nagios. Here's how...
Introduction
This example explains how to easily generate alerts in Nagios for SNMP traps that are received by the UCD-SNMP snmptrapd daemon. These directions assume that the host which is receiving SNMP traps is not the same host on which Nagios is running. If your monitoring box is the same box that is receiving SNMP traps you will need to make a few modifications to the examples I provide. Also, I am assuming that you having installed the nsca daemon on your monitoring server and the nsca client (send_nsca) on the machine that is receiving SNMP traps.
For the purposes of this example, I will be describing how I setup Nagios to generate alerts from SNMP traps received by the ArcServe backup jobs running on my Novell servers. I wanted to get notified when backups failed, so this worked very nicely for me. You'll have to tweak the examples in order to make it suit your needs.
Defining The Service
First off you're going to have to define a service in your object configuration file for the SNMP traps (in this example, I am defining a service for ArcServe backup jobs). Assuming that the host that the alerts are originating from is called novellserver, a sample service definition might look something like this:
define service{
host_name novellserver
service_description ArcServe Backup
is_volatile 1
active_checks_enabled 0
passive_checks_enabled 1
max_check_attempts 1
contact_groups novell-backup-admins
notification_interval 120
notification_period 24x7
notification_options w,u,c,r
check_command check_none
}
Important things to note are the fact that this service has the volatile option enabled. We want this option enabled because we want a notification to be generated for every alert that comes in. Also of note is the fact that active checks are disabled for the service, while passive checks are enabled. This means that the service will never be actively checked - all alert information will have to be sent in passively by the nsca client on the SNMP management host (in my example, it will be called firestorm).
ArcServe and Novell SNMP Configuration
In order to get ArcServe (and my Novell server) to send SNMP traps to my management host, I had to do the following:
SNMP Management Host Configuration
On my Linux SNMP management host (firestorm), I installed the UCD-SNMP (NET-SNMP) software. Once the software was installed I had to do the following:
In order to have the snmptrapd daemon route ArcServe SNMP traps to our Nagios host, we've got to define a traphandler in the /etc/snmp/snmptrapd.conf file. In my setup, the config file looked something like this:
#############################
# ArcServe SNMP Traps
#############################
# Tape format failures
traphandle ARCserve-Alarm-MIB::arcServetrap9 /usr/local/nagios/libexec/eventhandlers/handle-arcserve-trap 9
# Failure to read tape header
traphandle ARCserve-Alarm-MIB::arcServetrap10 /usr/local/nagios/libexec/eventhandlers/handle-arcserve-trap 10
# Failure to position tape
traphandle ARCserve-Alarm-MIB::arcServetrap11 /usr/local/nagios/libexec/eventhandlers/handle-arcserve-trap 11
# Cancelled jobs
traphandle ARCserve-Alarm-MIB::arcServetrap12 /usr/local/nagios/libexec/eventhandlers/handle-arcserve-trap 12
# Successful jobs
traphandle ARCserve-Alarm-MIB::arcServetrap13 /usr/local/nagios/libexec/eventhandlers/handle-arcserve-trap 13
# Imcomplete jobs
traphandle ARCserve-Alarm-MIB::arcServetrap14 /usr/local/nagios/libexec/eventhandlers/handle-arcserve-trap 14
# Job failures
traphandle ARCserve-Alarm-MIB::arcServetrap15 /usr/local/nagios/libexec/eventhandlers/handle-arcserve-trap 15
This example assumes that you have a /usr/local/nagios/libexec/eventhandlers/ directory on your SNMP mangement host and that the handle-arcserve-trap script exists there. You can modify these to fit your setup. Anyway, the handle-arcserve-trap script on my management host looked something like this:
#!/bin/sh
# Arguments:
# $1 = trap type
# First line passed from snmptrapd is FQDN of host that sent the trap
read host
# Given a FQDN, get the short name of the host as it is setup in Nagios
hostname="unknown"
case $host in
novellserver.mylocaldomain.com)
hostname="novellserver"
;;
nt.mylocaldomain.com)
hostname="ntserver"
;;
esac
# Get severity level (OK, WARNING, UNKNOWN, or CRITICAL) and plugin output based on trape type
state=-1
output="No output"
case "$1" in
# failed to format tape - critical
11)
output="Critical: Failed to format tape"
state=2
;;
# failed to read tape header - critical
10)
output="Critical: Failed to read tape header"
state=2
;;
# failed to position tape - critical
11)
output="Critical: Failed to position tape"
state=2
;;
# backup cancelled - warning
12)
output="Warning: ArcServe backup operation cancelled"
state=1
;;
# backup success - ok
13)
output="Ok: ArcServe backup operation successful"
state=0
;;
# backup incomplete - warning
14)
output="Warning: ArcServe backup operation incomplete"
state=1
;;
# backup failure - critical
15)
output="Critical: ArcServe backup operation failed"
state=2
;;
esac
# Submit passive check result to monitoring host
/usr/local/nagios/libexec/eventhandlers/submit_check_result $hostname "ArcServe Backup" $state "$output"
exit 0
Notice that the handle-arcserve-trap script calls the submit_check_result script to actually send the alert back to the monitoring host. Assuming your monitoring host is called monitor, the submit check_result script might look like this (you'll have to modify this to specify the proper location of the send_nsca program on your management host):
#!/bin/sh
# Arguments
# $1 = name of host in service definition
# $2 = name/description of service in service definition
# $3 = return code
# $4 = output
/bin/echo -e "$1\t$2\t$3\t$4\n" | /usr/local/nagios/bin/send_nsca monitor -c /usr/local/nagios/etc/send_nsca.cfg
Finishing Up
You've now configured everything you need to, so all you have to do is restart the Nagios on your monitoring server. That's it! You should be getting alerts in Nagios whenever ArcServe jobs fail, succeed, etc.