Custom Nagios Core plugins turn local checks, application metrics, and site-specific probes into scheduled service states. A plugin only has to print a status line to stdout and exit with the status code that represents OK, WARNING, CRITICAL, or UNKNOWN, which makes small shell scripts enough for many private checks.
Packaged and source-built Nagios Core systems can use different plugin and object-definition directories. Match the plugin path and object directory to the active nagios.cfg file before saving a command definition, because Nagios Core only loads files reached from that configuration tree.
A queue-depth check gives the plugin a concrete metric, two thresholds, status text, and optional performance data without depending on an external application. Replace the sample metric file with the file, socket, API call, or local command that represents the application condition on the monitoring server.
Related: How to install Nagios plugins
Related: How to run a Nagios plugin manually
Related: How to define a Nagios Core check command
Steps to create a custom Nagios plugin:
- Create the plugin file in the Nagios Core plugin directory.
$ sudoedit /usr/lib/nagios/plugins/check_queue_depth
Ubuntu and Debian package installs use /usr/lib/nagios/plugins. Use /usr/local/nagios/libexec on source installs that follow the upstream default layout.
- Add the plugin script.
#!/bin/sh OK=0 WARNING=1 CRITICAL=2 UNKNOWN=3 usage() { echo "UNKNOWN - usage: $0 --path FILE --warning N --critical N" exit "$UNKNOWN" } metric_path= warning= critical= while [ "$#" -gt 0 ]; do case "$1" in --path) [ "$#" -ge 2 ] || usage metric_path=$2 shift 2 ;; --warning) [ "$#" -ge 2 ] || usage warning=$2 shift 2 ;; --critical) [ "$#" -ge 2 ] || usage critical=$2 shift 2 ;; *) usage ;; esac done if [ -z "$metric_path" ]; then usage fi case "$warning:$critical" in :*|*:|*[!0-9:]* ) usage ;; esac if [ "$warning" -ge "$critical" ]; then echo "UNKNOWN - warning threshold must be lower than critical threshold" exit "$UNKNOWN" fi if [ ! -r "$metric_path" ]; then echo "UNKNOWN - cannot read $metric_path" exit "$UNKNOWN" fi IFS= read -r queue_depth < "$metric_path" || queue_depth= case "$queue_depth" in ""|*[!0-9]* ) echo "UNKNOWN - $metric_path does not contain an integer" exit "$UNKNOWN" ;; esac if [ "$queue_depth" -ge "$critical" ]; then echo "CRITICAL - queue depth is $queue_depth | queue_depth=$queue_depth;$warning;$critical;0;" exit "$CRITICAL" elif [ "$queue_depth" -ge "$warning" ]; then echo "WARNING - queue depth is $queue_depth | queue_depth=$queue_depth;$warning;$critical;0;" exit "$WARNING" fi echo "OK - queue depth is $queue_depth | queue_depth=$queue_depth;$warning;$critical;0;" exit "$OK"
The text before | becomes the service status output. The text after | is optional performance data in the format label=value;warning;critical;minimum;maximum.
- Make the plugin executable.
$ sudo chmod 0755 /usr/lib/nagios/plugins/check_queue_depth
- Create a sample metric file for the plugin to read.
$ printf '12\n' | sudo tee /var/lib/nagios4/app-queue-depth 12
- Make the sample metric readable by the nagios user.
$ sudo chown nagios:nagios /var/lib/nagios4/app-queue-depth
In production, point --path at the real metric source and keep credentials outside the plugin file when possible.
- Run the plugin manually as the nagios user.
$ sudo -u nagios /usr/lib/nagios/plugins/check_queue_depth --path /var/lib/nagios4/app-queue-depth --warning 70 --critical 90 OK - queue depth is 12 | queue_depth=12;70;90;0;
Running as nagios catches file-permission, interpreter, and environment problems before the scheduler starts using the plugin.
Related: How to run a Nagios plugin manually - Confirm the OK exit code from the previous plugin run.
$ echo $? 0
Nagios Core maps exit code 0 to OK, 1 to WARNING, 2 to CRITICAL, and 3 to UNKNOWN.
- Set the sample metric to a warning value.
$ printf '75\n' | sudo tee /var/lib/nagios4/app-queue-depth 75
- Run the plugin to confirm the WARNING branch.
$ sudo -u nagios /usr/lib/nagios/plugins/check_queue_depth --path /var/lib/nagios4/app-queue-depth --warning 70 --critical 90 WARNING - queue depth is 75 | queue_depth=75;70;90;0;
- Set the sample metric to a critical value.
$ printf '95\n' | sudo tee /var/lib/nagios4/app-queue-depth 95
- Run the plugin to confirm the CRITICAL branch.
$ sudo -u nagios /usr/lib/nagios/plugins/check_queue_depth --path /var/lib/nagios4/app-queue-depth --warning 70 --critical 90 CRITICAL - queue depth is 95 | queue_depth=95;70;90;0;
- Set the sample metric to an invalid value.
$ printf 'busy\n' | sudo tee /var/lib/nagios4/app-queue-depth busy
- Run the plugin to confirm the UNKNOWN branch.
$ sudo -u nagios /usr/lib/nagios/plugins/check_queue_depth --path /var/lib/nagios4/app-queue-depth --warning 70 --critical 90 UNKNOWN - /var/lib/nagios4/app-queue-depth does not contain an integer
UNKNOWN is for invalid arguments, unreadable input, malformed local data, or an internal plugin failure that prevents a meaningful check result.
- Reset the sample metric to an OK value for the scheduled service check.
$ printf '12\n' | sudo tee /var/lib/nagios4/app-queue-depth 12
- Create a local object file for the command and service.
$ sudoedit /etc/nagios4/conf.d/queue-depth.cfg
- Add the command and service definitions.
define command { command_name check_queue_depth command_line $USER1$/check_queue_depth --path /var/lib/nagios4/app-queue-depth --warning $ARG1$ --critical $ARG2$ } define service { use generic-service host_name localhost service_description Queue Depth check_command check_queue_depth!70!90 }$ARG1$ and $ARG2$ come from the bang-separated values in check_command. Replace localhost with the host object that should own the custom service.
Related: How to add a service check in Nagios Core
Related: How to use Nagios Core macros in a command - Validate the Nagios Core configuration.
$ sudo nagios4 -v /etc/nagios4/nagios.cfg Nagios Core 4.4.6 ##### snipped ##### Reading configuration data... Read main config file okay... Read object config files okay... Running pre-flight check on configuration data... ##### snipped ##### Total Warnings: 0 Total Errors: 0 Things look okay - No serious problems were detected during the pre-flight check
Do not reload Nagios Core while Total Errors is greater than 0. Fix the first reported object or command error, then run the verifier again.
- Reload Nagios Core to load the new command and service objects.
$ sudo systemctl reload nagios4
Ubuntu and Debian package installs use the nagios4 service name. Containers without systemd can send HUP to the running nagios4 process or use the control method supplied by the image.
Related: How to manage the Nagios Core system service - Check the service result in Nagios Core.
Queue Depth Current Status: OK Status Information: OK - queue depth is 12 Performance Data: queue_depth=12;70;90;0;
The Queue Depth service should leave PENDING after its next active check. Force one service check if the scheduler has not run it yet.
Related: How to reschedule an active check in Nagios Core
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.