Custom Nagios Core plugins turn local checks, application metrics, and site-specific probes into scheduled service states. A plugin only has to print a status line to stdout and exit with the status code that represents OK, WARNING, CRITICAL, or UNKNOWN, which makes small shell scripts enough for many private checks.

Packaged and source-built Nagios Core systems can use different plugin and object-definition directories. Match the plugin path and object directory to the active nagios.cfg file before saving a command definition, because Nagios Core only loads files reached from that configuration tree.

A queue-depth check gives the plugin a concrete metric, two thresholds, status text, and optional performance data without depending on an external application. Replace the sample metric file with the file, socket, API call, or local command that represents the application condition on the monitoring server.

Steps to create a custom Nagios plugin:

  1. Create the plugin file in the Nagios Core plugin directory.
    $ sudoedit /usr/lib/nagios/plugins/check_queue_depth

    Ubuntu and Debian package installs use /usr/lib/nagios/plugins. Use /usr/local/nagios/libexec on source installs that follow the upstream default layout.

  2. Add the plugin script.
    #!/bin/sh
     
    OK=0
    WARNING=1
    CRITICAL=2
    UNKNOWN=3
     
    usage() {
        echo "UNKNOWN - usage: $0 --path FILE --warning N --critical N"
        exit "$UNKNOWN"
    }
     
    metric_path=
    warning=
    critical=
     
    while [ "$#" -gt 0 ]; do
        case "$1" in
            --path)
                [ "$#" -ge 2 ] || usage
                metric_path=$2
                shift 2
                ;;
            --warning)
                [ "$#" -ge 2 ] || usage
                warning=$2
                shift 2
                ;;
            --critical)
                [ "$#" -ge 2 ] || usage
                critical=$2
                shift 2
                ;;
            *)
                usage
                ;;
        esac
    done
     
    if [ -z "$metric_path" ]; then
        usage
    fi
     
    case "$warning:$critical" in
        :*|*:|*[!0-9:]* ) usage ;;
    esac
     
    if [ "$warning" -ge "$critical" ]; then
        echo "UNKNOWN - warning threshold must be lower than critical threshold"
        exit "$UNKNOWN"
    fi
     
    if [ ! -r "$metric_path" ]; then
        echo "UNKNOWN - cannot read $metric_path"
        exit "$UNKNOWN"
    fi
     
    IFS= read -r queue_depth < "$metric_path" || queue_depth=
    case "$queue_depth" in
        ""|*[!0-9]* )
            echo "UNKNOWN - $metric_path does not contain an integer"
            exit "$UNKNOWN"
            ;;
    esac
     
    if [ "$queue_depth" -ge "$critical" ]; then
        echo "CRITICAL - queue depth is $queue_depth | queue_depth=$queue_depth;$warning;$critical;0;"
        exit "$CRITICAL"
    elif [ "$queue_depth" -ge "$warning" ]; then
        echo "WARNING - queue depth is $queue_depth | queue_depth=$queue_depth;$warning;$critical;0;"
        exit "$WARNING"
    fi
     
    echo "OK - queue depth is $queue_depth | queue_depth=$queue_depth;$warning;$critical;0;"
    exit "$OK"

    The text before | becomes the service status output. The text after | is optional performance data in the format label=value;warning;critical;minimum;maximum.

  3. Make the plugin executable.
    $ sudo chmod 0755 /usr/lib/nagios/plugins/check_queue_depth
  4. Create a sample metric file for the plugin to read.
    $ printf '12\n' | sudo tee /var/lib/nagios4/app-queue-depth
    12
  5. Make the sample metric readable by the nagios user.
    $ sudo chown nagios:nagios /var/lib/nagios4/app-queue-depth

    In production, point --path at the real metric source and keep credentials outside the plugin file when possible.

  6. Run the plugin manually as the nagios user.
    $ sudo -u nagios /usr/lib/nagios/plugins/check_queue_depth --path /var/lib/nagios4/app-queue-depth --warning 70 --critical 90
    OK - queue depth is 12 | queue_depth=12;70;90;0;

    Running as nagios catches file-permission, interpreter, and environment problems before the scheduler starts using the plugin.
    Related: How to run a Nagios plugin manually

  7. Confirm the OK exit code from the previous plugin run.
    $ echo $?
    0

    Nagios Core maps exit code 0 to OK, 1 to WARNING, 2 to CRITICAL, and 3 to UNKNOWN.

  8. Set the sample metric to a warning value.
    $ printf '75\n' | sudo tee /var/lib/nagios4/app-queue-depth
    75
  9. Run the plugin to confirm the WARNING branch.
    $ sudo -u nagios /usr/lib/nagios/plugins/check_queue_depth --path /var/lib/nagios4/app-queue-depth --warning 70 --critical 90
    WARNING - queue depth is 75 | queue_depth=75;70;90;0;
  10. Set the sample metric to a critical value.
    $ printf '95\n' | sudo tee /var/lib/nagios4/app-queue-depth
    95
  11. Run the plugin to confirm the CRITICAL branch.
    $ sudo -u nagios /usr/lib/nagios/plugins/check_queue_depth --path /var/lib/nagios4/app-queue-depth --warning 70 --critical 90
    CRITICAL - queue depth is 95 | queue_depth=95;70;90;0;
  12. Set the sample metric to an invalid value.
    $ printf 'busy\n' | sudo tee /var/lib/nagios4/app-queue-depth
    busy
  13. Run the plugin to confirm the UNKNOWN branch.
    $ sudo -u nagios /usr/lib/nagios/plugins/check_queue_depth --path /var/lib/nagios4/app-queue-depth --warning 70 --critical 90
    UNKNOWN - /var/lib/nagios4/app-queue-depth does not contain an integer

    UNKNOWN is for invalid arguments, unreadable input, malformed local data, or an internal plugin failure that prevents a meaningful check result.

  14. Reset the sample metric to an OK value for the scheduled service check.
    $ printf '12\n' | sudo tee /var/lib/nagios4/app-queue-depth
    12
  15. Create a local object file for the command and service.
    $ sudoedit /etc/nagios4/conf.d/queue-depth.cfg
  16. Add the command and service definitions.
    define command {
        command_name    check_queue_depth
        command_line    $USER1$/check_queue_depth --path /var/lib/nagios4/app-queue-depth --warning $ARG1$ --critical $ARG2$
    }
    
    define service {
        use                    generic-service
        host_name              localhost
        service_description    Queue Depth
        check_command          check_queue_depth!70!90
    }

    $ARG1$ and $ARG2$ come from the bang-separated values in check_command. Replace localhost with the host object that should own the custom service.
    Related: How to add a service check in Nagios Core
    Related: How to use Nagios Core macros in a command

  17. Validate the Nagios Core configuration.
    $ sudo nagios4 -v /etc/nagios4/nagios.cfg
    Nagios Core 4.4.6
    ##### snipped #####
    Reading configuration data...
       Read main config file okay...
       Read object config files okay...
    
    Running pre-flight check on configuration data...
    ##### snipped #####
    Total Warnings: 0
    Total Errors:   0
    
    Things look okay - No serious problems were detected during the pre-flight check

    Do not reload Nagios Core while Total Errors is greater than 0. Fix the first reported object or command error, then run the verifier again.

  18. Reload Nagios Core to load the new command and service objects.
    $ sudo systemctl reload nagios4

    Ubuntu and Debian package installs use the nagios4 service name. Containers without systemd can send HUP to the running nagios4 process or use the control method supplied by the image.
    Related: How to manage the Nagios Core system service

  19. Check the service result in Nagios Core.
    Queue Depth
    Current Status: OK
    Status Information: OK - queue depth is 12
    Performance Data: queue_depth=12;70;90;0;

    The Queue Depth service should leave PENDING after its next active check. Force one service check if the scheduler has not run it yet.
    Related: How to reschedule an active check in Nagios Core