Tag Archives: RPM

NRPE for Nagios Core on CentOS 7.x

Introduction

In continuation to part 1 and part 2 of this blog series… NRPE (Nagios Remote Plugin Executor) is yet another Client/Server plugin for Nagios (but can work with other applications too). Unlike NRDP, Nagios isn’t required for NRPE to function which means you can harness the power of this tool and apply it to many other applications too. It does however work ‘very’ well with Nagios and was originally designed for it.

If you read my blog on the NRDP protocol, then you’re already familiar with it’s push architecture where the Applications are responsible for reporting in their status. NRPE however works in the reverse fashion; NRPE requires you to pull the information from the Application Server instead. The status checking responsibility falls on the Nagios Server (instead of the Applications it monitors).

NRPE Overview
NRPE Overview

The diagram above illustrates how the paradigm works.

  • The A represents the Application Server. There is no limit to the number of these guys.
  • The N represents the Nagios Server. You’ll only ever need one Nagios server.

NRPE Query Overview

  1. The Nagios Server will make periodic status checks to the to the Application Server via the NRPE Client (check_nrpe).
  2. The Application Server will analyze the request it received and perform the status check (locally).
  3. When the check completes, it will pass the results back to the Nagios Server (via the same connection the NRPE Client started).
  4. Nagios will take the check_nrpe results and display it accordingly. If the check_nrpe tool can’t establish a connection to the NRPE Server (running on the Application Server), then it will report a failure.

    If the call to the check_nrpe tool takes to long to get a response (or a result) back, then it will be reported as a (timeout) failure. By default check_nrpe will wait up to a maximum of 10 seconds before it times out and gives up. You can change this if you find it too short (or to long).

Here is what you’re getting with this blog and the packaged rpms that go with it:

  • Nagios Core v4.x: updated RPM packaging which I continued to maintain and carry forward from my previous blogs.
    Note: You will need this installed in order to monitor applications with NRPE. See part 1 of this blog series for more information if you don’t already have it set up.
  • NRPE (Nagios Remote Plugin Executor) v3.x: My custom RPM packaging bringing NRPE v3.x to CentOS 7.x for the first time especially since I couldn’t find it available anywhere else (at the time of the this blog entry). I had to make a few modifications to it so that it would be easier used our environment such as:
    • I forwarded the useful patches from the old NRPE v2.x branch to the new NRPE 3.x branch that were applicable still.
    • I patched the SystemD startup script to work with CentOS/Red Hat systems.
    • A sudoer’s file is already to go for those who want to use sudo with their NRPE remote calls.
    • There is firewall configuration all ready to use with FirewallD.

    Best of all, with my RPMs, you can run SELinux in full Enforcing mode for that extra piece of mind from a security standpoint!

    The Goods

    For those who really don’t care and want to just jump right in with the product. Here you go!

    You can download the packages manually if you choose, or reference them using my repository:

    Package Download Description
    nrpe el7.rpm The NRPE Server: This should be installed on any server you want to monitor. The server will allow the machine to respond to requests sent to it via the check_nrpe (Nagios) plugin.
    nrpe-selinux el7.rpm An SELinux add-on package that allows the NRDP Server to operate under Enforcing Mode.

    Note: This RPM is not required by the NRDP server to run correctly.
    nagios-plugins-nrpe el7.rpm Our NRPE Client; this is the Nagios Plugin used to request information from the NRPE Server(s). You’ll only need to install this on the Nagios server (for the purpose of this blog). The RPM provides a small tool called check_nrpe that gets installed into the Nagios Plugin Directory (/usr/lib64/nagios/plugins).

    Through some simple configuration; Nagios can use the check_nrpe tool to monitoring anything you want just as long as the NRPE server is (installed and) running at the other end.

    Note: The source rpm can be obtained here which builds everything you see in the table above. It’s not required for the application to run, but might be useful for developers or those who want to inspect how I put the package together.

    NRPE Server Side Configuration

    The NRPE Server is usually installed onto all of the Application servers you want Nagios to monitor remotely. It provides a means of accepting requests to process (such as, what is your system load like?) and handling the request and returning the response back.

    Assuming you hooked up to my repository here, the NRDP server can can be easily installed with the following command:

    # Install NRPE (Server) on your Application Server
    yum install nrpe nrpe-selinux
    

    NRPE communicates through TCP port 5666; which means you may need to enable the ports in your firewall first to allow remote connections from Nagios. The below commands open up the protocol to everyone attached to your network. You should only perform this command if your Application Server will be running on a local private network:

    # Enable NRPE port (5666)
    firewall-cmd --permanent --add-service=nrpe
    firewall-cmd --add-service=nrpe
    

    If you’re opening this up to the internet, then you might want to just open the (NRPE) port exclusively for the Nagios Server. The following presumes you know the IP of the Nagios Server and will open access to ‘JUST’ that system:

    # Assuming 5.6.7.8 belongs to the Nagios Server you intend to
    # allow to monitor you; you might do the following:
    firewall-cmd --permanent --zone=public \
       --add-rich-rule='\
           rule family="ipv4" \
           source address="5.6.7.8/32" \
           port protocol="tcp" \
           port="5666" \
           accept'
    

    You’ll want to have a look at your NRPE Configuration as you will probably have to update a few lines. See /etc/nagios/nrpe.cfg and have a look for the following directives:

    Directive Details
    allowed_hosts This is added security for NRPE but can come back and haunt you if you ignore it (as nothing will work). You should specify the IP address of your Nagios server here and not allow anyone else! For example, if your Nagios server was 6.7.8.9, then you would put 6.7.8.9/32 here.
    dont_blame_nrpe This allows you to pass options into your remote checks. I personally think this is awesome, but there is no question that depending on the checks that accept arguments, it could could exploit content from your system you wouldn’t have otherwise wanted to share. This is specially the case if you grant NRPE Sudoers permission (discussed a bit lower). Set this value to one (1) to enable argument passing and zero (0) to disable it.
    allow_bash_command_substitution Bash substitution is something like $(hostname) (which might return something like node01.myserver). Set this value to one (1) to enable argument passing and zero (0) to disable it. I don’t use personally use this and therefore have it disabled on my system.

    Setting up NRPE Tokens

    Once your server is set up, there is one last thing you need to do. You need to associate tokens with some of the status checks you want to do. You see, NRPE won’t just execute any program you tell it to, it will only execute programs a specific way that you’ve allowed for. This is purely for security and it makes sense to do so! It’s really not that complicated, consider a token/command mapping like this (as an example):

    Token Command
    check_load # Check the system load:
    # – warning if greater than 10.0
    # – critical if grater then 20.0
    # – we map the check_load token to this command:
    /usr/lib64/nagios/plugins/check_load -w 10 -c 20

    You pass this information into NRPE by creating a .conf file in /etc/nrpe.d (assuming you’re using my RPMs) with the syntax:

    command[token]=/path/to/mapped/command

    So with respect to the example we started; it would look like this:

    command[check_load]=/usr/lib64/nagios/plugins/check_load -w 10 -c 20

    You can specify as many tokens as you like (each one has to be unique from the other). If you have a look in /etc/nrpd.d/, you’ll see one called common_checks.cfg which has a handful of useful commands to start you off with. You can add to this file or start another if you like (in the same directory with a .conf extension). Each time you make changes to the configuration (or another) you’ll need to signal NRPE to have it load any new changes you provided:

    # Restart our NRPE Service
    systemctl restart nrpe.service
    

    Here is a conceptual diagram that will help illustrate what was just explained here using the check_load example above:

    NRPE check_load Example
    NRPE check_load Example

    Sudoer’s Permission

    Substitute User Do (or sudo) allows you to run commands as other users. Most commonly people use sudo to elevate there permission to the root (superuser) privilege to execute a command. When the command has completed, they return back to their normal privileges.

    Some actions require you to have higher system authority to get anything good from them. This includes retrieve certain kinds of system/status information. For example, you can’t check how much mail is spooled and ready for remote delivery (on a Mail Server) unless you reduce the privileges of all of your stored mail (making it accessible by anyone); not a very good idea! But alternatively you can use sudo to grant one user permission to just run a command that can only fetch the number of mail items queued; this is a much better approach! Thus elevating a users permissions for just "special status checking only " commands isn’t so much of a risk. NRPE can be configured to have superuser permissions for it’s status checks which can allow you to monitor a lot more things! Here is how to do it:

    # First make sure that the sudoer's configuration doesn't
    # require tty (teletype terminals) endpoints to use. You
    # do so by commenting out the line that reads 'Defaults requiretty'
    # in the /etc/sudoers file (access it with the command visudo)
    # You can also do this with the below one-liner too
    sed -i -e 's/^\(Defaults[ \t]\+requiretty.*\)$/#\1/g' \
             /etc/sudoers
    
    # By doing this, you allow pseudo-teletype (pty) endpoints
    # to use the sudo command too. NRDP would be a pty service
    # for example since it's not an actual person running the
    # command.
    
    # Update our configuration to use the sudo command
    sed -i -e "s|^[ \t#]*\(command_prefix\)=.*|\1=/usr/bin/sudo|g" \
          /etc/nagios/nrpe.cfg
    
    # Restart our server if it's running
    systemctl restart nrpe.service
    

    SELinux users will want to also do this:

    # Allow NRPE/Nagios calls to run /usr/bin/sudo
    setsebool -P nagios_run_sudo on
    

    NRPE Client Side Configuration

    The client side simply consists of a small application (called check_nrpe) which connects remotely to any NRPE Server you tell it to and hands it a token for processing (provided your NRPE Server is configured correctly). The NRPE Server will take this token and execute a command that was associated with it and return the results back to you.

    # You'll want to be connected to my repositories for this to work:
    # See: https://nuxref.com/repo for more information
    # Install NRPE on the same server running Nagios
    yum install nagios-plugins-nrpe
    

    There is configuration already in place for you if you’re using my RPMS located in /etc/nagios/conf.d called nrpe.cfg. But you can feel free to create your own (or over-write it like so:

    cat << _EOF > /etc/nagios/conf.d/nrpe.cfg
    ; simple wrapper to check_nrpe for remote calls to NRPE servers
    define command{
       command_name check_nrpe
       command_line /usr/lib64/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
    }
    
    ; check_nrpe with arguments enabled (enable the dont_blame_nrpe for this
    ; to work properly otherwise simply don't use it)
    define command{
       command_name check_nrpe_args
       command_line /usr/lib64/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -a $ARG2$
    }
    

    You can now send command tokens to servers running NRPE via Nagios. A Nagios configuration might look like this:

    cat << _EOF > /etc/nagios/conf.d/my.application.server01.cfg
    ; first we define a host that we want to monitor.
    ; If you already have a host configured; you can skip this part
    define host{
    ; Name of host template to use. This host definition will inherit
    ; all variables that are defined in (or inherited by) the
    ; linux-server host template definition. You can find this
    ; in /etc/nagios/objects/templates.cfg if you're interested
            use                     linux-server
    
    ; Now we define the server we're going to monitor
            host_name               my-application-server01.nuxref.com
            alias                   my-application-server01.nuxref.com
            address                 192.168.1.2
    
            statusmap_image         redhat.png
            icon_image              redhat.png
            icon_image_alt          CentOS 7.x
    }
    
    ; Now we want to define our monitoring service that talks to our
    ; Application server with NRPE installed on it:
    define service{
    ; Name of service template to use. This service definition will inherit
    ; all variables that are defined in (or inherited by) the
    ; local-service service template definition. You can find this
    ; in /etc/nagios/objects/templates.cfg if you're interested
    	use				local-service
            host_name                       my-application-server01.nuxref.com
    	service_description		Our System Load
    
    ; The Exclamation mark (!) lets nagios know that check_load is the argument
    ; we want to pass into our check_nrpe command we defined earlier (as $ARG1$)
    	check_command			check_nrpe!check_load
    }
    _EOF
    

    If you want to make sure it’s going to work, before you go any further you can run the same command you just told Nagios to do (above):

    # The syntax 'check_nrpe!check_load' gets translated to:
    #  /usr/lib64/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
    #
    # Which we can further translate (for testing purposes) to:
    /usr/lib64/nagios/plugins/check_nrpe -H 192.168.1.2 -c check_load
    

    If everything worked okay you should see something similar to the output:

    OK – load average: 0.00, 0.00, 0.00|load1=0.000;10.000;20.000;0; load5=0.000;10.000;20.000;0; load15=0.000;10.000;20.000;0

    You’ll want to reload Nagios to pick up on your new configuration so it can start calling this command too:

    # But before you reload it; it doesn't hurt to just check and make
    # sure your configuration is okay; You can test it out with:
    nagios -v /etc/nagios/nagios.cfg
    
    # correct any errors that display and rerun the above command until
    # everything checks out okay!
    
    # Now reload Nagios so it reads in it's new configuration:
    systemctl reload nagios.service
    

    NRPE vs. NRDP

    Which one should you use? Both NRDP and NRPE have their Pro’s and Cons.

    Benefits of using…
    NRPE NRDP
    You have central control over the checking periods of an application. You only ever need to open 1 TCP port; from a security standpoint; this is awesome!
    You don’t need Nagios to use this. If used properly, it can provide a great way to access remote servers for information and even execute administrative and maintenance commands. You only need to manage the Nagios configuration when a new server is added.
    Can send multiple status messages in one single (TCP) transaction.
    Can reflect a remote application status change immediately oppose to NRPE which one reflect the change until the next status check is performed.

    It should be worth noting that nothing is stopping you from using both NRDP and NRPE at the same time. You might choose an NRDP strategy for remote systems while choosing an NRPE strategy for all your systems residing in your private network. NRPE works over the internet as well, but just exercise caution and be sure to have SELinux running in Enforcing mode on all of the server end points.

    Credit

    This blog took me a very (,very) long time to put together and test! The repository hosting alone accommodates all my blog entries up to this date. All of the custom packaging described here was done by me personally. I took the open source available to me and rebuilt it to make it an easier solution and decided to share it. If you like what you see and wish to copy and paste this HOWTO, please reference back to this blog post at the very least. It’s really all I ask.

    Sources

NRDP for Nagios Core on CentOS 7.x

Introduction

In continuation to part 1 of this blog series… NRDP (Nagios Remote Data Processor) is a Client/Server plugin for Nagios. With the NRDP model, it is the Application (or the server hosting it) that is responsible for reporting in it’s status. The beauty of this is you can have applications installed all throughout your business, your home, across data centers, overseas, etc all reporting to 1 single Nagios instance. You can monitor your entire infrastructure with this tool.

NRDP Overview
NRDP Overview

The diagram above illustrates how the paradigm works.

  • The A represents the Application Server. There is no limit to the number of these guys.
  • The N represents the Nagios Server. You’ll only ever need one Nagios server.
  • The Application (and/or the server hosting it) will make periodic status reports to the Nagios server. It should report that everything is okay, or report that it isn’t. Out of the box (using my rpms) Nagios will wait for up to X minutes (configurable) for a message to be received before reporting the service/host as being offline. Ideally you should send at least 2 or 3 notifications within this grace period. Currently X is set to 5 minutes with the templates in the RPMs I provide. But you can change this.
  • If Nagios doesn’t hear from the Application for an extended period of time, then it assumes the worst and reports to the user that there is a problem.

NRDP is a newer chapter in the Nagios world which allows you to monitor more applications you wouldn’t have otherwise been able to do. NRDP replaces it’s predecessor NSCA which performs the exact same task/function. One of the key differences is that NRDP is much easier to set up and use than NSCA is. NRDP is also much less lines of code too! The RPMs I provide make the setup even easier. That said, if you’re interested in using NSCA instead, I still package it here (and client here). But lets get back to talking about NRDP….

NRDP Clients pushing to Central NRDP Server
NRDP Clients pushing to Central NRDP Server

Here is what you’re getting with this blog and the packaged rpms that go with it:

  • Nagios Core v4.x: updated RPM packaging which I continued to maintain and carry forward from my previous blogs.
    Note: You will need this installed in order to use NRDP. See part 1 of this blog series for more information if you don’t already have it set up.
  • NRDP (Nagios Remote Data Processor) v1.x: My custom RPM packaging to just work for everyone right out of the box. Here are some modifications I made to the package to make our lives easier:
    • The default insecure port is configured to use 5668.
    • The default secure port is configured to use 5669.
    • I included my own Apache configuration so that the service would run right out of the box.
    • There are a several patches I had to make in order for the NRDP package to even work in our CentOS environment.
    • There is firewall configuration all ready to use with FirewallD.

Best of all, with my RPMs, you can run SELinux in full Enforcing mode for that extra piece of mind from a security standpoint!

The Goods

For those who really don’t care and want to just jump right in with the product. Here you go!

You can download the packages manually if you choose, or reference them using my repository:

Package Download Description
nrdp el7.rpm The NRDP Server: This must be installed on the same server running Nagios. It is in charge of listening to requests sent by nrdp-clients and reporting their status directly to Nagios. Don’t forget to change your tokens which we’ll discuss further down if you don’t know what I’m talking about!
nrdp-selinux el7.rpm An SELinux add-on package that allows the NRDP Server to operate under Enforcing Mode.

Note: This RPM is not required by the NRDP server to run correctly.
nrdp-client el7.rpm The NRDP Client: You’ll only need to optionally install this on to the remote servers that will be sending their status along to your NRDP Server. Since the protocol is so simple to adapt, you can also just choose to build the client aspect into the program you want to monitor with. If you do choose to install this RPM, you get access to a small program called send_nrdp.php which allows you to post status messages to the Nagios Server.
nrdp-doc el7.rpm Just some documentation that is already publicly available on NRDP’s website.
Note: This RPM is not required by Nagios to run correctly.

Note: The source rpm can be obtained here which builds everything you see in the table above. It’s not required for the application to run, but might be useful for developers or those who want to inspect how I put the package together.

NRDP Server Side Configuration

The NRDP Server is only going to work if it’s installed on the same server as Nagios.

Assuming you hooked up to my repository here, the NRDP server can can be easily installed with the following command:

# Install NRDP (Server) on the same server as the one running Nagios
yum install nrdp nrdp-selinux

NRDP was written in PHP and therefore depends on a few small tweaks on your part. For one, you’ll want to make sure you have a timezone configured too to silence any potential errors and keep the reporting consistent. You’ll want to do this from your /etc/php.ini file; look for a directive called date.timezone. Here is a list of supported timezones by PHP. But you can also just do this which works too:

# This clever trick (a 1 liner) lets the timezone in the /etc/php.ini
# to that of whatever your current system is set to!
sed -i -e "s/^[; \t]\?\(date\.timezone\)[ \t]*=.*/\1 = \"$(date +%Z)\"/g" \
   /etc/php.ini

If Apache isn’t running; now’s the time to start it up.

# This simple command should be ran as root and will
# start Apache if it isn't running already, otherwise it
# will send a reload signal if it is.  Regardless,
# after you finish this command, the new NRDP Configuration
# will be running and ready to go.
systemctl status httpd.service && \
   systemctl reload httpd.service || \
   systemctl start httpd.service

# Make sure we start up on reboots (if we haven't done so already)
systemctl enable httpd.service

NRDP/Nagios Configuration

You need to let Nagios know what services it should expect notifications from. The key thing that separates these entries from any other Nagios entries you specified is the ‘use‘ directive. For hosts reporting in, you’ll want to use the generic-nrdp-host directive. For services reporting in, you’ll want to use the generic-nrdp-service entry.

A Configuration file on your Nagios Server (the same server running your NRDP Server) might look like this:

cat << _EOF > /etc/nagios/conf.d/somehost.cfg
# We might have a test server defined like this
define host {
    # Name of host template to use (notice we're using the
    # 'generic-nrdp-host' template).  If you want to actively ping
    # this server and not await to hear from it, you can change this
    # to use the 'linux-server' template instead.
    use                         generic-nrdp-host
    # Hostname
    host_name                   somehost
    # Human Readable
    alias                       My Test Server
}

# Now we might identify a component to associate with our test server called <em>somehost</em>.
define service{
   use                 generic-nrdp-service
   service_description someservice
   host_name           somehost
}
_EOF

With the templates I provide, Nagios will wait for up to 5 minutes for a notification before assuming nothing is coming (and changing the state to CRITICAL). If this is too long (or too short) of a wait time for you, you can change it by modifying the NRDP templates (or creating your own). The provided NRDP templates can be found here: /etc/nagios/conf.d/nrdp.conf. You’ll want to focus on the entry entitled freshness_threshold which is currently set to 300 (300 seconds is equal to 5 minutes) and change it to your liking.

Alternatively, you can over-ride the template value by just adding the freshness_threshold to the service configuration you create like so:

cat << _EOF > /etc/nagios/conf.d/somehost.10min.service.cfg
# Over-ride our template timeout with a new one specified
# in our service declaration
define service{
   use                 generic-nrdp-service
   service_description someservice
   host_name           somehost
   # Over-ride the defaults in our template and only go stale
   # if 10 minutes elapses (600 seconds = 10 min)
   freshness_threshold 600
}
_EOF

If you’ve modified your configuration in anyway; be sure to reload Nagios so that it can take in our new configuration:

# Reload our service (if it's running)
systemctl reload nagios.service

# Be sure to start it if it isn't already:
systemctl start nagios.service

Testing your NRDP Server Setup

Once NRDP is up and running it will provide you a simple website you can use to interact with it. This interface allows you to manually send passive checks through to Nagios which is really useful for testing if everything is working okay!

Here is what the test page looks like:

NRDP Server Side
NRDP Server Side – use nuxref as a token to test with.

Note: The blog does not touch on the Submit Nagios Command portion of this page; but that’s another topic for another day. You want to focus on the Submit Check Data section.

To use NRDP you will be required to provide a token with every submission you ever make to it. This is just a security feature to prevent people from sending messages into your system who shouldn’t be. By default (if you’re using my rpms), the token is nuxref. You’ll notice in the above example, I’ve gone ahead and filed this field in to show you where it goes.

To change this token (which you really should do) just open up the php file /etc/nrdp/config.php. You can specify as many tokens as you like that you wish to accept. Here is what you’re looking for:

// ...
// look for the authorized_tokens directive
// and add your tokens here:
$cfg['authorized_tokens'] = array(
   "the-longer-and-more-encrypted-your-token-is-the-better",
   "specify-as-many-tokens-as.-you-want",
);

If you followed the blog so far and created the test examples I provided (above), then you can now submit a passive check using the default settings; don’t forget to provide your token!

If everything goes according to plan, you should be able to check out your message safely being acknowledged:

NRDP Test Submission Results (using defaults)
NRDP Test Submission Results (using defaults)

NRDP Server Security

Since you can access a very simple NRDP webpage on the same server hosting Nagios via http://localhost:5668/ (insecure) or https://localhost:5669/ (secure), you’ll want to protect it from those who shouldn’t be visiting it.

The below commands open up the protocol to everyone on your network. You should only perform this command if your Nagios Server will be running on a local private network:

# Enable insecure NRDP port (5668)
firewall-cmd --permanent --add-service=nrdp
firewall-cmd --add-service=nrdp

# Enable secure NRDP port (5669)
firewall-cmd --permanent --add-service=nrdps
firewall-cmd --add-service=nrdps

If you’re opening this up to the internet, then you might want to just open the (NRDP) port(s) exclusively used by the remote applications you intend to allow reporting from. It might be worth investing time into fail2ban as well as an additional precaution. The following presumes you know the IP of the application server and will open access to ‘JUST’ that system:

# Assuming 1.2.3.4 belongs to a system you trust who will be sending you
# Nagios status reports via NRDP:
firewall-cmd --permanent --zone=public \
   --add-rich-rule='\
       rule family="ipv4" \
       source address="1.2.3.4/32" \
       port protocol="tcp" \
       port="5668" \
       accept'

# Here is the Secure version of the same command
firewall-cmd --permanent --zone=public \
   --add-rich-rule='\
       rule family="ipv4" \
       source address="1.2.3.4/32" \
       port protocol="tcp" \
       port="5669" \
       accept'

Note: Again, I want to stress you should only open ports you intend to use and to the Application Servers who intend to use it. If you don’t intend on using the insecure port, then don’t even open it on your firewall!

Once you’re satisfied that everything is working. You might want to disable GET requests on your NRDP Server. This will prevent people from using the test website and only accept POST requests. To do this, just have a look in /etc/httpd/conf.d/nrdp.conf and uncomment (hence, remove the #) the line that reads:

RewriteCond %{REQUEST_METHOD} !^(POST) [NC]

You’ll need to reload your Apache server for the changes to take effect. If ever you want to test again, just comment the line back out again:

# This simple command should be ran as root and will
# start Apache if it isn't running already, otherwise it
# will send a reload signal if it is.  Regardless,
# after you finish this command, the new NRDP Configuration
# will be running and ready to go.
systemctl status httpd.service && \
   systemctl reload httpd.service || \
   systemctl start httpd.service

# Make sure we start up on reboots (if we haven't done so already)
systemctl enable httpd.service

NRDP Client Side Configuration

The protocol is so simple, that you can easily adapt this into programs you write or tool-sets. You can also just simple use the tool provided

# Install Nagios Core using NuxRef
# See: https://nuxref.com/repo for more information
# Install NRDP Client (CLI) on any server you want to
# be able to report it's status to (onto Nagios)
yum install nrdp-client

Now… from our application server we would want to install our nrdp-client package. so that we can access our send_nrdp.php tool.

The tool is very simple to use; here is an example how you can send a simple status message to our Nagios server:

# The below sends a host acknowledgment to our NRDP server
# url:   web address to our Nagios server
# token: the security token (this is to prevent anyone from
#        sending status requests to your Nagios server.
#        The default is 'nuxref' until you change it (which you
#        really should consider doing)
# state: This will work the color coding of Nagios, you can
#        specify one of the following:
#           0 - OKAY (green)
#           1 - WARNING (yellow/orange)
#           2 - CRITICAL (red)
#           3 - UNKNOWN (blue)
# output: Attach a message with our state; this will be visible
#         as well from the Nagios display screen:

# Here is a NRDP notification:
send_nrdp.php --url=http://nagios.server.addr:5668 --token=nuxref \
   --host=somehost \
   --state=0 \
   --output="Test is is working great."

# This will send a warning status to our component we identified
send_nrdp.php --url=http://nagios.server.addr:5668 --token=nuxref \
   --host=somehost \
   --state=0 \
   --output="Test is is working great."

Note: It’s worth noting that there are no WARNING states for hosts. A zero (0) will report that it is online, and anything else will report it as CRITICAL.

NRDP Protocol

Are you a developer? If you are, you may want to avoid using the script and just talk directly to the NRDP Server from your application. It’s incredibly easy to do. If you’re not, don’t worry; the send_nrdp.php script looks after all of this for you. You can still read this section if you’re interested none the less.

NRDP is literally just a small PHP website that relays XML it receives via it’s restful API into a language that Nagios can interpret. In fact NRDP is just a website that writes specially formatted files into the /var/nagios/spool/checkresults/ directory to which Nagios scans for processing.

The NRDP Payload

The payload is XML; the schema looks like this:

<?xml version='1.0'?> 
<!-- For checking a host -->
<checkresults>
    <!-- The type can be either 'host' or 'service' -->
    <!-- The checktype is always 1 for relaying passive status updates -->
  <checkresult type='host' checktype='1'>
    <hostname>somehost</hostname>
    <!-- 0 = Success, 1 = Warning, 2 = Critical -->
    <state>0</state>
    <!-- A message we might want to associate  -->
    <output>Our somehost is behaving perfectly!</output>
  </checkresult>
</checkresults>

Easy peasy right? A Service message looks like this:

<?xml version='1.0'?> 
<!-- For checking a service -->
<checkresults>
    <!-- The type can be either 'host' or 'service' -->
    <!-- The checktype is always 1 for relaying passive status updates -->
  <checkresult type='service' checktype='1'>
    <!-- This has to be the hostname associated with the service -->
    <hostname>somehost</hostname>
    <!-- This has to be the service we configured using our NRDP
         template.  It's identified by the `service_description`
         field as part of our nagios configuration. -->
    <servicename>someservice</servicename>
    <!-- 0 = Success, 1 = Warning, 2 = Critical -->
    <state>2</state>
    <!-- A message we might want to associate  -->
    <output>Danger Will Robinson! Danger!</output>
  </checkresult>
</checkresults>

You can stack as many messages as you want in a single payload which really adds to NRDPs flexibility:

<?xml version='1.0'?> 
<!-- A mix of status updates -->
<checkresults>
  <checkresult type='host' checktype='1'>
    <hostname>somehost</hostname>
    <state>0</state>
    <output>Our testserver is behaving perfectly!</output>
  </checkresult>

  <checkresult type='service' checktype='1'>
    <hostname>somehost</hostname>
    <servicename>someservice</servicename>
    <state>2</state>
    <output>Houston, we have a problem.</output>
  </checkresult>

  <checkresult type='service' checktype='1'>
    <hostname>somehost</hostname>
    <servicename>my-other-service</servicename>
    <state>1</state>
    <output>Let's hope the problem goes away.</output>
  </checkresult>

</checkresults>

The Transaction

Since it’s a PHP website, it’s hosted via Apache, NginX, etc. This blog sets up Apache but nothing is stopping you from using another service.

This means however that all interactions are web requests. A transaction must be an URL encoded HTTP POST and it must include a special token as part of its payload (part of the verification process). If the token specified is invalid or missing, then the NRDP server will just reject the message. The same rules apply if you don’t POST the message.

You could accomplish this with Python like so:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Site: https://nuxref.com
# Desc: A simply python snip-it to show you how easy it is
#       to interface with NRDP server:
#
# Note: 'requests' (which is required) can be installed 
#       like so (if you don't already have it):
#  #> yum install python-requests
import requests

xml = """<?xml version='1.0'?> 
<checkresults>
  <checkresult type='service' checktype='0'>
    <hostname>somehost</hostname>
    <servicename>someservice</servicename>
    <state>0</state>
    <output>Everything is running great!</output>
  </checkresult>
</checkresults>"""

params = {
   'token': 'nuxref',
   'cmd': 'submitcheck',
   'XMLDATA': xml,
}
r = requests.post(
    'https://nagios.server.addr:5669',
    params=params,
    headers={'Content-Type': 'application/xml'},
    # This is only in place for those who have self signed secure
    # certificates for private networks; you may wish to change
    # this to True in a real environment if you have the ability
    # to verify the host
    verify=False,
)

# Ideally you want to see: 200
print(r.status_code)

# Print our data
print(r.text)

Credit

This blog took me a very (,very) long time to put together and test! The repository hosting alone accommodates all my blog entries up to this date. All of the custom packaging described here was done by me personally. I took the open source available to me and rebuilt it to make it an easier solution and decided to share it.

NRDP certainly does not work like this out of the box; here were the patches I created (see pull request #12 on NRDPs Repository for upstream pushes of this):

  1. nrdp-php_headerfix.patch: Fix PHP header so tool can be executed on the shell.
  2. nrdp-basic_authorized_token.patch:Default (nuxref) token for an out-of-the-box working solution.
  3. nrdp-date.timezone.warning.patch: Silence potential Timezone Warning Message.
  4. nrdp-permissions.patch: Better passive check permission handling for NRDP/Nagios
  5. nrdp-client-support_ports.patch: Support more then just port http 80 (as this blog sets up our NRDP server using ports 5669 and/or 5668).

If you like what you see and wish to copy and paste this HOWTO, please reference back to this blog post at the very least. It’s really all I ask.

Sources

Nagios Core 4.x Setup for CentOS 7.x

Introduction

A few years ago I blogged about setting up Nagios Core 4.x for CentOS 6.x. I later made a blog on setting up NSCA and NRPE too. But times have changed so I figured it was time to do an updated blog for CentOS 7.x and Nagios.

Some new great tools I’ve packaged to make a load and go setup for everyone is:

  • Nagios Core v4.x: updated RPM packaging which I continued to maintain and carry forward from my previous blogs.
  • Nagios Plugins v2.x: A ton of out of the box working plugins.

Best of all, with my RPMs, you can run SELinux in full Enforcing mode for that extra piece of mind from a security standpoint!

Nagios Core

Nagios (for those who don’t know) is an application that allows us to monitor other system/applications we manage. It’s primary function is to immediately bring to our attention any outage or anomaly is detected with our systems. This tool is completely free and should be an essential component of anyone’s business infrastructure.

The current version of Nagios (at the time of writing this blog) is v4.2.2. You can download the latest version from my repository (if you’re set up) as follows:

# Install Nagios Core using NuxRef repositories
# at: https://nuxref.com/repo
yum install -y nagios nagios-selinux

You can also download the packages manually if you wish using this table:

Package Download Description
nagios el7.rpm Nagios Core IV is the the actual monitoring server we can use to monitor our applications.
nagios-selinux el7.rpm An add-on package that allows you to run Nagios in Enforcing Mode.

Note: This RPM is not required by Nagios to run correctly.
nagios-contrib el7.rpm Extra tools that add to the great features Nagios already offers (such as distributed monitoring). These tools are not discussed in this blog entry; but maybe useful to you.
Note: This RPM is not required by Nagios to run correctly.
nagios-devel el7.rpm Header files for developers who want to build using the libnagios shared library.
Note: This RPM is not required by Nagios to run correctly.

Note: The source rpm can be obtained here which builds everything you see in the table above. It’s not required for the application to run, but might be useful for developers or those who want to inspect how I put the package together.

Configure Nagios Core

Once Nagios is installed it can be started like so:

# Start Nagios
systemctl start nagios.service

# If you want Nagios to start after the system is rebooted, you
# can type the following:
systemctl enable nagios.service

# Now we want to turn on Apache if it's not running, otherwise
# reload the configuration if it is:
systemctl status httpd.service && \
   systemctl start httpd.service ||\
   systemctl reload httpd.service

If you followed the instructions above you should be able to access the (Nagios) monitoring website right away by visiting http://localhost/nagios. If you’re installing this on another server you may need to open your web ports to access the Nagios Monitoring site:

# The following commands should be ran on your Nagios Server.
# It will enable our http (and secure https) port on our firewall so our
# monitoring website can be accessed remotely:
firewall-cmd --permanent --add-service=http
firewall-cmd --add-service=http
firewall-cmd --permanent --add-service=https
firewall-cmd --add-service=https
Nagios Credentials
Login nagiosadmin
Password nagiosadmin

You’ll be prompted for a user/pass combination; at this time the default values are defined in the table:

Once you’ve logged in, you can click on links like Services (under the Current Status heading) which will list to you all of the system services you’re currently monitoring and their status:

Nagios > Current Status > Services
Nagios > Current Status > Services

In the example, you can see Nagios has picked up on a high system load (denoted by the yellow warning entry).

Nagios at it’s most basic level is set up at this point and can be maintained by having a look at the following directories:

/etc/nagios/nagios.cfg:
The main configuration file that is read by Nagios when starting up. The only things you may want to change in here are:

Directive Description
date_format The default is us (MM-DD-YYYY HH:MM:SS), but personally I like iso8601 (YYYY-MM-DD HH:MM:SS).
# simple 1-liner to toggle date_format to iso8601:
sed -i -e 's/^\(date_format\)=.*$/\1=iso8601/g' \
   /etc/nagios/nagios.cfg
check_for_updates The default is 1 (which is to check for updates). Personally, I don’t want my web page pinging Nagios every time i access the website for updates. Here is how you can do the same:
# disable update check:
sed -i -e 's/^\(check_for_updates\)=.*$/\1=0/g' \
   /etc/nagios/nagios.cfg

/etc/nagios/objects/contacts.cfg:
This is where a default contact has been created. Feel free to open this up and add your name and (especially the) email address.
/etc/nagios/objects/commands.cfg:
All of the possible checks Nagios can perform on it’s own (without any plugins or extensions) are identified here. It’s as easy as defining a command_name (give it some name) and then tell it what you want to execute with the command_line directive.

A Nagios command is really simple to write; you can write one in any language you want. Here is one written in bash shell:

#!/bin/bash
# Path: /usr/lib64/nagios/plugins/check_temp_file_count
# Please keep in mind this script is pretty useless but
# it's goal is to show you how easy it is to write a
# script for Nagios.
#
# The rules are simple:
# Whatever we echo to the screen get's displayed to Nagios
# and our return value from our script/program will
# determine the color code (and whether or not we alarm)

# A return code of zero (0) tells Nagios everything is okay
RET_OKAY=0

# A return code of one (1) tells Nagios we're reporting a warning
# about whatever it is we're monitoring
RET_WARN=1

# A return code of two (2) tells Nagios we're reporting a critical
# error
RET_CRIT=2

# I define 3 here, but quite honestly, anything you return that
# does not fit in the 0, 1 or 2 response type (as identified
# above) is considered an unknown state.  Try to avoid this
# type if you can.
RET_UNKN=3

# As a test script we'll count all of the files and directories
# in the /tmp folder
COUNT=$(find /tmp 2>/dev/null | wc -l)

# If we have less then 10 files we'll tell Nagios everything
# is okay:
if [ $COUNT -lt 10 ]; then
   echo "$COUNT files; everything is good!"
   exit $RET_OKAY
fi

# If we have less then 30 files we'll tell Nagios that
# it should report a warning
if [ $COUNT -lt 30 ]; then
   echo "$COUNT files; caution!"
   exit $RET_WARN
fi

# Anything more we should report a critical alarm
echo "$COUNT files; Critical!!"
exit $RET_CRIT

If you’re familiar with Perl, the Nagios team has made a framework with it which you use to make your packages too! I’ve already gone ahead and packaged the perl-Nagios-Monitoring-Plugin rpm for you if you want it.

Nagios will associate /usr/lib64/nagios/plugins/ as it’s plugin directory; so you should save any plugins you create there (to keep them all in a common location). Plus if you intend to use SELinux, this is the directory that Nagios is allowed to execute from.

Our entry in the commands.cfg for this new script might look like this:

; 'check_temp_file_count' command definition
; $USER1$ gets translated automatically to our Nagios Plugin directory
; so in our case: /usr/lib64/nagios/plugins/
; Ideally you should call the command the same name as the checking
; tool you wrote.
define command{
	command_name	check_temp_file_count
	command_line	$USER1$/check_temp_file_count
	}

/etc/nagios/objects/localhost.cfg:
This is just a general configuration file for the very machine Nagios is running on. If you open it up, you’ll see that it defines a lot of entries that reference commands (defined already in the commands.cfg file).

A new entry in the commands.cfg for this new script might look like this:

; the 'use' directive identifies a template of information to save
; us from typing it all out here.  For now; just leave this as
; local-service.
;
; The 'hostname' defines our server (defined at the very top of this same
; file. Since the host at the to was defined as 'localhost', we need
; to use this same name here too
;
; The next field is just a description field; it will be how this service
; is presented on Nagios through the website
;
; The check_command is the name we gave it in the commands.cfg
; file.
define service{
        use                             local-service
        host_name                       localhost
        service_description             Count our Temporary Files
        check_command                   check_temp_file_count
}

Check to see if Nagios has any errors with any new configuration you provided:

# This command just tells Nagios to read in it's configuration
# and check if it appears valid:
nagios -v /etc/nagios/nagios.cfg

If everything checks out okay, go ahead and reload Nagios with our
new configuration:

# Reload Nagios
systemctl reload nagios.service

Nagios Plugins

If you’ve installed Nagios, there isn’t really any good reason why you shouldn’t just install the Nagios Plugins too. This is just more tools and checking scripts to make Nagios all that more powerful. The best part is, these tools have been tested over the years, so they’re already proven to be reliable and will allow you to accomplish most monitoring without much effort.

Nagios Plugins
Nagios Plugins

It’s important to note that the Nagios Plugin RPMs are NOT required by Nagios to run correctly. They merely just improve it’s existing functionality. You may however want to install the plugins you’re interested in that monitor systems you’re using.

The current version of the Nagios Plugins (at the time of writing this blog) is v2.1.3. You can download the latest version from my repository (if you’re set up) as follows:

# Install Nagios Core using NuxRef repositories
# See: https://nuxref.com/repo for more information
yum install -y nagios-plugins nagios-plugins-selinux

You can also download the packages manually if you wish using this table:

Package Download Description
nagios-plugins el7.rpm 50+ plugins that are fully adaptable to Nagios in every way. If you’re planning on installing Nagios, don’t forget about adding this package for it’s convenience!
nagios-plugins-selinux el7.rpm An optional add-on package that allows you to use the Nagios Plugins in Enforcing Mode.
nagios-plugins-ldap el7.rpm A Nagios plugin that can be used to check integrity and data entries within an LDAP database.
nagios-plugins-mysql el7.rpm A Nagios plugin that can be used to check integrity and data entries within an MySQL (or Maria) database.
nagios-plugins-ntp el7.rpm A Nagios plugin that can be used to check the NTP status of the machine it’s called on.
nagios-plugins-pgsql el7.rpm A Nagios plugin that can be used to check integrity and data entries within an PostgreSQL database.
nagios-plugins-samba el7.rpm A Nagios plugin that can be used to check status of your Samba mounts and their availability.
nagios-plugins-snmp el7.rpm A Nagios plugin that can query SNMP enabled appliances (routers, firewalls, switches, servers) and convert their output back to something Nagios can monitor or report.

Note: The source rpm can be obtained here which builds everything you see in the table above. It’s not required for the application to run, but might be useful for developers or those who want to inspect how I put the package together.

The main thing to know about this package after it is installed is the slew of new plugins available to you in /usr/lib64/nagios/plugins/ and a config file to get you started which references most of them in /etc/nagios/conf.d/nagios-plugin-commands.cfg.

Extra Plugins

There are a lot of great plugins on Nagios Exchange! I packaged just a few of them because they required patches and tweaks to work out of the box. All of these are available on my repository, but feel free to haul them down directly here:

Package Download Description
nagios-plugins-lvm el7.rpm / src.rpm / NE Source This plugin finds all LVM logical volumes, checks their used space, and compares against the supplied thresholds.
nagios-plugins-crm el7.rpm / src.rpm / NE Source A plugin for monitoring a Pacemaker/Corosync cluster.
Note: that this plugin requires perl-Nagios-Monitoring-Plugin to work.
nagios-plugins-drbd84 el7.rpm / src.rpm / NE Source A plugin for monitoring a DRBD v8.4 setup.

Credit

This blog took me a very (,very) long time to put together and test! The repository hosting alone accommodates all my blog entries up to this date. All of the custom packaging in this blog was done by me personally. If you like what you see and wish to copy and paste this HOWTO, please reference back to this blog post at the very least. It’s really all I ask.

Sources

The remaining portions of this series can be found here:

  • Part 2 – NRDP for Nagios Core on CentOS 7.x: This blog explains how awesome NRDP really is and why it might become a vital asset to your own environment. It’s also provides the first set of working RPMs (with SELinux support of course) of it’s kind.
  • Part 3 – NRPE for Nagios Core on CentOS 7.x: This blog explains how to set up NRPE (v3.x) for your Nagios environment. At the time this blog was written, there was no packaging of it’s kind for this version. So allow me be the first to share it with you!