Tag Archives: Linux

Apprise Nagios Integration

Integrate Apprise into Nagios for More Notification Support

Introduction

Apprise is an open source tool that allows you to send a notification through a wide range of messaging services out there (such as Discord, Slack, Telegram, Microsoft Teams, etc). Well when you combine this with Nagios, you open it up to a much larger scope then simply emailing on an alert.

With Apprise you can configure Nagios to text your mobile phone using Amazon’s Web Service, or notify your devops team on Slack and/or Microsoft Teams. You can even trigger an IFTTT event. But it doesn’t just stop there, Apprise already supports over 35+ notification services today (and is always expanding) which means Nagios could leverage all of this too. This blog will explain how you can set up your instance of Nagios to notify more end points then just email.

The Installation

I’m going to presume you have a copy of Nagios already installed. If you don’t you can check out my blog here on how to set up your own copy with CentOS 7. Those who are not using CentOS are certainly not out of luck though, there are lots of blogs out there to get you started.

This blog will assume you have root privileges or have sudoers privileges.

Apprise can be easily added to your system through pip:

# Install Apprise onto the system currently also hosting Nagios
sudo pip install apprise

Configure Nagios

The Nagios configuration files can vary in their location depending on what Linux distribution you’re using. I’m going to just refer to some standard paths used by the stuff I host here (for CentOS/RedHat).

Step 1: Nagios Import Directory

If you’re using the Nuxref RPMs, then you can skip this step and move to the next as you’ll already be configured for this. Those using another distribution will want to update their nagios.cfg to point to a directory we can use to drop in and remove configuration from. The file is presumably going to be located as: /etc/nagios/nagios.cfg):

# Place this anywhere in /etc/nagios/nagios.cfg
# preferably put it near the bottom of the file.

# Definitions for global configuration directory
cfg_dir=/etc/nagios/conf.d

Now make sure this directory exists because this is where we’ll place our new apprise configuration:

# Ensure our global include directory exists that we
# just defined in our nagios.cfg file:
mkdir -p /etc/nagios/conf.d

# Place a dummy file in here so that Nagios doesn't
# throw any errors (as it isn't a fan of include directories
# without configuration files in it).
touch /etc/nagios/conf.d/dummy.cfg

Step 2: Apprise/Nagios Integration

Now we need to let Nagios know about Apprise. We’ll do this by creating the following files called /etc/nagios/conf.d/apprise.cfg

#
# Apprise to Nagios Configuration File
# Place this file as /etc/nagios/conf.d/apprise.cfg
#
# 'notify-host-by-apprise' command definition
define command{
   command_name   notify-host-by-apprise
   command_line   /usr/bin/printf "%b" "- *Notification Type*: $NOTIFICATIONTYPE$\n- *Host*: $HOSTNAME$\n- *State*: \n- *Address*: $HOSTADDRESS$\n- *Info*: $HOSTOUTPUT$\n\n- *Date/Time*: $LONGDATETIME$\n" | /usr/bin/apprise -c /etc/nagios/apprise.yml -n "$HOSTSTATE$" -g "$NOTIFICATIONTYPE$" -t "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **"
}
 
# 'notify-service-by-apprise' command definition
define command{
   command_name   notify-service-by-apprise
   command_line   /usr/bin/printf "%b" "*Notification Type*: $NOTIFICATIONTYPE$\n- *Service*: $SERVICEDESC$\n- *Host*: $HOSTALIAS$\n- *Address*: $HOSTADDRESS$\n- *State*: $SERVICESTATE$\n- *Date/Time*: $LONGDATETIME$\n\n*Additional Info*:\n$SERVICEOUTPUT$\n" | /usr/bin/apprise -c /etc/nagios/apprise.yml -n "$HOSTSTATE$" -g "$NOTIFICATIONTYPE$" -t "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **"
}

# Register our contact template that we can reference
define contact{
  ; The name of this contact template
  name                             apprise-contact

  ; service notifications can be sent anytime
  service_notification_period      24x7

  ; host notifications can be sent anytime
  host_notification_period         24x7

  ; send notifications for all service states, flapping events,
  ; and scheduled downtime events
  service_notification_options     w,u,c,r,f,s

  ; send notifications for all host states, flapping events,
  ; and scheduled downtime events
  host_notification_options        d,u,r,f,s

  ; send service notifications via email
  service_notification_commands    notify-service-by-apprise

  ; send host notifications via email
  host_notification_commands       notify-host-by-apprise

  ; Don't register this as it is just a template for future
  ; references by contacts who wish to use the apprise plugin
  register                         0
}

Now for every contact we set up going forward, we can point it to use Apprise. By default Nagios usually provides us a contact.cfg file that contains the generic user nagiosadmin. For those using my packaging, you can find this file at /etc/nagios/objects/contacts.cfg; you’ll want to change it to looks like this:

define contact{
  ; Short name of (Nagios) user
  contact_name    nagiosadmin

  ; This next line used to read generic-contact; but we want to switch it
  ; over to our new Apprise based one:
  use             apprise-contact

  ; Full name of user
  alias           Nagios Admin

  ; not important if using apprise-contact (defined above)
  email           nagios@localhost
}

Before you advance to the next step, you’ll want to run a test flight check on your configuration and make sure it validates okay.

# Perform a flight check on our new configuration (as root)
sudo nagios -v /etc/nagios/nagios.cfg

If you get any errors, you should revisit the first part of this blog and try to iron them out before continuing. If everything is error free, then the next step is to reload our instance of Nagios (if it’s running) so it can re-read this configuration. This can be done with the command:

# You will need to be root to do this; send a SIGHUP
# to all instances of nagios running in memory:
sudo killall -HUP nagios

Step 3: Apprise Configuration

Now we need to prepare our Apprise configuration (/etc/nagios/apprise.yml) and fill it with the notification services we want listen for and who we want to pass it to.

We can associate with Nagios notifications types passed to us through tags. Nagios will pass these along one of the following $NOTIFICATIONTYPE$ when an event occurs; these are:

  • PROBLEM: There was an issue with one of the checks.
  • RECOVERY: The issue previously set has been cleared.
  • ACKNOWLEDGEMENT: An outstanding issue has been acknowledged.
  • FLAPPINGSTART: Flapping is a state where a service has a PROBLEM associated with it and then moments later has a RECOVERY. This is the state called FLAPPING. When this process occurs too many times in a row, this alert gets set.
  • FLAPPINGSTOP: The service that was previously FLAPPING is no longer doing so.
  • FLAPPINGDISABLED: Someone just disabled FLAPPING for this service/host.
  • DOWNTIMESTART: The scheduled downtime for this service/host has begun.
  • DOWNTIMEEND: The scheduled downtime is over.
  • DOWNTIMECANCELLED: Someone just cancelled the scheduled downtime for this service/host.

Knowing the above notification types that we’ll receive, here is what an Apprise configuration file located at /etc/nagios/apprise.yml might look like:

# This file should be placed in /etc/nagios/apprise.yml

# NOTE: THIS IS JUST AN EXAMPLE CONFIGURATION FILE. YOU WILL WANT
#       TO CUSTOMIZE YOUR OWN WITH THE SERVICE(S) OF YOUR CHOICE
#       VISIT https://github.com/caronc/apprise TO SEE WHAT IS
#       AVAILABLE AND HOW THEY WORK.

# Identify all of the global notification types we want to flag on.
tag:
  - PROBLEM
  - RECOVERY
  - FLAPPINGSTART
  - FLAPPINGSTOP

# Now we want to define our Apprise URLS; you'll want to visit
# https://github.com/caronc/apprise to see all of the supported
# services and how to build their URLs.
urls:

  # Maybe we want to notify a custom service we're hosting to
  # monitor and track Nagios status; Check out the following 
  # for more details https://github.com/caronc/apprise/wiki/Notify_json

  - json://localhost

  # Maybe we want to notify a Slack channel; more details on this
  # are here: https://github.com/caronc/apprise/wiki/Notify_slack
  - slack://T1JJ3T3L2/A1BRTD4JD/TIiajkdnlazkcOXrIdevi7F/#nuxref

  # the Apprise YAML configuration is quite powerful, the
  # following prepares the email URL and sends an email to each
  # user identified below:
  - email://user:password@gmail.com
      - to: george@example.com
      - to: admin@example.com

  # More details on the emails can be found here:
  # https://github.com/caronc/apprise/wiki/Notify_email

  # We can also individually disperse the tags in the same config
  # file.  The below tags will override the globals defined above.
  # A use case for this would be that maybe we just want to
  # send certain notification types to say... the DevOps team:
  - email://user:password@gmail.com
      - to: devops@example.com
        tag: DOWNTIMESTART, DOWNTIMEEND, DOWNTIMECANCELLED

Once you’re file is all ready, be sure this file is readable by Nagios (to keep it away from prying eyes), but otherwise you’re all set and ready to go!

# Here is what one might do to protect this apprise configuration
chmod 640 /etc/nagios/apprise.yml
chown nagios.root /etc/nagios/apprise.yml

Verification

To be sure everything works, you may want to just test that you got all of your configuration right You can test this using manually as follows:

# Test our configuration with apprise using the PROBLEM tag
# -vvv for some verbose debugging in-case we need it.
apprise -c /etc/nagios/apprise.yml \
    -n CRITICAL -g PROBLEM \
    -vvv \
    -t "A Test Title" \
    -b "a Test Body"

Here is a screenshot of a test error displayed on gitter.im that was sent by Nagios using Apprise:
Gitter Example

Sources

Pan: A Useful NewsReader for Linux

Introduction

PAN is a newsreader that has been around for ages. It allows you to sift through the massive clutter that Usenet has become through its really fast interface loaded with tons of features!

It’s development died off way back in 2012, but recently it’s development has picked right back up again. Not only is this product feature rich and open source, but it’s written purely in C++ which makes it incredibly light weight (thus very, very fast). Some of the subtle product enhancements this product has seen in the past few months make it worthy to be in the spot light again.

So What Can It Do?

  • Header Caching: Tell it the group(s) you want and how much of it you want to see and it will download the headers it retrieves to a local cache file. This is awesome because now you can sift through this content offline.
    Cache Headers

    Cache Headers

  • Header Scoring: You can flag key aspects of articles with a score. By default every header retrieved has a score of zero (0) unless you start dabbling in this area.

    Anything that scores less than (or equal to) -9999 can be configured to not list itself at all. Some well set scores can greatly clean up your ability to locate content in groups.

    You can score content higher and/or lower based on the posts author, subject, size, age, etc. You can even apply scoring through regular expressions too!

    Scoring is very powerful when used properly! I’ll talk about it again a bit later in this blog once you’ve gotten set up. But if we were to apply scoring to the previous screenshot (above), it might look like this (all garbage cleaned up and content prioritized with color coding too):

    Header Scoring

    Header Scoring

  • Multiple Server Support: Got a block account? No problem, you can add it as a secondary server and only pull from it if the Primary one is unavailable.
  • NZB-File Support: The treasure maps of Usenet can be loaded into Pan too and downloaded through it. True automation of these come through systems like NZBGet and SABnzbd, but it’s still worth knowing that not only is this a newsreader, but it can pass as a downloader as well!
  • Concurrent Connections: Like any great browser/downloader of any system; files are retrieve concurrently. This means that you can just keep browsing and tagging content of interest seamlessly without interruptions.
  • Header Compression Support: One of the new enhancements surfaced with the new development of this project. This makes a world of difference when retrieving hundreds of thousands of headers from a Usenet group. Enabling this feature along (if your Usenet provider supports it) will greatly reduce wait times!

Pan’s Disabled Features

The features page on PANs website explains about a parent company (called ChimPanXi) that tries to sell this free product with added functionality. I guess the deal they have with the developers is to just disable a few features so that they can be re-enabled them the paid version (purely speculation)?

But since the (Pan) code is open source, the options are right there in front of us but just disabled. Quite honestly… of all this disabled functionality, only one is truly worth pointing out: Pan restricts you to just 4 allowable concurrent connections to your Usenet provider at a time. Here is a small patch I created which increases this number to 99. The build I provide in this blog already has this patch applied. Here are the rest of the missing features (with some of my comments as well); maybe some might see value in the others?

Pans Missing Features

Pans Missing Features

The Goods

For those hooked up to my repository are already set, just type the following:

# install the new version of Pan
yum install pan --enablerepo=nuxref

You can also reference this table too for direct links:

Package Download Description
pan el7.rpm, fc22.rpm, fc23.rpm, fc24.rpm, fc25.rpm The Newsreader: This is the program that this blog focuses on.

Note: The source rpm can be obtained here which builds everything you see in the table above. It’s not required for the application to run, but might be useful for developers or those who want to inspect how I put the package together.

It’s also worth noting (again) that this build includes a small patch to increase the maximum allowable number of concurrent connections from 4 to 99.

Securing Your Connection

There is very little security built into Pan from a connection point of view. What little security is (normally) in place is built using GnuTLS. GnuTLS has a history of not keeping up with the security exploits and vulnerabilities that surface with encryption libraries. It doesn’t make it unsafe; it just doesn’t make it as reliable as it’s competition (OpenSSL and Crypto). For this reason the packages I provide are intentionally not built against it (GnuTLS).

It’s really not a problem at the end of the day because there are other ways of securing this connection (properly). The way I use (and recommend) is through Stunnel.

Stunnel allows you to take an unencrypted input (from Pan) and connect it to a secure connected one (at your Usenet provider). The best thing about stunnel is that it links to your (OpenSSL) shared system libraries libssl.so and libcrypto.so which are actively maintained and patched! Basically what I’m saying is by attaching Pan to Stunnel: you get the feature rich usage of Pan and the ongoing (reliable) security of OpenSSL.

The following will get you set up with stunnel; you’ll want to be root before running the command below:

# Install stunnel (if it's not installed already)
# you'll need to be connected to either EPEL or NuxRef for this
# to work:
yum install stunnel

You can also reference this table too for direct links:

Package Download Description
stunnel el7.rpm, fc22.rpm, fc23.rpm, fc24.rpm, fc25.rpm Secure Tunnel: for data encryption.

Note: This RPM is not required by PAN to run correctly. It does however offer you a safer and more secure method of encrypting your communication to (and from) your NNTP Server.
# You must have root permissions when setting up
# stunnel

# Create relay bound to local server only (semi-colons are for
# comments):
cat << _EOF > /etc/stunnel/stunnel.conf
; Use it for client mode
; This is the pass through mode you need to encrypted
; your NNTP traffic:
client = yes
 
[nntp]
;
; --- IN ---
;
; local port to listen on (on this PC)
; You will configure PAN to connect here:
accept = 127.0.0.1:119

;
; --- OUT ---
;
; The Remote Usenet Server's (encrypted) connection to use:
; In this example, I'm just pointing to Astraweb, but you
; can provide any Usenet server you wish here. Just be sure
; to point it to their secure transport point!
connect = ssl.astraweb.com:563
_EOF

# This line below is useless, but it allows you revisit this blog
# entry and continue and copy and paste these instructions at a later
# time. The line removes any previous entries set to prevent the
# creation of duplicate entries  in your startup file at another time
# It's harmless to run at any point:
sed -i -e '/bin\/stunnel/d' /etc/rc.d/rc.local

# Configure stunnel to start after each boot
echo "# Start /usr/bin/stunnel on boot each time:" >> /etc/rc.d/rc.local
echo "/usr/bin/stunnel" >> /etc/rc.d/rc.local

# By default stunnel is configured to read 
# it's configuration from /etc/stunnel/stunnel.conf
# on startup:
stunnel

The next step is to update your PAN server configuration to point to your local server (localhost or 127.0.0.1) instead of the remote one you’re accessing. Make sure to set the port to 119 too like so:

Stunnel Pan Configuration

Stunnel Pan Configuration


You’ll provide the same username and password you would have otherwise provided to your Usenet provider.

The end result is a secure connection between you and your Usenet provider like so:
Pan Setup With Stunnel

Scoring

Scoring articles can greatly ease your life when looking through all of the headers in front of you; it’s great for:

  • Eliminating SPAM
  • Filtering out potential malicious content (such as Trojans and Viruses)
  • Increasing the visibility of items of interest
  • Locating Authors of interest with ease

All scores can be optionally associated with a time limit too. When the limit expires, so does the score. This is useful when you only want to temporarily filter content. Otherwise the permanent scores will make up most of your configuration. To add a score, simply click Articles > Add a Scoring Rule…

Add Scoring Rule

Add a Scoring Rule

Here is an example of a rule you might add; this one greatly reduces the score of all entries that have potentially dangerous file extensions in the subject line:

Block Potentially Malicious Content

Block Potentially Malicious Content


Pan’s built in filter field allows you to sift through all of the articles you found with keywords. Pairing this functionality with the scoring one really shows off the power of Pan.

All created scores are kept in ~/.pan2/Scores so don’t worry if you mess one up. You can just as easily open this file and fix it. Any manual changes to this file will however require you to exit out of Pan (if it’s open) and restart it.

Here is just a few entries of what you might have in your Score file:

%BOS
% Greatly reduce score of potentially malicious content
[alt.bin*]
Score:: -9999
Subject: .*\.(exe|bat|vbs|cpl|msi|scr|vb(script)?|ws(f|h))[^A-Za-z0-9].*
%EOS

%BOS
% Moderately increase the score of compressed content
[alt.bin*]
Score:: 2500
Subject: .*\.(z(ip|[0-9]{2})|r(ar|[0-9]{2})|7z|iso)[^A-Za-z0-9]([0-9]{3}[^A-Za-z0-9])?.*
%EOS

%BOS
% Very slightly decrease the content of PAR content
% This allows it to not quite have the same spot light as
% the item it matches up against. If it were a compressed file
% it would already have +2500 from the previous score entry
% identified above.  These will just sit at +2400 instead.
[alt.bin*]
Score:: -100
Subject: .*(\.vol[0-9]+\+[0-9]+)?\.(par2|sfv)[^A-Za-z0-9].*
%EOS

%BOS
% Very slightly increase the score of NZB-Files
[alt.bin*]
Score:: 250
Subject: .*\.(nzb)[^A-Za-z0-9].*
%EOS 

%BOS
% Mildly drop the score of cross-posted content
[alt.bin*]
Score:: -750
Xref: (.*:){2} % cross-posted to 2 or more groups 
%EOS

Wrapping It Up

I’m certainly not asking anyone to change from their existing system if it works for them. What I am pointing out though is that Pan is completely free, it’s open source and the features it offers are comparable (if not better) than all of it’s competition. Although it works great on Linux, it also works on many other platforms as well such as Microsoft and Apple.

It might not have a beautiful interface, but it wasn’t built to fill your systems memory with bloated eye candy. It was built to be fast and effective… and truly, it really is.

The newer versions coming out are really great! If you haven’t given it a try since it’s dated ones, you really should! If you’re interested in seeing how Usenet is structured, than this is also a great tool to learn with. If you run an indexer (such as newznab or the many forks of it) you can practice your regular expressions (regexs) using Pan. For an Indexer Admin, this tool is especially great in debugging your regexs!

Credit

This blog took me a very long time to put together and test! The repository hosting alone accommodates all my blog entries up to this date. All of the custom packaging described here was done by me personally. I took the open source available to me and rebuilt it to make it an easier solution and decided to share it. If you like what you see and wish to copy and paste this HOWTO, please reference back to this blog post at the very least. It’s really all I ask.

Sources

AWStats Setup on CentOS

Introduction

AWStats is a great tool for gathering statistics about your website. It acquires everything it needs to know about your site strictly through your websites log files. AWStats is able to scan through these logs line by line and present them in a fantastic report. This report can really help you make strategic decisions going forward as well as spot any anomalies that might be taking place. The tool is smart enough to only scan newer log entries (from when it last ran) allowing you to run it again and again (as often as you want). Thus, once you set this tool up to run daily (or even hourly), you’ll have detailed statistics about your website you can call upon anytime.

AWStats collects information such as such as:AWStats Report

  • Who is visiting your site.
  • How many visitors you’re getting daily.
  • Where are they’re coming from (did a site link to you?)
  • Where is the visitor from (geographical location
  • … and on and on
  • The presentation of these collected statistics can be either via a website (HTML), XML and/or as a PDF file. The PDF is especially useful since it combines all of the multiple HTML pages (as presented) into one great big report with a table of contents and hyperlinks throughout it! The PDF is also really easy to navigate and pass along to others who might also be interested.

    Why Use AWStats over Google Analytics?

    Google Analytics Inaccuracy

    Google Analytics Inaccuracy

    The number one reason is because AWStats is much (,much) more accurate! AWStats also just works without ‘any’ changes to your website (literally – none at all). Google Analytics however requires you to add a small piece of JavaScript to every web page you want to track. Every time this tiny bit of JavaScript code executes, it passes the information along to Google. The problem is… if that little snippet of JavaScript doesn’t execute, then Google doesn’t track that user (and you’ll never know) because it just won’t get reported.

    It’s really easy to prevent this chunk of JavaScript from running too, you just have to have installed something like Ad-Blocker Plus, Disconnect and/or uBlock into your Web Browser (such as Firefox or Chrome). These plugins specifically block these tracking techniques and eliminate most (if not all) advertising the website might have too.

    It doesn’t mean that online analytic tools (like Google Analytics) are not good; no, not at all! But it’s just important to understand that they can’t (and truly aren’t) reporting everything that’s going on with your website and the traffic generated from it.

    Another point worth mentioning is that Google Analytics can not monitor and report statistics on traffic used by third party tools. Therefore you can’t use it to monitor any RESTful API services because the programs accessing it will never call these JavaScript snippets of code.

    It’s worth pointing out now that if you use AWStats, you’ll have the full picture! You’ll be able to easily identify any anomalies and detect certain forms of malicious intent! You’ll be able to monitor all of your internal (web based) services you may manage. From the public standpoint, you might be very surprised at how much more traffic your website is getting despite what online analytic tools will tell you!

    Let’s Get Started

    First you’ll want to install the proper packages. You should hook up to my repository and the EPEL repository as well! The EPEL repository hosts AWStats too, but mine is a newer version. We need the EPEL repository for it’s GeoIP packages since they get updated more often there:

    # CentOS 7 users can connect to EPEL this way:
    rpm -Uhi https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
    
    # Similarly, you can hook up to my repository at http://nuxref.com
    # but here is a quick way of doing it (for CentOS/RedHat 7):
    rpm -Uhi http://repo.nuxref.com/centos/7/en/x86_64/custom/nuxref-release-1.0.0-4.el7.nuxref.noarch.rpm
    

    You should be good to go now; the following installs AWStats and a few extra tools to get the best out of it:

    # install awstats
    # install htmldoc too because it'll allow you to create a pdf
    # install geoip-geolite for the ability to track the IPs
    #        to countries
    # install perl-Geo-IP to look up the IP Addresses
    yum install awstats htmldoc geoip-geolite perl-Geo-IP
    
    

    AWStat In a Nutshell

    The steps below will require that you have set up the environment defined below. Obviously you’ll want to change these environment variables to suite your own needs:

    # First define our website as a variable.
    # We will use this value to track and store in an
    # organized structure.
    
    # Those who host other websites for people can
    # change this and virtually everything below will
    # and re-run everything to get stats for that too!
    WEBSITE=nuxref.com
    
    # AWSTATS Variable Data
    DATADIR=/var/lib/awstats/$WEBSITE
    
    

    Configuring AWStats: Step 1 of 3

    AWStats works from configuration files you create in /etc/awstats/. But it also needs a directory it can work within (we use /var/lib/awstats/). I’ve provided documentation around each line so you know what’s going on:

    # Make sure our environment variables are defined
    # WEBSITE and DATADIR
    
    # First we need to setup our DATADIR; this is where
    # all our statistics and generated data will be placed
    # into:
    [ ! -d $DATADIR/static ] && \
        mkdir -p $DATADIR/static
    ln -snf /usr/share/awstats/wwwroot/icon \
       $DATADIR/static/icon
    ln -snf /usr/share/awstats/wwwroot/cgi-bin \
       $DATADIR/static/cgi-bin
    
    # Create a configuration file using our website
    # based on the awws.model.conf example file that
    # ships with AWStats
    sed -e "s|localhost\.localdomain|$WEBSITE|g" \
    	/etc/awstats/awstats.model.conf > \
    		/etc/awstats/awstats.$WEBSITE.conf
    
    ########################################
    # Now update our new configuration
    ########################################
    # Update the LogFile with our access.log file we'll
    # reference. This path doesn't exist yet but
    # we'll be creating it soon enough; leave this entry
    # untouched (don't change it to your real log path!):
    sed -i -e "s|^\(LogFile\)=.*$|\1=\"$DATADIR/access.log\"|g" \
       /etc/awstats/awstats.$WEBSITE.conf
    
    # Disable DNS (for speed mostly)
    sed -i -e "s|^\(DNSLookup\)=.*$|\1=0|g" \
       /etc/awstats/awstats.$WEBSITE.conf
    
    # For PDF Generation we need to update the relative
    # paths for the icons.
    sed -i -e "s|^\(DirIcons\)=.*$|\1=\"icon\"|g" \
       /etc/awstats/awstats.$WEBSITE.conf
    sed -i -e "s|^\(DirCgi\)=.*$|\1=\"cgi-bin\"|g" \
       /etc/awstats/awstats.$WEBSITE.conf
    
    

    Optionally Configuring GeoIP Updates

    The geolite data fetches us a great set of (meta) data we can reference when looking up IP Addresses (of people who visited our site) and determining what part of the world they came from. This information is fantastic when putting together statistics and web page traffic like AWStats does.

    First we want to configure AWStats to use the GEO IP Plugin:

    # Now configure our GEOIP Setup
    sed -i -e '/^LoadPlugin=.*/d' /etc/awstats/awstats.$WEBSITE.conf
    cat << _EOF >> /etc/awstats/awstats.$WEBSITE.conf
    LoadPlugin="geoip GEOIP_STANDARD /usr/share/GeoIP/GeoIP.dat"
    LoadPlugin="geoip_city_maxmind GEOIP_STANDARD /usr/share/GeoIP/GeoIPCity.dat"
    _EOF
    

    Next we want to set up our GEO IP to update itself with the latest meta data for us automatically (so we don’t have to worry about it):

    # downloads all of the latest GEO IP content to
    # /usr/share/GeoIP with this simple command:
    geoipupdate
    
    # This IP information changes often; so the next
    # thing you want to do is create a cronjob to have
    # this tool fetch regular updates automatically for
    # us to keep the GEO IP Content fresh and up to date!
    cat << _EOF > /etc/cron.d/geoipdate
    0 12 * * 3 root /usr/bin/geoipupdate &>/dev/null
    _EOF
    

    Apache Users: Step 2a of 3

    AWStats depends on the log files to build it’s statistics from, so it’s important we point it to the right directory. Apache logs have been pretty much standardized and AWStats just works with them. If your web page is being hosted through Apache then your log files are most likely being placed in /var/log/httpd. If you’re using NginX (and not Apache), you can skip over this section and to Step 2b of 3 instead.

    Make sure AWStats knows it’s dealing with Apache log files (make sure you’ve still got the $WEBSITE variable defined from above):

    # Make sure our environment variables are defined
    # WEBSITE and DATADIR
    ########################################
    # Apache Users Should run The Following
    ########################################
    # Now if you're logs are created from Apache you
    # need to run the following:
    # Log Format (Type 1 is for Apache)
    sed -i -e "s|^\(LogFormat\)=.*$|\1=1|g" \
       /etc/awstats/awstats.$WEBSITE.conf
    
    

    Now what we want to do is take all of the logs files associated with our website in /var/log/httpd and build one great big (sorted) log file we can get all of our statistics out of:

    # logresolvmerge.pl is a fantastic tool that ships with
    # awstats and merges (and sorts) all of our logs. We
    # place the output into our $DATADIR (which we declared
    # earlier):
    /usr/share/awstats/tools/logresolvemerge.pl \
       /var/log/httpd/access.log \
       /var/log/httpd/access.log-????????.gz \
        > $DATADIR/access.log
    
    

    Nginx Users

    NginX logs have a slightly different format then the Apache logs and therefore require a slightly different configuration to work. If your web page is being hosted through NginX then your log files are most likely being placed in /var/log/nginx. If you’re using Apache (and not NginX), then you can skip over this section as long as you’ve already done Step 2a of 3 instead.

    Make sure AWStats knows it’s dealing with NginX log files otherwise it won’t be able to interpret them. Also be sure to have your $WEBSITE variable defined:

    # Make sure our environment variables are defined
    # WEBSITE and DATADIR
    ########################################
    # NginX Users Should run The Following
    ########################################
    # If you're using NginX, you'll want to adjust
    # your awstat LogFormat entry as follows:
    sed -i -e "s|^\(LogFormat\)=.*$|\1=\"%host %other %logname %time1 %methodurl %code %bytesd %refererquot %uaquot\"|g" \
       /etc/awstats/awstats.$WEBSITE.conf
    

    Now we take all of the logs files associated with our website in /var/log/nginx and build one great big (sorted) log file we can get all of our statistics out of:

    # logresolvmerge.pl is a fantastic tool that ships with
    # awstats and merges (and sorts) all of our logs. We
    # place the output into our $DATADIR (which we declared
    # earlier):
    /usr/share/awstats/tools/logresolvemerge.pl \
       /var/log/nginx/access.log \
       /var/log/nginx/access.log-????????.gz \
        > $DATADIR/access.log
    

    Statistic Generation: Step 3 of 3

    At this point we have all the info we need

    # Make sure our environment variables are defined
    # WEBSITE and DATADIR
    ########################################
    # (Create) and/or Update our Stats
    ########################################
    /usr/share/awstats/wwwroot/cgi-bin/awstats.pl \
       -config=$WEBSITE
    
    # The following builds us a PDF file containing all
    # of our statistics in addition to a website we can
    # optionally host if we want.
    # The following would allow you to gather statistics for
    # a given year:
    #   /usr/share/awstats/tools/awstats_buildstaticpages.pl \
    #      -config=$WEBSITE -buildpdf \
    #      -month=all -year=$(date +'%Y') \
    #      -dir=$DATADIR/static \
    #      -buildpdf=/usr/bin/htmldoc
    
    # This will build statistics with all the information we have:
    /usr/share/awstats/tools/awstats_buildstaticpages.pl \
       -config=$WEBSITE -buildpdf \
       -dir=$DATADIR/static \
       -buildpdf=/usr/bin/htmldoc
    
    # - The main website will appear as:
    #      $DATADIR/static/awstats.$WEBSITE.html
    #    But this 'main' website links to several other websites
    #    that can also all be found in the $DATADIR/static
    #    directory
    # - The pdf file will appear as:
    #      $DATADIR/static/awstats.$WEBSITE.pdf
    

    Consider throwing the above into a script file and having it ran in a cron job!

    Hosting The Statistics

    This option is purely optional; but but here is some simple configurations you can use if you want to access these generated statistics from your browser.

    Note: I intentionally keep things simple in this section. AWStats can be configured so that you can update your statistics via it’s very own website (see AllowToUpdateStatsFromBrowser directive in the site configuration). However I don’t recommend this option and therefore do not document it below.

    NginX

    A simple NginX configuration might look like this:

    # Make sure our environment variables are defined
    # WEBSITE and DATADIR
    cat << _EOF > /etc/nginx/default.d/awstats.$WEBSITE.conf
       # Visit your statistics by browsing to:
       # if WEBSITE was equal nuxref.com, you'd visit the stats:
       # http://localhost/stats/nuxref.com/
       location /stats/$WEBSITE/ {
          alias   $DATADIR/$WEBSITE/static/;
          index  awstats.$WEBSITE.html;
    
          ## Set 1.2.3.4 to your own IP address and uncomment
          ## the entries below to 'only' allow yourself access to
          ## these stats:
          # allow 1.2.3.4/32;
          # deny all;
    
          location /stats/css/ {
              alias /usr/share/awstats/wwwroot/css/;
          }
    
          location /stats/icon/ {
              alias /usr/share/awstats/wwwroot/icon/;
          }
       }
    _EOF
    

    Don’t forget to reload NginX so it takes on your new configuration (and makes that statistics page visible):

    # Reload NginX
    systemctl reload nginx.service
    

    Apache

    # Make sure our environment variables are defined
    # WEBSITE and DATADIR
    cat << _EOF > /etc/httpd/conf.d/awstats.$WEBSITE.conf
       # Visit your statistics by browsing to:
       # if WEBSITE was equal nuxref.com, you'd visit the stats:
       # http://localhost/stats/nuxref.com/
       Alias /stats/$WEBSITE/ "$DATADIR/$WEBSITE/static/"
       <Directory "$DATADIR/$WEBSITE/static/">
          Options FollowSymLinks
          AllowOverride None
          Order allow,deny
          Allow from all
    
          ## Set 1.2.3.4 to your own IP address and uncomment
          ## the entries below to 'only' allow yourself access to
          ## these stats:
          # Order deny,allow
          # Deny from all
          # Allow from 1.2.3.4/255.255.255.255
       </Directory>
    _EOF
    

    Don’t forget to reload Apache so it takes on your new configuration (and makes that statistics page visible):

    # Reload Apache
    systemctl reload httpd.service
    

    Credit

    This blog took me a long time to put together and test! The repository hosting alone accommodates all my blog entries up to this date. I took the open source available to me and rebuilt it to make it an easier solution and decided to share it. If you like what you see and wish to copy and paste this HOWTO, please reference back to this blog post at the very least. It’s really all I ask.

    Sources