Monitoring Linux Systems with Nagios

17 February 2016

I don’t blog over here very much, partly because I try to keep the blog at Develop CENTS updated on a regular basis (although admittedly, I still don’t even blog over there nearly as often as I should). My topics on this website are more personal in nature, including my feelings on public policy (NSA Surveillance, anyone?), requests for public help (I’m looking for some missing family wedding photos taken in Germany in 1946), and posts on computer security that wouldn’t be a great fit for the Develop CENTS blog.

Nagios is extremely versatile, and can monitor just about anything. I first tasted Nagios when I worked as an Operations Intern for Acquia, a Drupal services company in Boston. This was after I spent a year in AmeriCorps working with a Boston nonprofit as a web developer and one of their server administrators.

In today’s post, I’m going to share some of my accumulated knowledge in using Nagios to monitor the infrastructure we manage through Develop CENTS. I recently (in December 2015) gave a presentation to the ChaDevOps Meetup Group on a Basic Introduction to Nagios. You can view all of my workshops & presentations at https://developcents.com/knowledge-base/#past-workshops.

Up until recently, I only used Nagios to monitor public services (namely, does a URL properly load, and is the server responsive to ICMP pings). Within the last 2 months, I’ve expanded my basic Nagios implementation to using NRPE for monitoring server load, memory usage, and postfix mail queues on various servers.

The Setup

I run all of my infrastructure on CentOS. Most of the servers I manage are running either CentOS 6 or 7, although I still have a couple legacy CentOS 5 machines under my control. Instead of compiling Nagios from source (who wants to maintain that?), I’ve opted to use the EPEL repository.

Here’s my setup:

  • EPEL Repo (For CentOS 7, you can install it with `rpm -iUvh http://ftp.linux.ncsu.edu/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm`)

  • After you do a `yum install nagios nagios-plugins-all nagios-nrpe`, you can find the relevant Nagios files as follows:
    • Main config and conf.d directory is in /etc/nagios/
    • Plugins are located in /usr/lib64/nagios/plugins
    • NRPE config is at /etc/nagios/nrpe.conf

The Monitoring

Here’s some of the things that I’m monitoring:

  • Checking for correct DNS values on various hosts
    • check_dns -H host [-s server] [-a expected-address] [-A] [-t timeout] [-w warn] [-c crit] — http://nagios-plugins.org/doc/man/check_dns.html
    • This doesn’t require NRPE, and is a simple check from the monitoring server. Here’s my service definition:

      define service{     host_name ns1.developcents.com     service_description DNS Check     check_command check_dns!ns1.developcents.com     contact_groups admins     max_check_attempts 3     check_interval 10     retry_interval 5     check_period 24×7     notification_interval 30     notification_period 24×7}

  • Checking to see if server load is reasonable
    • check_load [-r] -w WLOAD1,WLOAD5,WLOAD15 -c CLOAD1,CLOAD5,CLOAD15 — http://nagios-plugins.org/doc/man/check_load.html
    • This does require NRPE. Here’s my service definition on the monitoring server:

      define service{ host_name mail.developcents.com service_description Server Load contact_groups admins check_command check_nrpe!check_load check_interval 4 retry_interval 1 max_check_attempts 3 check_period 24×7 notification_period 24×7 }

    • And here’s my NRPE command (found in nrpe.conf) on the server that is being monitored:

      command[check_load]=/usr/lib64/nagios/plugins/check_load -w 15,10,5 -c 30,25,20

  • Checking the Mail Queue to make sure it’s not clogged
    • This is a 3rd party plugin not included in the default nagios-plugins-all package provided by EPEL. The plugin information is at https://exchange.nagios.org/directory/Plugins/Email-and-Groupware/Postfix/check_postfix_queue/details.
    • Here’s my service definition on the monitoring server:

      define service{ host_name mail.developcents.com service_description Mail Queue contact_groups admins check_command check_nrpe!check_queue check_interval 4 retry_interval 1 max_check_attempts 3 check_period 24×7 notification_period 24×7 }

    • And here’s my NRPE command (again, note that this goes into nrpe.conf on the server that is actually being monitored):

      command[check_queue]=/usr/lib64/nagios/plugins/check_postfix_queue -w 15 -c 30

I hope that this information is useful to someone! You can also find some of my Nagios-related questions & answers on ServerFault and StackOverflow:

  • My Question and answer on how to monitor URLs: http://stackoverflow.com/questions/9246557/monitoring-urls-with-nagios/
  • My Question and answer on how to monitor hosts with check_ping: http://stackoverflow.com/questions/26746404/nagios-monitoring-hosts-with-check-ping
  • My Answer to How to run a check from the CLI: http://serverfault.com/questions/339968/how-can-i-manually-run-a-nagios-check-from-the-command-line/339969#339969 (See my answer)

Want to share some of your Nagios knowledge? Leave me a comment.

Want me to help you with your Nagios – or other sysadmin – needs? Get a hold of me through Develop CENTS.

FacebookGoogle+DiggDeliciousTwitterGoogle BookmarksShare This Post

Please support TN HB1303 / SB1134

Published on March 31, 2015

This is a letter I wrote to my Tennessee state Congressman and state Senator last week that I wanted to make open.

Representative Gravitt and Senator Gardenhire,

My name is David White, and I live in your district. I’m the Founder of Develop CENTS , an IT consulting company that works with nonprofits and small businesses. We’re a member of the Chattanooga Area Chamber of Commerce. Thank you for your service in the State Legislature.

I’m asking for your support of TN HB1303 / SB1134: “AN ACT to amend Tennessee Code Annotated, Title 7, Chapter 52, relative to additional authorization to provide broadband and Internet services.”

As you know, Chattanooga is becoming a world leader in technology services and technology infrastructure. The job growth and economic success our city sees is a direct result of EPB’s investment in high-speed internet.

Huge internet service providers like Comcast provide low-quality services, and their customer support isn’t any better. Without competition coming from organizations such as EPB, these services won’t improve for everyone.

This improvement is hindered as a direct result of Tennessee law that currently blocks EPB’s broadband expansion.

As an IT consultant, I work with several nonprofit organizations, and I see first hand how beneficial a reliable broadband connection can be. Businesses, nonprofit organizations, and individuals deserve a choice, and often times there are only subpar choices available.

EPB’s expansion would improve these choices, increase market competition, and increase the overall quality of internet for thousands of Tennesseans.

I urge you to support the bill in your respective Chamber.

Sincerely,
David White