In my article on Ajax security tools (see Resources), I suggested some application-strengthening tools, including Firefox tools and addons, that you can use to improve or solve security problems within your Ajax applications. In a different article on speeding up your Ajax applications while dodging Web services vulnerabilities, I showed what Web services vulnerabilities are and why Service Level Agreements are important, and suggested some tools for speeding up applications on your network.
In this article, I focus on Nagios, an open source host, service, and network monitoring program for your Ajax applications. I discuss how you can quickly install and start Nagios, access CGIs, and monitor hosts and services. I also show you how you can monitor redundancy and failover, detect and handle state flapping, and solve security and performance issues.
Next on my list of topics are the core addons, such as NRPE, NASA, and NDOUtils, that you need to use with the Nagios program. Finally, I give some samples you can look at regarding Nagios-based products in automation, environmental monitoring, and enterprise management solutions.
Get Nagios on Fedora
To get started, look for the quickstart guides for Fedora 6, OpenSUSE, and Ubuntu (see Resources for links) on the Nagios Web site. If you want guides for other operating systems and Linux® distributions, go to the Nagios Community and click User-Contributed Documentation in the left navigation. If you still cannot find the documentation you want, you can modify the Fedora code that I give you in this article, so you can install and configure Nagios on a non-Fedora system.
Before you install Nagios, use yum to install Apache, the CC compiler, and the GD development libraries. Installing Nagios automatically creates the /usr/local/nagios directory to store the plug-ins, and configures Nagios to monitor the CPU load, disk usage, memory usage, and a few other aspects of your local system. After you have completed the installation successfully, you will be able to access Nagios at http://localhost/nagios/.
You'll want to begin in Nagios by creating an account. To create accounts, you must first make yourself the root user and then create a new nagios user account and give it a password. Listing 1 shows the code to create an account.
Listing 1. Create user account
su -l /usr/sbin/useradd nagios passwd nagios
Create a new nagcmd group to allow external commands to be submitted through the Web interface. Add both the nagios user and the apache user to this group, as shown in Listing 2.
Listing 2. Create new group
/usr/sbin/groupadd nagcmd /usr/sbin/usermod -G nagcmd nagios /usr/sbin/usermod -G nagcmd apache
To store the downloads, first create a directory like shown in Listing 3.
Listing 3. Create new directory
mkdir ~/downloads cd ~/downloads
Navigate to the Nagios Web site and download Nagios from this site. Then, extract the Nagios source code tarball, as shown in Listing 4.
Listing 4. Extract source code
cd ~/downloads tar xzf nagios-3.0.2.tar.gz cd nagios-3.0.2
Run the Nagios configure script next, passing the name of the group you created. Then, compile the Nagios source code, as shown in Listing 5.
Listing 5. Run and compile code
./configure -with-command-group=nagcmd make all
Install binaries, init script, sample config files, and set permissions on the external command directory, as shown in Listing 6.
Listing 6. Install binaries and set permissions
make install make install-init make install-config make install-commandmode
The next step is to edit the /usr/local/nagios/etc/objects/contacts.cfg config file
and change the e-mail address associated with the nagiosadmin contact
definition to the address you'd like to use for receiving alerts. Then, install the
Nagios Web config file in the Apache conf.d directory by typing
Create admin account
Create a nagiosadmin account for logging into the Nagios Web interface. Restart Apache to make the new settings take effect, as shown in Listing 7.
Listing 7. Create account and restart Apache
htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin service httpd restart
Extract the Nagios plug-ins source code tarball and then compile and install the plug-ins, as shown in Listing 8.
Listing 8. Extract and install plug-ins
./configure --with-nagios-user=nagios --with-nagios-group=nagios make make install
Add Nagios to the list of system services and have it automatically start when the system boots, as shown in Listing 9.
Listing 9. Add Nagios to system services
./configure --with-nagios-user=nagios --with-nagios-group=nagios chkconfig --add nagios chkconfig nagios on
Next, verify the sample Nagios configuration files. Finally, if there are no errors, start Nagios. Listing 10 shows this process.
Listing 10. Verify configuration files and start Nagios
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg service nagios start
You are not required to access CGIs. However, if you attempt to access them, you will get "Internet Server Error" messages. That's because Fedora ships with SELinux (Security Enhanced Linux) installed and in Enforcing mode by default.
It takes two steps to fix the problem. First, check if SELinux is in Enforcing mode. Then, put SELinux into Permissive mode, as shown in Listing 11.
Listing 11. Check SELinux and put into Permissive mode
Getenforce Setenforce 0
To make this change permanent, you'll have to modify the settings in /etc/selinux/config and reboot.
Alternatively, run the CGIs under SELinux enforcing/targeted mode as shown in Listing 12.
Listing 12. Run CGIs under SELinux mode
chcon -R -t httpd_sys_content_t /usr/local/nagios/sbin/ chcon -R -t httpd_sys_content_t /usr/local/nagios/share/
When you are done, you can access the Nagios Web interface at http://localhost/nagios/.
Make sure your machine's firewall rules are configured to allow access to the Web server if you want to access the Nagios interface remotely.
After you get Nagios installed and running properly, you'll no doubt want to start monitoring more than just your local machine (your monitoring host). One way of monitoring a remote Linux/UNIX™ host is to use the NRPE addon that allows you to monitor disk usage, CPU load, memory usage, and other local resources/attributes on the remote host. See Resources for a list of monitoring links.
You'll most likely want to monitor Windows® machines, Netware servers, routers/switches, network printers, and publicly available services (HTTP, FTP, SSH, and so on).
Monitor redundancy and failover
With redundant hosts, you can maintain the ability to monitor your network when the primary host that runs Nagios fails, or when portions of your network become unreachable, which could impact SLA guarantees. Before you implement the redundancy monitoring tool, make sure you implemented event handlers for hosts and services, issued external commands to Nagios, executed NRPE addons on remote hosts, and checked the status of the Nagios process with the check_nagios plug-in. You will need to modify sample scripts in the eventhandlers subdirectory of the Nagios distribution.
In one redundancy implementation scenario, the master and slave hosts monitor the same hosts and service on the network. Under normal circumstances, only the master host will be sending out notifications to contacts about problems. The slave host running Nagios will take over the job of notifying contacts about problems if the master host is down or stops running Ajax applications.
Just make sure the lag time between the master host failing and the slave host taking over is minimal. You can do this by having, for example, the master host recheck the slave host to allow for fast detection of host problems.
The basic goal of failover monitoring is to have the Nagios process on the slave host sit idle while the Nagios process on the master host is running. If the process on the master host stops running (or if the host goes down), the Nagios process on the slave host starts monitoring everything.
Detect and handle state flapping
Flapping occurs when a service or host changes state too frequently, resulting in a storm of problem and recovery notifications. Flapping can be indicative of configuration problems (such as thresholds set too low), troublesome services, or real network problems impacting SLA guarantees.
A host or service is determined to have started flapping when its percent state change first exceeds a high flapping threshold. A host or service is determined to have stopped flapping when its percent state goes below a low flapping threshold (assuming that is was previously flapping).
For both hosts and services, there are global high and low thresholds and host- or service-specific thresholds that you can configure. Nagios will use the global thresholds for flap detection if you do not specify host- or service-specific thresholds. To enable flapping detection, you'll need to set flap_detection directives to 1.
Some security measures you should consider are to use a dedicated monitoring service to install Nagios for your Ajax applications, and make sure only the Nagios users read or write in the check result directory. Do not run Nagios as a root.
If you are using external commands, make sure you set proper permission in the /user/local/nagios/var/rw directory. You'll need to require authentication to CGIs and use full paths in the command definition.
Don't forget to hide sensitive information with $USERn$ macros, and secure access to remote agents. Encrypt communication channels between Nagios installations and between Nagios servers and your monitoring agents. Also important is the stripping of dangerous characters from macros before they are used in notifications.
This section discusses some things to consider when you attempt to optimize Nagios to improve server performance. First, disable environment macros, adjust buffer slots, and check service latencies to determine the best value for maximum concurrent checks. Use compiled—not interpreted—plug-ins, schedule regular host checks, and enable cached host checks.
Next, optimize hardware for maximum performance, and set the maximum time that the Nagios daemon can spend processing the results of host and service checks. Most important of all, take advantage of graph performance statistics with the Multi Router Traffic Grapher (MRTG—see Resources for a link) to keep track of how well your Nagios installation handles the load over time and how your configuration changes affect it.
Get Nagios addons
Nagios comes with three core addons: NRPE, NDOUtils, and NSCA. While they give you the basic command-line options, you can add other options as listed in the Nagios Plugin Manual. See Resources for links to both the addons and manual.
The NRPE addon is designed to let you execute Nagios plug-ins on remote Linux/UNIX machines. NRPE can check remote services on other hosts through ftp and http. From the monitoring host, Nagios can monitor the CPU, disk usage, memory usage, and other local resources on remote machines.
Because these public resources are not usually exposed to external machines, NRPE must be installed on the remote machines. It allows you to execute scripts and check metrics on remote Windows machines.
While using SSH is more secure than the NRPE addon, SSH imposes a larger (CPU) overhead on both the monitoring and remote machines. This can become an issue when you start monitoring hundreds or thousands of machines. Many Nagios administrators opt for using the NRPE addon because of the lower load it imposes.
The NDOUtils addon lets you export current and historical data of configurations and events from one or more Nagios instances to a MySQL database. Storing information from Nagios in a database will allow for quicker retrieval.
The NSCA addon is installed on the monitory host, and lets you integrate passive alerts and checks from remote machines and applications with Nagios. This is useful for processing security alerts as well as redundant and distributed Nagios setups.
Look at product samples
Websensor is a digital environmental monitoring device capable of monitoring temperature, relative humidity, illumination (light level), DC voltage, and contact closure. It comes with a plug-in to allow you to monitor environmental readings with Nagios.
WebReboot is a device that allows you to restart, power-on, or power-off a server remotely without needing physical access to your server. It's useful for recovering from server lock-ups, BSODs, virus infections, and unexpected power outages.
Opengear Management Gateways and Console Servers can help you detect a problem in your IT infrastructure. It enables you to rectify the problem by providing you with secure access to and control of all servers, routers, switches, and power devices in your remote data centers.
To get information on other products, go to the Nagios Web site.
This article helps you to plan ahead to improve the monitoring and performance of your Ajax applications with Nagios, an open source host, service, and network program on remote servers. Because network performance is critical not only to developers, but also to testers, system administrators, and potential users, being aware of and resolving potential performance and environmental monitoring issues can make your development team's and users' experiences trouble-free.
- Explore the OASIS Consortium.
- The Work with Web services in enterprise-wide SOA series by Judith Myerson offers information on how to work with Web services in enterprise-wide SOAs.
- Browse Judith Myerson's series, Use SLAs in a Web services context for details on service-level agreements.
- Nagios come with three core addons: NRPE, NDOUtils, and NSCA. While they give you the basic command line options, you can add other options as listed in the Nagios Plugin Manual.
- You'll most likely want to monitor Windows machines, Netware servers, routers/switches, network printers, and publicly available services (HTTP, FTP, SSH and so on).
- Want more information on Ajax tools? Read about them in "Survey of Ajax tools and techniques" (developerWorks, July 2007).
- Use MRTG (the Multi Router Traffic Grapher) to keep track of how well your Nagios installation handles load over time and how your configurations changes affect it.
- Want more information on Nagios? Read manuals and notes in Nagios' Web site and Nagios Community.
- Read Judith M. Myerson's The Complete Book of Middleware, which focuses on the essential principles and priorities of system design and emphasizes the new requirements brought forward by the rise of e-commerce and distributed integrated systems.
- To get started, look for the quickstart guides for:
- Get the business insight and the technical know-how to ensure successful systems integration by reading Enterprise Systems Integration, Second Edition.
- Bring your organization into the future with RFID in the Supply Chain, which explains business processes, operational and implementation problems, risks, vulnerabilities, and security and privacy.
- Go into the nuts and bolts of developing a service-level agreement in this IBM Redbook for Domino administrators.
- Visit the technology bookstore for books on these and other technical topics.
- Want more on Web services? The developerWorks SOA and Web services zone hosts hundreds of informative articles and introductory, intermediate, and advanced tutorials on how to develop Web services applications.
- Check out the Ajax Resource Center, your one-stop shop for information on the Ajax programming model, including articles and tutorials, discussion forums, blogs, wikis, events, and news. If it's happening, it's covered here.
Get products and technologies
- See how IBM Rational ClearQuest and IBM Rational Functional Tester Plus can help when developing Ajax and other applications. These tools from IBM help increase your productivity by reducing testing time and the costs of test labs in your enterprise.
- IBM trial products for download: Build your next development project with IBM trial software, available for download directly from developerWorks.
- developerWorks blogs: Get involved in the developerWorks community.