ganglia

This page has not been liked. Updated 7/11/13, 3:49 PM by naggerTags: None

Ganglia Performance Monitoring tools on POWER

Sorry this web page did not survive the transfer from the older AIX wiki.

It was actually pretty old and needed updating - oh well, now we have too.

In the mean time, you can take a look at the following:


What is Ganglia?

Ganglia is an excellent Open Source tool for graphing via a web server large numbers of machines running various operating system. The home webpage is full of information and worth a look http://ganglia.info/. The Source-Forge page for the open source code at http://ganglia.sourceforge.net/ or Ganglia at Wikipedia Ganglia page for a general description. We are not going to reproduce that content here.



Briefly: It offers a very small light weight data collector, simple collection of the stats to a central machine, storage of the data in fixed size RRDTOOL databases and excellent and flexible graphs at the machine level or a summary at cluster level. Adding new config details or new stats takes seconds and automatic graphing of the data.



The ganglia system consists of:

  1. Two unique daemons:
    • Ganglia Monitoring Daemon (gmond) the monitoring daemon, collects the metrics which runs on each node
    • Ganglia Meta Daemon (gmetad) that polls all gmond clients and stores the collected metrics in Round-Robin Databases (RRDs) via RRDTool
  2. A PHP-based web frontend
  3. A few other small utility programs
    • gmetric - that can be used to easily extend Ganglia with additional user-defined metrics
    • gstat
    • gexec

Please note: “Cluster” is used here as a “logical term”!

 


How does Ganglia monitor whole Power Systems machines?

Power Systems can use the standard Ganglia releases but benefits from extra stats specific for POWER based machine and there have been created and released as add-ons to Ganglia by IBMers working in their own time. Chief among these is Michael Perzl in Germany. In the case of monitoring whole machines initially, we made every POWER machine a cluster in Ganglia and that worked well. However, with Live Partition Mobility (LPM) where logical partition (LPAR) / Virtual machines (VM) can jump between machines we new use Ganglia views to get the POWER machine view.

 


POWER additional stats for Ganglia

Clean in stall of AIX

# oslevel -s
7100-01-06-1241

Additional stats are collected using Dynamic extensions. Briefly these are:

  1. mod_ibmpower (AIX and Linux): The Power5/6/7 extensions (23 metrics) - config file: /etc/ganglia/conf.d/ibmpower.conf
    1. capped
    2. cpu_entitlement
    3. cpu_in_lpar
    4. cpu_in_machine
    5. cpu_in_pool
    6. cpu_pool_id
    7. cpu_pool_idle
    8. cpu_used
    9. disk_read
    10. disk_write
    11. disk_iops
    12. fwversion
    13. kernel64bit
    14. lpar i.e. yes/no
    15. lpar_name
    16. lpar_num
    17. modelname
    18. oslevel
    19. serial_num
    20. smt
    21. splpar
    22. weight
  2. mod_ibmrperf(AIX and Linux) The Basics  - Note: just using colour below to highlight groupings:
    • IBM rPerf and SPEC CPU2006 extensions (5 metrics) - config file: /etc/ganglia/conf.d/ibmrperf.conf
  3. mod_ibmame(AIX only):
    • Power7 Active Memory Expansion (AME) extensions (11 metrics) - config file: /etc/ganglia/conf.d/ibmame.conf
  4. mod_ibmams(AIX and Linux):
    • Power6/7 Active Memory Sharing (AMS) extensions (9 metrics) - config file: /etc/ganglia/conf.d/ibmams.conf
  5. mod_ibmfc (AIX only):
    • Individual Fibre Channel devices (maximum of 4 metrics per single device) - config file: /etc/ganglia/conf.d/ibmame.conf
  6. mod_ibmnet (AIX only):
    • Individual Ethernet devices (maximum of 4 metrics per single device) - config file: /etc/ganglia/conf.d/ibmnet.conf
  7. mod_netif (Linux only):
    • Individual Ethernet devices (maximum of 4 metrics per single device) - config file: /etc/ganglia/conf.d/ibmnet.conf (Linux)
  8. mod_aixdisk (AIX only):
    • Individual hard disk devices (maximum of 20 metrics per single device) - config file: /etc/ganglia/conf.d/aixdisk.conf
  9. mod_linuxdisk (Linux only):
    • Individual hard disk devices (maximum of 11 metric per single device) - config file: /etc/ganglia/conf.d/linuxdisk.conf

Downloading and installing Apache 2 (httpd)

Packages downloaded from ftp://www.oss4aix.org/latest/aix71/  this is the repository found from http://perzl.org

# ls -1 *.rpm
apr-1.4.6-1.aix5.2.ppc.rpm
apr-util-1.5.1-1.aix5.1.ppc.rpm
apr-util-ldap-1.5.1-1.aix5.1.ppc.rpm
bash-4.2-12.aix5.1.ppc.rpm
bzip2-1.0.6-1.aix5.1.ppc.rpm
db4-4.7.25-2.aix5.1.ppc.rpm
expat-2.1.0-1.aix5.1.ppc.rpm
gettext-0.10.40-8.aix5.2.ppc.rpm
httpd-2.4.4-1.aix5.1.ppc.rpm
info-5.0-1.aix5.1.ppc.rpm
libiconv-1.14-2.aix5.1.ppc.rpm
openldap-2.4.23-0.3.aix5.1.ppc.rpm
openssl-1.0.1e-2.aix5.1.ppc.rpm
pcre-8.32-1.aix5.1.ppc.rpm
readline-6.2-4.aix5.1.ppc.rpm
zlib-1.2.7-2.aix5.1.ppc.rpm

Installed in one go:

# rpm -Uvh *.rpm

apr                         ##################################################

apr-util                    ##################################################

apr-util-ldap               ##################################################

bash                        ##################################################

bzip2                       ##################################################

db4                         ##################################################

expat                       ##################################################

gettext                     ##################################################

Group "apache" does not exist.

User "apache" does not exist.

httpd                       ##################################################

warning: /opt/freeware/info/dir created as /opt/freeware/info/dir.rpmnew

info                        ##################################################

Please check that /etc/info-dir does exist.

You might have to rename it from /etc/info-dir.rpmsave to /etc/info-dir.

libiconv                    ##################################################

openldap                    ##################################################

warning: /var/ssl/openssl.cnf saved as /var/ssl/openssl.cnf.rpmorig

openssl                     ##################################################

pcre                        ##################################################

readline                    ##################################################

zlib                        ##################################################

You will find the Apache control file here: /opt/freeware/etc/httpd/conf/httpd.conf

And the Default webpages here: /var/www/htdocs

I would recommend making that directory a file system.

Or changing the control file to refer to a file system perhaps: /webpages to make it very clear.

Look for: DocumentRoot "/var/www/htdocs"

Downloading and installing Apache 2 mod_php

This gets PHP support for the webserver

Packages downloaded from ftp://www.oss4aix.org/latest/aix71/  this is the repository found from http://perzl.org and run the following command:

rpm -Uvh mod_php_ap24-5.4.13-1.aix5.1.ppc.rpm \
php-common-5.4.13-1.aix5.1.ppc.rpm \
curl-7.27.0-1.aix5.1.ppc.rpm \
gd-2.0.35-5.aix5.1.ppc.rpm \
libmcrypt-2.5.8-2.aix5.1.ppc.rpm \
libtool-ltdl-1.5.26-2.aix5.1.ppc.rpm \
libXpm-3.5.10-2.aix6.1.ppc.rpm \
libxml2-2.9.0-1.aix5.1.ppc.rpm \
t1lib-5.1.2-1.aix5.1.ppc.rpm \
freetype2-2.4.11-1.aix5.1.ppc.rpm \
libjpeg-9-1.aix5.1.ppc.rpm \
libpng-1.6.1-1.aix5.1.ppc.rpm \
libidn-1.26-1.aix5.1.ppc.rpm \
libssh2-1.4.3-1.aix5.1.ppc.rpm \
fontconfig-2.8.0-2.aix5.1.ppc.rpm \
libtool-1.5.26-2.aix5.1.ppc.rpm \
xorg-compat-aix-1.1-1.aix5.1.ppc.rpm \
xz-libs-5.0.4-1.aix5.1.ppc.rpm \
lzma-libs-4.32.7-1.aix5.1.ppc.rpm \
automake-1.13.1-1.aix5.1.ppc.rpm \
autoconf-2.69-1.aix5.1.ppc.rpm \
sed-4.2.2-1.aix5.1.ppc.rpm \
grep-2.14-1.aix5.1.ppc.rpm \
pkg-config-0.28-1.aix5.1.ppc.rpm \
m4-1.4.16-1.aix5.1.ppc.rpm \
glib2-2.36.0-1.aix5.1.ppc.rpm \
libsigsegv-2.10-1.aix5.2.ppc.rpm \
libffi-3.0.13-1.aix5.1.ppc.rpm \
libgcc-4.7.2-1.aix7.1.ppc.rpm

mod_php_ap24                ##################################################
Please restart your web server using: '/opt/freeware/sbin/apachectl restart'
php-common                  ##################################################
curl                        ##################################################
gd                          ##################################################
libmcrypt                   ##################################################
libtool-ltdl                ##################################################
libXpm                      ##################################################
libxml2                     ##################################################
t1lib                       ##################################################
freetype2                   ##################################################
libjpeg                     ##################################################
libpng                      ##################################################
libidn                      ##################################################
libssh2                     ##################################################
fontconfig                  ##################################################
libtool                     ##################################################
xorg-compat-aix             ##################################################
xz-libs                     ##################################################
lzma-libs                   ##################################################
automake                    ##################################################
autoconf                    ##################################################
sed                         ##################################################
grep                        ##################################################
pkg-config                  ##################################################
m4                          ##################################################
libgcc                      ##################################################
libffi                      ##################################################
glib2                       ##################################################
libsigsegv                  ##################################################

Downloading and installing rrdtool

Downloaded from perzl.org

libart_lgpl-2.3.21-1.aix5.1.ppc.rpm

rrdtool-1.2.30-3.aix5.1.ppc.rpm

# rpm -Uvh rrdtool-1.2.30-3.aix5.1.ppc.rpm libart_lgpl-2.3.21-1.aix5.1.ppc.rpm
rrdtool                     ##################################################
libart_lgpl                 ##################################################
#
# rrdtool
RRDtool 1.2.30  Copyright 1997-2008 by Tobias Oetiker <tobi@oetiker.ch>
               Compiled Jul  4 2011 09:47:49

Usage: rrdtool [options] command command_options

Valid commands: create, update, updatev, graph, dump, restore,
                last, lastupdate, first, info, fetch, tune,
                resize, xport

RRDtool is distributed under the Terms of the GNU General
Public License Version 2. (www.gnu.org/copyleft/gpl.html)

For more information read the RRD manpages

#

Excellent we have the pre-reqs for Ganglia

 

 

BELOW I FAILED TO INSTALL THE LATEST rrdtool 1.4.7 - We are still working on this as there is many more pre-reqs and some fail to install

 

Download from the same place

atk-1.32.0-1.aix5.1.ppc.rpm
cairo-1.12.14-1.aix5.1.ppc.rpm
dejavu-lgc-sans-mono-fonts-2.33-1.aix5.1.noarch.rpm
dejavu-sans-mono-fonts-2.33-1.aix5.1.noarch.rpm
gtk2-2.20.1-2.aix5.1.ppc.rpm
jasper-1.900.1-2.aix5.1.ppc.rpm
jbigkit-libs-2.0-2.aix5.1.ppc.rpm
libXrender-0.9.7-2.aix6.1.ppc.rpm
libart_lgpl-2.3.21-1.aix5.1.ppc.rpm
libcroco-0.6.5-1.aix5.1.ppc.rpm
libdatrie-0.2.4-1.aix5.1.ppc.rpm
libdbi-0.8.4-1.aix5.1.ppc.rpm
librsvg2-2.34.2-1.aix5.1.ppc.rpm
libthai-0.1.18-1.aix5.1.ppc.rpm
libtiff-4.0.3-1.aix5.1.ppc.rpm
libxcb-1.7-1.aix5.1.ppc.rpm
lzo-2.06-1.aix5.1.ppc.rpm
pango-1.24.5-1.aix5.1.ppc.rpm
pixman-0.28.2-1.aix5.1.ppc.rpm
rrdtool-1.4.7-2.aix5.1.ppc.rpm

Install

# rpm -Uvh *
atk                         ##################################################
libxcb                      ##################################################
lzo                         ##################################################
pixman                      ##################################################
cairo                       ##################################################
dejavu-lgc-sans-mono-fonts  ##################################################
dejavu-sans-mono-fonts      ##################################################
libdatrie                   ##################################################
libthai                     ##################################################
libXrender                  ##################################################
pango                       ##################################################
var/opt/freeware/tmp/rpm-tmp.8005: 6356998 Memory fault(coredump)
execution of pango-1.24.5-1 script failed, exit status 139
jbigkit-libs                ##################################################
libtiff                     ##################################################
jasper                      ##################################################
gtk2                        ##################################################
/opt/freeware/bin/update-gdk-pixbuf-loaders[13]: 6357000 Memory fault(coredump)
/opt/freeware/bin/update-gtk-immodules[13]: 6357002 Memory fault(coredump)
execution of gtk2-2.20.1-2 script failed, exit status 139
libart_lgpl                 ##################################################
libcroco                    ##################################################
libdbi                      ##################################################
librsvg2                    ##################################################
rrdtool                     ##################################################

Darn a few post install script errors so check it they ended up installed

# rpm -qa | grep pango
pango-1.24.5-1
# rpm -qa | grep gtk2
gtk2-2.20.1-2

# rrdtool
Memory fault(coredump)

Download and Install Ganglia for AIX

The software required for the AIX or PowerLinux LPARs/VMs is simple but the data repository and Webserver needs quite a lot of software.

Ganglia rrdtool host and webserver - the central host

Apache 2.4, PHP and rrdtool installed above.

Now start to install Ganglia itself - first the gmeta daemon that collects the gmond stats from the cluster and records the data in the rrdtool databases.

# ls -1
ganglia-gmetad-3.4.0-1.aix5.3.ppc.rpm
ganglia-gmond-3.4.0-1.aix5.3.ppc.rpm
ganglia-lib-3.4.0-1.aix5.3.ppc.rpm
libconfuse-2.7-1.aix5.1.ppc.rpm
# rpm -Uvh *
ganglia-gmetad              ##################################################
ganglia-gmond               ##################################################
ganglia-lib                 ##################################################
libconfuse                  ##################################################
#

Now we have to decide the amount of data to hold in the rrdtool databases.

I decided to go with the default so edited: vi /etc/ganglia/gmetad.conf

Changed and my machine is called gold

data_source "my cluster" localhost

to

data_source "gold" localhost

and uncommented (removed the "#" at the start of the line) the following line

  • RRAs "RRA:AVERAGE:0.5:1:5856" "RRA:AVERAGE:0.5:4:20160" "RRA:AVERAGE:0.5:40:52704"

The comments in the files say this is 5856 15 second data points = the last 24 hours) then two weeks at 1 minute and then a years worth at 10 minutes. This sounded a good compromise to me - more data points means more data and more CPU time generating graphs at run time.

 

Next make sure the directory for rrdtool files is owned by user "nobody" and start gmetad with:

# chown -R nobody:nobody /var/lib/ganglia/rrds
# /etc/rc.d/init.d/gmetad restart
Shutting down GANGLIA gmetad daemon... done.
Starting GANGLIA gmetad... done.
# /etc/rc.d/init.d/gmetad status
GANGLIA gmetad daemon is running with PID 8847424.
#

Next we install the Apache front end part of Ganglia and sort out file permmions for the Apache Webserver can access the direcctories and files as we install as root user.

# rpm -Uvh ganglia-gweb-3.5.4-1.aix5.1.noarch.rpm
# chown -R pache:apache /var/www/htdocs/ganglia
# mkdir /var/lib/ganglia/dwoo/compiled
# mkdir /var/lib/ganglia/dwoo/compiled
# chown -R apache:apache /var/lib/ganglia

Now you can use your web browser to http://<your-hostname>/ganglia

There is no data collection yet, so the graphs are empty but we are "good to go" to the next part.

 

Next is running gmond: vi /etc/ganglia/gmond.conf

Changed

cluster {
  name = "unspecified"
  owner = "unspecified"
  latlong = "unspecified"
  url = "unspecified"
}

to

cluster {
  name = "gold"
  owner = "unspecified"
  latlong = "unspecified"
  url = "unspecified"
}

Start gmond collecting data for the current node and supplying it to gmetad

/etc/rc.d/init.d/gmond restart

Now wait a minute and check the gmetad and gmond processes are running: ps -ef | grep gm

Assuming they are refresh the web browser page and you should see:



 

Wow!! we have some data appearing - now its best to leave it for say 10 minutes so you have graphs that are slowing looking more impressive.

 

 

 

Ganglia monitored LPAR/VM - each node of the machine

 


Set-up for AIX

 


Screen shots of sample graphs on Power Systems