Monitoring an HA cluster queue manager on AIX and Linux
It is usual to provide a way for the high availability (HA) cluster to monitor the state of the queue manager periodically. In most cases, you can use a shell script for this. Examples of suitable shell scripts are given here. You can tailor these scripts to your needs and use them to make additional monitoring checks specific to your environment.
- Use the crtmqenv command from an IBM WebSphere MQ 7.1 installation to identify the correct MQ_INSTALLATION_PATH for a queue manager:
This command returns the correct MQ_INSTALLATION_PATH value for the queue manager specified by qmname.crtmqenv -m qmname
- Run the monitoring script with the appropriate qmname and MQ_INSTALLATION_PATH parameters.
#!/bin/ksh
su mqm -c name_of_monitoring_script qmname MQ_INSTALLATION_PATH
where MQ_INSTALLATION_PATH is an optional parameter that specifies
the path to the installation of IBM MQ that the queue
manager qmname is associated with. The following script is not robust to the possibility that runmqsc hangs. Typically, HA clusters treat a hanging monitoring script as a failure and are themselves robust to this possibility.
The script does, however, tolerate the queue manager being in the starting state. This is because it is common for the HA cluster to start monitoring the queue manager as soon as it has started it. Some HA clusters distinguish between a starting phase and a running phase for resources, but it is necessary to configure the duration of the starting phase. Because the time taken to start a queue manager depends on the amount of work that it has to do, it is hard to choose a maximum time that starting a queue manager takes. If you choose a value that is too low, the HA cluster incorrectly assumes that the queue manager failed when it has not completed starting. This could result in an endless sequence of failovers.
This script must be run by the mqm user; it might therefore be necessary to wrap this script in a shell script to switch the user from the HA cluster user to mqm (an example shell script is provided in Example shell scripts for starting an HA cluster queue manager on AIX and Linux ):
#!/bin/ksh
#
# This script tests the operation of the queue manager.
#
# An exit code is generated by the runmqsc command:
# 0 => Either the queue manager is starting or the queue manager is running and responds.
# Either is OK.
# >0 => The queue manager is not responding and not starting.
#
# This script must be run by the mqm user.
QM=$1
MQ_INSTALLATION_PATH=$2
if [ -z "$QM" ]
then
echo "ERROR! No queue manager name supplied"
exit 1
fi
if [ -z "$MQ_INSTALLATION_PATH" ]
then
# No path specified, assume system primary install or MQ level < 7.1.0.0
echo "INFO: Using shell default value for MQ_INSTALLATION_PATH"
else
echo "INFO: Prefixing shell PATH variable with $MQ_INSTALLATION_PATH/bin"
PATH=$MQ_INSTALLATION_PATH/bin:$PATH
fi
# Test the operation of the queue manager. Result is 0 on success, non-zero on error.
echo "ping qmgr" | runmqsc ${QM} > /dev/null 2>&1
pingresult=$?
if [ $pingresult -eq 0 ]
then # ping succeeded
echo "Queue manager '${QM}' is responsive"
result=0
else # ping failed
# Don't condemn the queue manager immediately, it might be starting.
srchstr="( |-m)$QM *.*$"
cnt=`ps -ef | tr "\t" " " | grep strmqm | grep "$srchstr" | grep -v grep \
| awk '{print $2}' | wc -l`
if [ $cnt -gt 0 ]
then
# It appears that the queue manager is still starting up, tolerate
echo "Queue manager '${QM}' is starting"
result=0
else
# There is no sign of the queue manager starting
echo "Queue manager '${QM}' is not responsive"
result=$pingresult
fi
fi
exit $result