Technical Blog Post
Abstract
Managing Unplanned Downtime with IBM PowerHA SystemMirror SE 7.2
Body
IBM® PowerHA® technology provides high availability, business continuity and disaster recovery, with near-continuous application availability with advanced failure detection, failover and recovery capabilities, which offers robust performance along with a simplified user interface for managing, monitoring, and configuring multi-node clusters.
Table of Content
-
Start
-
Stop
-
Monitor
-
Install & Configure
-
Prerequisites
-
References
Start
To start the cluster and all resources that are managed by the cluster are brought online:
# clmgr start cluster
Stop
To stop the cluster on all nodes
# clmgr offline cluster
To stop the cluster but let the services continue running:
# clmgr offline cluster MANAGE=unmanage
Monitor
To check the status of a cluster:
# clmgr query cluster | grep STATE # clmgr -a STATE query cluster To check the status of resource groups:
# clmgr query resource_group # clRGinfo -v
To check the status of the cluster manager:
# lssrc -ls clstrmgrES
To check the configuration:
# clmgr query <cluster CLASS object to query> # cllsif # cllscf # cllsdisk -g <Resource Group> # cllsvg -g <Resource Group> # cllsserv # cllsres # cllsgrp # cllsfs # clshowres # clshowsrv # cltopinfo
Install & Configure
To create a basic dual node PowerHA SystemMirror v7.2.2 Standard Edition cluster, with single resource group with one application controller, one service IP and one shared volume group of SAN LUN hdisks:
- (optional) Install the PowerHA SystemMirror GUI server.
- Configure the AIX LPAR cluster nodes.
- Zone and mask the cluster shared SAN LUNs to all WWPNs for all the nodes.
- Create the cluster.
- Add nodes.
- Add repository disk(s).
- Add service IP.
- Add application controller.
- Add resource group.
- Synchronize cluster.
- Test and verify cluster functionality.
Install the PowerHA SystemMirror GUI server
With PowerHA SystemMirror v7.2.2 Standard Edition it is recommend to install the GUI server on a separate small AIX LPAR.
If the AIX LPAR do not have Internet access, download and put the per-requisite RPMs in /var/hacmp/log/smuiinst.downloads/ directory (below is for PowerHA SystemMirror 7.2.2, check the script for future levels):
http://www.bullfreeware.com/download/bin/1255/info-4.13-3.aix5.3.ppc.rpm http://www.bullfreeware.com/download/bin/2191/cpio-2.11-2.aix6.1.ppc.rpm http://www.bullfreeware.com/download/bin/1267/readline-6.2-2.aix5.3.ppc.rpm http://www.bullfreeware.com/download/bin/1260/libiconv-1.13.1-2.aix5.3.ppc.rpm http://www.bullfreeware.com/download/bin/1231/bash-4.2-5.aix5.3.ppc.rpm http://www.bullfreeware.com/download/bin/1250/gettext-0.17-6.aix5.3.ppc.rpm http://www.bullfreeware.com/download/bin/2287/libgcc-4.9.2-1.aix6.1.ppc.rpm http://www.bullfreeware.com/download/bin/2295/libgcc-4.9.2-1.aix7.1.ppc.rpm http://www.bullfreeware.com/download/bin/2289/libstdc++-4.9.2-1.aix6.1.ppc.rpm http://www.bullfreeware.com/download/bin/2297/libstdc++-4.9.2-1.aix7.1.ppc.rpm
Install the RPMs:
# /usr/es/sbin/cluster/ui/server/bin/smuiinst.ksh -i
The options are:
-D // for debug
-d // only do the download
-i // only do the install
-f // force
-p <proxy url> // proxy
-P <secure proxy url> // secure proxy
-h // help
-v // verbose
Install the GUI server:
# loopmount -i POWERHA_SYSTEMMIRROR_V7.2.2_SE.iso -o "-V cdrfs -o ro" -m /mnt # installp -gYXd /mnt/smui_server/cluster.es.smui.server_722 cluster.es.smui.server cluster.es.smui.common
Login to the GUI server with a web browser
https://HostName:8080/#/login
Configure the AIX LPAR cluster nodes
Add all cluster node IP addresses and symbolic names to the /etc/hosts file
Add all cluster node boot IP addresses (or symbolic names from /etc/hosts file) to:
- /usr/es/sbin/cluster/etc/rhosts
- /etc/cluster/rhosts
Add filesystems to NFS export, for both cluster nodes or non-cluster servers, to:
- /usr/es/sbin/cluster/etc/exports
Such as:
- /userdata -sec=sys,rw,access=lpar99,root=lpar99 // if root access
Enable CRITICAL VG on each LPARs rootvg:
# chvg -r y rootvg
Enable poll_uplink on each PowerVM VIOS backed network interface device:
# chdev -a poll_uplink=yes -l ent0 -P
Clear PV information from the repository disk, on one of the cluster nodes LPARs:
# chdev -a pv=yes -l hdisk33
Check the AIX level:
# oslevel -s
Check and install per-requsites for PowerHA SystemMirror:
- bos.cluster.rte
- rsct.compat.basic.hacmp
- rsct.compat.clients.hacmp
Install PowerHA SystemMirror:
# loopmount -i POWERHA_SYSTEMMIRROR_V7.2.2_SE.iso -o "-V cdrfs -o ro" -m /mnt # installp -gYXd /mnt/installp/ppc cluster.es.server // and additional filesets as required, such as the GUI Agent cluster.es.smui.common, or for cluster NFS also cluster.es.nfs.rte
Install any PowerHA SystemMirror fixes and check level:
# halevel -s
Zone and mask the cluster shared SAN LUNs to all WWPNs for all the nodes
- IF using NPIV, zone and mask to each LPARs NPIV adapters both WWPNs, for all cluster nodes on all Managed Systems
- IF using VSCSI, zone and mask to each VIOS physical FC adapters WWPN, for all VIOS on all Managed Systems
Check that all the shared LUN hdisks devices are available on all cluster nodes (compare UUID):
# lspv -u
Check that all the shared LUN hdisks devices have no_reserve set on all cluster nodes:
# lsdev -Cc disk -Fname|xargs -i lsattr -Pl {} -a reservation_policy // check to ensure no_reserve is set on shared cluster disks # lsdev -Cc disk -Fname|xargs -i chpv -Pl {} -a reservation_policy=no_reserve // change to no_reserve on shared cluster disks for next boot/load Check that all the hdisks have no_reserve set on all cluster nodes:
# lsdev -Cc disk -Fname|xargs -i devrsrv -c query -l {} // check current locking on shared cluster disks, ensure they are not locked Create the cluster
Ensure the PATH environment variable is set and exported, such as: export PATH=$PATH:/usr/es/sbin/cluster/utilities
# clmgr add cluster CL1 nodes=LPAR1 HEARTBEAT_TYPE=unicast
Add nodes
# clmgr add node LPAR2
Add repository disk(s)
# clmgr add repository hdisk33
Add service IP
Ensure the IP and hostname is available in /etc/hosts on all cluster nodes:
# clmgr add service_ip CL1SRVIP1
Add application controller
Ensure the the application start, stop (and monitor) scripts are available on all cluster nodes, and have execute permission (chmod +x <script>):
# clmgr add application_controller CL1AC1 startscript=/usr/local/pbin/startAC1.sh stopscript=/usr/local/pbin/stopAC1.sh
IF not using IBM supported SmartAssist scripts, ensure the custom scripts exit in a controlled way and that each execution is checked for success, if the custom scripts exit with non-zero exit code manual intervention is required to clear the config_too_long issue this can result in, such as with clruncmd LPAR1 (if the issue occurred on cluster node LPAR1).
Add resource group
Create the resource group and connect with the service IP and application controller:
# clmgr add rg CL1RG1 nodes=LPAR1,LPAR2 service_label=CL1SRVIP1 application=CL1AC1
Creating a resource group including a volume group:
# clmgr add rg CL1RG1 nodes=LPAR1,LPAR2 service_label=CL1SRVIP1 volume_group=cl1vg1 application=CL1AC1
Synchronize cluster
Synchronize the cluster and allow fixing any configuration issues on all nodes:
# clmgr sync cluster FIX=YES
Create a configuration snapshot
# clmgr add snapshot CL1$(date +"%Y%m%d") # clsnapshotinfo
Test and verify cluster functionality
Start by moving the resource group between the cluster nodes, and check that the service IP is assigned properly, that the application is started, that all volume groups with filesystems and any NFS exports or NFS mounts are properly in place after each move, and that the hacmp.out logfile and other logs do not show any error conditions.
# for i in 2 1 2 1 2 1 2 1 2;do clRGmove -g CL1RG1 -n LPAR$i; for j in 1 2 3 4 5 6 7 8 9;do clRGinfo; sleep 45; done; done // 45s delay between moves
Or with the clmgr command:
# for i in 2 1 2 1 2 1 2 1 2;do clmgr move resource_group CL1RG1 node=LPAR$i; for j in 1 2 3 4 5 6 7 8 9;do clmgr query resource_group CL1RG1; sleep 45; done; done
Prerequisites
Review the PowerHA SystemMirror for AIX Version Compatibility Matrix TECHDOC at
PowerHA SystemMirror 7.2.2 (5765-H39) minimum AIX levels:
- AIX 7100-04 and 7100-05
- AIX 7200-00, 7200-01 and 7200-02
Review the PowerHA SystemMirror Known Fixes at
Such as for AIX AIX 7200-02 APAR IJ04268 http://www-01.ibm.com/support/docview.wss?uid=isg1IJ04268
References
clmgr command
clmgr command: Quick reference
PowerHA SystemMirror commands
IBM PowerHA SystemMirror Version 7.2.2 for AIX documentation
PowerHA SystemMirror graphical user interface (GUI)
https://www.ibm.com/support/knowledgecenter/en/SSPHQG_7.2.2/com.ibm.powerha.gui/ha_gui_kickoff.htm
Installing PowerHA SystemMirror GUI
PowerHA SystemMirror systems monitoring and recovery - Video
PowerHA SystemMirror Cloud Enabled Administration - Video
PowerHA Release Notes PowerHA SystemMirror 7.2.2
- https://www.ibm.com/support/knowledgecenter/SSPHQG_7.2.2/com.ibm.powerha.navigation/releasenotes.htm
IBM Support PowerHA SystemMirror new support experience
PowerHA SystemMirror FLRT Lite
PowerHA SystemMirror for AIX Version Compatibility Matrix TECHDOC
PowerHA SystemMirror Known Fixes Information
Fix Level Recommendation Tool
PowerHA SystemMirror Technology level update images at Entitled Systems Support
5765-H39 = "PowerHA for AIX Standard Edition", feature 2322
5765-H37="PowerHA SystemMirror Enterprise Edition", feature 2323
PowerHA/CAA Tunable Guide
PowerHA SystemMirror Forums
LinkedIn:
DeveloperWorks:
QA Forum:
AIX Vulnerability Checker:
AIX HIPER APARs:
UID
ibm16165123
