High Availability is a key component of business resiliency. It is widely documented that outages increase the total cost of IT ownership, as well as causing potential damage to client relationships and loss of revenue. The IBM Power Systems strategy is to not only deliver more advanced functional capabilities for business resiliency but to enhance product usability and robustness through deep integration with AIX and affiliated software stack technologies.
PowerHA for AIX®, part of the branding strategy for Power Systems, is the flagship Power Systems HA package for UNIX® environments. PowerHA is well recognized as a robust and mature high availability product, because it supports a wide variety of configurations and provides the cluster administrator with a great deal of flexibility. PowerHA is designed by IBM to provide integrated resiliency for AIX environments. For over a decade, PowerHA has been providing reliable monitoring, failure detection, and automated fail over for business application environments.
In the past, PowerHA/HACMP was extremely difficult to configure and manage. Through the years it has gotten much easier to manage. Part of the reason for this is the PowerHA/HACMP Smart Assist programs, which are available for enterprise applications such as DB2, Oracle and WebSphere. Smart Assist simplifies implementation and configuration of PowerHA in DB2, Oracle and WebSphere Application Server environments by deploying application specific knowledge. It provides all of the necessary application monitors and start-stop scripts to streamline configuration process. But, all these Smart Assists scripts validate the installation of the application before configuring them under PowerHA/HACMP, and these validations are based on a few assumptions. If these validation checks fail, Smart Assists may fail to deploy them. This article helps resolve common WebSphere Application Server Smart Assist Deployment problems.
PowerHA/XD and PowerHA Smart Assist features provide additional automated data backup, disaster recovery, and database environment configuration assistance to help protect your business. All of the facilities of PowerHA are available for and with the IBM System p Capacity on Demand offerings. This enables users to configure clusters that are scalable and to expand easily the clusters' CPU and memory capacity as the need arises, without having to pay upfront for unused hardware.
PowerHA Smart Assist for WebSphere extends an existing PowerHA configuration to include monitoring and recovery support for the following components:
- WebSphere Application Server
- WebSphere Deployment Manager
- WebSphere Transaction Log
- IBM HTTP Server
- Tivoli Directory Server
Please note that this article uses terms PowerHA and HACMP interchangeably and assumes PowerHA V5.5 is running on the cluster.
For PowerHA Smart Assist for WebSphere to manage the availability of the WebSphere components, it must be installed on a shared volume group. PowerHA Smart Assist for WebSphere does not move WebSphere components from their original installation location. For each WebSphere Deployment Manager, IBM HTTP Server, and WebSphere Application Server, users should plan a shared volume group to contain the application files. Note that WebSphere Application Server Smart Assist does not create PowerHA configurations to support WebSphere Application Servers installed in an ND environment. The user can take advantage of the HA cluster capabilities that are built into WebSphere Network Deployment.
This article assumes that you have a two-node cluster with Node1 as the primary node and Node2 as the failover/fallover/takeover/stand-by/secondary node, as shown below in Figure 1 below. It shows that a two-node cluster exists on Node1 and Node2. As mentioned above, components like WebSphere Deployment Manager and WebSphere Application Server should be installed on shared volume groups on primary node, Node1. If the application becomes unavailable, PowerHA starts the application on takeover/fail over node of the cluster to continue service.
Figure 1. Sample PowerHA cluster
PowerHA will create a resource group to protect specified application and will perform the following tasks:
- Creates a PowerHA application server for the application.
- Ensures that the application has a service IP label that can be transferred to another system, thus keeping the application highly available.
- Creates PowerHA application monitors to detect failures in application processes.
- Provides start and stop scripts for the application (that is, application servers).
- Stores the generated PowerHA configuration in the PowerHA configuration database (ODM).
When PowerHA Smart Assist for WebSphere creates the PowerHA resource configuration, it updates the PowerHA configuration and changes the IP address (for IBM HTTP Server, WebSphere Application Server, and WebSphere Deployment Manager), or the Transaction Log directory (for the WebSphere cluster). So that if application becomes unavailable, PowerHA starts the application on takeover node (for example Node2) to continue service.
Before getting into the details, check out WebSphere Application Server detailed system requirements on AIX and refer to Smart Assist for WebSphere Application Server User's Guide for different configurations of WebSphere Application Server components supported by PowerHA.
Basic configuration steps
These are the basic steps to be followed. Before getting into the details, refer to Smart Assist for WebSphere User's Guide.
- Install the prerequisite software, like X-Server browser.
- Make sure a shared volume group (VG) is available for each WebSphere Application Server component installation (like one VG for IBM HTTP Server and one for WebSphere Application Server). It is recommended to create shared VGs using PowerHA-CSPOC even though these can be created using general AIX commands.
To create volume group using C-SPOC
smitty hacmp -> System Management(C-SPOC) -> PowerHA Logical Volume Management -> Shared Volume Groups -> Create a Shared Volume Group.
Select the nodes appropriately to create a VG.
- Create a file system (FS) with required name.
- If the VG and FS were created using CSPOC, the definition will be imported to all the cluster nodes automatically. If AIX commands were used, these definitions should be imported to other cluster nodes using the same major number.
- Install WebSphere Application Server on one of the shared VG(wasvg). After installation, start Firstssteps and check if it was properly installed.
- Install IBM HTTP Server on the shared disk/VG (ihsvg) of fallover/failover/standby/standalone node or on the same node depending on the requirement. Install the required plug-ins for IBM HTTP Server and WebSphere Application Server as required.
- Install PowerHA on all the cluster nodes and configure minimal cluster. Please note that this article assumes PowerHA V5.5 to be running on the cluster.
- Add the WebSphere Application Server components to PowerHA, as shown below.
To configure WebSphere Application Server applications
smitty hacmp -> Initialization and Standard Configuration -> Configuration Assistants -> Make Applications Highly Available (Use Smart Assists) -> Add an Application to the PowerHA Configuration
PowerHA will discover the nodes, where WebSphere Application Server components are installed and will show the node names against WebSphere Smart Assist in the selector screen, as in Figure 2 below.
Figure 2. WebSphere Smart Assist selector screen
Make sure that the node names are displayed properly against WebSphere Smart Assistant.
Once WebSphere Smart Assist is selected, PowerHA automatically discovers the WebSphere components installed on the nodes and displays a selector screen with discovered components. Assuming the base WebSphere Application Server was discovered in sample cluster, PowerHA would display the component selector screen, as seen in Figure 3.
Figure 3. Component selector screen
Next, PowerHA displays a selector screen with discovered WebSphere Application Server nodes. Users have to select one from the list, as shown in Figure 4 below.
Figure 4. WebSphere Application Server node selector screen
Finally, a dialog screen would be displayed where user have to specify takeover node and service interface.
Figure 5. Attributes of chosen application
After all of the required fields are filled, press Enter. WebSphere Application Server configuration is integrated with PowerHA. Initiate verify and sync from this node. The rest of the nodes detect this integration.
For concerns about PowerHA not detecting WebSphere Application Server or any other issue, refer to the section below, Trouble shooting WebSphere Application Server integration with PowerHA.
Troubleshooting WebSphere Application Server Integration with PowerHA
Even with proper installation and configuration, if WebSphere Application Server Smart Assist fails to discover WebSphere Application Server applications running on the node, or fails to integrate WebSphere Application Server with PowerHA, that may not be culprit. Sometimes, users might forget simple steps like mounting file systems, varying on VG, before running WebSphere Application Server Smart Assist. Some of the problems that you might encounter with WebSphere Application Server Smart Assist are described in the following section.
Problem 1: WebSphere Application Server Smart Assist is unable to detect or discover applications (WebSphere Application Server or IBM HTTP Server)
The solution here is to make sure the VGs of WebSphere Application Server and IBM HTTP Server are varied on and the
file systems are mounted where the WebSphere Application Server or IBM HTTP Server applications are running. If the
discovery is failing on all nodes other than the node on which discovery is run,
clcomd is running on all the cluster nodes.
Enter this command to ensure that
#lssrc -s clcomdES
clcomd, enter the following:
#startsrc -s clcomdES
clcomd, enter the following:
#stopsrc -s clcomdES
Make sure the output of the command lists all WebSphere Application Server discovery scripts as shown below:
#odmget -q "name=DISCOVERY_COMMAND" HACMPsa HACMPsa: sa_id = "zzOther" component_id = "GASA" name = "DISCOVERY_COMMAND" value = "/usr/es/sbin/cluster/sa/gasa/sbin/discovery" reserved = 0 HACMPsa: sa_id = "WAS_6.0" component_id = "WAS_6.0_TIVOLI_LDAP" name = "DISCOVERY_COMMAND" value = "/usr/es/sbin/cluster/sa/was/sbin/cl_wassatdsquery" reserved = 0 HACMPsa: sa_id = "WAS_6.0" component_id = "WAS_6.0_IHS_SERVER" name = "DISCOVERY_COMMAND" value = "/usr/es/sbin/cluster/sa/was/sbin/cl_wassaihsquery" reserved = 0 HACMPsa: sa_id = "WAS_6.0" component_id = "WAS_6.0_APP_SERVER" name = "DISCOVERY_COMMAND" value = "/usr/es/sbin/cluster/sa/was/sbin/cl_wassaserverquery -n" reserved = 0 HACMPsa: sa_id = "WAS_6.0" component_id = "WAS_6.0_DEPLOYMENT_MANAGER" name = "DISCOVERY_COMMAND" value = "/usr/es/sbin/cluster/sa/was/sbin/cl_wassaserverquery -d" reserved = 0
If it does not list the discovery scripts, uninstall the PowerHA WebSphere Smart Assist fileset and reinstall it.
If the WebSphere Application Server or IBM HTTP Server application is still not discovered, check if the physical volume (PV) where the application was installed has an entry in the /usr/es/sbin/cluster/etc/config/clvg_config file. If the entry for PV does not exist, remove the clvg_config file and rerun the Smart Assist discovery. A new clvg_config file should be created with the entry of PV on which the application was installed.
The clvg_config file is created when the discovery is run for the first time until HACMP 5.4.1. The clvg_config file does not get updated every time the discovery runs until HACMP5.4.1, even if a new PV was added. However, from PowerHA 5.5 on, the clvg_config file gets updated every time the application discovery was run. APAR IZ26108 fixes this issue in HACMP 5.4.1.
Problem 2: While importing WebSphere Application Server transaction log configuration into HACMP using WebSphere Application Server Smart Assist, claddres may report that VG is not a shareable volume group
The following sample code illustrates this problem.
Searching for WebSphere Clusters in cell rac1n6Cell01. - Found WebSphere Cluster cluster was member rac1n7. Creating service label wasTRserviceip. Creating HACMP Resource Group WAS_CLUS_rac1n6Cell01_cluste_rg. Auto Discover/Import of Volume Groups was set to true. Gathering cluster information, which may take a few minutes. claddres: VGforTLog is not a shareable volume group. Could not perform all imports. No ODM values were changed.
The solution is to make sure that the shared VG of transaction log should be varied off on all the cluster nodes.
Problem 3: While importing WebSphere Application Server transaction log configuration into HACMP using WebSphere Application Server Smart Assist, even if the file system exists on shared VG, the WebSphere Application Server Smart Assist would still try creating a new FS and may fail
The solution is to make sure that the filesystem on the shared VG for the Transaction log is created with a name like WAS_CLUS_node1Cell01_tmpclu where node1Cell01 is the cell name and tmpclu is the cluster name.
Problem 4: Failed setting End Point host for Deployment Manager while importing WebSphere Application Server configuration into HACMP
Creating service label wasserviceip. Creating HACMP Application Server WAS_DM_node1Cell01_1_as. Creating HACMP Custom Application Monitor WAS_DM_node1Cell01_1_monitor. Creating HACMP Resource Group WAS_DM_node1Cell01_rg. Auto Discover/Import of Volume Groups was set to true. Gathering cluster information, which may take a few minutes. Setting End Point host for Deployment Manager on node1 for WebSphere Cell node1Cell01, Node , Server dmgr to the value "wasserviceip". ERROR: Failed setting End Point host for Deployment Manager on node1 for WebSphere Cell node1Cell01, Node , Server dmgr. Removing HACMP Application Server WAS_DM_node1Cell01_1_as from HACMP. Application monitor WAS_DM_node1Cell01_1_monitor is no longer in use by any application servers and will be removed. Removing HACMP Application Monitor WAS_DM_node1Cell01_1_monitor from HACMP. ERROR: Failed removing HACMP Application Monitor WAS_DM_node1Cell01_1_monitor from HACMP. Removing Service IP Label wasserviceip from HACMP. Service IP label wasserviceip has been removed, removing it from resource group WAS_DM_node1Cell01_rg. Removing the resource group WAS_DM_node1Cell01_rg.
The solution is to check that there is sufficient paging space available and if the security was enabled during WebSphere Application Server installation. If security was enabled, supply the authentication information as suggested in WebSphere Application Server documentation, for WebSphere Application Server SA to communicate with DM and other servers. The sas.client.props file and soap.client.props file should be modified depending on the connector being used.
Problem 5: WebSphere Application Server could not be stopped using HACMP stop script.
The solution here is to add password information in the soap.client.props or sas.client.props files depending on whether you connect with a SOAP connector, or a Remote Method Invocation (RMI) connector.
Problem 6: WebSphere Application Server restarting once cluster services are up
This could be because of insufficient CPU or memory resources. Try adding them or increase the monitor interval to a sufficient value so that WebSphere Application Server app monitor the start script has sufficient time to start and stabilize.
Problem 7: Cluster becoming unstable after removal of RG
Make sure the RG is brought offline before removing RG from WebSphere Application Server-HACMP configuration.
Problem 8: Tivoli service IP label mismatch
Mon Apr 28 03:45:13 CDT 2008 - Application Server tdsas_node1 created. Mon Apr 28 03:45:13 CDT 2008 - Resource Group tdsrg_node1 created. Mon Apr 28 03:45:13 CDT 2008 - Application Monitor for application server tdsas_node1 created. Auto Discover/Import of Volume Groups was set to true. Gathering cluster information, which may take a few minutes. Mon Apr 28 03:45:13 CDT 2008 - Adding resources to resource group failed. ERROR: Import Failed. Removing all added entries. clrmres: Group name tdsrg_node1 not found. Mon Apr 28 03:45:13 CDT 2008 - Resources removed to resource group tdsrg_node1. clrmappmon: Monitor "tdsas_node1" removed. Mon Apr 28 03:45:13 CDT 2008 - Application Monitor for application server tdsas_node1 deleted. Mon Apr 28 03:45:13 CDT 2008 - Application Server tdsas_node1 deleted.
Make sure the service IP label provided for Tivoli Smart Assist matches exactly to the service IP label name given in WebSphere Application Server Smart Assist configuration.
Problem 9: After inputting values to Tivoli, it fails saying ERROR: Failed to create import scripts.
Make sure all the required VGs are online.
Problem 10: WebSphere Application Server Network Deployment imported HACMP cluster name and node/host name mismatch
ERROR: The WebSphere Cluster hclus has a member hosted on node node1.in.ibm.com and that node is not part of the resource group WAS_CLUS_node2Cell01_hclus_rg. Please remove the member from the WebSphere Cluster or use Smart Assist for WebSphere to remove transaction log recovery for this cluster.
Make sure the name of node in WebSphere Application Server Network Deployment cluster and node names (short names) used in PowerHA is the same. The WebSphere Application Server V6.1: System Management and Configuration Redbook is a good resource while working with WebSphere Application Server and WebSphere Application Server Network Deployment.
Problem 11: Service IP label for Tivoli missing from HACMP
Wed Apr 30 01:33:02 CDT 2008 - Application Server tdsas_node1 created. Wed Apr 30 01:33:02 CDT 2008 - Resource Group tdsrg_node1 created. Wed Apr 30 01:33:02 CDT 2008 - Application Monitor for application server tdsas_node1 created. Auto Discover/Import of Volume Groups was set to true. Gathering cluster information, which may take a few minutes. Wed Apr 30 01:33:04 CDT 2008 - Adding resources to resource group failed. ERROR: Import Failed. Removing all added entries. clrmres: Group name tdsrg_node1 not found. Wed Apr 30 01:33:04 CDT 2008 - Resources removed to resource group tdsrg_node1. clrmappmon: Monitor "tdsas_node1" removed. Wed Apr 30 01:33:04 CDT 2008 - Application Monitor for application server tdsas_node1 deleted. Wed Apr 30 01:33:04 CDT 2008 - Application Server tdsas_node1 deleted. clrmgrp: Group name tdsrg_node1 removed from HACMPgroup Class. Wed Apr 30 01:33:04 CDT 2008 - Resource Group tdsrg_node1 deleted. Wed Apr 30 01:33:04 CDT 2008 - TDS import delete complete.
The solution is to add service IP/label to HACMP before adding it to WebSphere Application Server TDS Smart Assist.
Smart Assists help minimize effort in configuring applications under PowerHA. Once applications like WebSphere Application Server and DB2 are enabled for PowerHA, then PowerHA will provide monitoring and recovery support for the application on failures.
Get products and technologies
- The following are references for High Availability Cluster Multi-Processing for AIX.
- Follow developerWorks on Twitter.
- Get involved in the My developerWorks community.
- Participate in the AIX and UNIX® forums: