Across every industry, the business environment is becoming more complex, fast paced, and unpredictable. Today's companies need the flexibility to stay ahead of the competition while ensuring their operations are efficient and resilient. IBM WebSphere delivers application infrastructure and integration software that helps companies address the key, critical priorities in an on demand world, such as maintaining maximum system uptime. This article shows you a step-by-step procedure for building a highly available configuration for WebSphere Application Server.
The WebSphere Application Server is a J2EE® and Web services application server, designed to deliver a high-performance and extremely scalable transaction engine for dynamic e-business applications.
WebSphere Application Server Network Deployment (ND) provides an operating environment with advanced performance and availability capabilities to support dynamic application environments. In addition to the features and functions in the WebSphere Application Server, this specific configuration delivers advanced deployment services that include clustering, edge-of-network services, Web services enhancements, and high availability for distributed configurations.
The first article in this series introduced you to HA concepts and how to install and configure heartbeat. This article shows the HA implementation for WebSphere Application Server and ND in a cold standby configuration using heartbeat.
In this implementation, heartbeat detects a failure with the primary and then initiates failover by:
- Stopping the WebSphere Application Server(s) and the node agent on the primary
- Stopping the ND deployment manager on the primary
- Releasing the shared disk on the primary
- Removing the service IP address on the primary
- Adding the service IP address to the standby
- Mounting the shared disk on the standby
- Starting the deployment manager on the standby machine
- Starting the application server(s) and the node agent on the standby machine
To get the most out of this article, you need a basic understanding of WebSphere Application Server Base and Network Deployment and High Availability clusters. Also, for helpful background, read the first article in this series, High-availability middleware on Linux, Part 1: Heartbeat and Apache Web server.
WebSphere Application Server and HA
In WebSphere Application Server ND, deployment managers are administrative agents that provide a centralized management view for all nodes in a cell. The management of clusters and the management of workload balancing of the application servers across one or several nodes are accomplished via the deployment manager.
The deployment manager also hosts the administrative console and provides a single central point of administrative control for all the elements of the entire WebSphere Application Server distributed cell. When a deployment manager is unavailable, this impacts the ability both to make configuration changes and to propagate the changes to the application servers. This makes the deployment manager a single point of failure.
The remainder of this article shows how to make the deployment manager of the WebSphere Application Server ND environment highly available, and how a WebSphere base node can be failed over to a backup. As in the previous articles in this series, critical files are on a shared filesystem (/ha for this example) that is available to a backup machine in the event of a WebSphere Application Server node failure.
Figure 1 shows the organization of the filesystem.
Figure 1. The WebSphere high-availability setup
In this setup:
- The machine ha1 serves as a primary WebSphere Application Server deployment manager machine and a WebSphere Application Server node.
- The machine ha2 serves as a backup for the WebSphere Application Server deployment manager and the WebSphere Application Server node.
- The machine ha3 serves as a WebSphere Application Server node.
- The entire WebSphere deployment manager (/ha/WebSphere/DeploymentManager) and WebSphere Node (/ha/WebSphere/AppServer) installation is kept on the shared disk. Only the log directories are kept on the local machines.
Installing WebSphere Application Server ND and base in a HA configuration
In this section, you will install both WebSphere Application Server ND and the base.
Installing WebSphere Application Server ND
To install WebSphere Application Server ND 5.1 with the necessary fix packs on both the primary and the backup node:
- Make sure heartbeat is running on both of the nodes. This will ensure that ha1 is serving the cluster IP address and that the filesystem /ha is mounted on ha1 as well.
- Create the installation directories for WebSphere Application Server ND and base on the ha1 node:
mkdir /ha/WebSphere/ mkdir /ha/WebSphere/DeploymentManager mkdir /ha/WebSphere/AppServer
- Extract the WebSphere Application Server ND 5.1 installation image on the ha1 node:
rm -rf /tmp/was5.1nd-install mkdir /tmp/was5.1nd-install tar xf c53t6ml.tar -C /tmp/was5.1nd-install
Here, c53t6ml.tar is the installation tar file for WebSphere Application Server ND. Your image filename may differ, depending on how you obtained it.
- Run the installation wizard on the node ha1:
cd /tmp/was5.1nd-install/linuxi386 ./launchpad.sh
Put the following information in the wizard screen fields:- Installation directory: /ha/WebSphere/DeploymentManager
- Node: haManager
- Host: ha.haw2.ibm.com
- Cell: haNetwork
- In this setup, an HTTP server and MQ are already installed, so I chose not to install either of them. I also chose not to install the Web services gateway.
- Clean up the installation image directory:
rm rf /tmp/was5.1nd-install
- Extract the WebSphere Application Server ND 5.1 Fix Pack 1 installation image on the ha1 node:
rm -rf /tmp/was5.1.1nd-install mkdir /tmp/was5.1.1nd-install tar xzf was51_nd_fp1_linux.tar.gz -C /tmp/was5.1.1nd-install
Here, was51_nd_fp1_linux.tar.gz is the installation tar file for WebSphere Application Server ND 5.1 Fix Pack 1. Your image filename may differ, depending on how you obtained it.
- Run the silent update on ha1:
. /ha/WebSphere/DeploymentManager/bin/setupCmdLine.sh cd /tmp/was5.1.1nd-install/ ./updateSilent.sh installDir /ha/WebSphere/DeploymentManager -fixpack -install -fixpackDir /tmp/was5.1.1nd-install/fixpacks -fixpackID was51_nd_fp1_linux -skipIHS -skipMQ
- Clean up the fix pack installation image directory:
rm rf /tmp/was5.1.1nd-install
- Extract the WebSphere Application Server ND 5.1.1 Cumulative Fix 1 installation image on the ha1 node:
rm -rf /tmp/was5.1.1.1nd-install mkdir /tmp/was5.1.1.1nd-install unzip -q was511_nd_cf1_linux.zip -d /tmp/was5.1.1.1nd-install
Here, was511_nd_cf1_linux.zip is the installation zip file for WebSphere Application Server ND 5.1.1 Cumulative Fix 1.
- Run the silent update on ha1:
. /ha/WebSphere/DeploymentManager/bin/setupCmdLine.sh cd /tmp/was5.1.1.1nd-install ./updateSilent.sh installDir /ha/WebSphere/DeploymentManager -fixpack -install -fixpackDir /tmp/was5.1.1.1nd-install/fixpacks -fixpackID was511_nd_cf1_linux -skipIHS -skipMQ
- Clean up the fix pack installation image directory:
rm rf /tmp/was5.1.1.1nd-install
- Now, you'll create the directory links shown in Figure 1.
- Remove the deployment manager log directories from the installation on ha1:
rm rf /ha/WebSphere/DeploymentManager/logs
- Create directories for logs on a local filesystem on both the nodes, ha1 and ha2:
mkdir /var/log/waslog mkdir /var/log/waslog/DeploymentManager
- Set the correct permissions on both the nodes, ha1 and ha2:
chmod 755 /var/log/waslog chmod 755 /var/log/waslog/DeploymentManager
- Create the symbolic links on node ha1 only:
ln -s /var/log/waslog/DeploymentManager /ha/WebSphere/DeploymentManager/logs
- Remove the deployment manager log directories from the installation on ha1:
To install WebSphere Application Server 5.1 Base with the necessary fix packs on both the primary and the backup node:
- Extract the WAS Base 5.1 installation image on the ha1 node:
rm -rf /tmp/was5.1base-install mkdir /tmp/was5.1base-install tar xf c53ipml.tar -C /tmp/was5.1base-install
Here, c53ipml.tar is the installation tar file for WAS Base 5.1. Your image filename may differ, depending on how you obtained it.
- Run the installation wizard on the node ha1:
cd /tmp/was5.1base-install/linuxi386 ./launchpad.sh
Put the following information in the wizard screens fields:
- Installation directory: /ha/WebSphere/AppServer
- Node: ha
- Host: ha.haw2.ibm.com
- In this setup, I already have a HTTP server and MQ installed, so I chose not to install either of them. You can disable Installation of these features by selecting the Custom Setup option.
- Clean up the installation image directory using the following command:
rm rf /tmp/was5.1base-install
- Extract the WAS Base 5.1 Fix Pack 1 installation image using the commands shown below on the ha1 node:
rm -rf /tmp/was5.1.1base-install mkdir /tmp/was5.1.1base-install tar xzf was51_fp1_linux.tar.gz -C /tmp/was5.1.1base-install
Here, was51_fp1_linux.tar.gz is the installation tar file for WAS Base 5.1 fix pack 1. Your image filename might be different based on how you obtained it.
- Run the silent update on ha1 using the command shown below:
. /ha/WebSphere/AppServer/bin/setupCmdLine.sh cd /tmp/was5.1.1base-install ./updateSilent.sh installDir /ha/WebSphere/AppServer -fixpack -install -fixpackDir /tmp/was5.1.1base-install/fixpacks -fixpackID was51_fp1_linux -skipIHS -skipMQ
- Clean up the fixpack installation image directory using the following command on node ha1:
rm rf /tmp/was5.1.1base-install
- Extract the WAS Base 5.1.1 Cumulative Fix 1 installation image using the commands shown below on the ha1 node:
rm -rf /tmp/was5.1.1.1base-install mkdir /tmp/was5.1.1.1base-install unzip -q was511_cf1_linux.zip -d /tmp/was5.1.1.1base-install
Here, was511_cf1_linux.zip is the installation zip file for WAS Base 5.1.1 Cumulative Fix 1.
- Run the silent update on ha1 using the command shown below.
. /ha/WebSphere/AppServer/bin/setupCmdLine.sh cd /tmp/was5.1.1.1base-install ./updateSilent.sh installDir /ha/WebSphere/AppServer -fixpack -install -fixpackDir /tmp/was5.1.1.1base-install/fixpacks -fixpackID was511_cf1_linux -skipIHS -skipMQ
- Clean up the fixpack installation image directory using the following command on node ha1:
rm rf /tmp/was5.1.1.1base-install
- Now, you'll create the directory links shown in Figure 1.
- Remove the WebSphere log directories from the installation on ha1:
rm rf /ha/WebSphere/AppServer/logs
- Create directories for logs on a local filesystem on both the nodes, ha1 and ha2:
mkdir /var/log/waslog/AppServer
- Set the correct permissions on both the nodes, ha1 and ha2:
chmod 755 /var/log/waslog/AppServer
- Create the symbolic links on node ha1 only:
ln -s /var/log/waslog/AppServer /ha/WebSphere/AppServer/logs
- Remove the WebSphere log directories from the installation on ha1:
- Install WebSphere Application Server base on the node ha3 (steps 1 - 9 above, only) with the following information:
- Installation directory: /opt/WebSphere/AppServer
- Node: ha3
- Host: ha3.haw2.ibm.com
- In this setup, an HTTP server and MQ are already installed, so I chose not to install either of them.
- Start the deployment manager on ha1 by running the startManager.sh from the bin directory of the deployment manager installation.
- Add the WAS nodes ha and ha3 (created during WebSphere Application Server base installs above) to the cell haNetwork (created in step 4 of WebSphere Application Server ND install) by running the following command on each node (from the application server bin directory) :
addnode.sh ha
- Verify through the admin console that the cell appears to be correct. Open the console (http://ha.haw2.ibm.com:9090/admin) and make sure that you see all of the nodes for the Application Servers.
- Stop everything. This means stopping the deployment manager and the node agents on each of the WAS nodes. Use these commands:
- Node Agents:
stopNode.sh(from the bin directory of the Application Server) - Deployment manager:
stopManager.sh(from the bin directory of the deployment manager)
- Node Agents:
Configuring heartbeat to manage the deployment manager
Now you can configure heartbeat to manage the WebSphere Application Server ND deployment manager. First, create a script to start and stop the deployment manager process. A basic script (wasdmgr) is shown in Listing 1. You can further customize it to suit your setup. Place this script in the /etc/rc.d/init.d directory.
Listing 1. Basic script (wasdmgr) for starting and stopping the deployment manager
#!/bin/bash
#
# /etc/rc.d/init.d/wasdmgr
#
# Starts the WebSphere Deployment Manager
#
# chkconfig: 345 88 57
# description: Runs WAS DMGR
. /etc/init.d/functions
# Source function library.
PATH=/usr/bin:/bin:/ha/WebSphere/DeploymentManager/bin
#==============================================================================
SU="sh"
#======================================================================
start() {
echo "$0: starting websphere deployment manager"
$SU -c "startManager.sh"
}
#======================================================================
stop() {
echo "$0: stopping websphere deployment manager"
$SU -c "stopManager.sh"
}
case $1 in
'start')
start
;;
'stop')
stop
;;
'restart')
stop
start
;;
*)
echo "usage: $0 {start|stop|restart}"
;;
esac
|
Next, configure the /etc/ha.d/haresources file on both the nodes ha1 and ha2 to include the wasdmgr script. Here is the relevant portion of the modified file:
ha1.haw2.ibm.com 9.22.7.46 Filesystem::hanfs.haw2.ibm.com:/ha::/ha::nfs::rw,hard wasdmgr |
This line dictates that on startup of heartbeat, ha1 serves the cluster IP address, mounts the shared filesystem, and starts the WebSphere Application Server deployment manager. On shutdown, heartbeat will first stop the deployment manager, then un-mount the shared filesystem, and finally give up the IP.
Configure heartbeat to manage the WebSphere Application Server node
Now configure heartbeat to manage the WebSphere Application Server node agent and application server processes. First, create a couple of scripts to start and stop the node agent and the application server processes. A basic script to start a node agent (wasnode) is shown in Listing 2, and the script to start application servers (wasserver) on a node is shown in Listing 3. You can further customize these to suit your setup. Place these scripts in the /etc/rc.d/init.d directory.
Listing 2. wasnode script to start a node agent
#!/bin/bash
#
# /etc/rc.d/init.d/wasnode
#
# Starts the WebSphere Node Agent
#
# chkconfig: 345 88 57
# description: Runs WAS NODE
. /etc/init.d/functions
# Source function library.
PATH=/usr/bin:/bin:/ha/WebSphere/AppServer/bin
#======================================================================
SU="sh"
#======================================================================
start() {
echo "$0: starting websphere node agent"
$SU -c "startNode.sh"
}
#==============================================================================
stop() {
echo "$0: stopping websphere node agent"
$SU -c "stopNode.sh"
#sleep 30
}
case $1 in
'start')
start
;;
'stop')
stop
;;
'restart')
stop
start
;;
*)
echo "usage: $0 {start|stop|restart}"
;;
esac
|
Listing 3. wasserver script to start application servers
#!/bin/bash
#
# /etc/rc.d/init.d/wasserver
#
# Starts the WebSphere Application Server
#
# chkconfig: 345 88 57
# description: Runs WAS Server
. /etc/init.d/functions
# Source function library.
PATH=/usr/bin:/bin:/ha/WebSphere/AppServer/bin
WASSERVERS="server1"
#======================================================================
SU="sh"
#======================================================================
start() {
for wasserver in $WASSERVERS ; do
export wasserver
echo "$0: starting websphere application server $wasserver"
$SU -c "startServer.sh $wasserver"
done
}
#==============================================================================
stop() {
for wasserver in $WASSERVERS ; do
export wasserver
echo "$0: stopping websphere application server $wasserver"
$SU -c "stopServer.sh $wasserver"
done
}
case $1 in
'start')
start
;;
'stop')
stop
;;
'restart')
stop
start
;;
*)
echo "usage: $0 {start|stop|restart}"
;;
esac
|
Now, configure the /etc/ha.d/haresources file on both the nodes ha1 and ha2 to include the wasnode and wasserver scripts. Here is the relevant portion of the modified file:
ha1.haw2.ibm.com 9.22.7.46 Filesystem::hanfs.ibm.com:/ha::/ha::nfs::rw,hard wasdmgr wasnode wasserver |
This line dictates that on the startup of heartbeat, ha1 serves the cluster IP address, mounts the shared filesystem, and starts the WebSphere Application Server deployment manager, node agent, and application servers. On shutdown, heartbeat will first stop the application servers, then the node agent, then the deployment manager, then un-mount the shared filesystem, and finally give up the IP.
Testing deployment manager failover
This section shows you how to test the high availability of the deployment manager.
-
Start the heartbeat service on the primary and then on the backup node with this command:
/etc/rc.d/init.d/heartbeat start. After heartbeat starts successfully, you will see a new interface with the IP address that you configured in the ha.cf file. Once you've started heartbeat, take a peek at your log file (default is /var/log/ha-log) on the primary and make sure that it is doing the IP takeover and then starting dmgr, node agent, application servers, and other resources. Heartbeat will not start any resource on the backup. This happens only after the primary fails. -
Start the WebSphere node agent and the WebSphere Application Server on the ha3 node.
-
From the admin console (http://ha.haw2.ibm.com:9090/admin), make sure the application servers on both machines (ha1 and ha3) are running. If not, start them.
-
Deploy the sample enterprise application, TestWebSphereHA.ear (see the Download section below) under the was\sample_ver_1 directory using the admin console. Make sure you deploy it on both the ha and ha3 WAS nodes. Start the application using the console.
-
Verify that the application runs on both nodes by pointing the browser at these URLs:
http://ha.haw2.ibm.com:9080/TestWebSphereHAWeb/Test
http://ha3.haw2.ibm.com:9080/TestWebSphereHAWeb/TestFor both URLs, the browser should display the following text:
Test:doGet() Invoked the HA Test Servlet.
At this point, you have successfully deployed an application on two WebSphere Application Server nodes being managed by a deployment manager running on ha1. Now, check to see if this configuration information survives a failover to the backup.
-
Simulate failover by stopping heartbeat on the primary system using this command:
/etc/rc.d/init.d/heartbeat stop. You should see all the services come up on the backup machine. Verify that the deployment manager is running on the backup by checking the /var/log/ha-log file. Once the backup has taken over control, start the admin console again. You should see the two application servers and the enterprise application, TestWebSphereHA. This shows that the configuration information survived a failure. Also repeat step 5 to verify that the application works on both the nodes. -
Update the enterprise application to a newer version of TestWebSphereHA.ear under the was\sample_ver_2 directory by using the admin console. Make sure you select both ha and ha3 WAS nodes while updating.
-
After the update, make sure you save the master configuration. Also make sure that the Synchronize changes with Nodes option is selected. You should be able to successfully update the application. Repeat Step 5 again.
-
For the URL http://ha.haw2.ibm.com:9080/TestWebSphereHAWeb/Test, the browser displays
Test:doGet() Invoked the HA Test Servlet on remote host : ha2.haw2.ibm.com. The hostname of the machine gets printed as well. This verifies that the cluster IP ha.haw2.ibm.com is being served by the backup (ha2.haw2.ibm.com) machine. -
For the URL http://ha3.haw2.ibm.com:9080/TestWebSphereHAWeb/Test, the browser displays
Test:doGet() Invoked the HA Test Servlet on remote host : ha3.haw2.ibm.com. -
Start the heartbeat service back on the primary. This should stop the WebSphere Application Server processes on the secondary and start them on the primary. The primary should also take over the cluster IP as well. Use this
command:
/etc/rc.d/init.d/heartbeat start. - Start the admin console again. You should see the two application servers and the application TestWebSphereHA. This shows that the updated configuration information survived a failover to the primary. Also, repeat Step 5 to verify that the application works on both the nodes.
You have now seen how configuration information of the deployment manager survives a failover from a primary machine to a standby machine.
Testing WebSphere Application Server node failover
For testing the failover for a node, I have modified the sample test application so that it keeps track of how many times it has been invoked by maintaining a persistent counter (count). Here, I have chosen the filesystem as the mechanism to keep the counter persistent. For the failover of the application to work, this data must be kept on the shared disk.
To test the high availability of a WebSphere node and application:
-
Start the heartbeat service on the primary and then on the backup node with this command:
/etc/rc.d/init.d/heartbeat start. -
Start the WebSphere node agent and the WebSphere Application Server on the ha3 node.
-
From the admin console (http://ha.haw2.ibm.com:9090/admin), make sure the application servers on both machines (ha1 and ha3) are running. If not, start them. Also, you should see the application TestWebSphereHA.
-
Update the enterprise application to a newer version of TestWebSphereHA.ear under the was\sample_ver_3 directory using the admin console. Make sure you select both ha and ha3 WAS nodes while updating. After update, make sure you save the master configuration. Also make sure that the Synchronize changes with Nodes option is selected. You should be able to successfully update the application.
-
Verify that the application runs on the ha WebSphere node by pointing a browser to the following URL: http://ha.haw2.ibm.com:9080/TestWebSphereHAWeb/Test. The output in the browser should be the following:
Test:doGet() Invoked the HA Test Servlet on remote host : ha1.haw2.ibm.com Test:doGet() This servlet has been invoked 1 times Test:doGet() Count data file : /ha/WebSphere/AppServer/installedApps/haNetwork /TestWebSphereHA.ear/TestWebSphereHAWeb.war /WEB-INF/count.dat
This output shows that the application is successfully running on the WAS node ha, which is being run on the primary node ha1. Also note that the count.dat file is on the shared filesystem /ha. Repeat this step one more time so that the count is 2.
-
Simulate failover by stopping heartbeat on the primary system using this command:
/etc/rc.d/init.d/heartbeat stop. Once the backup has taken over control, start the admin console again. You should see the two application servers and the application TestWebSphereHA.Point a browser to the following URL: http://ha.haw2.ibm.com:9080/TestWebSphereHAWeb/Test. The output in the browser for this run should be the following:
Test:doGet() Invoked the HA Test Servlet on remote host : ha2.haw2.ibm.com Test:doGet() This servlet has been invoked 3 times Test:doGet() Count data file : /ha/WebSphere/AppServer/installedApps/haNetwork /TestWebSphereHA.ear/TestWebSphereHAWeb.war /WEB-INF/count.dat
This output shows that the application is successfully running on the WAS node ha, which is being run on the backup node ha2. Also, the application data (value of count) has survived a failover as it is on the shared disk.
This is a trivial example of application data failover. A more practical example would be one that uses a database. In that case, HA for the database should be implemented as well, which the next article in this series will demonstrate.
- Start the heartbeat service back on the primary. This should stop the WebSphere Application Server processes on the secondary and start them on the primary. The primary should also take over the cluster IP as well. Use this command:
/etc/rc.d/init.d/heartbeat start.
Availability is a key requirement to building an on demand operating environment where IT systems run 24/7. This article shows how you can lessen the business impact of downtime by implementing high availability for WebSphere Application Server using open source software on the Linux™ operating system. Using the approach outlined in this series, you can significantly reduce planned and unplanned outages, allowing for cluster upgrades and system maintenance without interrupting operations
| Description | Name | Size | Download method |
|---|---|---|---|
| Sample code package for this series of articles | hahbcode.tar.gz | 25 KB | HTTP |
Information about download methods
- Read the other articles in the High-availability middleware on Linux series.
- Find more information on the Network Deployment configuration of WebSphere Application Server on the WebSphere Application Server Network Deployment page.
- Check out the High-Availability Linux project Web site for more information on heartbeat, including Heartbeat success stories.
- You can download most of the software needed for this series of articles at these locations (note that not all of the downloads are free):
- Red Hat Enterprise Linux 3.0 (2.4.21-15.EL)
- Heartbeat 1.2.3
- IBM Java™ 2 SDK 1.4.2
- IBM DB2® Universal Database™ Enterprise Server Edition V8.1 for Linux
- For a detailed discussion of
availability and how to plan for and maintain it in an enterprise middleware environment, read "Planning for Availability in the Enterprise" (developerWorks, December 2003).
- Clusters are well known for their high-availability features: "Build a heterogeneous cluster with coLinux and openMosix" (developerWorks, February 2005) and "Linux clustering cornucopia" (developerWorks, May 2000) can add to your cluster knowledge.
- The IBM WebSphere Application Server support page contains links to Fix Packs and product documentation.
- IBM also offers a list of high-availability resources such as white papers and documentation.
- Find more resources for Linux developers in the developerWorks Linux zone.
- Get involved in the developerWorks community by participating in
developerWorks blogs.
- Build your next Linux development project with IBM trial software, available for download directly from developerWorks.
Hidayatullah H. Shaikh is a Senior Software Engineer on the IBM T.J. Watson Research Center's On-Demand Architecture and Development Team. His areas of interest and expertise include business process modeling and integration, service-oriented architecture, grid computing, e-commerce, enterprise Java, database management systems, and high-availability clusters. You can contact Hidayatullah at hshaikh@us.ibm.com.




