IBM WebSphere Developer Technical Journal: Building a high availability database environment using WebSphere middleware, Part 1

Using DB2 High Availability Disaster Recovery with WebSphere Application Server

IBM® DB2® High Availability Disaster Recovery (HADR), plus the Automatic Client Reroute capability, enables its client applications to recover from a failed database server with minimal interruption. This article shows you the steps for building a highly available database environment by leveraging DB2 HADR and Automatic Client Reroute using IBM WebSphere® Application Server Network Deployment V6.1 as the client application.

Share:

Li-Fang Lee (lifang@us.ibm.com), Advisory Software Engineer, IBM

Li-Fang Lee is a test strategist working for the WebSphere Test & Quality Organization in Rochester, MN. Her current focus area is high availability, understanding customer business requirements, designing customer-like test scenarios across the organization, and leading a team to carry out test scenarios to ensure high availability of WebSphere Application Server.



Wayne Rosario (rwayne@us.ibm.com), Software Engineer, IBM

Wayne Rosario is a software engineer at IBM in Rochester, MN. Wayne currently works on WebSphere Application Server's System Verification Test, and has over three years of test experience.



Soloman Barghouthi (soloman@us.ibm.com), Advisory Software Engineer, IBM

Soloman Barghouthi is a Senior Software Engineer at the IBM Rochester Lab. Soloman is the architect and leader of the WebSphere Application Server EJBContainer team. Soloman is a database expert and has published many articles on database/WebSphere Application Server interaction.



February 2008 (First published 09 May 2007)

Also available in Chinese

From the IBM WebSphere Developer Technical Journal.

Introduction

IBM DB2 Universal Database (UDB) Enterprise Server Edition V8.2 introduced a new high availability feature, called High Availability Disaster Recovery (HADR), that provides a failover capability to its client applications by replicating data from a primary database to a standby database. This enables client applications to easily recover from a partial or complete disaster. In addition, DB2 HADR is associated with an Automatic Client Reroute (ACR) capability to make the failover behavior virtually transparent to its client applications.

IBM WebSphere Application Server Network Deployment (ND) V6.1, for example, uses DB2 to store its application persistent data. WebSphere Application Server applications need only to make connections to the primary database. To keep a synchronized copy of the database on the standby server, DB2 HADR log files are consistently sent from the primary to the standby to keep the two in sync. When the primary server goes offline for (planned or unplanned) downtime, the standby server is available to be brought online. When the Automatic Client Reroute feature has been enabled, the DB2 reroute logic will check whether or not the standby server can be found. If it is found, the reroute first retries the connection to the failed primary server, and if that fails again, the reroute will connect to the alternate standby server. At this time, the WebSphere Application Server connection is re-established with the alternate server, and the transaction is rolled back and then re-issued on the alternate server. The entire DB2 HADR failover process is transparent to the WebSphere Application Server ND applications.

DB2 HADR plus the Automatic Client Reroute capability enable its client applications to recover from a failed database server with minimal interruption. This article describes the steps for building a highly available database environment by leveraging DB2 HADR and Automatic Client Reroute using WebSphere Application Server ND V6.1 as the client application.

This article assumes familiarity with IBM WebSphere Application Server Network Deployment V6.x and DB2 Universal Database (UDB) Enterprise Server basic concepts and setup.


DB2 HADR preparation

DB2 HADR requirements

Before enabling DB2 with HADR as the WebSphere Application Server application datastore, you need to be aware of these basic requirements for both the primary and standby DB2 servers:

  • Identical operating system and DB2 version are required.
  • Same container file system and installation path are necessary, such as /home/db2inst1/sqllib.
  • If using by reference, communication ports for HADR need to be specified in:
    • For UNIX® or Linux®: /etc/services
    • For Windows®: c:\windows\system32\drivers\etc\services
  • The standby server machine has to be reachable by TCP/IP from the primary server and by the client application.

DB2 HADR setup

Let's look at what is involved in setting up the DB2 HADR primary and standby servers:

  1. Install DB2 UDB Enterprise Server Edition on both the primary and standby machines. Start the DB2 servers on both machines, if they are not already running, then create your database and the required tables on the primary machine. For illustration purposes, we will use "Sample" as the database name. (See the DB2 Information Center for detailed installation information.)

  2. Next, identify the TCP/IP connection communication ports for each database on the primary and standby machines (as per normal client/server database connectivity). The port name is user-defined and the port numbers can be any numbers, provided there are no conflicts. It is not required that the ports on the primary and standby servers be identical; however, if you can keep the ports the same on both machines, the configuration will be much easier. For our Sample database, two ports (51012 and 51013) are used on the primary and standby servers:

    Listing 1. Two HADR ports used for the "Sample" database
    >more /etc/services 
    
    # HADR ports assigned by user
    ha_sample      51012/tcp
    ha_sample_int  51013/tcp

    You need to edit the /etc/services file (UNIX/Linux) or c:\windows\system32\drivers\etc\services (Windows) to specify the ports. (Note that only the administrator account can edit these files.)

  3. Configure the HADR variables for each database on the primary machine. (The steps shown here are for the primary machine only.) This process must be repeated for each database. When setting up these variables, verify that the hard_local_hos" and hard_remote_host variables refer to the proper machines. Also, the hadr_local_svc and hadr_remote_svc variables must match the names defined in the /etc/services file described above. The "Sample" database is used here as an example:

    Listing 2. HADR variables for the Sample database
    >db2 update db cfg for Sample using hadr_local_host    <primary machine IP address>
    >db2 update db cfg for Sample using hadr_remote_host   <standby machine IP address>
    >db2 update db cfg for Sample using hadr_local_svc     ha_sample
    >db2 update db cfg for Sample using hadr_remote_svc    ha_sample_int
    >db2 update db cfg for Sample using hadr_remote_inst   <DB2 instance name on the standby>
    >db2 update db cfg for Sample using hadr_timeout       120
    >db2 update db cfg for Sample using hadr_syncmode      nearsync
    >db2 update db cfg for Sample using logretain 	  on
    >db2 update db cfg for Sample using LOGINDEXBUILD 	  on
    >db2 update alternate server for database Sample using hostname <Standby IP address>
    port 60000

    Above, NEARSYNC is selected as the sync mode. This mode provides less protection against transaction loss than SYNC, in exchange for a shorter transaction response time than SYNC mode. Notice that the last command in Listing 2 enables the Automatic Client Reroute capability for DB2 HADR and port 60000 is the DB2 instance port number on the standby machine.

  4. Verify your configuration values by entering this command, also shown in Listing 3:

    db2 get db cfg for <database_name> | grep HADR

    Listing 3. HADR configuration for our Sample database on the primary machine
    >db2 get db cfg for  sample  | grep HADR
    
    HADR database role                                 = STANDARD
    HADR local host name             (HADR_LOCAL_HOST) = svtlewis.rchland.ibm.com
    HADR local service name           (HADR_LOCAL_SVC) = ha_sample
    HADR remote host name           (HADR_REMOTE_HOST) = svtclark.rchland.ibm.com
    HADR remote service name         (HADR_REMOTE_SVC) = ha_sample_int
    HADR instance name of remote server  (HADR_REMOTE_INST) = db2inst1
    HADR timeout value                       (HADR_TIMEOUT) = 120
    HADR log write synchronization mode     (HADR_SYNCMODE) = NEARSYNC
  5. Now, it is time to backup the database from your primary machine and restore it on the standby. The following commands will backup your databases on the primary machine (also shown in Listing 4). You need to repeat the command for each of your databases. Be aware that you cannot perform a backup on a database that has an existing connection with its client application.

    cd <temp_backup_directory>
    db2 backup db <database_name>

    The backup copy of each database will be stored in the <temp_backup_directory>.

    Listing 4. Backup database on the primary machine
    >cd tmpdir
    >db2 backup db sample
    
    Backup successful. The timestamp for this backup image is :20061101161943

    Transfer the backup files generated on the primary machine to the standby machine. To restore the database to the standby machine, log on to the standby machine with your DB2 user ID. The command below will restore your database on the standby machine (Listing 5). You need to repeat this command for each of your databases:

    db2 restore db <database_name> from <temp_restore_directory> replace history file

    Listing 5. Restore database on the standby machine
    >cd tmpdir
    >db2 restore db sample replace history file
              
    DB20000I  The RESTORE DATABASE command completed successfully.
  6. After restoring all databases on the standby machine, all the DB2 HADR variables shown in Listing 6 for each database must be copied from the primary DB2 server to the standby server. From the standby DB2 server perspective, the primary DB2 server is the remote host. Thus, you need to modify the hadr_local_host, hadr_remote_host, hadr_local_svc and hadr_remote_svc values on the standby machine. In addition, the Automatic Client Reroute must be updated to point to the primary machine.

    Notice that the port 60000 is the DB2 instance port number on the primary machine.

    Listing 6. HADR variables of the Sample database on the standby machine
    >db2 update db cfg for Sample using hadr_local_host     <standby machine IP address>
    >db2 update db cfg for Sample using hadr_remote_host    <primary machine IP address>
    >db2 update db cfg for Sample using hadr_local_svc      ha_sample_int
    >db2 update db cfg for Sample using hadr_remote_svc     ha_sample
    
    >db2 update alternate server for database Sample using hostname <Primary machine IP 
    address> port 60000
  7. It is critical that all the HADR parameters be set with proper values prior to starting DB2 HADR on either the primary or standby machine. If any of these parameters are empty or misconfigured, DB2 HADR servers will not start properly. In addition, DB2 HADR needs to be started on the standby machine first, before it is started on the primary machine.

    On the standby machine, first deactivate the database and then issue the start hadr command to start it as the standby database (Listing 7).

    db2 deactivate db <database_name>
    db2 start hadr on db <database_name> as standby

    Listing 7. Deactivate and start the Sample as the standby database
    >db2 deactivate db sample
    >db2 start hadr on db sample as standby
         
         DB20000I  The START HADR  ON DATABASE command completed successfully
    
    >db2 get snapshot for db on sample  | grep Role
    
         Role                   = Standby

    On the primary machine, first activate the database and then issue the start hadr command to start it as the primary database (Listing 8).

    db2 activate db <database_name>
    db2 start hadr on db <database_name> as primary
    Listing 8. Activate and start the Sample as the primary database
    > db2 activate db sample
    > db2 start hadr on db sample as primary    
     
        DB20000I  The START HADR  ON DATABASE command completed successfully 
    
    > db2 get snapshot for db on sample | grep Role
    
        Role                   = Primary
  8. When starting DB2 HADR servers, you should receive a message indicating that the "START HADR ON DATABASE command completed successfully" message. To ensure that HADR servers are running on the primary and standby machines with the proper roles, you can issue this command on the primary and standby machine to verify:

    db2 get snapshot for db on <database_name> | grep Role

    Under the HADR status role, you will see "Standby" for the standby machine and "Primary" for the primary machine. If you can't get the proper indicator for each database, you need to review all previous steps.

  9. After the databases on the standby and primary machines are started successfully, you need to ensure that both databases are in sync. If they are not, the failover will not work successfully, which makes it possible to get undesirable results, such as data loss. Issue this command to check the state for both databases on the primary and standby database machines:

    db2 get snapshot for database on <database_name> | grep State

    You will need to wait until the standby database makes a connection with the primary database to get both databases in peer mode. Once the databases are in peer mode, the DB2 HADR servers on the primary and standby machines are ready for use.


DB2 Universal JDBC Driver behavior when running DB2 HADR

Although DB2 JDBC Universal Driver will transparently connect to the appropriate database server (that is, the primary versus the alternate), there are differences when comparing the DB2 Universal JDBC Driver Type2 and Driver Type4:

  • DB2 Universal Driver Type2

    After the first successful connect to the DB2 database, DB2 will update the JDBC driver with the alternate server information. The alternate server information then gets stored on the JDBC driver side, in memory as well as in a DB2 database directory (persistent on disk). When the connection to the primary DB2 server fails, DB2 will retrieve the alternate server information from memory (or from the DB2 persistent copy, if the alternate server information is not found in memory). The DB2 driver will then use the alternate server information to connect to the right DB2 server. The entire process is transparent to the user.

  • DB2 Universal Driver Type4

    After the first successful connect to the DB2 database, DB2 will also update the JDBC driver with the alternate server information. However, the alternate server information will only be stored in memory on the JDBC driver side. When the connection to the primary DB2 server fails, DB2 will retrieve the alternate server information from memory and will use that alternate server information to connect to the right DB2 server. The entire process is transparent to the user.

The fact that Type4 only updates the information in memory and does not persist it in the DB2 persistent copy (as is the case with Type2) causes the alternate server information to be lost if the client side JVM (for example, WebSphere Application Server) is shutdown (normally or abnormally). The information is, of course, restored when the connection is established again; however, an issue could arise if the client side JVM is restarted after the DB2 primary failed-over to a standby. In this case, the client will not know about the alternate server information, and instead will continue to connect to the primary (which is down). See the Appendix for more details on how to persist the HADR alternate server information when using the DB2 Universal JDBC Driver Type4.


Configuring WebSphere Application Server

Defining your data source in WebSphere Application Server ND should be straightforward, since there is no special requirement for setting up a connection with an HADR-enabled DB2 server. You can configure your DB2 data source to be of Type2 or Type4 to connect to DB2. For the DB2 server name, you need only enter the primary DB2 server machine. WebSphere Application Server doesn't need to know anything about the standby machine, since the high availability and client reroute features are supported on the DB2 server side. Hence, you will define your DB2 datasource in the same manner as if you were not using HADR.

Here is an example of configuring a DB2 HADR connection on WebSphere Application Server ND:

  1. Create DB2 Universal JDBC Driver provider.
  2. Create a new data_source by selecting JDBC providers => DB2 Universal JDBC Driver Provider => Data sources => <DataSource_Name>. Create the data source with all the information of the primary DB2 machine only (Figure 1).

Before testing the DB2 HADR takeover behavior, you need to verify that the connection between WebSphere Application Server and the DB2 HADR primary machine works well.

Figure 1. Data source configuration on the WebSphere Application Server console
Figure 1. Data source configuration on the WebSphere Application Server console

By default, the DB2 Automatic Client Reroute feature retries to establish a connection to the database repeatedly for up to 10 minutes. It is, however, possible to configure the exact retry behavior. Users of Type4 connectivity with the DB2 Universal JDBC Driver can use the following JDBC custom properties to do so:

  • maxRetriesForClientReroute: Use this property to limit the number of retries, if the primary connection to the server fails. This property is only used if the retryIntervalClientReroute property is also set.
  • retryIntervalForClientReroute: Use this property to specify the amount of time (in seconds) to sleep before retrying again. This property is only used if the maxRetriesForClientReroute property is also set.

These properties can have an effect of better turnaround time by limiting the duration in which an automatic reroute is attempted and returning an error quicker to the application, if a reroute is not possible.

From the WebSphere Application Server administrative console:

  1. To configure resources, navigate to JDBC providers => DB2 Universal JDBC Driver Provider => Data sources => <DataSource_Name> => Custom properties.
  2. Select New and add these custom properties:
    • maxRetriesForClientReroute: 2
    • retryIntervalForClientReroute: 15
  3. Apply and save your configuration.

The above example has the effect of limiting reroute to 2 attempts with a 15 second sleep time between attempts. Figure 2 shows how to set these properties. The actual values you set will depend on the hardware and topology in your environment.

Figure 2. Add custom properties for Data source configuration
Figure 2. Add custom properties for Data source configuration

Refer to the DB2 documentation on Automatic Client Reroute configuration for the equivalent DB2 Registry variables and additional details.


HADR takeover

When the WebSphere Application Server application is connected to the DB2 HADR primary machine, you can start verifying the HADR failover behavior. Issue the following commands on the standby server for a DB2 HADR takeover to occur (Listing 9).

db2 takeover hadr on db <database_name>

It is a good practice to check that your database on the standby machine has the "primary role" after the takeover.

db2 get snapshot for db on <database name> | grep Role

Listing 9. Database takeover on the standby machine
> db2 takeover hadr on db sample

    DB20000I The TAKEOVER HADR ON DATABASE command completed successfully.

After the takeover, WebSphere Application Server needs to get a new connection to access the database on the standby machine. You will see the following messages logged in the SystemOut.log files (Listing 10) indicating that the application first tried to connect to the primary machine and then re-attempted to connect to the standby machine. Once the application retries, it should obtain a new connection to the standby database.

Listing 10. WebSphere Application Server SystemOut.log
[11/1/06 17:15:39:298 CST] 00000039 ServletWrappe E   SRVE0068E: Uncaught exception 
thrown in one of the service methods of the servlet: /dbview.jsp. Exceptionthrown : 
javax.servlet.ServletException: A connection failed but has been re-established. The 
hostname or IP address is "svtlewis.rchland.ibm.com" and the service name 
or port number is 60000 . Special registers may or may not be re-attempted (Reason 
code = 1  DB2ConnectionCorrelator: G9056D89.O37F.061101231714

There are other types of failure situations in which database takeover must be performed by force for the connection between WebSphere Application Server and DB2 HADR to work. Below are some sample situations so you can examine the differences.

Case 1

The primary is unavailable from the network. The state on the standby machine becomes "Remote catchup pending" (Listing 11), meaning that the standby has lost the connection to the primary.

Listing 11. Remote catchup pending state on the standby
> db2 get snapshot for database on sample | grep State

 State                  = Remote catchup pending

In this case, WebSphere Application Server connections with DB2 UDB will timeout. Failover does not occur automatically, since WebSphere Application Server still has a connection with the primary database. Therefore, a "takeover by force" command, shown below, must be performed on the standby machine. Once the takeover takes place, WebSphere Application Server will establish new connections to the standby database.

db2 takeover hard on db <database name> by force

The state will be changed to "Disconnected" for a while, since the primary is not available from the network at the time of takeover (Listing 12).

Listing 12. Database takeover by force on the standby
> db2 takeover hadr on db Sample by force
     
     DB20000I  The TAKEOVER HADR ON DATABASE command completed successfully.

> db2 get snapshot for database on sample | grep Role

     Role    = Primary

> db2 get snapshot for database on sample | grep State

     State   = Disconnected

Case 2

A regular DB2 database is used as the WebSphere Application Server application datastore. A JDBC provider is configured to connect to the database from the applications. Later, it is decided to switch the database to HADR-enabled DB2. After setting up HADR and starting the database server, the first time the application tries to connect to the primary HADR database, it will log the following message in SystemOut.log:

Listing 13. WebSphere Application Sever SystemOut.log
[10/27/06 0:26:27:796 CDT] 0000002b WebApp  E   [Servlet Error]-[/dbview.j
sp]: com.ibm.websphere.ce.cm.StaleConnectionException: A connection failed but has been 
re-established. The hostname or IP address is "svtlewis.rchland.ibm.com" and 
the service name or port number is 60000. Special registers may or may not be re-attempted
(Reason code = 1 DB2ConnectionCorrelator: G9056D89.A7AD.061027052803...)

Once the application reattempts to connect to the database, it will get the connection successfully.

Case 3

During the time that the primary machine is off-line, requests keep coming through WebSphere Application Server to access the standby database. The data on the standby database might be modified based on client requests. Therefore, when the primary is back online again, the databases on the primary and standby might be out of sync. In this situation, you need to start the database on the primary machine as the standby to get the two databases back in sync again.

Since the two databases are out of sync, HADR will automatically sync both databases by changing the state from S-RemoteCatchup to S-NearlyPeer, and finally to S-Peer. The following messages logged in the db2diag.log file show the state changes during the database sync operation.

Listing 14. State changes logged in the db2diag.log file
2006-11-01-17.06.15.113214-360 I383930C400        LEVEL: Warning
PID     : 4021                 TID  : 4160127008  PROC : db2agent (SAMPLE) 0
INSTANCE: db2inst1             NODE : 000         DB   : SAMPLE
APPHDL  : 0-8                  APPID: *LOCAL.db2inst1.061101230614
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduStartup, probe:211
52
MESSAGE : Info: HADR Startup has completed.

2006-11-01-17.06.15.669454-360 E384331C363        LEVEL: Event
PID     : 4032                 TID  : 4160127008  PROC : db2hadrs (SAMPLE) 0
INSTANCE: db2inst1             NODE : 000         DB   : SAMPLE
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrSetHdrState, probe:10
000
CHANGE  : HADR state set to S-RemoteCatchup (was S-RemoteCatchupPending)

2006-11-01-17.06.16.145289-360 E385032C353        LEVEL: Event
PID     : 4032                 TID  : 4160127008  PROC : db2hadrs (SAMPLE) 0
INSTANCE: db2inst1             NODE : 000         DB   : SAMPLE
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrSetHdrState, probe:10
000
CHANGE  : HADR state set to S-NearlyPeer (was S-RemoteCatchup)

2006-11-01-17.06.16.240636-360 E385386C344        LEVEL: Event
PID     : 4032                 TID  : 4160127008  PROC : db2hadrs (SAMPLE) 0
INSTANCE: db2inst1             NODE : 000         DB   : SAMPLE
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrSetHdrState, probe:10
000
CHANGE  : HADR state set to S-Peer (was S-NearlyPeer)

Case 4

To avoid transaction failures for a planned takeover when WebSphere Application Server is not actively serving clients, or to reduce transaction failures for a planned takeover while WebSphere Application Server is actively serving clients, you can first purge the connection pools, then issue the takeover command and purge the pool again (in the case of the application server actively serving clients). For example, suppose the plan is to shut down the primary machine overnight. The next day, starting client requests without purging the connection pool first (even if takeover was done on the database side from the primary to alternate) will result in the client requests using bad connections from the WebSphere Application Server connection pool (assuming they haven't aged out). Hence, the requests will fail until the pool cleans itself of the bad connections. Therefore, it is best to purge the pool manually to avoid this situation.

To purge the connection pool of WebSphere Application Server, go to the deployment manager bin directory and issue the following commands for each application server (see WebSphere connection pool can be purged using MBeans Technote for details):

./wsadmin.sh
set ds [$AdminControl queryNames type=DataSource, name=<ds_name>]
$AdminControl invoke $ds purgePoolContents

Listing 15. Purge the database connection pool for server2
>./wsadmin.sh

>set ds [$AdminControl queryNames type=DataSource,name=myDS,process=server2,*]

>$AdminControl invoke $ds purgePoolContents

You can also purge all the pools in WebSphere Application Server with one command:

Listing 16. Purge all pools
>./wsadmin.sh

>set mbeans [$AdminControl queryNames type=DataSource,*]  
foreach mbean $mbeans {  
>$AdminControl invoke $mbean purgePoolContents
}

You can then issue the takeover command on the standby machine. Since the connection pool on the primary has been purged and the takeover command has been issued, all client requests will be automatically directed to the standby without getting lots of failure messages logged in the SystemOut.log (Listing 13). Know if the takeover occurs while WebSphere Application Server is actively serving clients, there is still a small window in which some connections will be opened after the pool is purged and before the takeover is complete.


Fallback to the primary machine

Prior to fallback to the primary machine, you need to ensure that the DB2 UDB server has been started successfully on the primary, then issue the command below on the primary machine to take over the database from the standby machine.

db2 takeover hadr on db <database name>

Double check that the database on the primary machine has the proper role:

db2 get snapshot for database on <database name> | grep Role

It should return with "Primary." Also, the state now for both databases on the standby and primary should be in the "Peer" state.


Troubleshooting

For your reference, here is a collection of causes and solutions for some of the possible problems that you might encounter using DB2 HADR:

  • SQL1768N Unable to start HADR. Reason code = "7" when starting the primary or standby database.

    The possible cause of this problem is that the primary database failed to establish a connection to its standby within the HADR timeout interval. You can adjust the interval to fit with your environment.

  • SQL1768N Unable to start HADR. Reason code = "8" when starting the primary or standby database.

    Check the required HADR variables below (see Listing 3). They must be correct and cannot be empty:

    • HADR database role
    • HADR local host name
    • HADR remote host name
    • HADR remote service name
    • HADR instance name of remote server
    • HADR timeout value
    • HADR log write synchronization mode
  • SQL1768N Unable to start HADR. Reason code = "99" when starting the primary or standby database.

    Examine the HADR ports and names used for each database in the /etc/services file. They should have identical hadr_remote_svc and hard_local_svc settings, as described above.

  • SQL1769N Stop HADR cannot complete. Reason code = "2" when stopping the primary or standby database.

    In this case, you can deactivate the database prior to stopping the HADR:

    db2 deactivate db <database_name>
    db2 stop hadr on db <database_name>

  • SQL30081N A communication error has been detected when starting HADR on the primary or standby machine.

    Check key parameters to make sure they are configured correctly on all DB2 UDB server machines. Login in to the database machine with the DB2 UDB user account (for example: db2inst1), then type these commands at the user prompt:

    db2start (if DB2 is not running)
    db2set DB2COMM=TCPIP
    db2set DB2AUTOSTART=YES
    db2 update dbm cfg using SVCENAME DB2_db2inst1
    db2stop
    db2start

    (The SVCENAME is the TCP/IP service name on your configuration is in the /etc/services file.)

  • SQL2406N The BACKUP cannot be performed because the data base needs to be rolled forward. SQLSTATE=57019 when backing up the database.

    You need to roll forward your database by using this command (also in Listing 17).

    db2 rollforward db <database_name> to end of logs and stop

    Listing 17. Database rollfoward
    >db2 rollforward db sample to end of logs and stop
    
      Rollforward Status
    
     Input database alias                   = sample
     Number of nodes have returned status   = 1
    
     Node number                            = 0
     Rollforward status                     = not pending
     Next log file to be read               =
     Log files processed                    = S0000000.LOG - S0000006.LOG
     Last committed transaction             = 2006-10-06-16.22.35.000000
    
           DB20000I  The ROLLFORWARD command completed successfully
  • Takeover doesn't work on the primary or standby machine. You can issue the takeover command by force:

    db2 takeover hard on db <database_name> by force


Conclusion

DB2 Universal Database (UDB) High Availability Disaster Recovery (HADR) provides a high availability solution for its client application. As a client application, WebSphere Application Server has the capability to distinguish between a database failure verses a takeover situation. When a DB2 HADR takeover occurs, WebSphere Application Server re-establishes its connection, leveraging the DB2 Automated Client Reroute feature automatically to the standby HADR server.

This article used WebSphere Application Server Network Deployment V6.1 and DB2 UDB HADR to demonstrate how the takeover happens between these two products. You have gone through the steps to enable DB2 with HADR and to configure WebSphere Application Server ND to connect to DB2 HADR. You also learned about the takeover behavior and the expected messages logged in the WebSphere Application Server log files during the takeover. In addition, you received some troubleshooting tips and techniques for several common error situations.

There is no extra configuration step required on WebSphere Application Server to make the HADR failover to work successfully. WebSphere Application Server takes the advantage of the technologies provided by DB2 UDB HADR and the Automatic Client Reroute feature to provide a highly available environment for your applications.

Part 2 of this series will look at attaining high availability databases using Oracle® Real Application Cluster with WebSphere Process Server.


Appendix

To persist HADR alternate server information when using DB2 Universal JDBC Driver Type4:

  1. Add a new DataSource custom property in the WebSphere Application Server administrative console with these values:

    • Name: clientRerouteServerListJNDIName
    • Type: java.lang.String
    • Value: cell/persistent/alSrvlist(here, "cell" should be used as is; this is NOT the cell name)

    The value of the clientRerouteServerListJNDIName must have the cell/persistent/ in front of it. Otherwise, the JNDI object will not be persisted in WebSphere Application Server. The value here should match the JNDI name passed in the T4CRBind.java file (see the registry.bind and regirstry.unbind lines in Listing 18). Save and sync your entire cell.

  2. Copy the program in Listing 18, and compile it with the db2jcc.jar file. From the WAS_HOME/bin directory, execute the compile command below, then put the T4CRBind.class in WAS_HOME/bin directory:

    javac -classpath /home/db2inst1/sqllib/java/db2jcc.jar:$CLASSPATH T4CRBind.java

    Be sure "java" is in your path (that is, set PATH=$JAVA_HOME/bin). If the db2jcc.jar file is located in a different path, then adjust as required.

  3. Create a launcher program that can launch the T4CRBind class as shown in Listing 19.

  4. Restart the WebSphere Application Server that is running your application and start the launcher program from the WAS_HOME/bin directory. You can launch the program by executing:

    launchT4CRBind.sh T4CRBind

    Prior to running this command, make sure that your entire cell is running. This includes the deployment manager, node agents, and application servers.

    Be aware that the program below will not work when WebSphere Application Server global security is enabled. Furthermore, if you are running with global security enabled and want the persistence of alternate server information to work at run time, you will need APAR PK48854 (available for 6.1.0.15 and later).

    Listing 18. T4CRBind.java
    import javax.naming.InitialContext;
    import com.ibm.db2.jcc.DB2ClientRerouteServerList;
    
    public class T4CRBind
    {
    
        public static void main(String[] args)
        {
    	try
    	{
    		InitialContext registry = new InitialContext();
    
    		DB2ClientRerouteServerList address = new DB2ClientRerouteServerList();
    
    		int[] alternateServerPorts =
    		{ 60000 };
    		String[] alternateServerHosts =
    		{ "svtclark.rchland.ibm.com" };
    
    		address.setPrimaryPortNumber(60000);
    		address.setPrimaryServerName("svtlewis.rchland.ibm.com");
    
    		address.setAlternatePortNumber(alternateServerPorts);
    		address.setAlternateServerName(alternateServerHosts);
    
    		registry.unbind("cell/persistent/alSrvlist");
    		registry.rebind("cell/persistent/alSrvlist", address);
    
    		System.out.println("JNDI Registration done....");
    	}
    	catch (Exception e)
    	{
    		e.printStackTrace();
    	}
        }
    }
    Listing 19. launchT4CRBind.sh
    #!/bin/sh
    
    binDir=`dirname $0`
    . "$binDir/setupCmdLine.sh"
    
    CMD_NAME=`basename $0`
    
    if [ -f ${JAVA_HOME}/bin/java ]; then
        JAVA_EXE="${JAVA_HOME}/bin/java"
    else
        JAVA_EXE="${JAVA_HOME}/jre/bin/java"
    fi
    
    DB2JCC_HOME="/home/db2inst1/sqllib/java/db2jcc.jar"
    export DB2JCC_HOME
    
    
    CLASSPATH=".:$DB2JCC_HOME:$WAS_CLASSPATH"
    export CLASSPATH
    
    # In the case where more than one application server is installed, you will need to run 
    # the synch to make sure that the name space of the entire cell is updated.  Also, if the
    # port for the naming space is not the default, then you will need to specify the 
    # –Djava.naming.provider.url= <specify the url here>
    
    
    "$JAVA_HOME/bin/java" \
      "-Dwas.install.root=$WAS_HOME" \
      "-Djava.naming.factory.initial=com.ibm.websphere.naming.WsnInitialContextFactory" \ *
       -classpath "$CLASSPATH" com.ibm.ws.bootstrap.WSLauncher "$@"
    exit 0

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere, Information Management
ArticleID=218280
ArticleTitle=IBM WebSphere Developer Technical Journal: Building a high availability database environment using WebSphere middleware, Part 1
publish-date=02092008