Using roving high availability (HA) failover in partitioned database environments
When you are using a N Plus M failover policy with 'N' active nodes and one standby node, you can enable roving HA failover.
Before you begin
Each node in the cluster must have the roving HA failover support enabled or disabled.
In partitioned database environments where roving HA failover is not enabled, the designated standby node is usually the only node with access to all the disks and volume groups, including the file systems on these volume groups. In those environments, ensure that the external storage LUN mappings and the SAN zones in the cluster can see all the disks in the database instance. In addition, verify that all the volume groups controlled by the cluster are imported on all the cluster nodes. After importing the volume groups, disable the auto-varyon attribute of volume groups and the auto-mount attribute of the file systems on all the active cluster nodes.
If you want to use roving HA failover, you must enable it again using these steps after applying a new fix pack.
About this task
N Plus M
failover policy with 'N' active nodes and exactly
one standby node, a failover operation occurs when one of the active nodes fail. As part of
failover, the standby node begins hosting the resources of the failed node. When the failed node
comes back online, you would usually have to take a momentary outage in order to move the resources
back over to their original active node. Instead of this, you can configure roving HA failover to
have the last failed node in the cluster become the standby node for all other partitions in the
cluster without requiring any additional fail back operations. Online IBM.ResourceGroup:db2_db2inst1_0-rg Nominal=Online
|- Online IBM.Application:db2_db2inst1_0-rs
|- Online IBM.Application:db2_db2inst1_0-rs:hostA
'- Offline IBM.Application:db2_db2inst1_0-rs:hostD
Online IBM.ResourceGroup:db2_db2inst1_1-rg Nominal=Online
|- Online IBM.Application:db2_db2inst1_1-rs
|- Online IBM.Application:db2_db2inst1_1-rs:hostB
'- Offline IBM.Application:db2_db2inst1_1-rs:hostD
Online IBM.ResourceGroup:db2_db2inst1_1-rg Nominal=Online
|- Online IBM.Application:db2_db2inst1_2-rs
|- Online IBM.Application:db2_db2inst1_2-rs:hostC
'- Offline IBM.Application:db2_db2inst1_2-rs:hostD
hostB
, the resource model would then look as
follows (without the roving HA failover
feature):Online IBM.ResourceGroup:db2_db2inst1_0-rg Nominal=Online
|- Online IBM.Application:db2_db2inst1_0-rs
|- Online IBM.Application:db2_db2inst1_0-rs:hostA
'- Offline IBM.Application:db2_db2inst1_0-rs:hostD
Online IBM.ResourceGroup:db2_db2inst1_1-rg Nominal=Online
|- Online IBM.Application:db2_db2inst1_1-rs
|- Offline IBM.Application:db2_db2inst1_1-rs:hostB
'- Online IBM.Application:db2_db2inst1_1-rs:hostD
Online IBM.ResourceGroup:db2_db2inst1_1-rg Nominal=Online
|- Online IBM.Application:db2_db2inst1_2-rs
|- Online IBM.Application:db2_db2inst1_2-rs:hostC
'- Offline IBM.Application:db2_db2inst1_2-rs:hostD
hostB
:Online IBM.ResourceGroup:db2_db2inst1_0-rg Nominal=Online
|- Online IBM.Application:db2_db2inst1_0-rs
|- Online IBM.Application:db2_db2inst1_0-rs:hostA
'- Offline IBM.Application:db2_db2inst1_0-rs:hostB
Online IBM.ResourceGroup:db2_db2inst1_1-rg Nominal=Online
|- Online IBM.Application:db2_db2inst1_1-rs
|- Offline IBM.Application:db2_db2inst1_1-rs:hostB
'- Online IBM.Application:db2_db2inst1_1-rs:hostD
Online IBM.ResourceGroup:db2_db2inst1_1-rg Nominal=Online
|- Online IBM.Application:db2_db2inst1_2-rs
|- Online IBM.Application:db2_db2inst1_2-rs:hostC
'- Offline IBM.Application:db2_db2inst1_2-rs:hostB
In the above resource model, we see that after the failure to hostB
,
hostB
is now the standby host location for all active partitions on hosts A, C and
D.
Procedure
To enable the roving HA failover feature, perform the following steps on each host in the cluster:
Results
What to do next
- Ensure that there is no failover operation in progress.
- Make a backup copy of the
db2V115_start.ksh
db2V121_start.ksh
script located in the /usr/sbin/rsct/sapolicies/db2/ directory. - Edit the
db2V115_start.ksh
db2V121_start.ksh
script. Find the following line:
and make the following change to set theROVING_STANDBY_ENABLED=true
ROVING_STANDBY_ENABLED
variable tofalse
:ROVING_STANDBY_ENABLED=false
- Save your changes.