WAS/DB2 cluster model and use cases

This model contains a DB2® database, a WebSphere® Application Server application that depends on DB2 and four WebSphere Applications. The parent/child dependency for this model is that DB2 should be available prior to activating the WAS and WebSphere (WS#) applications depend on the availability of WAS.

The location dependency of the resource groups is that DB2 and WAS should not be activated on the same node and WAS is Online On The Same Node Dependent with WS4 (see figure) and DB2 is Online On The Same Node with WS1, WS2 and WS3. The location dependency for the WS# is purely artificial in this example. However, this is a configuration where one of the nodes is fine-tuned for DB2 (hence will be the highest priority node for DB2) and the other one is fine-tuned for WAS. They both have a common backup node, which can host only one of the two groups at a time.

Figure 1. A WAS cluster and a DB2 cluster with location and parent and child dependencies

A WAS cluster and a DB2 cluster with location and parent and child dependencies

Resource group policies

All resource groups have the following policies:

  • Startup Policy: Online On First Available Node
  • Fallover Policy: Fallover to Next priority node
  • Fallback Policy: Never Fallback
Participating Nodes Location Dependency Parent/Child Dependency
  • DB2 [2, 3]
  • WS1 [2, 3]
  • WS2 [2, 3]
  • WS3 [2, 3]
  • WS4 [1, 3]
  • WAS [1, 3]
Online On The Same Node Dependent Groups:
  1. DB2, WS1, WS2, WS3
  2. WAS, WS4

Online On Different Nodes Dependent Groups:

  • DB2, WAS
  1. WS1, WS2, WS3 and WS4 (children) depend on WAS (parent)
  2. WAS (child) depends on DB2 (parent)

Use case 1: Start first node (Node 1)

Note: All resource groups are offline, all nodes are offline.

All resource groups are offline, all nodes are offline.

Step/ Qualifier Action Node 1 Node 2 Node 3
1 Start Node 1 a Parent/child dependency not met.
2 WAS: ERROR
3 WS4: ERROR
Post-condition/ Resource group states WAS: ERROR WS4: ERROR
  • DB2:
    • WS1:
    • WS2:
    • WS3:
WAS:
DB2:
  • WS1:
  • WS2:
  • WS3:
  • WS4

WAS and WS4 could have started on Node 1, but the parent resource group DB2 is still in the offline state. Therefore, WAS and WS4 are put in the ERROR state.

Use case 2: Start second node (Node 2)

Note: Cluster state as in the post-condition from the use case 2.
Step/ Qualifier Action Node 1 Node 2 Node 3
1 Start Node 2
2 Acquire DB2
3 Acquire WAS
4 Acquire WS4 Acquire WS1, WS2, WS3
Post-condition/ Resource group states WAS:ONLINE

WS4: ONLINE

  • DB2: ONLINE
  • WS1: ONLINE
  • WS2: ONLINE
  • WS3: ONLINE
WAS:
DB2:
  • WS1:
  • WS2:
  • WS3:
  • WS4:

Node 2 starts DB2 (the parent RG), which in turn triggers processing of WAS (child of DB2). Finally all the grandchildren are started on their respective nodes.

Consolidated view of start node sequence 1, 2, 3

Step Node 1 Node 2 Node 3
Start node 1 WAS: ERROR

WS4: ERROR

DB2:
  • WS1:
  • WS2:
  • WS3:
WAS:
DB2:
  • WS1:
  • WS2:
  • WS3:
  • WS4:
Start node 2 WAS: ONLINE

WS4: ONLINE

  • DB2: ONLINE
  • WS1: ONLINE
  • WS2: ONLINE
  • WS3: ONLINE
WAS:
DB2:
  • WS1:
  • WS2:
  • WS3:
  • WS4:
Start node 3 WAS: ONLINE

WS4: ONLINE

  • DB2: ONLINE
  • WS1: ONLINE
  • WS2: ONLINE
  • WS3: ONLINE
  • WAS: OFFLINE
  • DB2: OFFLINE
  • WS1: OFFLINE
  • WS2: OFFLINE
  • WS3: OFFLINE
  • WS4: OFFLINE

Use case 3: Start nodes out of order (Node 3)

Note: All cluster nodes and resource groups are in the offline state.
Step/ Qualifier Action Node 1 Node 2 Node 3 Comments
1 Start Node 3
2 Acquire DB2
Post-condition/ Resource group states WAS: DB2:

WS4

WS1:
  • WS2:
  • WS3:
  • WAS: ERROR
  • DB2: ONLINE
  • WS1: ERROR
  • WS2: ERROR
  • WS3: ERROR
  • WS4: ERRORli

Node 3 is a participating node for all the resource groups. However, WAS and DB2 cannot coexist on the same node. DB2 - being a parent - is started on Node 3, which means that WAS cannot be started on the same node. Since WAS is not online none of the children of WAS can come online on Node 3.

Use case 4: Start second node out of order (Node 2)

Note: Cluster and RG states as at the end of the previous use case.
Step/ Qualifier Action Node 1 Node 2 Node 3
1 Start Node 2
2 Release DB2
3 Acquire DB2
4 Acquire WAS
Acquire WS1, WS2, WS3 Acquire WS4
Post-condition/ Resource group states WAS:

WS4

  • DB2: ONLINE
  • WS1: ONLINE
  • WS2: ONLINE
  • WS3: ONLINE
  • WAS: ONLINE
  • DB2: OFFLINE
  • WS1: OFFLINE
  • WS2: OFFLINE
  • WS3: OFFLINE
  • WS4: ONLINE

Node 2 is the higher priority node for DB2. Therefore DB2 falls back to Node 2 and WAS (Online On Different Nodes Dependency set) can now be acquired on Node 3.

Use case 5: Start third node (Node) 1

Note: Cluster and RG states as at the end of the previous use case.
Step/ Qualifier Action Node 1 Node 2 Node 3
1 Start Node 1
2 Release WS1, WS2, and WS3 Release WS4
3 Acquire WAS
4 Acquire WS4
5 Acquire WS1, WS2 and WS3
Post-condition/ Resource group states WAS: ONLINE

WS4: ONLINE

  • DB2: ONLINE
  • WS1: ONLINE
  • WS2: ONLINE
  • WS3: ONLINE
  • WAS: OFFLINE
  • DB2: OFFLINE
  • WS1: OFFLINE
  • WS2: OFFLINE
  • WS3: OFFLINE
  • WS4: OFFLINE

All groups are now online.

Consolidated view of start node sequence: 3, 2, 1

Step Node 1 Node 2 Node 3
Start Node 3 WAS:

WS4:

DB2:
  • WS1:
  • WS2:
  • WS3:
  • WAS: ERROR
  • DB2: ONLINE
  • WS1: ERROR
  • WS2: ERROR
  • WS3: ERROR
  • WS4: ERROR
Start Node 2 WAS:

WS4:

  • DB2: ONLINE
  • WS1: ONLINE
  • WS2: ONLINE
  • WS3: ONLINE
  • WAS: ONLINE
  • DB2: OFFLINE
  • WS1: OFFLINE
  • WS2: OFFLINE
  • WS3: OFFLINE
  • WS4: ONLINE
Start Node 1 WAS: ONLINE

WS4: ONLINE

  • DB2: ONLINE
  • WS1: ONLINE
  • WS2: ONLINE
  • WS3: ONLINE
  • WAS: OFFLINE
  • DB2: OFFLINE
  • WS1: OFFLINE
  • WS2: OFFLINE
  • WS3: OFFLINE
  • WS4: OFFLINE

Use case 6: Acquisition failure example

Note: Node 1 is offline and all resource groups are ONLINE on Nodes 2 and 3.
Step / Qualifier Action Node 1 Node 2 Node 3 Comments
WAS: WS4:
  • DB2: ONLINE
  • WS1: ONLINE
  • WS2: ONLINE
  • WS3:ONLINE
  • WAS: ONLINE
  • DB2: OFFLINE
  • WS1: OFFLINE
  • WS2: OFFLINE
  • WS3: OFFLINE
  • WS4: ONLINE
1 Node_up 1
2 Release WS1 WS2 WS3 Release Release WS4
3 Release WAS
4 Acquire WAS Acquisition Failure for WAS
5 rg_move WAS Normal rg_move event
6 Acquire WAS
7 Acquire WS1 WS2 WS3 Acquire Acquire WS4
Post condition/ Resource group states WAS:OFFLINE

WS4: OFFLINE

DB2: ONLINE WS1: ONLINE
  • WS2: ONLINE
  • WS3: ONLINE
  • WAS: ONLINE
  • DB2: OFFLINE
  • WS1: OFFLINE
  • WS2: OFFLINE
  • WS3: OFFLINE
  • WS4: ONLINE

As Node 1 joins the cluster, WAS attempts to fallback but gets the acquisition failure. The acquisition failure launches a resource_state_change event; this triggers an rg_move event, which moves WAS to its original node.