Pacemaker base component
In the integrated high availability (HA) solution with Pacemaker, the cluster software stack is composed of various components which are all needed to run Pacemaker effectively.
Resources
A set of Db2 defined entities where states are to be monitored, started, or stopped. For HADR, this includes the Db2 instance, HADR capable databases, Ethernet network adapters, and virtual IP addresses. For Mutual Failover (MF) and Data Partitioning Feature (DPF), this includes partitions, mount points, Ethernet network adapters, and virtual IP addresses. Lastly, for pureScale, this includes Cluster Caching Facilities (CF), Db2 members, the primary CF, idle members, mount points, Ethernet network adapter, Db2 instance, and fence agent.
Resources that are concurrently active on multiple cluster domain nodes are called resource clones. Examples of resource clones include Storage Scale mount points and Ethernet network adapters.
Constraints
These are rules setup during cluster creation to augment the behavior of processes:
- Location constraints specify where resources can run and where they prefer to run
- Ordering constraint - the order in which certain resource actions must occur.
- Co-location constraint - the dependency of one resource's location on that of another resource.
- Location constraints specify where resources can run and where they prefer to run:
- HADR: the following location constraint specifies that the instance resource
db2_hadr-srv-1_db2inst1_0prefers to run on thehadr-srv-1host.<rsc_location id="pref-db2_hadr-srv-1_db2inst1_0-runsOn-hadr-srv-1" rsc="db2_hadr-srv-1_db2inst1_0" score="200" node="hadr-srv-1"/>
- pureScale: the following location constraints specify that the member resource
db2_member_db2inst1_0can run on bothps-srv-3host andps-srv-4host, but prefers to run on theps-srv-3host. The higher score value represents greater location priority.<rsc_location id="pref-db2_member_db2inst1_0-runsOn-ps-srv-3" rsc="db2_member_db2inst1_0" resource discovery="exclusive" score="INFINITY" node="ps-srv-3"/> <rsc_location id="pref-db2_member_db2inst1_0-runsOn-ps-srv-4” rsc="db2_member_db2inst1_0" resource discovery="exclusive" score="100" node="ps-srv-4”/>
- HADR: the following location constraint specifies that the instance resource
- Location constraints can also be conditional, specifying that a resource will only run depending
on the state of another resource. Within these constraints, the expression attributes represent the
resource state where 1 is started and 0 is stopped. The commonly utilized negative INFINITY score
implies that the host must be avoided.
- HADR: the following location constraint specifies that the database resource will only run if
the Ethernet network adapter
eth0is healthy.<rsc_location id="dep-db2_db2inst1_db2inst1_HADRDB-clone-dependsOn-db2_ethmonitor_hadr-srv-1_eth0" rsc="db2_db2inst1_db2inst1_HADRDB-clone"> <rule score="-INFINITY" id="dep-db2_db2inst1_db2inst1_HADRDB-clone-dependsOn-db2_ethmonitor_hadr-srv-1_eth0-rule"> <expression attribute="db2ethmon_cib" operation="eq" value="0" id="dep-db2_db2inst1_db2inst1_HADRDB-clone-dependsOn-db2_ethmonitor_hadr-srv-1_eth0-rule-expression"/> </rule> </rsc_location>
-
- pureScale: the following location constraint specifies the primary CF resource cannot run on a
host where there is no active CF (neither db2_cf_regress1_128 or db2_cf_regress1_129 resources are
running).
<rsc_location id="dep-db2_cfprimary_db2inst1-dependsOn-cf" rsc="db2_cfprimary_db2inst1"> <rule score="-INFINITY" id="dep-db2_cfprimary_db2inst1-dependsOn-cf-rule"> <expression attribute="db2_cf_db2inst1_128_cib" operation="ne" value="1" id="dep-db2_cfprimary_db2inst1-dependsOn-cf-rule-expression"/> <expression attribute="db2_cf_db2inst1_129_cib" operation="ne" value="1" id="dep-db2_cfprimary_db2inst1-dependsOn-cf-rule-expression-0"/> </rule> </rsc_location>
- pureScale: the following location constraint specifies the primary CF resource cannot run on a
host where there is no active CF (neither db2_cf_regress1_128 or db2_cf_regress1_129 resources are
running).
- DPF example: the following location constraint specifies that partitions cannot run on a host
where the public network is not
healthy.
<rsc_location id="dep-db2_partition_db2inst1-dependsOn-public_network" rsc-pattern="db2_partition_db2inst1_[0-9]{1,3}"> <rule score="-INFINITY" id="dep-db2_partition_db2inst1-dependsOn-public_network-rule"> <expression attribute="db2ethmon_cib" operation="eq" value="0" id="dep-db2_partition_db2inst1-dependsOn-public_network-rule-expression"/> </rule> </rsc_location>
- HADR: the following location constraint specifies that the database resource will only run if
the Ethernet network adapter
- Ordering constraints ensure that the resources start in the correct order.
- HADR example: the following ordering constraint ensures that the database resource starts before
the primary VIP resource
does.
<rsc_order id="order-rule-db2_db2inst1_db2inst1_HADRDB-then-primary-VIP" kind="Mandatory" first="db2_db2inst1_db2inst1_HADRDB-clone" first-action="start" then="db2_db2inst1_db2inst1_HADRDB-primary-VIP" then-action="start"/> - pureScale example: the following ordering constraint ensures that the shared GPFS mount resource
must start before the member resource
db2_member_db2inst1_0does.<rsc_order id="order-db2_mount_db2fs_sharedFS-clone-then-db2_member_db2inst1_0" kind="Mandatory" symmetrical="true" first="db2_mount_db2fs_sharedFS-clone" first-action="start" then="db2_member_db2inst1_0" then-action="start"/>
- HADR example: the following ordering constraint ensures that the database resource starts before
the primary VIP resource
does.
- Co-location constraints ensure that resources that need to be on the same host are currently
active on the same host.
- HADR example: the following co-location constraint ensures that the primary VIP is running on
the same host as the primary HADR
database:
<rsc_colocation id="db2_db2inst1_db2inst1_HADRDB-primary-VIP-colocation" score="INFINITY" rsc="db2_db2inst1_db2inst1_HADRDB-primary-VIP" rsc-role="Started" with-rsc="db2_db2inst1_db2inst1_HADRDB-clone" with-rsc-role="Master"/>
- HADR example: the following co-location constraint ensures that the primary VIP is running on
the same host as the primary HADR
database:
Resource set
A group of resources under the effect of a specific constraint. A constraint with a resource set
is also ordered. For example, in a DPF configuration, the following co-location constraint ensures
that partition resources db2_partition_db2inst1_0,
db2_partition_db2inst1_1, and db2_partition_db2inst1_2 all reside
on the same host and are started in that respective order.
<rsc_colocation id="co-db2_partition_db2inst1_0-with-db2_partition_db2inst1_1-with-db2_partition_db2inst1_2-with-set" score="INFINITY">
<resource_set id="db2_partition_db2inst1_0-with-db2_partition_db2inst1_1-with-db2_partition_db2inst1_2-with-set">
<resource_ref id="db2_partition_db2inst1_0"/>
<resource_ref id="db2_partition_db2inst1_1"/>
<resource_ref id="db2_partition_db2inst1_2"/>
</resource_set>
</rsc_colocation>
Resource model
A Pacemaker
resource model for Db2 refers to the
pre-defined relationship and constraints of all resources. The resource model is created as part of
the cluster setup using the db2cm utility with the -create
option. Any deviation or alteration of the model without approval from Db2 will render the model
unsupported.
Resource agents
Resource agents in Pacemaker are the Db2 user exits which are a set of shell scripts developed and supported by Db2 to perform actions on the resources defined in the resource model.
-
- HADR
db2hadr
The resource agent to monitor, start, and stop individual HADR-enabled databases. This is at the Db2 database level.db2inst
The resource agent to monitor, start, and stop the Db2 member process. This is at the Db2 instance level.db2ethmon
The resource agent to monitor the defined Ethernet network adapter. This is at host level.
- Mutual Failover (MF) and Data Partitioning Feature (DPF)
db2partition
The resource agent to monitor, start, and stop the Db2 partition process. This is at the Db2 instance level.db2fs
The resource agent to monitor, start, and stop a file system.db2ethmon
The resource agent to monitor the defined Ethernet network adapter. This is at host level.
- pureScale
db2cf
The resource agent to monitor, start, and stop cluster caching facilities (CF). This is at the Db2 instance level.db2member
The resource agent to monitor, start, and stop Db2 members. This is at the Db2 instance level.db2cfprimary
The resource agent to monitor the primary CF. This is at the Db2 instance level.db2idle
The resource agent to monitor, start, and stop idle processes. This is at host level.db2ethmonitor
The resource agent to monitor the defined Ethernet network adapter. This is at host level.db2instancehost
The resource agent to monitor the Db2 instance. This is at host level.db2mount_ss
The resource agent to monitor the mount points. This is at host level.db2fence_ps
The resource agent to perform Db2 node fencing and un-fencing operations.
Cluster topology and communication layer
All HA cluster manager software must have the capability to ensure each node has the same view of the cluster topology (or membership). Pacemaker utilizes the Corosync Cluster Engine, an open source group communication system software, to provide a consistent view of cluster topology, ensure reliable messaging infrastructure so that events are executed in the same order in each node, and to apply quorum constraints.
Cluster domain leader
One of the nodes in the cluster will be elected as the Domain Leader (also known as the Designated Controller (DC) in Pacemaker terms) where the Pacemaker controller daemon residing on the DC will assume the role to make all cluster decisions. A new domain leader will be elected if the current domain leader's host fails.
For more information on Pacemaker internal components and their interactions, refer to the Pacemaker architecture.