Pacemaker base component

In the integrated high availability (HA) solution with Pacemaker, the cluster software stack is composed of various components which are all needed to run Pacemaker effectively.

Important: In Db2® 11.5.8 and later, Mutual Failover high availability is supported when using Pacemaker as the integrated cluster manager. In Db2 11.5.6 and later, the Pacemaker cluster manager for automated fail-over to HADR standby databases is packaged and installed with Db2. In Db2 11.5.5, Pacemaker is included and available for production environments. In Db2 11.5.4, Pacemaker is included as a technology preview only, for development, test, and proof-of-concept environments.

Resources

A set of Db2 defined entities where states are to be monitored, started, or stopped. For HADR, this includes the Db2 instance, HADR capable databases, Ethernet network adapters, and virtual IP addresses. For Mutual Failover (MF) and Data Partitioning Feature (DPF), this includes partitions, mount points, Ethernet network adapters, and virtual IP addresses. Lastly, for pureScale, this includes Cluster Caching Facilities (CF), Db2 members, the primary CF, idle members, mount points, Ethernet network adapter, Db2 instance, and fence agent.

Resources that are concurrently active on multiple cluster domain nodes are called resource clones. Examples of resource clones include Storage Scale mount points and Ethernet network adapters.

Constraints

These are rules setup during cluster creation to augment the behavior of processes:

  • Location constraints specify where resources can run and where they prefer to run
  • Ordering constraint - the order in which certain resource actions must occur.
  • Co-location constraint - the dependency of one resource's location on that of another resource.
The following are examples of how these constraints function:
  • Location constraints specify where resources can run and where they prefer to run:
    • HADR: the following location constraint specifies that the instance resource db2_hadr-srv-1_db2inst1_0 prefers to run on the hadr-srv-1 host.
      <rsc_location id="pref-db2_hadr-srv-1_db2inst1_0-runsOn-hadr-srv-1" rsc="db2_hadr-srv-1_db2inst1_0" score="200" node="hadr-srv-1"/>
    • pureScale: the following location constraints specify that the member resource db2_member_db2inst1_0 can run on both ps-srv-3 host and ps-srv-4 host, but prefers to run on the ps-srv-3 host. The higher score value represents greater location priority.
      <rsc_location id="pref-db2_member_db2inst1_0-runsOn-ps-srv-3" rsc="db2_member_db2inst1_0" resource discovery="exclusive" score="INFINITY" node="ps-srv-3"/> 
      <rsc_location id="pref-db2_member_db2inst1_0-runsOn-ps-srv-4” rsc="db2_member_db2inst1_0" resource discovery="exclusive" score="100" node="ps-srv-4”/>
  • Location constraints can also be conditional, specifying that a resource will only run depending on the state of another resource. Within these constraints, the expression attributes represent the resource state where 1 is started and 0 is stopped. The commonly utilized negative INFINITY score implies that the host must be avoided.
    • HADR: the following location constraint specifies that the database resource will only run if the Ethernet network adapter eth0 is healthy.
      <rsc_location id="dep-db2_db2inst1_db2inst1_HADRDB-clone-dependsOn-db2_ethmonitor_hadr-srv-1_eth0" rsc="db2_db2inst1_db2inst1_HADRDB-clone">
            <rule score="-INFINITY" id="dep-db2_db2inst1_db2inst1_HADRDB-clone-dependsOn-db2_ethmonitor_hadr-srv-1_eth0-rule">
              <expression attribute="db2ethmon_cib" operation="eq" value="0" id="dep-db2_db2inst1_db2inst1_HADRDB-clone-dependsOn-db2_ethmonitor_hadr-srv-1_eth0-rule-expression"/>
            </rule>
      </rsc_location>
      
      • pureScale: the following location constraint specifies the primary CF resource cannot run on a host where there is no active CF (neither db2_cf_regress1_128 or db2_cf_regress1_129 resources are running).
        <rsc_location id="dep-db2_cfprimary_db2inst1-dependsOn-cf" rsc="db2_cfprimary_db2inst1">  
          <rule score="-INFINITY" id="dep-db2_cfprimary_db2inst1-dependsOn-cf-rule"> 
            <expression attribute="db2_cf_db2inst1_128_cib" operation="ne" value="1" id="dep-db2_cfprimary_db2inst1-dependsOn-cf-rule-expression"/>  
            <expression attribute="db2_cf_db2inst1_129_cib" operation="ne" value="1" id="dep-db2_cfprimary_db2inst1-dependsOn-cf-rule-expression-0"/>  
          </rule> 
        </rsc_location>
    • DPF example: the following location constraint specifies that partitions cannot run on a host where the public network is not healthy.
      <rsc_location id="dep-db2_partition_db2inst1-dependsOn-public_network" rsc-pattern="db2_partition_db2inst1_[0-9]{1,3}">
            <rule score="-INFINITY" id="dep-db2_partition_db2inst1-dependsOn-public_network-rule">
              <expression attribute="db2ethmon_cib" operation="eq" value="0" id="dep-db2_partition_db2inst1-dependsOn-public_network-rule-expression"/>
            </rule>
      </rsc_location>
  • Ordering constraints ensure that the resources start in the correct order.
    • HADR example: the following ordering constraint ensures that the database resource starts before the primary VIP resource does.
      <rsc_order id="order-rule-db2_db2inst1_db2inst1_HADRDB-then-primary-VIP" kind="Mandatory" first="db2_db2inst1_db2inst1_HADRDB-clone" first-action="start" then="db2_db2inst1_db2inst1_HADRDB-primary-VIP" then-action="start"/>
    • pureScale example: the following ordering constraint ensures that the shared GPFS mount resource must start before the member resource db2_member_db2inst1_0 does.
      <rsc_order id="order-db2_mount_db2fs_sharedFS-clone-then-db2_member_db2inst1_0" kind="Mandatory" symmetrical="true" first="db2_mount_db2fs_sharedFS-clone" first-action="start" then="db2_member_db2inst1_0" then-action="start"/>
  • Co-location constraints ensure that resources that need to be on the same host are currently active on the same host.
    • HADR example: the following co-location constraint ensures that the primary VIP is running on the same host as the primary HADR database:
      <rsc_colocation id="db2_db2inst1_db2inst1_HADRDB-primary-VIP-colocation" score="INFINITY" rsc="db2_db2inst1_db2inst1_HADRDB-primary-VIP" rsc-role="Started" with-rsc="db2_db2inst1_db2inst1_HADRDB-clone" with-rsc-role="Master"/>

Resource set

A group of resources under the effect of a specific constraint. A constraint with a resource set is also ordered. For example, in a DPF configuration, the following co-location constraint ensures that partition resources db2_partition_db2inst1_0, db2_partition_db2inst1_1, and db2_partition_db2inst1_2 all reside on the same host and are started in that respective order.

<rsc_colocation id="co-db2_partition_db2inst1_0-with-db2_partition_db2inst1_1-with-db2_partition_db2inst1_2-with-set" score="INFINITY">
        <resource_set id="db2_partition_db2inst1_0-with-db2_partition_db2inst1_1-with-db2_partition_db2inst1_2-with-set">
          <resource_ref id="db2_partition_db2inst1_0"/>
          <resource_ref id="db2_partition_db2inst1_1"/>
          <resource_ref id="db2_partition_db2inst1_2"/>
        </resource_set>
</rsc_colocation>

Resource model

A Pacemaker resource model for Db2 refers to the pre-defined relationship and constraints of all resources. The resource model is created as part of the cluster setup using the db2cm utility with the -create option. Any deviation or alteration of the model without approval from Db2 will render the model unsupported.

Resource agents

Resource agents in Pacemaker are the Db2 user exits which are a set of shell scripts developed and supported by Db2 to perform actions on the resources defined in the resource model.

The resource agents provided are:
  • - HADR

    • db2hadr

      The resource agent to monitor, start, and stop individual HADR-enabled databases. This is at the Db2 database level.
    • db2inst

      The resource agent to monitor, start, and stop the Db2 member process. This is at the Db2 instance level.
    • db2ethmon

      The resource agent to monitor the defined Ethernet network adapter. This is at host level.

    - Mutual Failover (MF) and Data Partitioning Feature (DPF)

    • db2partition

      The resource agent to monitor, start, and stop the Db2 partition process. This is at the Db2 instance level.
    • db2fs

      The resource agent to monitor, start, and stop a file system.
    • db2ethmon

      The resource agent to monitor the defined Ethernet network adapter. This is at host level.

    - pureScale

    • db2cf

      The resource agent to monitor, start, and stop cluster caching facilities (CF). This is at the Db2 instance level.
    • db2member

      The resource agent to monitor, start, and stop Db2 members. This is at the Db2 instance level.
    • db2cfprimary

      The resource agent to monitor the primary CF. This is at the Db2 instance level.
    • db2idle

      The resource agent to monitor, start, and stop idle processes. This is at host level.
    • db2ethmonitor

      The resource agent to monitor the defined Ethernet network adapter. This is at host level.
    • db2instancehost

      The resource agent to monitor the Db2 instance. This is at host level.
    • db2mount_ss

      The resource agent to monitor the mount points. This is at host level.
    • db2fence_ps

      The resource agent to perform Db2 node fencing and un-fencing operations.

Cluster topology and communication layer

All HA cluster manager software must have the capability to ensure each node has the same view of the cluster topology (or membership). Pacemaker utilizes the Corosync Cluster Engine, an open source group communication system software, to provide a consistent view of cluster topology, ensure reliable messaging infrastructure so that events are executed in the same order in each node, and to apply quorum constraints.

Cluster domain leader

One of the nodes in the cluster will be elected as the Domain Leader (also known as the Designated Controller (DC) in Pacemaker terms) where the Pacemaker controller daemon residing on the DC will assume the role to make all cluster decisions. A new domain leader will be elected if the current domain leader's host fails.

For more information on Pacemaker internal components and their interactions, refer to the Pacemaker architecture.