Viewing status information for members and cluster caching facilities in a Db2 pureScale instance
You can view details about the operational status of members and cluster caching facilities (also known as CFs) in a Db2 pureScale instance, such as the role played by CFs (for example, primary or peer), and whether or members have failed over to a different host.
About this task
Procedure
db2instance -list
ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER LOGICAL_PORT NETNAME
-- ---- ----- --------- ------------ ----- ---------------- ------------ ---------
0 MEMBER STARTED HOSTA HOSTA NO 0 0 OSTA-ib0
1 MEMBER STARTED HOSTB HOSTB NO 0 0 HOSTB-ib0
2 MEMBER STARTED HOSTC HOSTC NO 0 0 HOSTC-ib0
128 CF PRIMARY HOSTD HOSTD NO - 0 HOSTD-ib0
129 CF PEER HOSTE HOSTE NO - 0 HOSTE-ib0
HOSTNAME STATE INSTANCE_STOPPED ALERT
-------- ----- ---------------- -----
HOSTA ACTIVE NO NO
HOSTC ACTIVE NO NO
HOSTD ACTIVE NO NO
HOSTE ACTIVE NO NO
HOSTB ACTIVE NO NO
Results
- Hosts
HOSTA
throughHOSTC
are configured as members - Each member is started, and is running on its own home host
- There are 2 cluster caching facilities CFs running
on hosts
HOSTD
andHOSTE
- The primary CF is
running on
HOSTD
; another CF is running onHOSTE
in peer mode, indicating that it is ready to take over the responsibilities of the primary CF in the event of a primary CF failure.
Example
You can also retrieve status information for members and cluster caching facilities using the following interfaces:
- DB2_MEMBER or DB2_CF administrative view
- DB2_GET_INSTANCE_INFO table function
- LIST INSTANCE command-line processor (CLP) command
- db2cluster system command.
The examples that follow illustrate the use of some of these interfaces.
- Example 1: Retrieving status information using the DB2_MEMBER administrative view
- The DB2_MEMBER administrative view shows status information for members in a Db2 pureScale instance. What follows is an example of how to use this administrative view to retrieve member status:
SELECT ID, varchar(STATE,21) AS STATE, varchar(HOME_HOST,10) AS HOME_HOST, varchar(CURRENT_HOST,10) AS CUR_HOST, ALERT FROM SYSIBMADM.DB2_MEMBER
Results:
ID STATE HOME_HOST CUR_HOST ALERT ------ --------------------- ---------- ---------- -------- 0 WAITING_FOR_FAILBACK HOSTA HOSTB NO 1 STARTED HOSTB HOSTB NO 2 STARTED HOSTC HOSTC NO 3 record(s) selected.
In this example, member 0 has failed on its home host and has failed over to HOSTB. Member 0 is waiting to fail back to its home host, HOSTA.
-
Example 2: Retrieving status information for CFs using
the DB2_GET_INSTANCE_INFO table function
- The DB2_GET_INSTANCE_INFO table function lets you retrieve status information for members in a Db2 pureScale instance. One of the benefits of the table function is that you can pass parameters to it to narrow the scope of the results returned. For example, to retrieve information about CFs in a Db2 pureScale instance, you can construct a query such as:
SELECT ID, varchar(STATE,17) AS STATE, varchar(HOME_HOST,10) AS HOME_HOST, varchar(CURRENT_HOST,10) AS CUR_HOST, ALERT FROM TABLE(DB2_GET_INSTANCE_INFO(NULL,'','','CF',NULL))
Results:
ID STATE HOME_HOST CUR_HOST ALERT ------ ---------------- ---------- ---------- -------- 128 RESTARTING HOSTD HOSTD NO 129 BECOMING_PRIMARY HOSTE HOSTE NO 1 record(s) selected.
In this example, the CF with the host ID of 128 has failed. The CF with the host ID of 129 is in the process of taking over as the primary CF.
Example 3: Investigating alerts reported with the db2instance -list command.
- In this example, the results of running the db2instance -list command are as follows:
$ db2instance -list ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER LOGICAL_PORT NETNAME -- ---- ----- --------- ------------ ----- ---------------- ------------ ------- 0 MEMBER STARTED HostA HostA NO 0 0 - 1 MEMBER STARTED HostB HostB NO 0 1 - 2 MEMBER STARTED HostC HostC NO 0 2 - 128 CF ERROR HostD HostD YES - 0 - 129 CF ERROR HostE HostE YES - 0 - HOSTNAME STATE INSTANCE_STOPPED ALERT -------- ----- ---------------- ----- HostA ACTIVE NO NO HostB ACTIVE NO NO HostC ACTIVE NO NO HostD ACTIVE NO YES HostE ACTIVE NO YES There is currently an alert for a member, CF, or host in the data-sharing instance. For more information on the alert, its impact, and how to clear it, run the following command: 'db2cluster -cm -list -alert'
In this example, there are alerts for both cluster caching facilities in the instance. Also, the state of the CFs appear as ERROR. As the message at the end of the report suggests, you can use the db2cluster command with the -cm -list -alert options to view more information about the alerts:
$db2cluster -cm -list -alert Alert: CF '128' failed to start the PRIMARY role on host 'HostD'. Check the cadiag*.log for failures related to CF '128' for more information. Action: This alert must be cleared manually with the command: 'db2cluster -cm -clear -alert'. Impact: CF '128' on host 'HostD' will be unavailable to service requests from Db2 members until the alert is cleared.
-
Example 4: Member alert, a member failed to start in the Db2
pureScale instance
- In this example, the results of running the db2instance -list command are as follows:
$ db2instance -list ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER LOGICAL_PORT NETNAME -- ---- ----- --------- ------------ ----- ---------------- ------------ ------- 0 MEMBER ERROR HostA HostA YES 0 0 - 1 MEMBER STARTED HostB HostB NO 0 1 - 2 MEMBER STARTED HostC HostC NO 0 2 - 128 CF PRIMARY HostD HostD NO - 0 - 129 CF PEER HostE HostE NO - 0 - HOSTNAME STATE INSTANCE_STOPPED ALERT -------- ----- ---------------- ----- HostA ACTIVE NO NO HostB ACTIVE NO NO HostC ACTIVE NO NO HostD ACTIVE NO NO HostE ACTIVE NO NO There is currently an alert for a member, CF, or host in the data-sharing instance. For more information on the alert, its impact, and how to clear it, run the following command: 'db2cluster -cm -list -alert'
In this example, a member failed to start in the Db2 pureScale instance. Running the db2cluster command with the -cm -list -alert options recommends an action to take and outlines the impact of this failure in the Db2 pureScale instance.
$db2cluster -cm -list -alert Alert: Db2 member '0' failed to start on its home host 'HostA'. The cluster manager will attempt to restart the Db2 member in restart light mode on another host. Check the db2diag.log for messages concerning failures on host 'HostA' for member '0'." Action: This alert must be cleared manually with the command: 'db2cluster -cm -clear -alert'. Impact: Db2 member '%0' will not be able to service requests until this alert has been cleared and the Db2 member returns to its home host.
-
Example 5: CF error,
the secondary CF failed
CATCHUP phase
- In this example, the results of running the db2instance -list command are as follows:
$ db2instance -list ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER LOGICAL_PORT NETNAME -- ---- ----- --------- ------------ ----- ---------------- ------------ ------- 0 MEMBER STARTED HostA HostA NO 0 0 - 1 MEMBER STARTED HostB HostB NO 0 1 - 2 MEMBER STARTED HostC HostC NO 0 2 - 128 CF PRIMARY HostD HostD NO - 0 - 129 CF ERROR HostE HostE YES - 0 - HOSTNAME STATE INSTANCE_STOPPED ALERT -------- ----- ---------------- ----- HostA ACTIVE NO NO HostB ACTIVE NO NO HostC ACTIVE NO NO HostD ACTIVE NO NO HostE ACTIVE NO NO There is currently an alert for a member, CF, or host in the data-sharing instance. For more information on the alert, its impact, and how to clear it, run the following command: 'db2cluster -cm -list -alert'
In this example, the secondary CF failed CATCHUP phase. Running the db2cluster command with the -cm -list -alert options recommends an action to take and outlines the impact of this failure in the Db2 pureScale instance.
$db2cluster -cm -list -alert Alert: CF '129' failed to complete CATCHUP on host 'HostE'. Check the db2diag.log for failure messages pertaining to CATCHUP on CF '129'. Action: Contact IBM support to determine the reason for the failure. To re-attempt CATCHUP, restart the failed CF with the commands: 'db2stop 129; db2start 129'. This alert will clear itself when the CF is restarted. Impact: CF '129' on host 'HostE' will not be available until it can undergo CATCHUP successfully.
-
Example 6: Host alert, the host "HostA" lost a network connection.
- In this example, the results of running the db2instance -list command are as follows:
$ db2instance -list ID TYPE STATE HOME_HOST CURRENT_HOST ALERT PARTITION_NUMBER LOGICAL_PORT NETNAME -- ---- ----- --------- ------------ ----- ---------------- ------------ ------- 0 MEMBER WAITING_FOR_FAILBACK HostA HostA NO 0 0 - 1 MEMBER STARTED HostB HostB NO 0 1 - 2 MEMBER STARTED HostC HostC NO 0 2 - 128 CF PRIMARY HostD HostD NO - 0 - 129 CF PEER HostE HostE NO - 0 - HOSTNAME STATE INSTANCE_STOPPED ALERT -------- ----- ---------------- ----- HostA INACTIVE NO YES HostB ACTIVE NO NO HostC ACTIVE NO NO HostD ACTIVE NO NO HostE ACTIVE NO NO There is currently an alert for a member, CF, or host in the data-sharing instance. For more information on the alert, its impact, and how to clear it, run the following command: 'db2cluster -cm -list -alert'.
In this example, the host "HostA" lost a network connection. Running the db2cluster command with the -cm -list -alert options recommends an action to take and outlines the impact of this failure in the Db2 pureScale instance.
$db2cluster -cm -list -alert Alert: Host 'HostA' is INACTIVE. Ensure the host is powered on and connected to the network. Action: This alert will clear itself when the host is ACTIVE. Impact: While the host is INACTIVE, the Db2 members on this host will be in restart light mode on other hosts and will be in the WAITING_FOR_FAILBACK state. Any CF defined on the host will not be able to start, and the host will not be available as a target for restart light.