Interpretation of status information
When you query hosts, members or cluster caching facilities for status information, the system presents state and alert information that tells you about the status of the various components in your Db2® pureScale® environment. When problems arise, you generally need to examine both states and alerts to understand what is happening in the system.
The state of a host, member or cluster caching facility (also known as a CF) reflects its operational status. When everything is operating normally, the values reported for the state of hosts, members and cluster caching facilities (also known as CFs) can give you a general idea of the status of your system. For example, a status of RESTARTING, or WAITING_FOR_FAILBACK on a member does not itself indicate that there is a problem. There might be several valid reasons why a member is failing over to a new host, or restarting on its home host, such as when hosts are taken offline for maintenance. If a member is failing over on a frequent, repeated basis, there might be a problem that warrants further investigation.
An alert for a host, member or CF is an indication that a problem exists that might require investigation or intervention. Looking at alerts in the context of the state of a given system component can reveal additional information about the source of the problem. The sections that follow outline the various combinations of state and alert information that you might encounter for hosts, members or cluster caching facilities, and how to interpret different combinations of states and alerts.
- The type of instance in which the table function, administrative view, or command is being run (for example, Db2 pureScale instances or other database instances)
- Whether a supported cluster manager is employed in that instance. All Db2 pureScale Feature deployments use a cluster manager.
Host status
SELECT varchar(HOSTNAME,10) AS HOST,
varchar(STATE,8) AS STATE,
varchar(INSTANCE_STOPPED,7) AS STOPPED,
ALERT
FROM SYSIBMADM.DB2_CLUSTER_HOST_STATE
HOST STATE STOPPED ALERT
---------- -------- ------- --------
HOSTD ACTIVE NO NO
HOSTB ACTIVE NO NO
HOSTA ACTIVE YES NO
HOSTC ACTIVE NO NO
4 record(s) selected.
(In the preceding example, the STOPPED column corresponds to the INSTANCE_STOPPED column returned by the administrative view.)
STATE | INSTANCE_STOPPED | ALERT | Description |
---|---|---|---|
ACTIVE | NO | NO | The host is active and operating normally. |
YES | The host is active (that is, it responds to system commands), however there might be a problem preventing it from participating in the Db2 pureScale instance. For example, there might be a file system problem or a network communication issue, or the idle processes that the Db2 pureScale Feature requires for performing failovers might not be running. | ||
YES | NO | The host is active. The instance has been stopped explicitly on this host by the administrator using the db2stop instance on hostname command | |
YES | The host is active, however, an alert exists for the host that has not been cleared. The administrator has explicitly stopped the instance. | ||
INACTIVE | NO | NO | Not applicable. A host cannot be INACTIVE when both INSTANCE_STOPPED and ALERT are set to NO. |
YES | The host is not responding to system commands. The instance was not stopped explicitly by the administrator, however there is an alert. This combination of status information indicates the abnormal shutdown of a host. Such a shutdown might arise, for example, from a power failure on a host. | ||
YES | NO | Normal state when the instance has been stopped by the administrator. Such a combination of status information might arise when the host is being taken offline for the installation of software updates. | |
YES | The host is not responding to system commands. An alert exists for the host that has not been cleared, but the instance was stopped explicitly by the administrator (that is, the system did not shut down abnormally). |
Member status
You can view member states and alerts using several different interfaces. One such interface is the DB2_MEMBER administrative view. The DB2_MEMBER administrative view shows status information for members in a Db2 pureScale instance. What follows is an example of how to use this administrative view to retrieve member status:SELECT ID,
varchar(STATE,21) AS STATE,
varchar(HOME_HOST,10) AS HOME_HOST,
varchar(CURRENT_HOST,10) AS CUR_HOST,
ALERT
FROM SYSIBMADM.DB2_MEMBER
STATE | ALERT | Description |
---|---|---|
STARTED | NO | The member is started in the instance and is operating normally. |
YES | The member is started in the instance. However, at some point, there was an unsuccessful attempt to fail over to another host. Since that unsuccessful attempt to fail over, the member was able to fail over successfully to another host, or it has failed back to its home host. If the member is running on it its home host, it is running normally; if it is running on a guest host, it is running in light mode. Either way, investigate the alert to determine what happened. | |
STOPPED | NO | The member has been stopped by the administrator using the db2stop command. |
YES | The member has been stopped by the administrator using the db2stop command, however, the alert field has not yet been cleared. | |
RESTARTING | NO | The member is starting. |
YES | The member is starting. However, at some point, there was an unsuccessful attempt to start the member on the home host or to fail over to another host. The alert field has not yet been cleared. | |
WAITING_FOR_FAILBACK | NO | The member is running in light mode on a guest host, and is waiting to fail back to the home host. You might want to examine the status of the home host to see if anything is preventing the member from failing back to the home host (for example, a failed network adapter). |
YES | An attempt to restart the member on the home host might have failed, automatic failback is disabled, or crash recovery might have failed. You need to resolve the problem and clear the alert manually before the member can automatically fail back to its home host. If automatic failback is disabled, manually clear the alert and enable automatic failback using the db2cluster command. | |
ERROR | YES | Db2 cluster services was not able to start the member on any host. You need to resolve the problem and clear the alert manually before attempting to restart the instance. |
cluster caching facility status
SELECT ID,
varchar(STATE,17) AS STATE,
varchar(HOME_HOST,10) AS HOME_HOST,
varchar(CURRENT_HOST,10) AS CUR_HOST,
ALERT
FROM TABLE(DB2_GET_INSTANCE_INFO(NULL,'','','CF',NULL))
STATE | ALERT | Description |
---|---|---|
STOPPED | NO | The cluster caching facility (also known as a CF) has been manually stopped using the db2stop command. |
YES | There has been an unsuccessful attempt by the CF to become the primary CF. The cluster caching facility has been manually stopped in the instance by the administrator using the db2stop command. | |
RESTARTING | NO | The CF is restarting, either as a result of the db2start command, or after a primary CF failure. |
YES | The CF is restarting, however, there is a pending alert from a previous failed attempt by the CF to take on the primary role that must be cleared manually. | |
BECOMING_PRIMARY | NO | The CF will take on the role of primary CF if there is no other primary CF already running in the instance. |
YES | Not applicable. The CF cannot attempt to take on the primary role with an alert condition set. | |
PRIMARY | NO | The CF has taken on the role of primary CF and is operating normally. |
YES | Not applicable. The CF cannot be acting as the primary CF with an alert condition set. | |
CATCHUP(n%) | NO | This secondary CF is in the process of
copying information from the primary CF required for it to
operate in PEER mode. Note: When you view the status of the secondary CF using the command
db2instance -list, it will be in CATCHUP state until a connection is made
to the database. Once the first connection is made, the process of copying data from the primary
CF
begins.
|
YES | This secondary CF is in the process of copying information from the primary CF required for it to operate in PEER mode. There is a pending alert from a previous failed attempt by this CF to take on the primary role that must be cleared manually. | |
PEER | NO | This secondary CF is ready to assume the role of primary CF if the current primary CF fails. |
YES | This secondary CF is ready to assume the role of primary CF if the current primary CF fails. There is a pending alert from a previous failed attempt by this CF to take on the primary role that must be cleared manually. | |
ERROR | YES | The CF could not be started on any host in the instance. You need to resolve the problem and clear the alert manually before attempting to restart the instance. |
Differences in reporting for data-sharing and environments other than Db2 pureScale environments
All the various table functions, administrative views and commands that report status data for hosts, members and cluster caching facilities can be used outside of a Db2 pureScale instance. However, the results returned by these interfaces might be different from what you see in a Db2 pureScale instance.HOSTNAME STATE INSTANCE_STOPPED ALERT
-------- ------ ---------------- -----
HOSTA ACTIVE - NO
HOSTB ACTIVE - NO
HOSTC ACTIVE - NO
HOSTD ACTIVE - NO
ID CURRENT_HOST STATE ALERT
------ ------------------- ---------- -----
0 record(s) selected.
HOSTNAME STATE INSTANCE_STOPPED ALERT
-------- ------ ---------------- -----
HOSTA - - -
HOSTB - - -
HOSTC - - -
HOSTD - - -
4 record(s) selected.