Values for member and cluster caching facility states and alerts

Many of the table functions, administrative views, and commands that you can use to query the status of components in a Db2® pureScale® environment return state and alert information. The state of a host, member or cluster caching facility (also known as a CF) reflects its operational status. An alert for a host, member or CF is an indication that a problem exists that might require investigation or intervention.

States for hosts, members and cluster caching facilities

State information is returned by many of the table functions, administrative views, and commands that you can use to query the status of the components of a Db2 pureScale environment. The possible values for the state of each component are shown in Table 1:
Table 1. Possible states for hosts, members and cluster caching facilities
Component Possible states Description
Host ACTIVE Host is available for use. This means that the host system is running and can respond to operating system or networking commands, such as the TCP/IP ping command.
INACTIVE Host is not available for use. This means that the host system is not running, not available or not responding to system commands. The reasons for being in this state can range from a power loss at the host to connection or networking issues.
cluster caching facility (CF) STOPPED CF has been manually stopped using the db2stop command as part of a normal shutdown by the administrator.
RESTARTING CF is in the process of starting, either from the db2start command, or after a CF failure.
BECOMING_PRIMARY Once a CF has started, it attempts to take on the role of the primary CF in the instance if no other CF already has this role.
PRIMARY The CF is operating normally as the primary CF.
CATCHUP (n%) When a secondary CF is initially started, it does not contain any information from the primary CF. During CATCHUP state, the secondary CF is in the process of obtaining a copy of all relevant information from the primary CF. This information that enables it to assume the role of primary CF if the primary CF fails. n% indicates how far along the secondary CF is in the process of copying information from the primary CF. When this copying process is complete, the secondary CF moves into PEER state.
Note: When you view the status of the secondary CF using the command db2instance -list, it will be in CATCHUP state until a connection is made to the database. Once the first connection is made, the process of copying data from the primary CF begins.
PEER The secondary CF is ready to take over the responsibilities of the primary CF in the event of a primary CF failure. Duplexing continues while the secondary CF is in PEER state.
ERROR Db2 cluster services could not automatically restart the CF. When the CF reflects an ERROR state, the ALERT field is always set to YES, indicating intervention and investigation is required by the administrator. Db2 cluster services no longer attempts to restart the CF once it is in the ERROR state unless the alert has been cleared.

The ERROR state can also occur if a connection to a CF cannot be established to query its state. In this case, the ALERT field is not set to YES because the problem might be temporary. Rerun the db2instance -list command for a couple of iterations with some delay in between to check if the CF state changes to a healthy state, or if the ALERT field changes to YES.

Member STARTED Member is started in the instance and operating normally. All databases are consistent and member is ready to accept or is already accepting connections to databases. If a member failed and started again, it is possible that the process model has started, but that crash recovery of the database is not yet complete. Use the LIST UTILITIES command with the SHOW DETAIL option from any member to monitor the recovery progress.
STOPPED Member has been manually stopped using the db2stop command as part of a normal shutdown by the administrator.
RESTARTING Member is in the process of starting or restarting. If the current host is the same as the home host, then a local member restart is taking place. If current host is different from the home host, then a member has failed over to the current host, and is restarting in light mode.
WAITING_FOR_FAILBACK The process model for this member has successfully restarted on the current host in light mode. The member is waiting for its home host to become available, and which point, it will fail back on the home host. Use the LIST UTILITIES command with the SHOW DETAIL option from any active member to monitor recovery progress, and to see if crash recovery is complete for all databases. The member does not accept any new connections, nor does it process any transactions. Indoubt transactions might still exist.
ERROR Db2 cluster services could not automatically restart the member, either on its home host or on any other host in the Db2 pureScale instance. When the member reflects an ERROR state, the ALERT field is always set to YES, indicating intervention and investigation is required from the administrator. Db2 cluster services no longer attempts to restart the member once it is in the ERROR state unless the alert has been cleared.

Alerts for hosts, members and cluster caching facilities

In addition to returning state information, the commands that query the status of the components in a Db2 pureScale environment also return alert information. The possible values for alerts for all components is either YES or NO. Generally speaking, an alert value of NO is an indication that things are running normally. An alert value of YES indicates that there is a problem that might require manual intervention. In some cases, the alert conditions are temporary, and the alert field might clear itself, such as when a host is rebooted. In other cases, the alert field remains set until the administrator resolves the problem and manually resets the alert field using the db2cluster command with the -clear -alert options.