Setting the hadr_timeout and hadr_peer_window database configuration parameters

You can configure the hadr_timeout and hadr_peer_window database configuration parameters for optimal response to a connection failure.

hadr_timeout database configuration parameter
If an HADR database does not receive any communication from its partner database for longer than the length of time that is specified by the hadr_timeout database configuration parameter, then the database concludes that the connection with the partner database is lost. If the database is in peer state when the connection is lost, then it moves into disconnected peer state if the hadr_peer_window database configuration parameter is greater than zero, or into remote catchup pending state if hadr_peer_window is not greater than zero. The state change applies to both primary and standby databases.
hadr_peer_window database configuration parameter
The hadr_peer_window configuration parameter does not replace the hadr_timeout configuration parameter. The hadr_timeout configuration parameter determines how long an HADR database waits before it considers that its connection with the partner database as failed. The hadr_peer_window configuration parameter determines whether the database goes into disconnected peer state after the connection is lost, and how long the database remains in that state. HADR breaks the connection as soon as a network error is detected during send, receive, or poll on the TCP socket. HADR polls the socket every 100 milliseconds. This frequency allows it to respond quickly to network errors detected by the OS. Only in the worst case does HADR wait until the timeout to break a bad connection. In this case, a database application that is running at the time of failure can be blocked for the time equal to the sum of the hadr_timeout and hadr_peer_window database configuration parameters.
Note: The HADR peer window is not supported in a Db2® pureScale® environment. Attempts to update it to a nonzero value fail with a warning, and the START HADR command fails if hadr_peer_window is not set to 0.
Setting the hadr_timeout and hadr_peer_window database configuration parameters
It is desirable to keep the waiting time that a database application experiences to a minimum. Setting the hadr_timeout and hadr_peer_window configuration parameters to small values would reduce the time that a database application must wait if an HADR standby database loses its connection with the primary database. However, you should also consider the following details when you are choosing values to assign to the hadr_timeout and hadr_peer_window configuration parameters:
  • Set the hadr_timeout database configuration parameter to a value that is long enough to avoid false alarms on the HADR connection that are caused by short, temporary network interruptions. For example, the default value of hadr_timeout is 120 seconds, which is a reasonable value on many networks.
  • Set the hadr_peer_window database configuration parameter to a value that is long enough to allow the system to perform automated failure responses. If the HA system, for example a cluster manager, detects primary database failure before disconnected peer state ends, a failover to the standby database takes place. Data is not lost in the failover as all data from old primary is replicated to the new primary. If the peer window is too short, the HA system might not have enough time to detect the failure and respond.
    Note: The principal standby uses the primary's setting for hadr_peer_window (the effective peer window). The setting for hadr_peer_window on any auxiliary standby is meaningless because that type of standby always runs in SUPERASYNC mode.