You can use database configuration parameters to help achieve optimal performance with DB2® HADR.
One exception to the configuration parameter behavior that is described in the previous paragraph is the behavior of the logfilsiz database configuration parameter. Although the value of this parameter is not replicated to the standby database, to help ensure that there are identical log files on both databases, the setting for the logfilsiz configuration parameter on the standby is ignored. Instead, the database creates local log files whose sizes match the size of the log files on the primary database.
After a takeover, the original standby (new primary) uses the logfilsiz parameter value that you set on the original primary until you restart the database. At that point, the new primary reverts to using the value that you set locally. In addition, the current log file is truncated and any pre-created log files are resized on the new primary.
If the databases keep switching roles as a result of a non-forced takeover and neither database is deactivated, the log file size that is used is always the one from the original primary database. However, if there is a deactivation and then a restart on the original standby (new primary), the new primary uses the log file size that you configured locally. This log file size continues to be used if the original primary takes over again. Only after a deactivation and restart on the original primary would the log file size revert to the settings on the original primary.
If you issue the LOAD command on the primary database with the COPY YES parameter, the command executes on the primary database, and the data is replicated to the standby database if the load copy can be accessed through the path or device that is specified by the command. If load copy data cannot be accessed from the standby database, the table space in which the table is stored is marked invalid on the standby database. Any future log records that pertain to this table space are skipped. To ensure that the load operation can access the load copy on the standby database, use a shared location for the output file from the COPY YES parameter. Alternatively, you can deactivate the standby database while performing the load on the primary, place a copy of the output file in the standby path, and then activate the standby database.
If you issue the LOAD command with the NONRECOVERABLE parameter on the primary database, the command executes on the primary database, and the table on the standby database is marked invalid. Any future log records that pertain to this table are skipped. You can issue the LOAD command with the COPY YES and REPLACE parameters to bring the table back, or you can drop the table to recover the space.
Because a load operation with the COPY NO parameter is not supported with HADR, the operation is automatically converted to a load operation with the NONRECOVERABLE parameter. To enable a load operation with the COPY NO parameter to be converted to a load operation with the COPY YES parameter, set the DB2_LOAD_COPY_NO_OVERRIDE registry variable on the primary database. This registry variable is ignored on the standby database. Ensure that the device or directory that you specify for the primary database can be accessed by the standby database by using the same path, device, or load library.
If you are using the Tivoli® Storage Manager (TSM) software to perform a load operation with the COPY YES parameter, you might have to set the vendoropt configuration parameter on the primary and standby databases. Depending on how you configured TSM, the values on the primary and standby databases might not be the same. Also, when using TSM to perform a load operation with the COPY YES parameter, you must issue the db2adutl command with the GRANT parameter to give the standby database read access to the files that are loaded.
Honoring peer window transition when a database breaks out of peer state ensures peer window semantics for safe takeover in all cases. If the primary fails during the transition, normal peer window protection still applies: safe takeover from the standby if it is still in disconnected peer state.
On the standby side, after disconnection, the database continues replaying already received logs. After the received logs have been replayed, the standby reconnects to the primary. After replaying the received logs, the standby reconnects to the primary. Upon reconnection, normal state transition follows: first remote catchup state, then peer state.
The hadr_timeout database configuration parameter does not break the primary out of peer state if the primary keeps receiving heartbeat messages from the standby while blocked. The hadr_timeout database configuration parameter specifies a timeout value for the HADR network layer. An HADR database breaks the connection to its partner database if it has not received any message from its partner for the period that is specified by the hadr_timeout configuration parameter. The timeout does not control timeout for higher-layer operations such as log shipping and ack (acknowledgement) signals. If log replay on the standby database is stuck on a large operation such as load or reorganization, the HADR component still sends heartbeat messages to the primary database on the normal schedule. In such a scenario, the primary is blocked as long as the standby replay is blocked unless you set the DB2_HADR_PEER_WAIT_LIMIT registry variable.
The DB2_HADR_PEER_WAIT_LIMIT registry variable unblocks primary logging regardless of connection status. Even if you do not set the DB2_HADR_PEER_WAIT_LIMIT registry variable, the primary always breaks out of peer state when a network error is detected or the connection is closed, possibly as result of the hadr_timeout configuration parameter.
Some HADR configuration parameters are static, such as hadr_local_host and hadr_remote_host. Static parameters are loaded on database startup, and changes are ignored during run time. HADR parameters are also loaded when the START HADR command completes. On the primary database, HADR can be started and stopped dynamically, with the database remaining online. Thus, one way to refresh the effective value of an HADR configuration parameter without shutting down the database is to stop and restart HADR. In contrast, the STOP HADR brings down the database on the standby, so the standby's parameters cannot be refreshed with the database online.
The target list specifies a set of host:port pairs that act as standbys (for the primary) or the standby hosts to be used if the standby takes over as the new HADR primary database. For a detailed description of its usage, see the hadr_target_list topic in the Related links.
TCP connections are used for communication between the primary and standby databases. The "local" parameters specify the local address and the "remote" parameters specify the remote address. A primary database or primary database member listens on its local address for new connections. A standby database that is not connected to a primary database retries connection to its remote address.
The standby database also listens on its local address. In some scenarios, another HADR database can contact the standby database on this address and send it messages.
my local address = your remote address
and my remote address = your local address
The
check is done using the IP address and port number, rather than the
literal string in the configuration parameters. You need to set the HADR_NO_IP_CHECK registry
variable in NAT (Network Address Translation) environment to bypass
the check. IP mode used for hadr_local_host parameter | IP mode used for hadr_remote_host parameter | IP mode used for HADR communications |
---|---|---|
IPv4 address | IPv4 address | IPv4 |
IPv4 address | IPv6 address | Error |
IPv4 address | host name, maps to IPv4 only | IPv4 |
IPv4 address | host name, maps to IPv6 only | Error |
IPv4 address | host name, maps to IPv4 and v6 | IPv4 |
IPv6 address | IPv4 address | Error |
IPv6 address | IPv6 address | IPv6 |
IPv6 address | host name, maps to IPv4 only | Error |
IPv6 address | host name, maps to IPv6 only | IPv6 |
IPv6 address | host name, maps to IPv4 and IPv6 | IPv6 |
hostname, maps to IPv4 only | IPv4 address | IPv4 |
hostname, maps to IPv4 only | IPv6 address | Error |
hostname, maps to IPv4 only | hostname, maps to IPv4 only | IPv4 |
hostname, maps to IPv4 only | hostname, maps to IPv6 only | Error |
hostname, maps to IPv4 only | hostname, maps to IPv4 and IPv6 | IPv4 |
hostname, maps to IPv6 only | IPv4 address | Error |
hostname, maps to IPv6 only | IPv6 address | IPv6 |
hostname, maps to IPv6 only | hostname, maps to IPv4 only | Error |
hostname, maps to IPv6 only | hostname, maps to IPv6 only | IPv6 |
hostname, maps to IPv6 only | hostname, maps to IPv4 and IPv6 | IPv6 |
hostname, maps to IPv4 and IPv6 | IPv4 address | IPv4 |
hostname, maps to IPv4 and IPv6 | IPv6 address | IPv6 |
hostname, maps to IPv4 and IPv6 | hostname, maps to IPv4 only | IPv4 |
hostname, maps to IPv4 and IPv6 | hostname, maps to IPv6 only | IPv6 |
hostname, maps to IPv4 and IPv6 | hostname, maps to IPv4 and IPv6 | IPv6 |
The primary and standby databases can make HADR connections only if they use the same IPv4 or IPv6 format. If one server is IPv6 enabled (but also supports IPv4) and the other server supports IPv4 only, at least one of the hhadr_local_host and hadr_remote_host parameters on the IPv6 server must specify an IPv4 address to force database on this server to use IPv4.
You can set the HADR local service and remote service parameters (hadr_local_svc and hadr_remote_svc) to either a port number or a service name. The values that you specify must map to ports that are not being used by any other service, including other DB2 components or other HADR databases. In particular, you cannot set either parameter value to the TCP/IP port that is used by the server to await communications from remote clients (the value of the svcename database manager configuration parameter) or the next port (the value of the svcename parameter + 1).
If the primary and standby databases are on different servers, they can use the same port number or service name; otherwise, they must have different values.
Reconfiguration is predicated on the values of the hadr_target_list configuration parameter being correct; if anything is wrong in a target list entry, you must correct it manually.
In a non-forced takeover, the values for the hadr_remote_host, hadr_remote_inst, and hadr_remote_svc configuration parameters on the new primary are automatically updated to its principal standby, and these parameters on the standbys listed in the new primary's hadr_target_list are automatically updated to point to the new primary. Any database that is not listed in the new primary's hadr_target_list is not updated. Those databases continue to attempt to connect to the old primary and get rejected because the old primary is now a standby. The old primary is guaranteed to be in the new primary's target list because of the requirement of mutual inclusion in the target list.
In a forced takeover, automatic update on the new primary and its standbys (excluding the old primary) work the same way as non-forced takeover. However, automatic update on the old primary does not happen until it is shut down and restarted as a standby for reintegration.
Any database that is not online during the takeover will be automatically reconfigured after it starts. Automatic reconfiguration might not take effect immediately on startup, because it relies on the new primary to periodically contact the standby. On startup, a standby might attempt to connect to the old primary and follow the log stream of the old primary, causing it to diverge from the new primary's log stream and, making that standby unable to pair with the new primary. As a result, you must shut down the old primary before takeover to avoid that kind of split brain scenario.
For more detailed information, see "DB2 high availability disaster recovery (HADR) synchronization mode".
The timeout period, which you specify with the hadr_timeout configuration parameter, must be identical on the primary and standby databases. The consistency of the values of these configuration parameters is checked when an HADR pair establishes a connection.
After an HADR pair establishes a connection, they exchange heartbeat messages. The heartbeat interval is computed from factors like the hadr_timeout and hadr_peer_window configuration parameters. It is reported by the HEARTBEAT_INTERVAL field in the MON_GET_HADR table function and the db2pd command. If one database does not receive any message from the other database within the number of seconds that is specified by the hadr_timeout configuration parameter, it initiates a disconnect. This behavior means that at most, it takes the number of seconds that is specified by the hadr_timeout configuration parameter for an HADR database to detect the failure of either its partner database or the intervening network. If you set the hadr_timeout configuration parameter too low, you will receive false alarms and frequent disconnections.
If you have the hadr_peer_window configuration parameter set to a nonzero value and the primary loses connection to the standby in peer state, the primary database does not commit transactions until the connection with the standby database is restored or the time value of the hadr_peer_window configuration parameter elapses, whichever happens first.
For maximal availability, the default value for the hadr_peer_window database configuration parameter is 0. When this parameter is set to 0, as soon as the connection between the primary and the standby is closed, the primary drops out of peer state to avoid blocking transactions. The connection can close because the standby closed the connection, a network error is detected, or timeout is reached. For increased data consistency, but reduced availability, you can set the hadr_peer_window database configuration parameter to a nonzero value.
For more information, see "Setting the hadr_timeout and hadr_peer_window database configuration parameters".
The following sample configuration is for the primary and standby databases:
HADR_TARGET_LIST host2.ibm.com:hadr_service
HADR_LOCAL_HOST host1.ibm.com
HADR_LOCAL_SVC hadr_service
HADR_REMOTE_HOST host2.ibm.com
HADR_REMOTE_SVC hadr_service
HADR_REMOTE_INST dbinst2
HADR_TIMEOUT 120
HADR_SYNCMODE NEARSYNC
HADR_PEER_WINDOW 120
HADR_TARGET_LIST host1.ibm.com:hadr_service
HADR_LOCAL_HOST host2.ibm.com
HADR_LOCAL_SVC hadr_service
HADR_REMOTE_HOST host1.ibm.com
HADR_REMOTE_SVC hadr_service
HADR_REMOTE_INST dbinst1
HADR_TIMEOUT 120
HADR_SYNCMODE NEARSYNC
HADR_PEER_WINDOW 120