Several things can be done to improve FTO failover time:
- The following variable can save up to a few minutes in failover time. The variable indicates how many times a MIRROR HUB should attempt to reconnect to its peer HUB before assuming the primary HUB responsibilities. The default for this variable is 2. Changing the value to 1 results in only one reconnect attempt before assuming the primary HUB role.
- Another variable that can be used to decrease failover time is the following:
This variable is called the check interval. It dictates how often the HUBs will check to see if its peer is still around when no data has been received for a period of time. There are two parts to this check. If either fails then the peer HUB is marked as disconnected. The default value is 120 seconds.
- Assuming the default value, if no data has been received from the peer HUB for 60 seconds (1/2 the specified value), then a request called a "ping" is sent to the peer HUB. If the ping request cannot successfully be sent, then the peer is marked as disconnected.
- If the ping request is successfully sent, then the peer HUB is expected to return the ping request within 2 check intervals. Assuming the default value, this means that the request must be returned within 4 minutes. The peer HUB is marked as disconnected if the request is not returned.
So using a check interval of 30 can remove another 45 seconds from the failover time. I do not recommend making the check interval much smaller than 30 as you will increase the likelihood of falsely marking the peer HUB as disconnected. The smaller check interval means there is less time for the peer HUB to return the ping request, which can lead to false disconnect state. Increase this value if you encounter false disconnect states.
- One last thing that can be done to possibly improve failover time is to specify the following variables at both HUBs: