Dead gateway detection

A host can be configured to detect whether a gateway it is using is down, and can adjust its routing table accordingly.

If the network option -passive_dgd is 1, passive dead gateway detection is enabled for the entire system. If no response is received for consecutive dgd_packets_lost ARP requests to a gateway, that gateway is assumed to be down and the distance metrics (also known as hopcount or cost) for all routes that use that gateway are raised to the maximum possible value. After dgd_retry_time minutes have passed, the route's costs are restored to their user-configured values. The host also takes action based on failing TCP connections. If consecutive dgd_packets_lost TCP packets are lost, the ARP entry for the gateway in use is deleted and the TCP connection tries the next-best route. The next time the gateway is used, the above actions take place if the gateway is actually down. The passive_dgd, dgd_packets_lost, and dgd_retry_time parameters can all be configured by using the no command.

Hosts can also be configured to use active dead gateway detection on a per-route basis with the -active_dgd flag of the route command. Active dead gateway detection pings all gateways used by routes for which it is enabled every dgd_ping_time second. If no response is received from a gateway, it is pinged more rapidly up to dgd_packets_lost times. If still no response is received, the costs of all routes that use that gateway are raised. The gateway continues to be pinged, and if a response is eventually received, the costs on the routes are restored to their user-configured values. The dgd_ping_time parameter can be configured by using the no command.

Dead gateway detection is most useful for hosts that use static rather than dynamic routing. Passive dead gateway detection results in less performance issues and is recommended for use on any network that has redundant gateways. However, passive dead gateway detection is done on a best-effort basis only. Some protocols, such as UDP, do not provide any feedback to the host if a data transmission is failing, and in this case no action can be taken by passive dead gateway detection.

Active dead gateway detection is most useful when a host must discover immediately when a gateway goes down. Since it queries each gateway for which it is enabled every few seconds, there is some excess network usage that is associated with its use. Active dead gateway detection is recommended only for hosts that provide critical services and on networks with a limited number of hosts.

Note: Dead gateway detection and the routing protocols that are used by the gated and routed daemons perform a similar function by discovering changes in the network configuration and adjusting the routing table accordingly. However, they use different mechanisms to do this, and if they are run at the same time, they might conflict with one another. For this reason, dead gateway detection must not be used on systems that run the gated or routed daemons.

When dead gateway detection detects that the primary route is back online and the dgd_flush_cached_route parameter is enabled, the current cached routes of all active connections are flushed. The routes of all the current active connections are validated again, to find the best route for sending data. The dgd_flush_cached_route parameter can be configured, by using the no command. By default, the dgd_flush_cached_route parameter is disabled.

Note: The dgd_flush_cached_route parameter must be enabled only in a stable network environment. Otherwise, there might be greater performance issues due to bad or unstable hardware routers, causing dead gateway detection to frequently update the routing table. Frequent flushing of the cached routes can also be expensive.