Node failovers

Node failover process depends on the size of your appliance. Multi-rack systems include spare nodes, hence the failover should not result in performance degradation. In single-rack systems, redistribution of MLNs is such that the load is shared fairly amongst surviving nodes.

The system experiences a brief outage (up to 10 minutes) while failover is performed. During this outage, the monitoring on the web console is available but limited. During the master node failover, the web console is being transitioned to a new master node, and it will be briefly unavailable during that time.

Single-rack systems

Single-rack systems do not provide spare nodes. If a node fails in a single-rack appliance, Multiple Logical Nodes (MLNs) are redistributed amongst surviving nodes to bring the system back online. After failover, there will be performance degradation, as a larger number of MLNs are being run on fewer nodes with fewer resources available. There is a minimum number of nodes that must be operating for high availability to be effective:

Table 1. Minimum number of operating nodes for high availability
Rack size Number of nodes Minimum number of operating nodes
1/3 3 2
2/3 5 3
Full 7 4

Multi-rack systems

On multi-rack systems, if a failover happens in a HA-domain, MLNs (Db2 data partitions) from the failed node are distributed to the spare node within this HA-domain. There should not be any performance degradation if MLNs are moved to a spare node.