Troubleshooting
Problem
Powering down an IBM Netezza 1000 system using RPC does not cause the system to failover.
Resolving The Problem
If you use the RPCs to power down the active host on an IBM Netezza 1000 system, it will not cause a failover. The RPC part is the key difference—any other power down method (issuing a shutdown command, hitting the power button, pulling power plugs) will cause a failover.
The following is an explanation for this behavior: Heartbeat wants to failover, but to do so it must first guarantee that the failed host is no longer accessing data. Heartbeat cannot reach it over the network so the only solution is to power-cycle (also known as fence) the host. Fencing uses the RPCs and Heartbeat's RPC driver looks at the power port status for the lost host. It sees the ports as 'off' (whereas during normal operation the status is 'on'). In this special case, Heartbeat will not fence the host because this will turn it back on. Heartbeat assumes that an administrator wanted the host to be off, so it will leave it off. Because Heartbeat is unable to fence the host, it is unable to failover.
This expected behavior is new to TwinFin. SHEaR systems will forcibly power-cycle the host and failover.
This is not considered to be a bug. It will only happen when a user willingly and manually uses RPCs to power down a host.
Historical Number
NZ513841
Was this topic helpful?
Document Information
Modified date:
17 October 2019
UID
swg21575816