Split policy
A cluster split event can occur between sites when a group of nodes cannot communicate with the remaining nodes in a cluster. For example, in a linked cluster, a split occurs if all communication links between the two sites fail. A cluster split event splits the cluster into two or more partitions.
You can use PowerHA® SystemMirror® to configure a split policy that specifies the response to a cluster split event.
- None
- This option indicates that no action occurs when a cluster split event is detected. Each partition that is created by the cluster split event becomes an independent cluster. Each partition can start a workload independent of the other partition. If shared volume groups are in use, it can potentially lead to data corruption. This option is the default setting, since manual configuration is required to establish an alternative policy. Do not use this option if your environment is configured to use HyperSwap® for PowerHA SystemMirror.
- Tie breaker
- You can use this option to specify a disk or an NFS file.
If you specify a disk for the tie breaker, each partition attempts to acquire the tie breaker disk by placing a lock on the tie breaker disk. If you specify a disk for the tie breaker, a SCSI disk that is assessable to all nodes in the cluster is used. The partition that cannot lock the disk is rebooted, as specified in the action plan.
If you specified an NFS file for the tie-breaker, the NFS mount must exist on each of the nodes in the cluster from the selected NFS server. The partition that first reserves the NFS file continues to function. The partition that cannot lock the NFS file is rebooted, as specified in the action plan.
Note: The default NFS mount options areCloud is another tiebreaker option and you must have cloud communication on all the nodes of the cluster for this option. During cluster split event, each partition attempts to acquire a lock by uploading a file to the configured Cloud service. The partition that successfully uploads the file to the configured Cloud service continues to function. The partition that cannot upload the file to the configured Cloud service is rebooted or the cluster services are restarted as specified by the chosen action plan in the policy setting.vers=4,fg,soft,retry=1,timeo=10. Modifying the default values might lead to failure in acquiring the NFS lock.If you use the Cloud option for the split policy, the merge policy must also be configured to use the Cloud option.
- Manual
- This option indicates that you want to manually fix the problem when a cluster split occurs.
Each node in the partition presents a message to choose to continue running cluster services or recover cluster services (which restarts the node). With this option, you can specify the number of attempts and the frequency of attempts that require your input. You can also specify a default action to occur after the number of attempts that require your input is reached and you have not provided any input.
The following message is displayed for a linked cluster that specifies the manual option when a cluster split event occurs:
In this example, you can use the manual option to check whether a split event or a merger event is waiting for a manual response from the SMIT menu.Broadcast message from root@e08m138.ausprv.stglabs.ibm.com (tty) at 04:09:48 ... A cluster split has been detected. You must decide if this side of the partitioned cluster is to continue. To have it continue, enter /usr/es/sbin/cluster/utilities/cl_sm_continue To have the recovery action - Reboot - taken on all nodes on this partition, enter /usr/es/sbin/cluster/utilities/cl_sm_recover LOCAL_PARTITION 1 e08m138 OTHER_PARTITION 2 e08m140If you want to use the manual option for stretched clusters and standard clusters, your environment must be running the following versions of software:- IBM® AIX® 7.2 with Technology Level 1, or later
- PowerHA SystemMirror Version 7.2.1, or later
Note: For any type of cluster that uses the manual option after the number of attempts specified is reached and you have not provided any input, the partition that has the lowest node ID is chosen as the winning partition.