Setting up HA policies
After you set up the KSYS subsystem successfully, set up recovery policies to customize the default configuration settings to suit your high availability preferences.
The VM Recovery Manager HA solution provides the following options that you can customize:
- HA monitoring
- Turns on or turns off HA monitoring for the associated entity. The specified policy at the
lowest resource level is considered first for HA monitoring. If you do not specify this policy for a
resource, the policy of the parent resource is applied to the resource. For example,
if you enable HA monitoring for the host group, HA monitoring is enabled for all virtual machines
within the host group unless you disable HA monitoring for specific virtual machines.
You can enable HA monitoring for virtual machines only after you install the VM agent on each VM and start the VM agent successfully. For details, see Setting up the VM agent topic. If you do not set up the VM agent, the KSYS subsystem might return error messages for HA monitoring at VM-level.
- To set the HA monitoring at system or virtual machine level, run the following
command:
ksysmgr modify system|vm name ha_monitor=enable|disable
- To set the HA monitoring at the host group level, run the following
command:
ksysmgr modify host_group <name> options ha_monitor=<enable | disable>]
- To set the HA monitoring at system or virtual machine level, run the following
command:
- ProactiveHA monitoring
- ProactiveHA monitors every managed VM in the host group, the CPU utilization, and network packet
loss during virtual machine or host monitor communication. When a VM's CPU utilization exceeds 90%
or when network packet loss is detected on each of the VM's adapters during virtual machine or host
monitor communication, an event is generated. The threshold for CPU utilization is predefined. By
default, the ProactiveHA monitoring option is enabled.
- To enable or disable the ProactiveHA monitoring feature at the system level, run the following
command:
ksysmgr modify system proactiveha=<enable | disable>
- To enable or disable the ProactiveHA monitoring feature at the host level, run the following
command:
ksysmgr modify host <name> proactiveha=<enable | disable>
- To enable or disable the ProactiveHA monitoring feature at the host group level, run the
following
command:
ksysmgr modify host_group <name> options proactiveha=<enable | disable>
- To enable or disable the ProactiveHA monitoring feature at the VM level, run the following
command:
ksysmgr modify vm <vmname[,vmname2,...]> proactiveha=<enable | disable>
- To enable or disable the ProactiveHA monitoring feature at the system level, run the following
command:
- Configuring network isolation events
- The KSYS subsystem uses the network isolation feature to configure the VIOS
netmon file, which is used by IBM Reliable Scalable Cluster Technology (RSCT)
to monitor the network status. The KSYS subsystem generates the
NETWORK_ISOLATION_SUCCESS
and theNETWORK_ISOLATION_ERROR
events depending on whether the configuration of the VIOS netmon file succeeded. You can use the ksysmgr command to configure the IP addresses for the VIOS netmon file. After the discovery operation completes, the KSYS subsystem checks the configured IP addresses at site Resource Control Point (RCP) and generates a put message request for the host monitor to configure the VIOS netmon file. To add or delete the IP addresses for network isolation detection, run the following command:ksysmgr modify system [network_isolation=<ip1,ip2,..|ALL> action=<add | delete>]
- Restart policy
- Indicates the KSYS subsystem to restart the virtual machines automatically during a failure.
This attribute can have the following values:
- auto: If you set this attribute to
auto
, the KSYS subsystem automatically restarts the virtual machines on the destination hosts. The KSYS subsystem identifies the most suitable host based on free CPUs, memory, and other specified policies. In this case, the KSYS subsystem also notifies the registered contacts about the host or VM failure and the restart operations. This is the default value of the restart_policy attribute. - advisory_mode: If you set this attribute to
advisory_mode
, the virtual machines are not restarted automatically after host or VM failures. In this case, the KSYS subsystem notifies the registered contacts about the host or VM failures. The administrator must review the failure and manually restart the VMs on other hosts by using the ksysmgr commands.
To set the restart policy, run the following command syntax:ksysmgr modify host_group name restart_policy=auto|advisory_mode
- auto: If you set this attribute to
- Host failure detection time
- Indicates the time that the KSYS waits on a non-responsive host before the KSYS declares the
host to be in an inactive state. This value is measured in seconds. The KSYS subsystem uses the
specified time to ensure the health of the host and attempts to connect to the host before the KSYS
declares the failure. After this duration, the virtual machines are restarted on another host that
is located within the host group. The value of this attribute can be in the range 90 seconds – 600
seconds. The default value is 90 seconds.To set the host failure detection time, run the following command:
ksysmgr modify system host_failure_detection_time=time_in_seconds
ksysmgr modify system|host_group name options host_failure_detection_time=time_in_seconds
- VM failure detection speed
- Represents the time that KSYS waits before KSYS declares the failure of a VM. You can select one
of the following options: fast, normal, or slow.
If you select fast, it means that the VM failure will be declared at the quickest
time. The default value of this attribute is normal. The time (in seconds) that KSYS waits is calculated based on the host failure detection time and the option that you specified:
- If you set the vm_failure_detection_speed attribute to
fast, the VM failure detection time is calculated as follows:
Host failure detection time + VM threshold
- If you set the vm_failure_detection_speed attribute to
normal, the VM failure detection time is calculated as follows:
Host failure detection time + VM threshold*2
- If you set the VM failure detection speed attribute to slow, the VM failure
detection time is calculated as follows:
Host failure detection time + VM threshold*3
VM threshold
is a hardcoded value of 50 that is set in the KSYS subsystem. You cannot change this value.Therefore, if you modify the host_failure_detection_time attribute, the value of VM failure detection time also changes. To set the VM failure detection speed, run the following command:ksysmgr modify system|host_group|host|vm name vm_failure_detection_speed=fast|normal|slow
- If you set the vm_failure_detection_speed attribute to
fast, the VM failure detection time is calculated as follows:
- Failover priority
- Specifies the order of processing of multiple VMs restart operations. For example, if a host
fails and all the VMs must be relocated to other hosts in the host group, the priority of the VM
determines which VM will be processed first. The supported values for this attribute are
High, Medium, or Low. You can set this attribute
at VM-level only. You must specify the UUID of the VM if you have two or more VMs with the same
name. By default, all VMs in the host group have the priority of Medium.To set the failover priority, run the following command:
ksysmgr modify vm name1[,name2,...] | filepath=filepath priority=high|medium|low
Note: At a VM-level, the restart operation that is triggered manually for multiple VMs from different hosts does not consider the VM priorities that are set. The VM priorities that are set are considered only when the VMs are from a single host. - Home host
- Specifies the home-host of the virtual machine. By default, the KSYS subsystem sets this value
initially to the host where the virtual machine was first discovered. You can change the home-host
value of a virtual machine even when the virtual machine is running on another host. In such case,
the specified home-host is used for all future operations. This attribute is useful when you get a
host repaired after failure and you want to restart the virtual machines in its home-host. To set the home-host value, run the following command:
ksysmgr modify vm name1[,name2...] homehost=hostname
- Flexible capacity policy
- Modifies the allocation of memory and CPU resources of a virtual machine when a virtual machine
is moved from home-host to another host in the host group. You can set flexible capacity values
based on the priority of a VM. You can set different flexible capacity values for various priorities
of virtual machines: high, medium, and low. You must specify the flexible capacity values in
percentage and it must be greater than 1. This policy specifies the percentage of resources that
must be allocated to the VM after recovery on the target host as compared to the source host.
The following example describes the percentage-based resource allocation:
If a VM is running on the source host with 10 GB memory and 2 dedicated processors, and if you specify the flexible capacity policy with 50% CPU resource and 70% memory resource, after the recovery of that VM on the target host, the VM will be running with 7 GB memory and 1 processor. Memory is calculated and rounded off to the nearest multiples of memory region size of the target host. If a dedicated processor is allocated to a VM, the calculated values are rounded off to the closest decimal.
The flexible capacity policy values can also be configured at the host group level and the system level. If the flexible capacity policy is configured at the host group level, the flexible capacity policy applies to all the managed VMs of the host group. Similarly, if the flexible capacity policy is configured at the system level, the flexible capacity policy applies to all VMs of all host groups at the system level.
- Minimum value
- This value sets the resource value of the target host LPAR (when the LPAR is started on the target host during the recovery operation) to the minimum value that is defined in the LPAR profile on the source host.
- Current desired
- This value sets the resource value of the target host LPAR (when the LPAR is started on the target host during the recovery operation) to the desired value that is defined in the LPAR profile on the source host.
- None
- Resets the existing value. When you set the value to none, VM capacity management does not follow any flexible capacity management. CPU and memory resources across the source host and target host match the values that are defined in the LPAR profile.
Notes:- When you set the reduced or increased capacity value to a percentage of the
existing capacity value, the resulting value must be within the range of the minimum and maximum
values that are specified in the LPAR profile in the HMC. Otherwise, the
ksysmgr
command will return an error. - In shared processor mode, when you set the reduced or increased processor value to a percentage of an existing processor value, the resulting value must be less than or equal to the virtual processor's value that is specified in the LPAR profile in the HMC.
- The priority of virtual machine is also considered when the virtual machine is moved from the source host to the target host with reduced capacity.
- Flexible capacity policy values are not applicable during Live Partition Mobility (LPM) operation and restore operation.
- The flexible capacity policy configuration must be updated only when a VM is present in home host. The KSYS subsystem takes the reference of flexible capacity policy configuration from the HMC when the VM is in home host.
- The flexible capacity policy of the host group takes precedence over the flexible
capacity policy of the site. For example, if you set the value for memory_capacity attribute at both the system and host group levels as follows:
The virtual machines of the Host_group1 will be restarted in the target host with a reduced memory capacity of 50%.system: memory_capacity = “none” Host_group1: memory_capacity=“50"
When you calculate a processor for a migration operation on a virtual machine, the KSYS subsystem takes processor into consideration so that the virtual processing does not exceed the number of processors that are allocated. During the discovery operation that runs on a virtual machine, if the memory or processor is not enough for the migration process, the KSYS subsystem displays the following warning for insufficient capacity:
HSCL145E The operation failed because the new virtual processors or processing units value would cause the processing units to exceed the maximum capacity allowed with the virtual processor setting.
The performance of the virtual machines might be decreased but the virtual machines can be restarted in the target host that has the reduced capacity. You can use the reduced capacity function during planned outages, where the virtual machines are temporarily moved to the target host and then moved back to the source host after the system maintenance or upgrade operation completes. When virtual machines are moved back to the source host, the original capacity settings of the virtual machine are restored.
If you do not want the HMC to check and compare the resource capacity between the source host and the target host during the verify operation, you can set the skip_resource_check parameter to
yes
.The flexible capacity reduction rules are applied even if adequate resources are available on the target host.
The flexible capacity policy does not consider I/O slots, adapters, and resources that are available in the hosts. You must ensure that all the I/O virtualization requirements of the VMs are met within the host group environment.
To set the flexible capacity policy, run the following command:
If you do not want the virtual machines to start automatically after they are moved to the target host, you can set the skip_power_on parameter toksysmgr modify system | host_group options [memory_capacity=(1-100) | minimum | current_desired | none | default] [priority=low|medium|high] [cpu_capacity=(1-100) | minimum | current_desired | none | default] [priority=low|medium|high]
no
. - Affinity policies
- Specifies affinity rules for a set of VMs that defines how the VMs must be placed within a host
group during a relocation. The following affinity policies are supported:
- Collocation: Indicates that the set of VMs must always be placed on the same
host after relocation. To set this option, run the following command:
ksysmgr add collocation name vm=vmname1[,...]> ksysmgr modify collocation name policy=add|delete vm=vm1[,...]
To update this option, run the following command:ksysmgr modify collocation name policy=add|delete vm=vm1[,...]
- Anticollocation: Indicates that the set of VMs must never be placed on the same
host after relocation.To set this option, run the following command:
ksysmgr add anticollocation name vm=vmname1[,...] ksysmgr modify anticollocation name policy=add|delete vm=vm1[,...]
To update this option, run the following command:ksysmgr modify anticollocation name policy=add|delete vm=vm1[,...]
- Workgroup: Indicates that the set of VMs must be prioritized first based on the
assigned priority.
For example, if VM1, VM2, and VM3 have the same priority and if you add VM1 to a workgroup, the KSYS policy manager gives VM1 the highest priority while assigning it to the target host.
To set this option, run the following command:ksysmgr add workgroup name vm=vmname1[,...] ksysmgr modify workgroup name policy=add|delete vm=vm1[,...]
To update this option, run the following command:ksysmgr modify workgroup name policy=add|delete vm=vm1[,...]
- Host blocklist: Specifies the list of hosts that must not be used for
relocating a specific virtual machine during a failover operation. For a virtual machine, you can
add hosts within the host group to the blocklist based on performance and licensing preferences.
To set this option, run the following command:
ksysmgr modify vm vmname blocklist_hosts=hostname[,...] policy=add|delete
Note: All the VMs in the Workgroup must have the same priority.
- Collocation: Indicates that the set of VMs must always be placed on the same
host after relocation.
- Dependency between applications of virtual machines within a host group
- VM Recovery Manager HA supports parent-child
dependency and primary-secondary dependency.
-
Parent-child dependency
Defines dependency between applications, which have a parent-child structure across VMs. A parent application can have multiple children application. A child application also can have multiple parent applications. A child application can be a parent application for another child application. Similarly, you can define parent-child dependency between applications for up to four levels. Command operations take effect from the parent application to the child application that are in a parent-child application structure.
For example, if app1 is the parent application for the app2 child application, and if you run a command to shut down the app1 parent application, the app2 child application will not be monitored.
To establish dependency between the parent application and the child application, run the following command:
ksysmgr add app_dependency <name> type=<parent_child> app_list=<vm1:app1,vm2:app2[,...]>
-
Primary-secondary dependency
Defines dependency between applications, which have a hierarchical primary-secondary structure across VMs.
To establish dependency between the primary application and the secondary application, run the following command:ksysmgr add app_dependency <name> type=<primary_secondary> app_list=<vm1:app1,vm2:app2>
Note: The app_list attribute must have only two vmname:appname pairs for the primary-secondary structure of applications across VMs.Note: TheApp_Role
is updated toprimary
orsecondary
when the user sets aprimary_secondary
application dependency. The value changes fromprimary
tosecondary
(or vice versa) based on the operation. If the application dependency is deleted, the value is cleared and set to an empty string (""
).
- To verify a dependency that you have created, run the following command:
ksysmgr query app_dependency [name]
- To delete a dependency that you have created, run the following command:
ksysmgr delete app_dependency [name]
-
- Fibre channel (FC) adapter failure detection
- The KSYS subsystem monitors Fibre Channel (FC) adapter status. An event is generated if adapter failure is detected. To use this feature, you must enable the ProactiveHA feature.