VM Recovery Manager HA overview

High availability (HA) management is a critical feature of business continuity plans. Any downtime to the software stack can result in loss of revenues and disruption of services. IBM® VM Recovery Manager HA for Power Systems is a high availability solution that is easy to deploy and provides an automated solution to recover the virtual machines (VMs), also known as logical partitions (LPARs).

The VM Recovery Manager HA solution implements recovery of the virtual machines based on the VM restart technology. The VM restart technology relies on an out-of-band monitoring and management component that restarts the VMs on another server when the host infrastructure fails. The VM restart technology is different from the conventional cluster-based technology that deploys redundant hardware and software components for a near real-time failover operation when a component fails.

The VM Recovery Manager HA solution is ideal to ensure high availability for many VMs. Additionally, the VM Recovery Manager HA solution is easier to manage because it does not have clustering complexities.

The following figure shows the architecture of the VM Recovery Manager HA solution. A set of hosts is grouped to be backup for each other. When failures are detected, VMs are relocated and restarted on other healthy hosts within the group.

Figure 1. VM Recovery Manager HA solution architecture
VM Recovery Manager HA solution architecture
The VM Recovery Manager HA solution provides the following capabilities:
Host health monitoring
The VM Recovery Manager HA solution monitors hosts for any failures. If a host fails, the virtual machines in the failed host are automatically restarted on other hosts. The VM Recovery Manager HA solution uses the host monitor module of the VIOS partition in a host to monitor the health of hosts.
VM and application health monitoring
The VM Recovery Manager HA solution monitors the virtual machines, its registered applications, and its hosts, for any failures. If a virtual machine or a critical application fails, the corresponding virtual machines are started automatically on other hosts. The VM Recovery Manager HA solution uses the VM monitor agent that must be installed in each virtual machine to monitor the health of virtual machines and registered applications.
Unplanned HA management
During an unplanned outage, when the VM Recovery Manager HA solution detects a failure in the environment, the virtual machines are restarted automatically on other hosts. You can also change the auto-restart policy to advisory mode. In advisory mode, failed VMs are not relocated automatically, instead email or text messages are sent to the administrator. Administrator can use the interfaces to manually restart the VMs.
Planned HA management
During a planned outage, when you plan to update firmware for a host, you can use the Live Partition Mobility operation of the VM Recovery Manager HA solution to vacate a host by moving all the VMs in the host to the remaining hosts in the group. After the upgrade operation is complete, you can use the VM Recovery Manager HA solution to restore the VM to its original host in a single operation.
Advanced HA policies
The VM Recovery Manager HA solution provides advanced policies to define relationships between VMs such as collocation and anti-collocation of VMs, priority in which the VMs will be restarted, capacity of VMs during failover operations.
GUI and command-line based management
You can use GUI or command-line interface to manage the resources in the VM Recovery Manager HA solution. For GUI, you can install the UI server and then use the web browser to manage the resources. Alternatively, the ksysmgr command and the ksysvmmgr command on KSYS LPAR provide end-to-end HA management for all resources.