Recovering hosts, VMs, and applications
After you configure the resources and policies in the KSYS subsystem, the KSYS continues to monitor the environment for any failures or issues. When any planned or unplanned outage occurs, the KSYS subsystem restarts the virtual machines on another host based on the specified policies.
The KSYS subsystem can be configured to monitor the hosts, virtual machines (VMs), and applications to perform recovery operations. By default, all VMs in the host group are managed. If you unmanage one or more VMs and run the discovery operation, the KSYS subsystem does not monitor and manage those VMs for high availability. Similarly, the KSYS subsystem does not monitor the specified resources for high availability if you disable HA monitoring for the entire system or a host group. However, you can monitor the health of VM and application only when you install the VM agent and enable HA monitoring in each VM that needs HA monitoring.
Recovering virtual machines in an unplanned outage
- Automatic restart of virtual machines
- When a host, VM, or critical application fails and the restart_policy
attribute is set to auto, the KSYS subsystem restarts the virtual machines
automatically on other hosts. The KSYS notifies you about the events; you do not have to take any
actions.
However, if the KSYS subsystem could not successfully stop the VMs in the source host, the VMs are not restarted automatically. Also, if the KSYS subsystem identifies a problem, but cannot determine the issue, the VMs are not restarted on other hosts automatically to avoid unnecessary outage because of false failure detection. In both these cases, the KSYS subsystem notifies you about the problem. You must review the notified problem and then manually start the VMs, if necessary.
- Manual recovery of virtual machines
- When a host, VM, or critical application fails and the restart_policy attribute is set to advisory_mode, the KSYS notifies you about the issue. You can review the issue and manually restart the virtual machines on another hosts.
If you have configured the VM agent in each of your virtual machines, the KSYS notifies you when a virtual machine or a registered critical application fails or stops working correctly. In such cases also, you can restart the virtual machines on another hosts based on the specified policies.
- Restart specific virtual machines or all virtual machines in a host by running the following
command:
Or,ksysmgr [-f] restart vm vmname1[,vmname2,….] [to=hostname|uuid]
After the virtual machines are restarted successfully, the KSYS subsystem automatically cleans the VMs on the source host and on the HMC by removing the LPAR profile from the HMC.ksysmgr [-f] restart host hostname|uuid [to=hostname|uuid]
- If the output of the ksysmgr restart command indicates cleanup errors, clean
up the VM details manually in the source host by running the following
command:
ksysmgr cleanup vm vmname host=hostname
- If the restart operations fail, recover the virtual machine in the same host where it is located
currently by running the following
command:
ksysmgr [-f] recover vm vmname
Planned migration of virtual machines
The KSYS subsystem uses the HMC-provided Live Partition Mobility (LPM) capability to support the planned HA management. You can also use HMC to perform the LPM operations and the KSYS adapts to the movements of the VMs within the host group as part of its regular discovery operation. If you plan for a host maintenance or an upgrade operation, you can move all the virtual machines to another host by using the LPM operation and also restore the virtual machines back to the same host after the maintenance or the upgrade operation is complete. You can also test whether the movement of virtual machines to another host will be successful by using LPM validation without moving the virtual machines. This validation is useful to avoid any errors that might occur during the relocation of virtual machines.
- Validate the LPM operation without migrating the virtual machines by running one of the
following commands:
- To validate the LPM operation for specific virtual machines, run the following
command:
ksysmgr [-f] lpm vm vmname1[,vmname2,..] action=validate
- To validate the LPM operation for all virtual machines in a specific host, run the following
command:
ksysmgr [-f] lpm host hostname|uuid action=validate
- To validate the LPM operation for specific virtual machines, run the following
command:
- Migrate the virtual machines from the source host to another host by running one of the
following commands:
- To migrate specific virtual machines, run the following
command:
ksysmgr [-f] lpm vm vmname1[,vmname2,..] [to=hostname|uuid]
- To migrate all virtual machines in a specific host, that is to migrate all the VMs from the
host, run the following
command:
ksysmgr [-f] lpm host hostname|uuid [to=hostname|uuid]
If you have HMC Version 9 Release 9.3.0, or later, you can view the LPM progress as a percentage value.
- To migrate specific virtual machines, run the following
command:
- Run the discovery and verify operations after each LPM operation to update the LPM validation
state by running the following
command:
ksysmgr discover host_group hg_name verify=true
- After the maintenance or upgrade activities are complete in the source host, restore all virtual
machines by running the following
command:
ksysmgr restore host hostname|uuid