Solving common problems
When you receive errors for any ksysmgr commands, the command output shows the error and the suggested resolution. However, if you are not able to determine the issue, review the following approaches to diagnose the issue.
- The discovery operation failed.
- If a problem occurs during the discovery steps, you must analyze the
ksysmgr.log file for any failures. After the discovery is complete, you can
query the IBM.VMR_LPAR resource class to confirm the successful completion of
discovery operation:
The output might be similar to the following sample:lsrsrc IBM.VMR_LPAR
In case of any errors in the discovery operation, the Phase field is set as VERIFY and the ErrMsg field indicates the error details. The Phase field is set as READY after a successful discovery operation.Name = "xxx" LparUuid = "59C8CFxx-4Bxx-43E2-A0CE-F028AEB5Fxxx" LparIPList = {} SiteCleanupTastList = {} ActiveSiteID = 80708xxxx LCB = { } BootDiskList = {} CecUuid = "6ce366c5-f05d-3a12-94f8-94a3fdfcxxxx" ErrMsg = "" Phase = "READY" PhaseDetail = 4194305 Memory = "4352" Processors = "0.1" ActivePeerDomain = "vmdr" - The discovery operation failed with the getlcb error.
- The cause for this error might be that the virtual machine's Fibre Channel port in the storage
area network (SAN) fabric is zoned with a storage port that does not provide any logical unit
numbers (LUNs) to the virtual machine. You can resolve this error by completing one of the following steps:
- Ensure that the virtual machine is zoned only with those storage ports that provide LUNs to the virtual machines.
- Run the cfgmgr command in the virtual machine where the getlcb failure occurred and then run the discovery operation again.
- The discovery operation failed indicating that the storage disk was already a part of an existing composite group.
- If any of the storage disks in the GDR solution are already a part of an existing composite group, the discovery operation cannot complete successfully. A storage disk in the GDR solution must be associated with a single composite group, which is asynchronously consistent. Remove the older composite groups, and run the discovery operation again.
- The discovery operation failed indicating that the disk group is created in only one site.
- Review the /var/ksys/log/ksys_srdf.log file for any consistency-enabled issue. Ensure all the disks that belong to a Remote Data Facility (RDF) group are also a part of the composite group.
- The verification phase failed.
- After the validation is complete, you can query the IBM.VMR_LPAR resource class
to ensure that the virtual machines are ready to be moved during a disaster
situation:
The output might be similar to the following sample:lsrsrc IBM.VMR_LPAR
In case of any errors in configuration validation, review the error details in the ErrMsg field. The Phase field is set as READY_TO_MOVE after a successful verification operation.Name = "xxx" LparUuid = "59C8CFxx-4Bxx-43E2-A0CE-F028AEB5Fxxx" LparIPList = {} SiteCleanupTastList = {} ActiveSiteID = 80708xxxx LCB = { } BootDiskList = {} CecUuid = "6ce366c5-f05d-3a12-94f8-94a3fdfcxxxx" ErrMsg = "" Phase = "READY_TO_MOVE" PhaseDetail = 4194305 Memory = "4352" Processors = "0.1" ActivePeerDomain = "vmdr"
The test-discovery step in the DR failover rehearsal operation is failing with the error
message: Tertiary storage copy is missing.

This error occurs when one or more of the third copy of disks required for cloning the backup storage data is missing. For each backup (2nd copy) disk, a corresponding tertiary (third copy) disk must exist in the backup site. Check the availability and accessibility of the tertiary disks in the storage subsystem. You can also check the status of cloning relationship by using commands that are provided by the specific storage vendor.

The test-discovery step in the DR failover rehearsal operation is failing with the error
message: Storage agent is not accessible.
This error occurs because of a problem in communication between the KSYS subsystem and storage
subsystem. Check for any hardware issues. For example, ensure proper connectivity between all the
subsystems. Also, identify the issue by analyzing resource manager trace log files. 
- The HMC interface indicates that the LPAR has no Resource Monitoring Control (RMC) connection or the RMC is inactive.
- Check whether the LPAR properties also indicate an RMC issue between the HMC and VIOS. The RMC
connectivity issue can occur because of the security mode that is set in the LPAR. The security mode
for both the HMC and LPAR must be set to the same value. For example, list the security mode for LPAR by running the following command:
The output might look like the following sample:/usr/sbin/rsct/bin/lssecmode
Similarly, list the security mode for the HMC by running the following command:Current Security Mode Configuration Compliance Mode : nist_sp800_131a Asymmetric Key Type : rsa2048_sha256 Symmetric Key Type : default
The output might look like the following sample:/usr/sbin/rsct/bin/lssecmode
In this case, the LPAR has the nist_xxx security mode enabled, but the HMC has no security mode. This mismatch can occur if another HMC was connected or a security mode was set before any reset operation was started.Current Security Mode Configuration Compliance Mode : none Asymmetric Key Type : rsa512 Symmetric Key Type : default - Errors occur when you register an EMC storage agent.
- Check the /var/symapi/config/netcnfg file to determine whether the configuration contains at least two EMC subsystems.
- You want to view all the storage disks in the active site and the backup site.
- Run the following command to list all the storage disks:
The output might be similar to the following sample:# symcfg listS Y M M E T R I X Mcode Cache Num Phys Num Symm SymmID Attachment Model Version Size (MB) Devices Devices 000196800508 Local VMAX100K 5977 217088 65 9964 000194901326 Remote VMAX-1SE 5876 28672 0 904 000196800573 Remote VMAX100K 5977 217088 0 7275 000198701861 Remote VMAX10K 5876 59392 0 255 - The output of ksysmgr query command is not updated for hosts, HMCs, and VIOS even when the entities are updated.
- Sometimes, the ksysmgr query command displays static information even when
the hosts or HMCs are modified. To update the ksysmgr query command output
dynamically, complete the following steps:
- Unpair the hosts across the sites by using the following
command:
ksysmgr pair host host_name pair=none - Remove the hosts from the site by using the following
command:
ksysmgr remove host host_name - Add the new HMC to the site by using the following
command:
ksysmgr add hmc hmc_name login=login_name password=login_password ip=ip_address site=site_name - Add the corresponding managed hosts to the site by using the following
command:
ksysmgr add host host_name
- Unpair the hosts across the sites by using the following
command:
The disaster recovery move operation failed.
When a disaster occurs and you initiate a move operation, the KSYS subsystem coordinates the
start of the virtual machines on the backup site. During this process, LPAR profiles are created
through HMCs on the backup site hosts. During the LPAR profile creation, if any errors occur such
that the KSYS cannot communicate with HMCs, the move operation might fail. At that time, any virtual
machines that are partially created on HMC require manual restart. For the rest of the virtual
machines, you can use the ksysmgr recover command to recover and start the
virtual machines. 
An unplanned move operation from the active site to the backup site is successfully completed
and the specified flexible capacity policy is followed. Later, another unplanned move operation from
the backup site to the source site fails. When the virtual machines are recovered to the source
site, the virtual machines are started but without any change in the processor and memory values,
(that is, without following the flexible capacity policy).
This situation can happen when the active hosts and backup hosts are connected to the same HMC.
You must connect the source hosts and target hosts to different HMCs to continue unplanned move
operations in case of source HMC failures. 