Troubleshooting

Failures are categorized into four stages based on when it occurs in workload migration. Use the following guidelines on how to handle failures in different stages.

Stage 0

Stage 0 is the initial stage wherein the migration is yet to start. Most of compatibility issues that are related to source and target system or input payload issues can be captured and validated at this stage. If validation errors occur, the migration stops and a precise reason is displayed in the response of REST call. In the absence of a message, go through /var/log/purescale/ipas.server/trace.log file in the target system.
Note:
  • This stage can be turned off by setting the premigration_validate flag to false.
  • By default, the validation is in On state.

Stage 1: Creating a request metadata during the instance deployment flow on the target system

In this stage, IBM® Cloud Pak System creates a dummy metadata in the database. It allocates resource on the target before the workload migration is triggered. However, if IBM Cloud Pak System flow is unable to do any of the following example actions, then failure occurs:
  • Create a dummy request to deploy the virtual machine.
  • Insert the data and allocate resources.
  • Add IP address.
  • Use data stores.
Resolution: As it is an initial stage, all partially generated data can be deleted from the target IBM Cloud Pak System by using the user interface. As a special case, if the data does not get deleted, then manually delete them and trigger fresh migration. To delete, do the following steps:
  1. Go to IBM Workload Deployer > Virtual System Instance.
  2. Select your instance and click the delete icon in its details page.

Stage 2

  • After the initial stages of workload migration are completed successfully, the Workload create job gets triggered. If errors occur in the API, then contact IBM support. The VMware® vCenter usually shows the error and status. In addition, it shows the VMware® jobs in vCenter user interface. If the triggered migration fails in between or is partially complete, then VMware® support team can confirm if and how it can be recovered.
  • If an instance fails at this stage, the Instance section of the user interface remains in Registering state.
  • If the instance UUID of the vCenters are same for both source and target, then the following error might appear due to VMware® migration utility error:
    ERROR: Client received SOAP Fault from server: The object 'vim.ResourcePool:resgroup-6740' has already been deleted or has not been completely created Please see the server log to find more detail regarding exact cause of the failure.
    To verify, see https://<VCIP address>/mob/?moid=ServiceInstance&doPath=content%2eabout
  • If the create job of workload migration fails due to the following error, ensure that the compute nodes that are associated with the Virtual System Instance's cloud group have IPv4 IP assigned as mentioned in Prerequisites section:
    pooljvm.1622736025648.13239 [06-03-21 16:03:51] 0073 workload_migrations.workload_migrations_create | java.lang.NullPointerException: Cannot get property 'ip' on null object
    workload_migrations.workload_migrations_rack_helper.getHostNameFromRackSystem(workload_migrations_rack_helper.groovy:1493)
    workload_migrations.workload_migrations_rack_helper.rack2rack_migration_helper_x(workload_migrations_rack_helper.groovy:1064)

Stage 3: Post deployment failure

After the workload migration stage 2 is complete, the virtual machines are moved to target vCenter. However, failure can occur in later stages in IBM Cloud Pak System jobs during post migration updates.

Resolution: Some post migration steps are run on the virtual machine to switch its data from source to target that includes the metadata update of the target. Later, it validates whether all the data is correct and finally deletes source instance entries from the source IBM Cloud Pak System. If it fails in these stages, you must manually investigate the root cause and identify possible solutions on the target IBM Cloud Pak System.

Stage 4

Do the following steps on the target system if the Microsoft Windows machine migration gets stuck in the Launching state for a hour or more:
  1. Log in to Windows virtual machine.
  2. On Windows command prompt, run the following command:
     C:\IBM\maestro\maestro.deployment.ui\zero stop
  3. Click Start > Programs > IBM Tivoli Monitoring > Manage Tivoli Monitoring Services.
  4. Right-click Monitoring Agent for Windows OS and select Stop.
  5. Right-click Monitoring Agent for Workloads and select Stop.
  6. Open Task Manager and end the following processes:
    • Right-click the two Python processes individually and click End task.
    • Right-click the two IBM Java processes individually and click End task. To verify the process file location, right-click the process and click Open file location.
  7. Take a backup of the C:\IBM\maestro\agent folder.
  8. Delete the following folders:
    • C:\0config\itlm\foundation
    • C:\0config\safemode
    • C:\IBM\maestro\agent\safemode
  9. Duplicate the file C:\0config\vm_is_installed and name it as "update". After you duplicate the file, you must have both these files in the directory C:\0config directory: vm_is_installed and update
  10. Restart the virtual system instance (VSI) from the IBM Cloud Pak System Software user interface. Navigate to the Virtual System Instance page, locate the specific virtual system instance, and click Stop and Start in the right panel. Ensure that the status of the virtual system instance displays as "STOPPED" before you start the instance.