IBM Support

How to save a running VM after lun is destroyed or snapshots are corrupt

Troubleshooting


Problem

There are times when a LUN is accidentally destroyed with an ESX install (as per kb 1003881 entitled SAN LUNs damaged after ESX installation, upgrade, or reinstall ) or other damage to the lun that makes it unreadable. If there is nothing that can be done to restore that LUN and you still have powered on VM's on that datastore then this document may assist. The Converter Hot Clone procedure documented here _may_ work depending on where on the volume the damage is and where on the volume the running VM's files are. There is no promise this will work but it is one option in addition to restoring from backups. There must be additional disk space to put the converted VM's other than the damaged lun/datastore. This same process can also be used if a VM has corrupted snapshots that can't be deleted but the VM will still boot. In one case a VM had 3 disks, 1 disk had to be removed in order to boot the VM but this conversion process allowed the VM OS and one data disk to be saved and they only had to restore from backup one disk after the conversion essentially cleaned up the snapshots that could not be recovered. This process is also mentioned in VMware kb's but this document details the steps more clearly. Here is the VMware kb. Note that you can ignore the title and 90% of the kb just scroll all the way to the bottom of the kb where it says "Workaround" to see a similar alternative method to what is documented here. Committing snapshots when there are no snapshot entries in the Snapshot Manager http://kb.vmware.com/kb/1002310

Symptom


The reason the VM's are still running even though the volume/file system/datastore is destroyed is because the vmdk files are placed toward the end of the volume but the ESX install only overwrites the beginning of the volume. The longer it is left in this state the worse the corruption gets.

cause: installation of ESX over a SAN lun

Environment

VMware ESX with attached storage

Diagnosing The Problem

fdisk shows an ESX, Windows, AIX or other OS install instead of a vmfs on the affected san lun.
Or other data corruption on a lun.

Resolving The Problem

There are three ways to do it. One is install Converter in the VM the other is to use the Converter plug-in that

comes with VC (assuming the customer installed Converter) or use converter standalone which with vSphere 5.x is your only option.

Here I have documented the Remote Hot Converter process using the Converter Plugin:

Here are the steps that document one specific example.

First install the Converter plugin and enable it. Or start up Converter standalone.

Then select an object in VC such as a host or cluster like below demonstrates. Select the Import Machine option.


IMPORTANT:


Select the Physical Server option which is how you to a hot clone (powered on). This is the most important part of this fix, If you select the VM option it will tell you to power off the VM which does not work in a recovery situation such as this. Also if the vmdk or vmx file is damaged then you you can not use the VM option.

What is happening here is basically a P2V (physical to virtual conversion) of the VM guest OS Image which is run inside the image, it is the only way to try to get the VM saved. VMotion will not work since anything that needs to read the disk from outside the VM will fail.



Put in the IP and credentials of the VM or the fully qualified domain name. You can not use just the short name here.




Remember to turn off Windows firewall on the VM otherwise you get this error



This will not work for Windows 2000 machines which require a reboot.



Select the desired disk size:




You must select a Lun other than the damaged one:



The time it will take to clone depends on the size of the disk and the speed of the underlying storage and how busy it is. Also do _not_ delete any VM's from disk until after all Imports have completed.

NOTE:

1)
If a previous conversion failed you must go into the add remove programs of the VM and remove the Converter Agent.

2)
If all else fails the only option is to restore from backup or contact a 3rd party data recovery company.

[{"Product":{"code":"SSCLB3","label":"VMware Solutions"},"Business Unit":{"code":"BU053","label":"Cloud \u0026 Data Platform"},"Component":"ESX","Platform":[{"code":"","label":"VMWare"}],"Version":"3.5;4.0;4.1;5.0;5.1;5.5","Edition":"","Line of Business":{"code":"LOB66","label":"Technology Lifecycle Services"}}]

Document Information

Modified date:
28 January 2020

UID

isg3T1011670