IBM Support

VMware ESX Datastore (VMFS) data corruption

Preventive Service Planning


Abstract


VMFS data corruption is rare but possible. The information in this document is not exhaustive on all the possible scenarios related to vmfs corruption. However, this covers some of the more common known issues with regard to VMFS data corruption. Please contact your VMware and Storage support immediately if any corruption is suspected.

1)
Do not present a server with non-ESX Operating System SAN luns with production vmfs datastores unless explicitly necessary for VMware Consolidated Backup (VCB). Occasionally a volume can be corrupted but it is not until after all ESX hosts are rebooted that the datastore is discovered to be corrupt. Which can make analysis difficult.

2)
Never install or upgrade an ESXi host if your shared storage is also SAS direct attached and has multiple LUN id 0 luns with the shared SAS controllers still attached, best practice is to remove access to the shared storage to ensure the VM data is not affected inadvertently. It is also a best practice to remove access to Fiber Channel based shared storage during an Install to be safe.

3)
Occasionally faulty hardware on the SAN or hard drives can cause vmfs data corruption, although this is rare it is possible.

4) Some corruption may not be related to the VMFS at all there can also be vmx file corruption or metadata corruption that can affect VMware ESXi hosts and VM's but not the entire datastore.

See the documents below and contact your VMware and Storage support immediately if corruption is suspected.


Content

Depending on the situation the solution options may be to restore from backups, contact a 3rd party data recovery service or if any VM's are still up and running do a hot conversion. Contact your VMware and storage support immediately if corruption is suspected. More information below:


Making ESX-controlled storage available to Windows servers to facilitate backups
http://kb.vmware.com/kb/2002227

SAN LUNs damaged after ESX installation, upgrade, or reinstall
http://kb.vmware.com/kb/1003881

See the workaround section in the kb below for more information on doing a hot clone to save a vm on a corrupted datastore. However, make sure not to reboot the VM you must use the vCenter converter instead:

See the section at the bottom here to save a running VM on a corrupted datastore:

Committing snapshots when there are no snapshot entries in the snapshot manager
http://kb.vmware.com/kb/1002310

VMFS datastores or their contents may become inaccessible
http://kb.vmware.com/kb/1028970

Physical drive failure on a storage array
http://kb.vmware.com/kb/1003499

Example of windows overwriting partition table
http://kb.vmware.com/kb/1002168

Best practices for VMware ESX or ESXi when scheduling SAN downtime
http://kb.vmware.com/kb/1002777

Here are steps to follow if you ever think this happens again to collect the data ASAP:

Collecting and applying raw metadata dumps on VMFS volumes using dd
http://kb.vmware.com/kb/1020645

Rescuing a running virtual machine with dd when datastore metadata is inaccessible
http://kb.vmware.com/kb/1007499
[{"Product":{"code":"SSCLB3","label":"VMware Solutions"},"Business Unit":{"code":"BU053","label":"Cloud \u0026 Data Platform"},"Component":"ESX","Platform":[{"code":"","label":"VMWare"}],"Version":"2.5;3.0;3.5;4.0;4.1;5.0;5.1;5.5","Edition":"","Line of Business":{"code":"LOB66","label":"Technology Lifecycle Services"}}]

Document Information

More support for:
VMware Solutions

Software version:
2.5, 3.0, 3.5, 4.0, 4.1, 5.0, 5.1, 5.5

Operating system(s):
VMWare

Document number:
670761

Modified date:
28 January 2020

UID

isg3T1012488

Manage My Notification Subscriptions