During the creation of a virtual machine shapshot in vSphere 5 or 5.1 you may get a “disk is full” error like this one:
The operation on the file "/vmfs/devices/multiextent/3ec865dc-xxxx.vmdk" failed (The file is too large). The file system where disk "/vmfs/devices/multiextent/3ec865dc-xxxx.vmdk" resides is full.
Select Retry to attempt the operation again.
Select Cancel to end the session.
Retry is useless because the error happens again so the only left option is to Cancel and, when doing that, you may get this other scary message:
VMware ESX cannot synchronize with the disk before canceling. Disk /vmfs/devices/multiextent/3ec865dc-xxxx.vmdk may be inconsistent.
If your VM is on it will be shut off and then you will not be able to start it again.
It is usually enough to delete the snapshot we have just made to be able to start our virtual machine again.
VMWare documentation (KB1002929) and other internet sites (like this one) say the root cause of the problem is the fact that the virtual machine's working directory is not pointing to the right place (or not having a working directory set as per the second reference) in the corresponding configuration file. This may happen if you have moved the VM, for example. To fix it you only have to edit the VMX file and add the right path (having no path should be fine as this is the default configuration and all my VMs work ok with this default setup). Then you need to remove the virtual machine from the Inventory and add it again back for the system to re-read the data.
¡Caution!, Remove it from Inventory NOT Delete from Disk:
But there exist another reason for this error that is not so commonly documented (or at least I have not found it) in the internet though it is already documented in the VMWare knowledge database. And this one was the culprit of the failure in my case.
During the virtualizacion process of physical equipment it is common to use the vCenter Converter Standalone software. In my case the last version, version 5 by then. Details like the disk controller type are replicated by the conversion tool and inserted in the resulting virtual machine. If you are virtualizing an old server it is possible that both the controller and disks are IDE type. If you are not careful enough and select a different type for the controler for the virtual machine when defining the conversion, the converter will select it for you and it will use IDE disks as in the source. What's more, if the size of the disks is big enough,chances that the converter create a sparse volume for your disks are high. This is not problematic by itself as the created virtual machine is able to start and work fine with this configuration. Not with the highest performance but it works.
The problem arises when you make the first snapshot while the virtual machine is up and running. Now all the elements indicated in the VMWare KB coordinate to trigger the problem.
Like before, deleting the snapshot should let the virtual machine go back to normality. A solution that fall short because now we will not be able to protect that machine anymore while live because we cannot get snapshots while the machine is on.
The solution in this case it to swap the controller type and/or change the disks type from sparse to any other like thin, thick or whatever you prefer using the vmkfstools tool to clone your disks. Then, like before, you will have to remove the virtual machine from the inventory and add it back so that the changes are read.