
On Fri, 19 Sep 2014, Craig Sanders wrote:
On Thu, Sep 18, 2014 at 10:48:00AM +0200, Michele Bert wrote:
1) Can bad block appear on a virtual disk too? Even if it is eventually just a flat file in the host filesystem? 2) Are those bad blocks related to real bad blocks on the physical host file system?
yes and yes and maybe. for example, if the physical disk has bad blocks and the VM's virtual disk uses those blocks then both the VM and the VMWare server could have errors when trying to access those blocks.
it's even possible that the VM will have errors while the VMWare server doesn't, if the VM retries less often or times out the request earlier than the server.
other possibilities include:
- VMWare server overloaded for a long time and unable to service ubuntu VM's request for disk IO - disk faults - cabling faults (e.g. loose cables can vibrate and cause transient errors) - disk controller faults - RAM faults - network outages if the disks (virtual and/or physical) are accessed over the network (e.g. iscsi or nfs or whatever)
Ie, look in the vmware logs for that host, as well as alerts and alarms. vmware will tend to drop disk paths well before linux would have a problem with them, in the name of High Availability. Whilst Linux would just log a 120s hangcheck timer alert to the syslog if the disk didn't answer in 120 seconds, vmware might respond to the same disk outage by *) dropping IO that happened to be in progress on the floor (only symptoms are that 4 of your 250 VMs go spontaneously readonly, and you only notice that if you're looking at your syslogs religiously, because most monitoring sure as hell won't pick up on it) *) rebooting the VM *) vmotioning the VM *) isolating the VM host from the cluster and a whole bunch of other failures that I've probably seen but subsequently purged from my memory. -- Tim Connors