Re: Root filesystem unexpectedly remounted in read-only

22 Sep 2014

      On Fri, 19 Sep 2014, Craig Sanders wrote:
...
On Thu, Sep 18, 2014 at 10:48:00AM +0200, Michele Bert wrote:
...
1) Can bad block appear on a virtual disk too? Even if it is
eventually just a flat file in the host filesystem?
2) Are those bad blocks related to real bad blocks on the physical
host file system?
yes and yes and maybe. for example, if the physical disk has bad blocks
and the VM's virtual disk uses those blocks then both the VM and the
VMWare server could have errors when trying to access those blocks.
it's even possible that the VM will have errors while the VMWare server
doesn't, if the VM retries less often or times out the request earlier
than the server.
other possibilities include:
- VMWare server overloaded for a long time and unable to service ubuntu
   VM's request for disk IO
 - disk faults
 - cabling faults (e.g. loose cables can vibrate and cause transient errors)
 - disk controller faults
 - RAM faults
 - network outages if the disks (virtual and/or physical) are accessed over
   the network (e.g. iscsi or nfs or whatever)
Ie, look in the vmware logs for that host, as well as alerts and alarms.

vmware will tend to drop disk paths well before linux would have a problem
with them, in the name of High Availability.  Whilst Linux would just log
a 120s hangcheck timer alert to the syslog if the disk didn't answer in
120 seconds, vmware might respond to the same disk outage by

*) dropping IO that happened to be in progress on the floor (only symptoms
are that 4 of your 250 VMs go spontaneously readonly, and you only notice
that if you're looking at your syslogs religiously, because most
monitoring sure as hell won't pick up on it)
*) rebooting the VM
*) vmotioning the VM
*) isolating the VM host from the cluster

and a whole bunch of other failures that I've probably seen but
subsequently purged from my memory.

-- 
Tim Connors