
On Tue, 25 Mar 2014 15:42:06 Allan Duncan wrote:
On 25/03/14 12:58, Craig Sanders wrote:
On Tue, Mar 25, 2014 at 12:27:04PM +1100, Daniel Jitnah wrote:
What can cause this?
most likely the VM's disk image became unavailable temporarily - possibly due to network problems, or a server being rebooted.
Also corruption of storage and RAM are possibilities. The cheaper cloud servers don't have ECC RAM, in the past I've had RAM corruption cause filesystem corruption. I suspect that filesystems such as BTRFS and ZFS can get more messed up in the case of corrupt RAM than simpler filesystems like Ext*. If the system was running for a while "working fine for weeks/months" then it could have been due for a FSCK on the next boot anyway in which case whatever errors there were might have been fixed already. Also in these situations the kernel message log usually has good information about the problem but it can't be written to local disk. So if you don't have syslog over the network to another machine then logging in via ssh and running "dmesg" is the only way to get the data. So by rebooting you probably permanently lost the ability to determine the cause - unless it happens again.
assuming you are using ext2/3/4 on your VM's disk - the mount option "errors=remount-ro" says to remount the fs as read-only if the kernel has any errors accessing the filesystem (e.g. if a disk is dead/dying or the cable is loose etc).
debian at least, and probably other distros, routinely adds "errors=remount-ro" to /etc/fstab for ext filesystems when you build the system.
If it is not set in fstab, look at the superblock with tune2fs -l <device> and see what is set for "Errors behavior:"
Note that you probably don't want to change this. Having the filesystem go read-only in case of problems is a good thing. As you have read access you can login and run programs like dmesg to diagnose it and as you have no write access you don't get further corruption. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/