
On Thu, Apr 05, 2012 at 06:31:49PM +1000, Russell Coker wrote:
On Thu, 5 Apr 2012, Craig Sanders <cas@taz.net.au> wrote:
On Thu, Apr 05, 2012 at 01:44:00PM +1000, Marcus Furlong wrote:
We have issues where the monthly mdadm raid check grinds the system to a halt.
do you find that these monthly cron jobs are actually useful? [...]
deb http://www.coker.com.au squeeze misc
In the above Debian repository for i386 and amd64 I have a version of mdadm patched to send email when the disks have different content. I am seeing lots of errors from all systems, it seems that the RAID code in the kernel is reporting that 128 sectors (64K) of disk space is wrong for every error (all reported numbers are multiples of 128).
if mdadm software raid is doing that, then to me it says "don't use mdadm raid" rather than "stress-test raid every month and hope for the best". however, i've been using mdadm for years without seeing any sign of that (and yes, with the monthly mdadm raid checks enabled. i used to grumble about it slowing my system down but never made the decision to disable it). first question that occurs to me is: is there a bug in the raid code itself or is the bug in the raid checking code?
Also I suspect that the Squeeze kernel has a bug in regard to this. I'm still tracking it down.
i never really used squeeze for long on real hardware (as opposed to on VMs)...except in passing when sid was temporarily rather similar to what squeeze became. and i've always used later kernels - either custom-compiled or (more recently) by installing the later linux-image packages.
If you have a RAID stripe that doesn't match then you really it to be fixed even if replacing a disk is not possible. Having two reads from the same address on a RAID-1 give different results is a bad thing. Having the data on a RAID-5 or RAID-6 array change in the process of recovering from a dead disk is also a bad thing.
true, but as above that's a "don't do that, then" situation. if you are getting symptoms like the above then either your hardware is bad or your kernel version is broken. in either case, don't do that. backup your data immediately and do something else that isn't going to lose your data.
Now the advantage of DRBD is that it's written with split-brain issues in mind. The Linux software RAID code is written with the idea that it's impossible for the two disks to be separated and used at the same time. In the normal case this is not possible unless a disk is physically removed.
yep, and the re-sync is a pain, even with bitmaps. this interesting article i just spotted this from 2006 may indicate an alternative.: ZFS on iscsi http://www.cuddletech.com/blog/pivot/entry.php?id=566 in short: it's possible to build a zpool using iscsi devices. whether it's reliable if one of the iscsi devices disappers, i don't know. zfs already copes well with degraded vdevs...with a mirrored vdev, it shouldn't be a problem (and fairly easily repaired with zfs online if it reappears or zfs replace if it's gone for good). with raidz-n, it would depend on how many disappeared and which ones. and this far more recent post (Aug 2011): http://cloudcomputingresourcecenter.com/roll-your-own-fail-over-san-cluster-... in short: zfs and glusterfs, written by someone who'd given up on drbd. craig ps: one of the reasons i love virtualisation is that it makes it so easy to experiment with this stuff and get an idea of whether it's worthwhile trying on real hardware. spinning up a few new vms is much less hassle than scrounging parts to build another test system. -- craig sanders <cas@taz.net.au> BOFH excuse #336: the xy axis in the trackball is coordinated with the summer solstice