
On Thu, 11 Apr 2013, James Harper <james.harper@bendigoit.com.au> wrote:
Does "11008" mismatches mean that 11008 bytes were found to be different, or that 11008 sectors were found to be different? In either case I would suggest to you that you have a serious problem with your servers and that this is not normal. I have many servers running linux md RAID1 and have never seen such a thing.
Linux Software RAID-1 seems to report large numbers of mismatches in a multiple of 64 when nothing appears to be wrong. It happens on all the systems I run. On Thu, 11 Apr 2013, Craig Sanders <cas@taz.net.au> wrote:
On Thu, Apr 11, 2013 at 02:10:37AM +0000, James Harper wrote:
with disks (and raid arrays) of that size, you also have to be concerned about data errors as well as disk failures - you're pretty much guaranteed to get some, either unrecoverable errors or, worse, silent corruption of the data.
Guaranteed over what time period?
any time period. it's a function of the quantity of data, not of time.
So far I haven't seen a corruption reported by "btrfs scrub". Admittedly I have less than 3TB of data on BTRFS at the moment.
If you say you are "guaranteed to get some" over, say, a 10 year period, then I guess that's fair enough. But as you don't specify a timeframe I can't really contest the point.
you seem to be confusing data corruption with MTBF or similar, it's not like that at all. it's not about disk hardware faults, it's about the sheer size of storage arrays these days making it a mathematical certainty that some corruption will occur - write errors due to, e.g., random bit-flaps, controller brain-farts, firmware bugs, cosmic rays, and so on.
I've got a BTRFS filesystem that was corrupted by a RAM error (I discarded a DIMM after doing all the relevant Memtest86+ tests). Currently I have been unable to get btrfsck to work on it and make it usable again. But at least I know the data was corrupted which is better than having the system keep going and make things worse.
Putting the error correction/detection in the filesystem bothers me. Putting it at the block device level would benefit a lot more infrastructure - LVM volumes for VM's, swap partitions, etc.
having used ZFS for quite some time now, it makes perfect sense to me for it to be in the filesystem layer rather in the block level - it's the file system that knows about the data, what/where it is, and whether it's in use or not (so, faster scrubs - only need to check blocks in use rather than all blocks).
http://etbe.coker.com.au/2012/04/27/btrfs-zfs-layering-violations/ There are real benefits to having separate layers, I've written about this at the above URL. But there are also significant benefits to doing things in the way that BTRFS and ZFS do it and it seems that no-one is interested in developing any other way of doing it (EG a version of Linux Software RAID that does something like RAID-Z). Also if you use ZVOLs then ZFS can be considered to be a LVM replacement with error checking (as Craig already noted). -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/