
On Thu, Apr 11, 2013 at 02:10:37AM +0000, James Harper wrote:
with disks (and raid arrays) of that size, you also have to be concerned about data errors as well as disk failures - you're pretty much guaranteed to get some, either unrecoverable errors or, worse, silent corruption of the data.
Guaranteed over what time period?
any time period. it's a function of the quantity of data, not of time.
It's easy to fault your logic as I just did a full scan of my array and it came up clean.
no, it's not. your array scan checks for DISK errors. It does not check for data corruption - THAT is the huge advantage of filesystems like ZFS and btrfs, they can detect and correct data errors
This is the md 'check' function that compares the two copies of the data together. If there was corruption in my RAID1 then it's incredibly unlikely that this corruption would have occurred on both disks and register as a match, at least from a disk based corruption issue.
If you say you are "guaranteed to get some" over, say, a 10 year period, then I guess that's fair enough. But as you don't specify a timeframe I can't really contest the point.
you seem to be confusing data corruption with MTBF or similar, it's not like that at all. it's not about disk hardware faults, it's about the sheer size of storage arrays these days making it a mathematical certainty that some corruption will occur - write errors due to, e.g., random bit-flaps, controller brain-farts, firmware bugs, cosmic rays, and so on.
e.g. a typical quoted rating of 1 error per 10^14 bits is one error per 12 terabytes - i.e. your four x 3TB array is guaranteed to have at least one error in the data.
Not according to my visible history of parity checks of the underlying data (when it was 4 x 1.5TB - last 3TB disk still on order). I will be monitoring it more closely now though!
I can say though that I do monitor the SMART values which do track corrected and uncorrected error rates, and by extrapolating those figures I can say with confidence that there is not a guarantee of unrecoverable errors.
smart values really only tell you about detected errors in the drive itself. they don't tell you *anything* about data corruption problems - for that, you actually need to check the data...and to check the data you need a redundant copy or copies AND a hash of what it's supposed to be.
Not entirely true. It gives reports of correctable errors, first-read-uncorrectable errors that were correct on re-read, etc. For an undetected disk read error to occur (eg one that still passed ECC or whatever correction codes are used), there would need to be significant quantities of the former, statistically speaking. I wonder if the undetected error rates differ with the 4K sector disks? That is supposed to be one of the other advantages. Of course that still doesn't detect errors that occur beyond the disk (eg pci, controller or cabling), so I guess your point still stands.
with mdadm, such errors can only be corrected if the data can be rewritten to the same sector or if the drive can remap a spare sector to that spot. with zfs, because it's a COW filesystem all that needs to be done is to rewrite the data.
Correct. It can be detected though.
...
Thanks for taking the time to write out that stuff about ZFS. I'm somewhat wiser about it all now :) James