
On Mon, 16 Apr 2012, Colin Fee <tfeccles@gmail.com> wrote:
Of course it might turn out that RAID-5 is the killer issue. Servers start becoming a lot more expensive if you want more than 8 disks and even 6 disks is a significant price point. An 8 disk RAID-5 gives something like 21TB usable space vs 12TB on a RAID-10 and a 6 disk RAID-5 gives about 15TB vs 9TB on a RAID-10.
Anything else I should consider?
Not that I've got anything to add re ZFS vs BTRFS having no specialist knowledge either way, but in other posts haven't you advocated for RAID-6 over RAID-5? Or is this something mandated on the client side?
If you use Linux Software RAID-6 then reconstruction apparently is not based on checking both sets of checksums but is rather just regenerating checksums based on the available data. So RAID-6 covers you for the case when two disks entirely die, but that is rare - it's still something you want coverage from but it doesn't give the potential benefits. I have no reason to believe that any other RAID system which still conforms to the basic RAID-6 does something better although I acknowledge that there are lots of implementations that aren't well documented so anything is possible. http://en.wikipedia.org/wiki/Zfs If you use ZFS with RAID-5 it will check the hashes on every block and regenerate things if they don't match. Also it's possible to go back in time and get an earlier copy of the data if there is a corrupted block in the latest copy and no redundancy (see the Wikipedia page for more info). So if you compare Linux Software RAID-5 which only properly copes with a disk entirely dying or returning read errors to ZFS then ZFS wins in the following situations: 1) A disk entirely dies (or is being replaced due to sporadic errors) and another disk has a single error during recovery. ZFS can flag an error on RAID-5 and allow you to get an earlier version. Linux software RAID just loses and leaves corruption for a fsck or data file scrub by an application. 2) 2 disks in a RAID-5 have a few read errors - a reasonably common failure case as most drive failures in production are based on some read failures not a total death. Linux software RAID fails, it kicks out one disk and then you lose when the second disk has a read error. ZFS SHOULD just read from the other disks in the stripe for each error (which is detected by a hash mismatch) and reconstruct the data. NB I've only seen two disks in a RAID set fail with RAID-1, and Linux software RAID lost then. 3) A disk returns corrupt data for any reason. Linux software RAID-6 deals with case 1. It also deals with case 2 although if you suddenly get a third disk giving a few read errors (which could happen due to heat) then you lose. In theory a ZFS RAID-5 (AKA RAID-Z) could cope better with some failure conditions than a Linux Software RAID-6! That said, ZFS supports RAID-6 AKA RAID-Z2. Given the prices of 3TB disks and the fact that reasonably affordable servers can handle 8 disks which allows 18TB of RAID-6 storage it seems like a RAID-Z2 with ZFS is clearly a better choice for most uses (the copy on write feature of ZFS apparently removes the worst performance problems of RAID-5 and RAID-6). Anyway in my previous message I just wasn't really concerned with RAID-5 vs RAID-6. As BTRFS supports neither and ZFS supports both and as they both have very similar amounts of usable capacity for the 8 disk case it's not an issue at this stage of planning. But I think that another general discussion of RAID technology at this time is a good thing so your question is good and deserved a long answer. As for my client, I will give them some options with prices and ask them how much they want to pay more for reliability. I expect that they will pay for RAID-6, not because of some sort of business analysis of risk (which they can't do), but because it doesn't cost much and it would really suck to have some down-time and data loss due to saving such a small amount of money and disk space. http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery As an aside the above page about giving recovery timeouts for disk read operations should also be of interest to some people here given the previous discussions about JBOD vs RAID modes for disks. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/