RE: gpt and grub

11 Apr 2013

      ...
On Wed, Apr 10, 2013 at 04:48:17AM +0000, James Harper wrote:
...
...
On 2013-04-09 02:40, James Harper wrote:
...
I have a server that had 4 x 1.5TB disks installed in a RAID5
configuration (except /boot is a 'RAID1' across all 4 disks). One of
the disks failed recently and so was replaced with a 3TB disk,
I'd be very wary of running RAID5 on disks >2TB
Remember that, when you have a disk failure, in order to rebuild the
array, it
needs to scan every sector of every remaining disk, then write to every
sector of the replacement disk.
Debian does a complete scan every month anyway. A HP raid controller
will basically be constantly (slowly) doing a background scan during
periods of low use.
And a full resync on my 4x3TB array only takes 6 hours, so the window
is pretty small.
with disks (and raid arrays) of that size, you also have to be concerned
about data errors as well as disk failures - you're pretty much
guaranteed to get some, either unrecoverable errors or, worse, silent
corruption of the data.
Guaranteed over what time period? It's easy to fault your logic as I just did a full scan of my array and it came up clean. If you say you are "guaranteed to get some" over, say, a 10 year period, then I guess that's fair enough. But as you don't specify a timeframe I can't really contest the point.

I can say though that I do monitor the SMART values which do track corrected and uncorrected error rates, and by extrapolating those figures I can say with confidence that there is not a guarantee of unrecoverable errors.
...
this is why error-detecting and error-correcting filesystems like ZFS and
btrfs exist - they're not just a good idea, they're essential with the large
disk and storage array sizes common today.
see, for example:
http://en.wikipedia.org/wiki/ZFS#Error_rates_in_harddisks
The part that says "not visible to the host software" kind of bothers me. AFAICS these are reported via SMART and are entirely visible, with some exceptions of poor SMART implementations.
...
personally, i wouldn't use raid-5 (or raid-6) any more.  I'd use ZFS
RAID-Z (raid5 equiv) or RAID-Z2 (raid6 equiv. with 2 parity disks)
instead.
Putting the error correction/detection in the filesystem bothers me. Putting it at the block device level would benefit a lot more infrastructure - LVM volumes for VM's, swap partitions, etc. I understand you can run those things on top of a filesystem also, but if you are doing this just to get the benefit of error correction then I think you might be doing it wrong.

Actually when I was checking over this email before hitting send it occurred to me that maybe I'm wrong about this, knowing next to nothing about ZFS as I do. Is a zpool virtual device like an LVM lv, and I can use it for things other than running ZFS filesystems on?
...
actually, i wouldn't have used RAID-5 without a good hardware raid
controller with non-volatile write cache - the performance sucks without
that - but ZFS allows you to use an SSD as ZIL (ZFS Intent Log or sync.
write cache) and as read cache.
For anything for which performance is a constraint I don't use RAID5 at all. This case is an exception in that it stores backup volumes from Bacula (eg streaming writes), and only needs to write as fast as data can come off the 1GBit/sec wire, so disk performance isn't an issue here as my array can easily handle 100mbytes/second streaming writes and backup compression means it never gets sent data that fast anyway.
...
if performance was more important than capacity, I'd use RAID-1 or
so-called raid-"10" or ZFS mirrored disks - a ZFS pool of mirrored pairs
is similar to raid-10 but with all the extra benefits (error detection,
volume management, snapshots, etc) of zfs.
Yes I use RAID10 almost exclusively these days.
...
ZFSonLinux just released version 0.61, which is the first release
they're happy to say is ready for production use. i've been using prior
versions for a year or two now(*) with no problems and just switched from
my locally compiled packages to their release .debs (for amd64 wheezy,
although they work find with sid too).
http://zfsonlinux.org/debian.html
Despite my reservations mentioned above, ZFS is still on my (long) list of things to look into and learn about, more so given that you say it is now considered stable :)
...
BTW, btrfs just got raid5/6 emulation support too...in a year or so
(after the early-adopter guinea pigs have discovered the bugs), it could
be worth considering that as an alternative. my own personal experience
with btrfs raid1 & raid10 emulation was quite bad, but some people swear
by it and lots of bugs have been fixed since i last used it. for large
disks and large arrays, it's still a better choice than ext3/4 or xfs.
As above, but I'll continue to let others find bugs :)

James