
as you suggest bcache and flashcache seem to offer a way around this for mdadm but i've never used either of them - i was already using zfs by the time they became available. i don't think the SATA interface speed is a deal-breaker for them because the only way around that is spending huge amounts of money.
There were some not-too-expensive battery backed PCI ramdisks available a while ago. Not anymore though.
. Battery backed write cache. Bcache/flashcache offer this but they have their shortcomings, in particular that most available cache modules are still on top of the SATA channel.
this is the only real advantage of hardware raid over mdadm. IMO, ZFS's ability to use an SSD or other fast block device as cache completely eliminates this last remaining superiority of hardware raid over software raid.
Yes I've now been enlightened on this subject :)
. Online resize/reconfigure
both btrfs and zfs offer this.
Can it seamlessly continue over reboot? Obviously it can't progress while the system is rebooting like a hardware raid but I'd hope it could pick up where it left of automatically.
. BIOS boot support (see recent thread "RAID, again" by me)
this is a misfeature of a crappy BIOS rather than a fault with software raid.
any decent BIOS not only has the ability to choose which disk to boot from (rather than hard-code it to only boot from whichever disk is plugged into the first disk port) but will also let you specify a boot order so that it will try disk 1 followed by disk 2 and then disk 3 or whatever. they'll also typically let you press F2 or F12 or whatever at boot time to pop up a boot device selection menu.
even server motherboards like supermicro let you choose the boot device and have a boot menu option accessible over IPMI.
This is where a lot of people get this wrong. Once the BIOS has succeeded in reading the bootsector from a boot disk it's committed. If the bootsector reads okay (even after a long time on a failing disk) but anything between the bootsector and the OS fails, your boot has failed. This 'anything between' includes the grub bootstrap, xen hypervisor, linux kernel, and initramfs, so it's a substantial amount of data to read from a disk that may be on its last legs. A good hardware RAID will have long since failed the disk by this point and booting will succeed. My last remaining reservation on going ahead with some testing is is there an equivalent of clvm for zfs? Or is that the right approach for zfs? My main server cluster is: 2 machines each running 2 x 2TB disks with DRBD with the primary exporting the whole disk as an iSCSI volume 2 machines each importing the iSCSI volume running lvm (clvm) on top, and using the lv's as backing stores for xen VM's. How would this best be done using zfs? Thanks James