
I'm in need of a PCIe SATA controller that works well under linux and supports BTRFS / ZFS raid arrays to be used in a home NAS I'm building and am no sure what I'm looking for.
From what I can see, Astrotek have a reasonably priced ($32) CPES6 and Highpoint make a more expensive Rocket R640L ($99).[1]
Aside from the obvious number of ports, is anyone able to tell me the difference between these two cards, or recommend an alternative that's known to work under linux. My current need is only for a single additional SATA port but my preference is for a card that will support future expansion, I'm just not sure 2 more ports are worth the almost 3 fold price differential. Tim [1] http://www.pccasegear.com/index.php?main_page=index&cPath=210_385

Tim Hamilton wrote:
[1] http://www.pccasegear.com/index.php?main_page=index&cPath=210_385
I notice all these controllers are SATA III. When I was struggling to understand why my new 4TB HGST_HDN7240_40ALE_640 SATA III HD ; couldn't seem to deliver a much higher data transfer rate on SATA III than on SATA II.; one comment which was illuminating to me was; " there is no such thing as a SATA III rotating hard-drive" which I took to mean as : ' there is no rotating harddrive; whose heads can transfer data on/off the disk; at anything like 6Gb/s ~ 600GB/s' (SATA III max)'. in contrast to say a SATA III SSD, where this is quite possible ! Apologies if this is merely repeating old news ! regards Rohan McLeod

On Sat, 6 Dec 2014, Rohan McLeod <rhn@jeack.com.au> wrote:
[1] http://www.pccasegear.com/index.php?main_page=index&cPath=210_385
I notice all these controllers are SATA III. When I was struggling to understand why my new 4TB HGST_HDN7240_40ALE_640 SATA III HD ; couldn't seem to deliver a much higher data transfer rate on SATA III than on SATA II.; one comment which was illuminating to me was; " there is no such thing as a SATA III rotating hard-drive" which I took to mean as : ' there is no rotating harddrive; whose heads can transfer data on/off the disk; at anything like 6Gb/s ~ 600GB/s' (SATA III max)'. in contrast to say a SATA III SSD, where this is quite possible ! Apologies if this is merely repeating old news !
That's the correct interpretation. The best I've seen for a single disk is about 160MB/s for contiguous reads from the outer tracks - a usage pattern that is quite uncommon IRL. In real world use I've seen systems using all available disk IO capacity while transferring less than 10MB/s because of the seek time. If you had a hardware RAID device that presented itself as a single SATA device with 2 real SATA devices connected then SATA 2 shouldn't be a bottleneck. If your hardware RAID device had 8 disks then for most real world uses of RAID hardware SATA 2 speed wouldn't be a bottleneck. Some SSDs would probably be able to deliver more than 300MB/s, but not the ones I bought. On Sat, 6 Dec 2014, Brett Pemberton <brett.pemberton@gmail.com> wrote:
Keep in mind, with ZFS for example, you can't do like mdadm, and just add a drive to a RAID5/6 array and reshape, growing to use more devices.
Your best bet is to start off with the maximum number of drives you plan to use, and upgrade capacities, instead of adding more devices. Can't comment on BTRFS.
Yes, ZFS doesn't support adding more disks to a RAID array. You can add more RAID arrays to the pool but as you can never remove them you probably don't want to. BTRFS is much better for changing things. It supports things like a "RAID-1" array where the disks have different sizes (as long as no single disk is larger than all the others combined there will be no wasted space). Also it allows removing disks while online. On Sat, 6 Dec 2014, Chris Samuel <chris@csamuel.org> wrote:
Can't comment on BTRFS.
The RAID5/6 code is very experimental and I wouldn't suggest trying to use that functionality for any data you're attached to. Stick to 2 drives, and good backups, for btrfs.
I have servers running BTRFS RAID-1, but so far I haven't even tested BTRFS RAID-5/6. Right at this moment they are merging patches that should make it theoretically usable, but I'd rather have someone else test it first. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

BTRFS is much better for changing things. It supports things like a "RAID-1" array where the disks have different sizes (as long as no single disk is larger than all the others combined there will be no wasted space). Also it allows removing disks while online.
Is there somewhere that describes how it allocates files to drives? I had an array of 4 bcache volumes - 2 x 500GB + 2 x 2TB, and the 2TB drives were filling up and the 500GB drives were empty. Then I had problems with bcache so I progressively removed each drive and re-added without bcache, with a 2TB drive as the last one, so now it looks like this: size 448.76GiB used 252.00GiB path /dev/sda3 size 448.76GiB used 251.03GiB path /dev/sdb3 size 1.80TiB used 1.19TiB path /dev/sdd3 size 1.80TiB used 713.00GiB path /dev/sdc3 And I assume it's ultimately aiming for some optimal ratio of disk use, but I'm curious about what that ratio is...
On Sat, 6 Dec 2014, Chris Samuel <chris@csamuel.org> wrote:
Can't comment on BTRFS.
The RAID5/6 code is very experimental and I wouldn't suggest trying to use that functionality for any data you're attached to. Stick to 2 drives, and good backups, for btrfs.
I have servers running BTRFS RAID-1, but so far I haven't even tested BTRFS RAID-5/6. Right at this moment they are merging patches that should make it theoretically usable, but I'd rather have someone else test it first.
I'm kind of surprised that RAID[56] is getting more attention than any SSD caching project, which I see as a necessity before I'd contemplate using striped raid levels. bcache solves the problem to some extent, but BTRFS really needs to fully control the cache... Do you know if BTRFS RAID[56] is "proper" RAID, eg striped across all the disks, small writes require read-modify-write, etc? Or is it some fancy RAID[56]-like implementation that avoids some of these shortcomings? I'm using btrfs on all my new home machines now. cp --reflink is really really awesome! James

On Sat, Dec 6, 2014 at 9:16 AM, Tim Hamilton <hamilton.tim@gmail.com> wrote:
I'm in need of a PCIe SATA controller that works well under linux and supports BTRFS / ZFS raid arrays to be used in a home NAS I'm building and am no sure what I'm looking for.
My current need is only for a single additional SATA port but my preference is for a card that will support future expansion
Keep in mind, with ZFS for example, you can't do like mdadm, and just add a drive to a RAID5/6 array and reshape, growing to use more devices. Your best bet is to start off with the maximum number of drives you plan to use, and upgrade capacities, instead of adding more devices. Can't comment on BTRFS. FWIW, I've bought a few of these over the years: http://www.msy.com.au/vic/pascoevale/peripherals/8896-channel-pes3a-020-pci-... Work fine under Linux/FreeBSD. No complaints. Cheap. / Brett

On Sat, 6 Dec 2014 02:02:50 PM Brett Pemberton wrote:
Can't comment on BTRFS.
The RAID5/6 code is very experimental and I wouldn't suggest trying to use that functionality for any data you're attached to. Stick to 2 drives, and good backups, for btrfs. cheers, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC

On Sat, 6 Dec 2014 02:02:50 PM Brett Pemberton wrote:
Can't comment on BTRFS.
The RAID5/6 code is very experimental and I wouldn't suggest trying to use that functionality for any data you're attached to. Stick to 2 drives, and good backups, for btrfs.
The 'mirror' RAID method of BTRFS is functionally the same as RAID5 from a 'you can lose one disk' point of view. BTRFS makes sure every piece of data is stored on 2 disks, so you can lose any one disk regardless of how big your array is. If there was such a thing as a 'always store 3 copies' mode then you could lose 2 disks from your array, which is functionally the same as RAID6 from that point of view. And depending on your workload this could be an improvement over RAID[56] from a performance point of view too (especially if you had a failed disk - that really hurts!) James

On Sat, 6 Dec 2014 12:08:37 PM James Harper wrote:
The 'mirror' RAID method of BTRFS is functionally the same as RAID5 from a 'you can lose one disk' point of view. BTRFS makes sure every piece of data is stored on 2 disks, so you can lose any one disk regardless of how big your array is.
The problem is that people assume that btrfs raid1 profile is the same as MD in that all data is on all drives, when it's (sadly) not. With MD you can have a 3 drive mirror and safely lose 2 drives, but with btrfs you can't. It's one thing I'd love to have fixed. I should mention that Miao Xie from Fujitsu is doing a lot of work on the RAID5/6 code, including posting patchsets to implement "Implement device scrub/replace for RAID56" which is important. All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC

Thanks Brett. I wasn't aware of RaidZ's limitation on adding devices. A quick scan of the BTRFS documentation shows that it doesn't have the same limitation. However, raid 5 and 6 are still listed as having significant issues with data recovery and should be used for testing purposes only. [1] Given my plan is to increase capacity by adding, rather than replacing, drives it looks like an mdadm managed raid 5/6 pool might be the better option. [1] https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices On Sat, Dec 6, 2014 at 2:02 PM, Brett Pemberton <brett.pemberton@gmail.com> wrote:
On Sat, Dec 6, 2014 at 9:16 AM, Tim Hamilton <hamilton.tim@gmail.com> wrote:
I'm in need of a PCIe SATA controller that works well under linux and supports BTRFS / ZFS raid arrays to be used in a home NAS I'm building and am no sure what I'm looking for.
My current need is only for a single additional SATA port but my preference is for a card that will support future expansion
Keep in mind, with ZFS for example, you can't do like mdadm, and just add a drive to a RAID5/6 array and reshape, growing to use more devices.
Your best bet is to start off with the maximum number of drives you plan to use, and upgrade capacities, instead of adding more devices. Can't comment on BTRFS.
FWIW, I've bought a few of these over the years:
http://www.msy.com.au/vic/pascoevale/peripherals/8896-channel-pes3a-020-pci-...
Work fine under Linux/FreeBSD. No complaints. Cheap.
/ Brett
-- Vote NO in referenda.

On Sat, Dec 6, 2014 at 9:40 PM, Tim Hamilton <hamilton.tim@gmail.com> wrote:
Given my plan is to increase capacity by adding, rather than replacing, drives it looks like an mdadm managed raid 5/6 pool might be the better option.
My method was to decide on my fully populated capacity: 9 drives, and then just source old drives to make that happen. So my initial raidz array was 3 x raidz1 vdevs, each with 3 drives. In my initial state these were 500gb drives I had lying around, or sourced for very cheap. 4 of them were even PATA. I was then able to replace them with larger drives, 3 at a time. 2TB drives were best cost/capacity at one point, so I replaced 3 drives with those. Next time I needed a storage boost, I did it again with 3 more of the 500GB drives (now free of PATA). Most recently, I replaced the last bunch with 4TB drives. Next time a jump is needed, 3x2TB drives will move to 4TB. As long as you can source some cheap old (but still relatively reliable) drives, you can still use ZFS in the same way as mdadm. cheers, / Brett

On Sat, Dec 6, 2014 at 10:36 PM, Brett Pemberton <brett.pemberton@gmail.com> wrote:
My method was to decide on my fully populated capacity: 9 drives, and then just source old drives to make that happen.
So my initial raidz array was 3 x raidz1 vdevs, each with 3 drives. In my initial state these were 500gb drives I had lying around, or sourced for very cheap. 4 of them were even PATA. I was then able to replace them with larger drives, 3 at a time.
As long as you can source some cheap old (but still relatively reliable) drives, you can still use ZFS in the same way as mdadm.
This is a scenario I hadn't considered. My box has a max capacity of 8 properly mounted drives, I have 4 x 3tb for a storage pool and 256gb ssd for the os and for some regularly used but not so precious data. I also have a bunch of older drives of random sizes laying about which I could use. I don't think any of these are the same size. If I was to follow your example, I could create 2 raidz1 vdevs of 3 drives each, 1 of these using the 3x3tb drives and 1 making use the other 3tb drive and my 2 biggest capacity old drives. This vdev would obviously be limited to the capacity by the smallest of these drives. I could then grow my storage pool when needed by upgrading the two older drives to match the larger 3tb. One downside would be that, for now, I wouldn't be making use of a big chunk of one of the 3tb drives. On the other hand, I don't yet have the data to fill it. The other obvious path would be to simply use the 4x3tb drives in a raid5 pool managed by mdadm and forgo zfs filesystem features, though I did want to make use of snapshots. Seems I should have posted here for ideas prior to purchasing my storage. If any of you had my hardware, how would you construct your storage layout? -- Vote NO in referenda.

On Tue, 9 Dec 2014, Tim Hamilton <hamilton.tim@gmail.com> wrote:
If I was to follow your example, I could create 2 raidz1 vdevs of 3 drives each, 1 of these using the 3x3tb drives and 1 making use the other 3tb drive and my 2 biggest capacity old drives. This vdev would obviously be limited to the capacity by the smallest of these drives. I could then grow my storage pool when needed by upgrading the two older drives to match the larger 3tb. One downside would be that, for now, I wouldn't be making use of a big chunk of one of the 3tb drives. On the other hand, I don't yet have the data to fill it.
The other obvious path would be to simply use the 4x3tb drives in a raid5 pool managed by mdadm and forgo zfs filesystem features, though I did want to make use of snapshots.
Why not just use 4*3TB disks in a RAID-Z array? That gives you 9TB of usable space. 5TB disks are on sale now and aren't particularly expensive, if every time you have a problem with a 3TB disk you replace it with a 5TB disk (or 6TB when they become available) then you'll probably end up with a couple of 5+TB disks in a few years. ZFS SHOULD work over a mdadm RAID-0 array or a LVM volume that spans multiple physical disks. So if you find yourself with 2*3TB and 2*5TB disks in your array you could replace those 3TB disks with RAID-0 arrays that have 5TB capacity. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

On 9 December 2014 at 20:22, Tim Hamilton <hamilton.tim@gmail.com> wrote:
If any of you had my hardware, how would you construct your storage layout?
The more disks you have, the higher the chance of having a disk failure. The older the disks you have, the higher the chance of having a disk failure. I like low-maintenance, high-reliability solutions where they fit well, so I would aim for a system that uses a few, large, disks, and plan to replace them in 2-3 years. I'd start by trying to estimate how much storage I might want within the next couple of years. Let's say I want to have 2TB of storage. In that case, I'd purchase two 2TB drives, mirror them in btrfs, done.[1] mkfs.btrfs -m raid1 -d raid1 /dev/disk/by-id/foo1 /dev/disk/by-id/foo2 Time goes by, the amount of data I'm collecting ramps up hugely, I need more space. So, I'd buy a couple of 4TB drives, and add them to the pool and then perhaps rebalance: btrfs device add /dev/disk/by-id/foo3 /mnt btrfs device add /dev/disk/by-id/foo4 /mnt btrfs balance start /mnt Later, I'd remove the original drives as they were getting old, and probably look at replacing them with bigger drives. btrfs device delete /dev/disk/by-id/foo1 /mnt btrfs device delete /dev/disk/by-id/foo2 /mnt Anyway, this all works, at least on the 3.17 kernels. An aside: on much earlier ones I noticed that the free space reported by 'df' was always quite wonky for RAID1 btrfs filesystems, but it seemed to get fixed eventually. Footnotes: 1: I'd actually buy three drives, with the third an externally housed one used for off-site backup.

On Wed, 10 Dec 2014, Toby Corkindale <toby@dryft.net> wrote:
Let's say I want to have 2TB of storage. In that case, I'd purchase two 2TB drives, mirror them in btrfs, done.[1] mkfs.btrfs -m raid1 -d raid1 /dev/disk/by-id/foo1 /dev/disk/by-id/foo2
Actually buy at least 3TB disks. The MSY prices are $92 and $123 respectively. It's a good idea to pay an extra $30 per disk to delay your need to buy bigger disks (which is expensive and takes some time and effort) and to give better performance (as a rule of thumb bigger drives are faster, especially if you partition it to put your data on the outer tracks). Even buying 4TB disks if you need to store 2TB of data may be a good idea, it's only $178 for 4TB.
Time goes by, the amount of data I'm collecting ramps up hugely, I need more space. So, I'd buy a couple of 4TB drives, and add them to the pool and then perhaps rebalance: btrfs device add /dev/disk/by-id/foo3 /mnt btrfs device add /dev/disk/by-id/foo4 /mnt btrfs balance start /mnt
Later, I'd remove the original drives as they were getting old, and probably look at replacing them with bigger drives. btrfs device delete /dev/disk/by-id/foo1 /mnt btrfs device delete /dev/disk/by-id/foo2 /mnt
Note that "btrfs replace" is MUCH faster than a balance or delete operation. Also last time I measured it an idle but spinning SATA disk added about 7W to the total power use of a PC (IE the power taken from the power point). Among other things that increases cooling problems in summer, noise, and as you noted more disks means a greater probability of failure. I've currently got 2*4TB disks in a BTRFS RAID-1 array for my home server. When that gets almost full I'll consider adding a pair of 1TB or 2TB disks to the array if 6TB is still the biggest capacity on sale at that time. But if they have 10TB disks on sale and 6TB disks going cheap (as is likely to be the case) then I'll probably just buy new 6TB disks. I could use a couple of 4TB disks for backups.
Footnotes: 1: I'd actually buy three drives, with the third an externally housed one used for off-site backup.
Buy 4 drives and use 1 for local backup and 1 for off-site. Even when your first line backup is BTRFS snapshots you still want to have more than one removable backup device. Another possibility is to use an old PC with a few disks in a RAID-5 or RAID-6 array for local backup. I've been considering getting a large tower PC filled with old disks (~1TB capacity) in a BTRFS RAID-5 configuration for local backup. The noise and heat of all the disks wouldn't matter as I'd only turn it on when doing a backup. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

On 10 December 2014 at 13:06, Russell Coker <russell@coker.com.au> wrote:
On Wed, 10 Dec 2014, Toby Corkindale <toby@dryft.net> wrote:
Let's say I want to have 2TB of storage. In that case, I'd purchase two 2TB drives, mirror them in btrfs, done.[1] mkfs.btrfs -m raid1 -d raid1 /dev/disk/by-id/foo1 /dev/disk/by-id/foo2
Actually buy at least 3TB disks. The MSY prices are $92 and $123 [snip]
I was really just using the size as an arbitrary amount for example's sake. I haven't looked recently to see what sort of sizes are good value, but I would expect people using this advice to do so, yes.
Time goes by, the amount of data I'm collecting ramps up hugely, I need more space. So, I'd buy a couple of 4TB drives, and add them to the pool and then perhaps rebalance: btrfs device add /dev/disk/by-id/foo3 /mnt btrfs device add /dev/disk/by-id/foo4 /mnt btrfs balance start /mnt
Later, I'd remove the original drives as they were getting old, and probably look at replacing them with bigger drives. btrfs device delete /dev/disk/by-id/foo1 /mnt btrfs device delete /dev/disk/by-id/foo2 /mnt
Note that "btrfs replace" is MUCH faster than a balance or delete operation.
Thanks Russell, that's interesting to know. Might not be an option for someone if they only have four or five ports, but sounds good if you do have a spare.
Another possibility is to use an old PC with a few disks in a RAID-5 or RAID-6 array for local backup. I've been considering getting a large tower PC filled with old disks (~1TB capacity) in a BTRFS RAID-5 configuration for local backup. The noise and heat of all the disks wouldn't matter as I'd only turn it on when doing a backup.
You start getting into higher failure rates there -- old drives, lots of them, frequently getting spun up and down..

On Wed, 10 Dec 2014, Toby Corkindale <toby@dryft.net> wrote:
On 10 December 2014 at 13:06, Russell Coker <russell@coker.com.au> wrote:
On Wed, 10 Dec 2014, Toby Corkindale <toby@dryft.net> wrote:
Let's say I want to have 2TB of storage. In that case, I'd purchase two 2TB drives, mirror them in btrfs, done.[1]
mkfs.btrfs -m raid1 -d raid1 /dev/disk/by-id/foo1 /dev/disk/by-id/foo2
Actually buy at least 3TB disks. The MSY prices are $92 and $123
[snip]
I was really just using the size as an arbitrary amount for example's sake. I haven't looked recently to see what sort of sizes are good value, but I would expect people using this advice to do so, yes.
I know, but avoiding drive replacements as much as possible is a really good strategy that should be considered.
Note that "btrfs replace" is MUCH faster than a balance or delete operation.
Thanks Russell, that's interesting to know. Might not be an option for someone if they only have four or five ports, but sounds good if you do have a spare.
Even if you have no spare ports it's an option. Transferring data over USB 2.0 is usually a lot faster than a BTRFS balance or remove. So one option would be to put the disk to be replaced in a USB-SATA caddy and for the duration of the replace operation. The "btrfa replace" operation has a -r flag to only read from the original disk if the other disks don't have the data. When you use this in the normal situation there will be almost no reads from the original disk. BTRFS balance and remove are REALLY slow. It's so slow that it's the subject of regular bug reports on the mailing list.
Another possibility is to use an old PC with a few disks in a RAID-5 or RAID-6 array for local backup. I've been considering getting a large tower PC filled with old disks (~1TB capacity) in a BTRFS RAID-5 configuration for local backup. The noise and heat of all the disks wouldn't matter as I'd only turn it on when doing a backup.
You start getting into higher failure rates there -- old drives, lots of them, frequently getting spun up and down..
True. But RAID-5 with checksums should cope with that reasonably well, and RAID-6 is even better. The way ZFS uses multiple copies of metadata is the best, hopefully BTRFS will get that feature soon. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

On 9 December 2014 at 20:22, Tim Hamilton <hamilton.tim@gmail.com> wrote:
If any of you had my hardware, how would you construct your storage layout?
The more disks you have, the higher the chance of having a disk failure. The older the disks you have, the higher the chance of having a disk failure. I like low-maintenance, high-reliability solutions where they fit well, so I would aim for a system that uses a few, large, disks, and plan to replace them in 2-3 years.
Your 7200RPM disk gets very likely gets under 100IOPS. More like 75. If you are aiming for capacity with no regard for performance then fewer, larger disks is likely a good solution, but if you have any performance requirements at all then a greater number of smaller disks is a better option. Performance should scale pretty much linearly with an increasing number of disks for RAID10. Failing that, get a couple of small SSD's in a RAID1 configuration and run bcache in front of your rotating disks. The performance difference is amazing. James

On 10 December 2014 at 17:33, James Harper <james@ejbdigital.com.au> wrote:
On 9 December 2014 at 20:22, Tim Hamilton <hamilton.tim@gmail.com> wrote:
If any of you had my hardware, how would you construct your storage layout?
The more disks you have, the higher the chance of having a disk failure. The older the disks you have, the higher the chance of having a disk failure. I like low-maintenance, high-reliability solutions where they fit well, so I would aim for a system that uses a few, large, disks, and plan to replace them in 2-3 years.
Your 7200RPM disk gets very likely gets under 100IOPS. More like 75. If you are aiming for capacity with no regard for performance then fewer, larger disks is likely a good solution, but if you have any performance requirements at all then a greater number of smaller disks is a better option. Performance should scale pretty much linearly with an increasing number of disks for RAID10.
Failing that, get a couple of small SSD's in a RAID1 configuration and run bcache in front of your rotating disks. The performance difference is amazing.
I suspect the OP (and most home users) are thinking more of large-file (probably AV media) archives, and for larger files, spinning disks provide adequate performance. I agree, for real performance, SSDs are the way to go. Half-terabyte SSDs are affordable and have been for a while, so I don't bother with bcache or similar; I just go direct to SSD if performance matters.

On Sat, Dec 06, 2014 at 02:02:50PM +1100, Brett Pemberton wrote:
On Sat, Dec 6, 2014 at 9:16 AM, Tim Hamilton <hamilton.tim@gmail.com> wrote:
I'm in need of a PCIe SATA controller that works well under linux and supports BTRFS / ZFS raid arrays to be used in a home NAS I'm building and am no sure what I'm looking for.
My current need is only for a single additional SATA port but my preference is for a card that will support future expansion
Keep in mind, with ZFS for example, you can't do like mdadm, and just add a drive to a RAID5/6 array and reshape, growing to use more devices.
Your best bet is to start off with the maximum number of drives you plan to use, and upgrade capacities, instead of adding more devices.
remember, though, that you can always add another vdev to a ZFS pool. you can't reshape a zpool, or remove a vdev from it (and zfs doesn't have a 'rebalance' feature like btrfs) but adding a vdev works. e.g. if you have a zpool consisting of one three drive raidz vdev, you could later add another vdev of any kind (e.g. another raidz, a 2-drive mirror, etc. a vdev can even be a single disk but you get no redundancy from that - similar to raid0). a zpool is made up of one or more vdevs, which are each made up of one or more drives. you can also replace the drives in any vdev with larger capacity drives....when all drives in that vdev are replaced, the extra capacity is available for storage. craig ps: performance on a zfs pool is notably improved (especially for random writes) if you have a small (4 or 8GB is plenty) SSD partition as a zfs log device. -- craig sanders <cas@taz.net.au>

On Sat, Dec 06, 2014 at 11:46:29PM +0000, James Harper wrote:
ps: performance on a zfs pool is notably improved (especially for random writes) if you have a small (4 or 8GB is plenty) SSD partition as a zfs log device.
But make sure your SSD partition is itself in a RAID of some sort too :)
not necessarily. redundancy in the ZIL is good, but can be achieved simply by having two ZIL devices. in fact, it's better to do it this way than to use a raid device - zfs works much better if it has direct knowledge and control of all devices in the pool. in particular, you should never give a raid array to zfs. instead, give the individual drives in the array to zfs and use zfs to construct the raid, otherwise you'll lose most of the benefit of using zfs (including zfs' ability to repair data corruption) see http://en.wikipedia.org/wiki/ZFS#ZFS_and_hardware_RAID e.g. i have two SSDs in my main system, partitioned to provide an mdadm raid1 mirror for my rootfs (170GB), mdadm raid1 /boot (1GB), two swap partitions, two non-raid ZIL partitions (4GB each), and two non-raid zfs l2arc cache partitions (50GB each). it's not a perfect setup - if money was irrelevant, i would have separate devices for root+boot, zil, and l2arc....but it's a reasonable compromise on speed, redundancy, and cost. i'd also need at least another 4 SATA ports if i had them on separate devices. my zpools (two of them, one for daily use and one as a backup pool for all systems on my network) are separate drives on an IBM M1015 SAS controller (around $100 for 8 ports which can handle SAS or SATA drives). i have a similar setup on my mythtv box, which also has a zpool for bulk storage but only a single SSD for root, /boot, zil, and l2arc cache. craig -- craig sanders <cas@taz.net.au>
participants (8)
-
Brett Pemberton
-
Chris Samuel
-
Craig Sanders
-
James Harper
-
Rohan McLeod
-
Russell Coker
-
Tim Hamilton
-
Toby Corkindale