
On 14/09/11 22:20, Craig Sanders wrote:
On Wed, Sep 14, 2011 at 05:06:01PM +1000, Toby Corkindale wrote:
All tests were performed with the /var/lib/postgresql directory being a separate partition, with a freshly-created filesystem for the test.
'zpool create ...' followed by 'zfs create -o mountpoint=/var/lib/postgresql postgresql' or similar?
Yeah, almost exactly that, I think.
I didn't do any tuning beyond the mount options mentioned in my post, so no, Pg didn't get an 8K block size.
I'm not convinced on the 8K vs 4K block size though - some benchmarks
i have no opinion one way or the other yet. haven't done the testing myself. just reporting what i'd read.
4K block-sizes (i.e. ashift=12 rather than default ashift=9 when you create the zpool), on the other hand, are really important. especially if you have some "Advanced Format" 4K sector drives, or there's a chance that you'll be adding some to the pool in the future. which is pretty much a certainty with most 2TB and (i think) all 3TB drives being 4K. I've even got a few 1TB drives that are 4K sectors (2 of them in my zpool).
Yep, I'm aware of the 4K Sector stuff, and do configure it correctly when needed. (Although Ubuntu has auto-detected and handled it correctly itself for a while, I've noticed.) The disks in question here were 1TB disks with standard 512-byte sectors though.
I noticed that.. I'd like to experiment with those again some time, but as it stands, just having three fast drives striped seems to work pretty well anyway.
just striped...scary for anything but "I really don't care if i lose it all" data.
zfs is good, but it's not magic. it can't recover data when a disk dies if there isn't another copy on the other disks in the pool.
for safety, stick another drive in and use raidz. it's like raid-5 but without (most of) the raid-5 write performance problems.
Oh, don't worry - I know what I'm doing! This was a purpose-built setup for benchmarking these filesystems - basically a stock install of Ubuntu Server - so it's trivial to recreate. Anything important gets stored on RAID 5, 6 or 10. In the case of this benchmark, I wanted to have something more closely approximating real-world DB servers, which usually have several spindles in a RAID-10 configuration.
Looking at disk stats as I ran the benchmarks, I noticed that the one that managed to generate the most disk I/O (in MB/sec) was ext4-nobarrier, followed by ZFS. Everything else was quite a way behind. ZFS appeared to be using a lot more CPU than ext4 though.. not sure what to make of that.. I guess the extra complexity in the FS has to cause something!
compression enabled?
Nope. I suspect Chris had it right, about all the checksumming being done. (Although btrfs didn't use that much juice.. but also didn't have much throughput, so perhaps the CPU usage would have risen if it had managed it.)
ZFS does use a fair amount of CPU power. it does a lot more than most filesystems. also, given that it's designed for "Enterprise" environments, they've made quite reasonable assumptions about hardware capabilities and performance than more consumer-oriented dev teams can get away with.
Agreed, although, wasn't OSX planning to use ZFS at some point? What happened with that?