
On Mon, Jul 23, 2012 at 02:44:26PM +1000, Brett Pemberton wrote:
I'm doing this by creating a new fs for this section, moving the files into there, and then turning dedup on for that FS. Presume this is the proper method.
almost. turn de-dupe on first. dupe-checking is done at time of write. same applies to enabling compression (only files written after compression is enabled will be compressed), and balancing of files over all spindles if you add another vdev to a zpool (btrfs has an auto-rebalance option, zfs doesn't. it's the one nice thing that btrfs has that zfs doesn't. OTOH zfs can be trusted not to lose your data and btrfs can't yet). you could manually de-dupe, compress, rebalance, etc by writing a script to copy & delete each file, but the technical term for this procedure is "a massive PITA" :)
That section is around 400GB. The machine in question currently has 16GB of RAM, so it'll be interesting to see how things go with it on.
planning to post your findings here? i'd be interested to read how it turns out.
zfs compression, OTOH, is definitely worth-while if most of the data you're storing is not already compressed.
Again, this will be limited to certain directories. Is the best practice to create separate filesystems for those areas?
you enable/disable compression on a per-filesystem basis, so yes. note, however, that the one minor drawback with multiple filesystems vs just a subdirectory is that even though the files are on the same physical disks, moving files from one zfs fs to another is a copy-and-delete operation (so time-consuming). i.e. same as having multiple filesystems on partitions or LVM volumes. craig -- craig sanders <cas@taz.net.au>