
On 16/07/13 09:45, Petros wrote:
Quoting "Tim Connors" <tconnors@rather.puzzling.org>
Don't ever use more than 80% of your file system? Yeah, I know that's not a very acceptable alternative.
The 80% is a bit of an "old time myth", I was running ZFS with higher usage under FreeBSD until I hit the "slowness".
Chipping in with a bit of real-world info for you here. Yes, ZFS on Linux still suffers from massive slowdowns once you run out of most of the free space. It's more than 80%, I'll grant, but not a whole lot - maybe 90%?
BTW: I don't use dedup. Firstly because I use cloning many times and after that: Well, that are changes unique to the ZFS in question.
I have problems to come up with a scenario to use it. But I am pretty sure someone asked for it. Maybe someone running a big big server farm and distributing copies of many many Gigabytes of data to many VMs on the same box?
Dedup is really painful on zfs. Attempting to use it over a multi-terabyte pool ended in failure; using it just a subset of data worked out OK, but ended up with >50GB of memory in the server to cope. Fine for a big server, but how many of you are running little fileservers at home with that much? You do save on a fair bit of disk space if you're storing a pile of virtual machine images that contain a lot of similarities. -Toby

On Tue, Jul 16, 2013 at 11:02:16AM +1000, Toby Corkindale wrote:
On 16/07/13 09:45, Petros wrote:
The 80% is a bit of an "old time myth", I was running ZFS with higher usage under FreeBSD until I hit the "slowness".
Chipping in with a bit of real-world info for you here. Yes, ZFS on Linux still suffers from massive slowdowns once you run out of most of the free space. It's more than 80%, I'll grant, but not a whole lot - maybe 90%?
traditional filesystems like ext4 reserve 5% for root anyway, so there's not much difference between 90 and 95%
Dedup is really painful on zfs. Attempting to use it over a multi-terabyte pool ended in failure; using it just a subset of data worked out OK, but ended up with >50GB of memory in the server to cope.
IIRC, de-duping can be enabled on a per subvolume/zvol basis, but the RAM/L2ARC requirement still depends on the size of the pool and NOT the volume. otherwise, i'd enable it on some of the subvolumes on my own pools. so if you want de-duping with small memory foot-print, make a small zpool dedicated just for that. could be a good way to get economical use of a bunch of, say, 250 or 500GB SSDs for VM images. hmmm, now that i think of it, that's actually a scenario where the cost-benefit of de-duping would really be worthwhile. and no need for L2ARC or ZIL on that pool, they'd just be a bottleneck (e.g. 4 SSDs accessed directly have more available IOPS than 4 SSDs accessed through 1 SSD as "cache") use a pool of small, fast SSDs as de-duped "tier-1" storage for VM rootfs images, and mechanical disk pool for bulk data storage.
You do save on a fair bit of disk space if you're storing a pile of virtual machine images that contain a lot of similarities.
it also improves performance, as the shared blocks for common programs like sed, grep and many others are more likely to be in cached in ARC or L2ARC. craig -- craig sanders <cas@taz.net.au> BOFH excuse #382: Someone was smoking in the computer room and set off the halon systems.
participants (2)
-
Craig Sanders
-
Toby Corkindale