
On Fri, Feb 10, 2012 at 06:53:23PM -0800, Daniel Pittman wrote:
it's in ZFS. the catch is that it takes enormous amounts of memory....about 1GB RAM per TB of disk space IIRC to store the hashes for each block. I suspect the same catch would apply to other implementations.
FreeBSD report that as ... optimistic, in the real world:
quite probably. my figure of 1GB/TB was no better than a guess based on something half-remembered.
There are some resources that suggest that one needs 2GB per TB of storage with deduplication [i] (in fact this is a misinterpretation of the text). In practice with FreeBSD, based on empirical testing and additional reading, it's closer to 5GB per TB.
ouch.
I am inclined to take their comment seriously, given reports from the few people I know who did try that. (OTOH, 64GB of RAM is US $379 for four consumer grade sticks,
$369 total, or $369 per 16GB stick? I thought 8GB & 16GB sticks were still stupidly expensive.
so you could realistically use that for a 12TB pool in hardware that is reasonably affordable.)
OTOH, there's probably lots better uses for that RAM on the same system. de-duping is, IMO, one of those things that sounds nice but that there's little practical use for. in most cases, it's cheaper and more effective to just add more disk than to add enough RAM to support it. craig -- craig sanders <cas@taz.net.au> BOFH excuse #40: not enough memory, go get system upgrade