
On 06/02/12 20:42, Russell Coker wrote:
On Mon, 6 Feb 2012, Toby Corkindale<toby.corkindale@strategicdata.com.au> wrote:
Ah, it might solve 90% of your problem, but it doesn't for most people, where a VM image is created from scratch via an ISO image of an installer, and then the VM has lots of patches and upgrades applied over time.
If you had two VMs that had been running for a while and repeatedly upgraded then you could shut down node B, do a reflink copy of node A, and then rsync the files from node B to the new reflink copy. Then you would end up with an identical set of node B files but with reflink for the files that are identical (IE everything that's packaged from the distribution).
Or I could get a filesystem editor and a calculator and go through the blocks one at a time, and manually cross-linking them with a hexeditor and a copy of the filesystem specification. or I could buy a bigger hard drive and put up with it for now, until someone implements a stable block-based deduplication in an open-source filesystem.. :) We already have it in the kernel for memory (as KSM) so I'm surprised we haven't seen something for ext4 already.