
From: "James Harper"
Rsync integration
Not claimed - no patches yet - Not in kernel yet
Now that we have code to efficiently find newly updated files, we need to tie it into tools such as rsync and dirvish. (For bonus points, we can even allow rsync to use btrfs's builtin checksums and, when a file has changed, tell rsync _which blocks_ inside that file have changed. Would need to work with the rsync developers on that one.)
Update rsync to preserve NOCOW file status.
Means: Make rsync work like btrfs send/receive;-) and put filesystem specific code in it.
I'm just testing out some of the deduplication stuff in btrfs, and was actually a little shocked to find it calculating the hashes itself. btrfs already has checksums, and if nothing else it could have used them to trivially reject blocks that are different before calculating a stronger hash. There is talk about exposing the btrfs checksums to userspace, but of course that puts contraints on further development as they now have to consider userspace compatibility.
ZFS has feature flags you can test.
I am not sure whether this is a great idea.
Most of the time you will have the same filesystem on both ends. Then you can use zfs/btrfs etc. tools. Or rsync if it's not a COW system.
It is more "code polluting" than it's worth I think.
Maybe. Depends on the speedup. In a lot of cases, the above optimisations would speed up the processing that rsync has to do, but if 90% of the time taken in your rsync was actually moving data then you're never going to get anymore than 10% faster. For LAN links though, I normally just use -W for rsync because computing changes just adds overhead (I mean you have to read the file at both ends anyway, and unless your disk can pull data faster than 1GByte/second you're not going to saturate your 10GBit /second link so don't bother computing changes. If you got the change computation "for free", then it's a big win.
Yes, that's what you get from btrfs/zfs send/receive:-) You only need rsync if you have to deal with a different filesystem at the other end and then it may help you only in a few percent of the cases. If both cases have a 10% likelihood you start this "layer blurring" coding for 1% of useful cases.
And if btrfs send/receive isn't stable there is a good chance to implement an unstable rsync as well.
(I think) Russell was supposing that there weren't many bugs reported for send/receive because not many people were using it. I'm not sure how we got from there to "send/receive isn't stable".
From Chris Samuel today:
Watch out with 3.17 then, there are early reports that it's broken btrfs send.
That may need a bit more evaluation, maybe. Overall I hear a lot of "btrfs is having this or that issues" which I never followed up properly. But it makes me wary. I am using ZFS on FreeBSD for at least 3 years in production now (and upgrade the kernel regularly when needed) without any issues. Besides of early problems with nearly full ZFS and memory which were known. On both fronts there seems to be improvement. E.g. I have 92% full ZFS volumes which are still performing - there was a 80% "warning" in the past, and I run ZFS on 3GB and 4GB RAM boxes and do not see issues. Regards Peter