
On Thu, 23 May 2013, James Harper <james.harper@bendigoit.com.au> wrote:
http://etbe.coker.com.au/2012/12/17/using-btrfs/
I'm currently using BTRFS snapshots for that sort of thing. On some of my systems I have 100 snapshots stored from 15 minute intervals and another 50 or so stored from daily intervals. The 15 minute intervals capture the most likely possibilities for creating and accidentally deleting a file. The daily once cover more long-term mistakes.
That's pretty neat. I do the same with Windows, but it's nice to see that Linux supports this now too. Windows would not support a 15 minutes snapshot interval though - docs say no more than 1 an hour or something like that. Recovering data under windows is as simple as right click then show previous versions and you select which snapshot you want to look at. Samba can do this too.
I believe that Samba can be integrated with various snapshot schemes. Last time I did Google searches for such things I saw some documentation about making Samba work with ZFS snapshots and I presume that BTRFS wouldn't be any more difficult (you could make BTRFS use the same directory names as ZFS for snapshots).
How does performance fare with lots of snapshots?
On BTRFS I haven't yet noticed any performance loss during operation. On older versions of BTRFS (such as is included with Debian/Wheezy) snapshot removal can be quite slow. I even once had a server become unusable because BTRFS snapshot removal took all the IO capacity of the system (after spending an hour prodding it I left it alone for 5 minutes and it came good). On ZFS I also haven't had any problems in operation although once due to a scripting mistake I ended up with about 1,000 snapshots of each of the two main filesystems. That caused massive performance problems in removing snapshots (operations such as listing snapshots taking many minutes) but came good when it was down below about 300 snapshots.
Windows goes with the concept that the snapshot holds the changed data, so first-write becomes a read-from-original + write-original-data-to-snapshot-area + write-new-data-to-original[1]. This reduces first-write performance but subsequent writes suffer no penalty, and means no fragmentation and throwing a snapshot away is instant. I think LVM actually writes the changed data into the snapshot area (still may require a read from original if the write isn't exactly the size of an extent) but I can't remember for sure. If so it means the first -write is faster but subsequent writes are still redirected to another part of the disk, and your data very quickly gets massively fragmented and recovery in the event of a booboo is a bitch if lvm metadata goes bad (from experience... I just gave up pretty much immediately and restored from backup when this happened to me[2]!).
How does btrfs do it internally?
BTRFS does all writes as copy-on-write. So every time you write the data goes to a different location. Keeping multiple versions just involves having pointers to different blocks on disk.
incremental backups would work better, but so far I've been too lazy to work out a way to filter out fewGB tv shows that I've watched and deleted and don't want in any incrementals.
On a BTRFS or ZFS system you would use a different subvolume/filesystem for the TV shows which doesn't get the snapshot backups.
I'm getting more and more excited about btrfs. I was looking around at zfs but it didn't end up meeting my needs. I'm still testing ceph and xfs is currently recommended for the backend store, btrfs is faster but has known issues with ceph, or at least did last time I read the docs and so is not currently recommended.
What issues would BTRFS have? XFS just provides a regular VFS interface which BTRFS does well. I can imagine software supporting ZFS but not BTRFS if it uses special ZFS features. But I'm not aware of XFS having useful features for a file store that BTRFS lacks. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/