
On Wednesday, 26 August 2020 8:43:30 PM AEST stripes theotoky via luv-main wrote:
Question, do you really need to recover files from the snapshots on a
routine basis?
No I don't. I have never recovered anything from ZFS and hope to never have to. ZFS is a last ditch line of defense against some of my co-authors who have a terrible habit of working on old files, saving them out with the same name as the current working file. More than once a changed introduction has destroyed years of work and it has only been my paranoia in having about 6 backups of everything that has saved the project.
Storage is getting bigger all the time. 2TB SSDs in 2.5" laptop form factor are affordable now. Sometimes it's easier to just buy more storage than to use storage effectively. As an aside I'm sure that somewhere under my home directory I have the megabytes of wasted space that were an annoyance when my laptop had a 3.8G disk, but now with a 160G SSD it's not a problem.
I would suggest that you consider the frequency of the snapshots, and of backups. If you truly understand, and take care with your actions, how much do you need to recover and undelete as the snapshots enable? If you do not need quite so frequent, nor so many snapshots, then you will have more usable space when you transfer to a new drive.
The trouble is I am often working on 4 - 5 academic papers at the same time; some of them maybe neglected for a couple of months hence errors won't be noticed until I come back to work on it again. In the worst case I had to dig up files from 3 years ago to recover data.
Having 3 years of snapshots of the important stuff is possible. You could have a separate ZFS mountpoint just for that important data. How large are your really important files? 1G? 3 years of archives of that 1G won't take much space on a 2TB SSD.
I realise that. The reasons for ZFS are outlined above. I know a better strategy would be to use GIT but as my co-authors are very smart in maths and economics but have almost zero computer knowledge, trying to tell them what GIT is let alone why we should be using it is impossible.
Some systems I run use etckeeper to make a git repository of /etc. On some triggering operations (including Debian package updates) etckeeper will run and commit all changes to git. I presume you could run something similar that uses git for tracking snapshots of changes for the projects in question.
Much of the data is on external disks located in 3 different countries so hopefully it is safe.
Having things on external disks is safe against some problems. But if you have a disk just sitting around for long periods of time problems can occur. You need to verify the storage before it rots. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/