
Thank you all for the very helpful advice:- If I understand you all the problems are ZFS is choking because I don't have enough disk space. The reason I don't have enough disk space is because the files I deleted are still in the old snapshots, plus it was already pretty full. This seems like a good way forward but is it really and if so how do I do it? First things first ZFS is choking due to lack of disk space. However, I have a lot of totally unused diskspace on a windows partition so how can I reduce the windows partition and increase the ZFS to stop it choking. It has 560GB of which I do not need more than 200max to leave for Windoze. Gparted apparently doesn't like ZFS but does this matter? I can I assume use it to shrink the Windows partition and free up about 360GB, then how do I expand the ZFS to use that free space? Second, I have checked /etc/cron and find auto snap shot commands for hourly, daily, weekly & monthly hourly = 24, daily = 31, weekly = 8, monthly = 6. It seems I could change that to hourly = 24, daily = 7, weekly = 4, monthly = 6 and get pretty much the same coverage with a lot less snapshots. Listing snapshots shows some from 2019, where are they coming from as with monthly only storing 6 months there shouldn't be anything newer than February 2020 or am I not understanding something here? The computer is a Lenovo W541 laptop, the longer term plan is to double the memory to 32GB and put a 2TB SSD in this box. Does that sound sensible? In the meantime, a thorough backup is running in at least two external drives (one incremental and one fresh). Stripes. On Sun, 23 Aug 2020 at 06:55, Keith Bainbridge <keithrbaugroups@gmail.com> wrote:
On 23/8/20 10:25 am, Darren Wurf wrote:
I noticed the snapshots follow a grandparent-parent-child pattern, something is managing these snapshots for you - which would explain why you have no older ones as well as why there are many recent snapshots and few older ones.
And if that is the case, that manager may be using hard links to reduce the space being used. So deleting older snapshots may not have a big affect.
I know timeshift works that way - the grandparent-parent-child pattern and hard links.
-- Keith Bainbridge
keithrbaugroups@gmail.com or ke1thozgroups@gmx.com
-- You received this message because you are subscribed to the Google Groups "mlug-au" group. To unsubscribe from this group and stop receiving emails from it, send an email to mlug-au+unsubscribe@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/mlug-au/a8673e8a-117b-e3b3-3f61-6316b46fac... .
-- Stripes Theotoky -37 .713869 145.050562

Hello Stripes, On 8/24/20, stripes theotoky via luv-main <luv-main@luv.asn.au> wrote:
Thank you all for the very helpful advice:-
If I understand you all the problems are ZFS is choking because I don't have enough disk space. The reason I don't have enough disk space is because the files I deleted are still in the old snapshots, plus it was already pretty full.
This seems like a good way forward but is it really and if so how do I do it?
Question, do you really need to recover files from the snapshots on a routine basis? I would suggest that you consider the frequency of the snapshots, and of backups. If you truly understand, and take care with your actions, how much do you need to recover and undelete as the snapshots enable? If you do not need quite so frequent, nor so many snapshots, then you will have more usable space when you transfer to a new drive. The other matter is that the snapshots are not a backup strategy, they too are lost along with the current copies if you have a hard drive crash. I would suggest looking at something like a Network Attached Storage device, with multiple drives in a suitable RAID array. There are others on the LUV lists who can better advise on which strategies will actually provide a reasonable measure of security of data. Remember that there are repositories from which you can restore the OS and applications, but that your data, including particular configuration, can only be recovered from a suitable backup on another device, and best often stored on a separate site. These are pointers to think about as to what is important, and what can you afford to loose, possibly because you can regenerate, possibly because it is not really that critical. As to life, yes the data has meaning to each of us, but it is not food and water even if it can be used in exchange for such.
First things first ZFS is choking due to lack of disk space. However, I have a lot of totally unused diskspace on a windows partition so how can I reduce the windows partition and increase the ZFS to stop it choking. It has 560GB of which I do not need more than 200max to leave for Windoze.
From this I gather that you find a need to have the Microsoft malware available. I have problems now and then when I have to deal interactively with a Microsoft Word document in the newest format. I usually point out that Word may well be widespread, but that is not universal, and a PDF can be locked to prevent being tampered with.
Gparted apparently doesn't like ZFS but does this matter? I can I assume use it to shrink the Windows partition and free up about 360GB, then how do I expand the ZFS to use that free space? Second, I have checked /etc/cron and find auto snap shot commands for hourly, daily, weekly & monthly hourly = 24, daily = 31, weekly = 8, monthly = 6. It seems I could change that to hourly = 24, daily = 7, weekly = 4, monthly = 6 and get pretty much the same coverage with a lot less snapshots.
Listing snapshots shows some from 2019, where are they coming from as with monthly only storing 6 months there shouldn't be anything newer than February 2020 or am I not understanding something here?
I would think that if the computer was not in use at the critical time, the snapshots could remain. Cron cannot do things while the computer is off. I know that there is an alternative, anacron, that is coded to cope with doing the actions that fell while the computer was off, when it is next booted. It would be worth checking which is installed and used.
The computer is a Lenovo W541 laptop, the longer term plan is to double the memory to 32GB and put a 2TB SSD in this box. Does that sound sensible?
In the meantime, a thorough backup is running in at least two external drives (one incremental and one fresh).
Again, consider backups on a separate device. Again, look to your use patterns and practices so that you have less need for the snapshots and recovering deleted files, and as such can use less snapshots. Look at what you are trying to achieve, and ask how to be reasonably effective. Try to make the most of the resources you can afford, rather than spending too much to compensate for suboptimal habits and practices.
Stripes.
Regards, Mark Trickett

Hello Mark, Question, do you really need to recover files from the snapshots on a
routine basis?
No I don't. I have never recovered anything from ZFS and hope to never have to. ZFS is a last ditch line of defense against some of my co-authors who have a terrible habit of working on old files, saving them out with the same name as the current working file. More than once a changed introduction has destroyed years of work and it has only been my paranoia in having about 6 backups of everything that has saved the project.
I would suggest that you consider the frequency of the snapshots, and of backups. If you truly understand, and take care with your actions, how much do you need to recover and undelete as the snapshots enable? If you do not need quite so frequent, nor so many snapshots, then you will have more usable space when you transfer to a new drive.
The trouble is I am often working on 4 - 5 academic papers at the same time; some of them maybe neglected for a couple of months hence errors won't be noticed until I come back to work on it again. In the worst case I had to dig up files from 3 years ago to recover data. The other matter is that the snapshots are not a backup strategy, they
too are lost along with the current copies if you have a hard drive crash.
I realise that. The reasons for ZFS are outlined above. I know a better strategy would be to use GIT but as my co-authors are very smart in maths and economics but have almost zero computer knowledge, trying to tell them what GIT is let alone why we should be using it is impossible.
I would suggest looking at something like a Network Attached Storage device, with multiple drives in a suitable RAID array.
This is the ultimate plan to build a NAS from an HP Microserver. I am leaning towards Nas4Free on an SSD or internal USB and 3, 6TB mirrors. This is a project that has to wait because right now due to Covid19 and Brexit we are not sure where we are. I am here and can't leave but expecting to be out of work (which won't stop my research), my husband is British/Australian, resident in Austria to avoid Brexit but is stranded by Covid in Greece. When it all settles down and we have a home again building this NAS is going to be pretty high on the list of things to do. There are others on the LUV lists who can better advise on which strategies
will actually provide a reasonable measure of security of data. Remember that there are repositories from which you can restore the OS and applications, but that your data, including particular configuration, can only be recovered from a suitable backup on another device, and best often stored on a separate site.
Much of the data is on external disks located in 3 different countries so hopefully it is safe. These are pointers to think about as to what is important, and what
can you afford to loose, possibly because you can regenerate, possibly because it is not really that critical. As to life, yes the data has meaning to each of us, but it is not food and water even if it can be used in exchange for such.
True.
First things first ZFS is choking due to lack of disk space. However, I
have a lot of totally unused diskspace on a windows partition so how can I reduce the windows partition and increase the ZFS to stop it choking. It has 560GB of which I do not need more than 200max to leave for Windoze.
From this I gather that you find a need to have the Microsoft malware available. I have problems now and then when I have to deal interactively with a Microsoft Word document in the newest format. I usually point out that Word may well be widespread, but that is not universal, and a PDF can be locked to prevent being tampered with.
Windows certainly is malware. I have it to run Scientific Workplace (SW) as some of my co-authors are unable to deal with Latex. In an ideal world I would only use Kile or Eclipse but sometimes I don't have the hours / days necessary to edit the perverted abomination that Scientific Workplace thinks is a tex file into an actual tex file, hence the need to have the windoze to run SW.
Gparted apparently doesn't like ZFS but does this matter? I can I assume
use it to shrink the Windows partition and free up about 360GB, then how do I expand the ZFS to use that free space? Second, I have checked /etc/cron and find auto snap shot commands for hourly, daily, weekly & monthly hourly = 24, daily = 31, weekly = 8, monthly = 6. It seems I could change that to hourly = 24, daily = 7, weekly = 4, monthly = 6 and get pretty much the same coverage with a lot less snapshots.
Listing snapshots shows some from 2019, where are they coming from as with monthly only storing 6 months there shouldn't be anything newer than February 2020 or am I not understanding something here?
I would think that if the computer was not in use at the critical time, the snapshots could remain. Cron cannot do things while the computer is off. I know that there is an alternative, anacron, that is coded to cope with doing the actions that fell while the computer was off, when it is next booted. It would be worth checking which is installed and used.
I didn't know about anacron, but it makes sense that it could be installed, I will have to check.
The computer is a Lenovo W541 laptop, the longer term plan is to double the
memory to 32GB and put a 2TB SSD in this box. Does that sound sensible?
In the meantime, a thorough backup is running in at least two external drives (one incremental and one fresh).
Again, consider backups on a separate device. Again, look to your use patterns and practices so that you have less need for the snapshots and recovering deleted files, and as such can use less snapshots. Look at what you are trying to achieve, and ask how to be reasonably effective. Try to make the most of the resources you can afford, rather than spending too much to compensate for suboptimal habits and practices.
Good advice. Find new co-authors? LOL Thanks again for your help.
Stripes.
Regards,
Mark Trickett
-- Stripes Theotoky -37 .713869 145.050562

On Wednesday, 26 August 2020 8:43:30 PM AEST stripes theotoky via luv-main wrote:
Question, do you really need to recover files from the snapshots on a
routine basis?
No I don't. I have never recovered anything from ZFS and hope to never have to. ZFS is a last ditch line of defense against some of my co-authors who have a terrible habit of working on old files, saving them out with the same name as the current working file. More than once a changed introduction has destroyed years of work and it has only been my paranoia in having about 6 backups of everything that has saved the project.
Storage is getting bigger all the time. 2TB SSDs in 2.5" laptop form factor are affordable now. Sometimes it's easier to just buy more storage than to use storage effectively. As an aside I'm sure that somewhere under my home directory I have the megabytes of wasted space that were an annoyance when my laptop had a 3.8G disk, but now with a 160G SSD it's not a problem.
I would suggest that you consider the frequency of the snapshots, and of backups. If you truly understand, and take care with your actions, how much do you need to recover and undelete as the snapshots enable? If you do not need quite so frequent, nor so many snapshots, then you will have more usable space when you transfer to a new drive.
The trouble is I am often working on 4 - 5 academic papers at the same time; some of them maybe neglected for a couple of months hence errors won't be noticed until I come back to work on it again. In the worst case I had to dig up files from 3 years ago to recover data.
Having 3 years of snapshots of the important stuff is possible. You could have a separate ZFS mountpoint just for that important data. How large are your really important files? 1G? 3 years of archives of that 1G won't take much space on a 2TB SSD.
I realise that. The reasons for ZFS are outlined above. I know a better strategy would be to use GIT but as my co-authors are very smart in maths and economics but have almost zero computer knowledge, trying to tell them what GIT is let alone why we should be using it is impossible.
Some systems I run use etckeeper to make a git repository of /etc. On some triggering operations (including Debian package updates) etckeeper will run and commit all changes to git. I presume you could run something similar that uses git for tracking snapshots of changes for the projects in question.
Much of the data is on external disks located in 3 different countries so hopefully it is safe.
Having things on external disks is safe against some problems. But if you have a disk just sitting around for long periods of time problems can occur. You need to verify the storage before it rots. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/

Russell Coker via luv-main wrote:
Having things on external disks is safe against some problems. But if you have a disk just sitting around for long periods of time problems can occur. You need to verify the storage before it rots.
A much underestimated problem I believe, with ' half lives of data on 'shelved' mediums uncomfortably short; my impression (no link !) -SSD's noticeable decay after three years -HDD noticeable decay after five years --DVD home burnt noticeable decay after five years -DVD factory burnt noticeable decay after 10 years -M-disk noticeable decay after 25 years regards Rohan McLeod

On Wed, Aug 26, 2020 at 10:43:30AM +0000, stripes theotoky wrote:
I would suggest looking at something like a Network Attached Storage device, with multiple drives in a suitable RAID array.
This is the ultimate plan to build a NAS from an HP Microserver. I am leaning towards Nas4Free on an SSD or internal USB and 3, 6TB mirrors. This is a project that has to wait because right now due to Covid19 and Brexit we are not sure where we are. I am here and can't leave but expecting to be out of work (which won't stop my research), my husband is British/Australian, resident in Austria to avoid Brexit but is stranded by Covid in Greece. When it all settles down and we have a home again building this NAS is going to be pretty high on the list of things to do.
In the meantime, you can use a largish (>= 4 or 6 TB) external USB drive set up to be a ZFS pool for backups. Then 'zfs send' your snapshots to the USB drive, and keep a multi-year snapshot history on them. Aggressively expire the snapshots in your laptop to minimise the amount of space they're taking. You can have multiple USB backup drives like this - each one has to be initialised with a full backup, but can then be incrementally updated with newer snapshots. Each backup pool should have a different name - like backup1, backup2, etc. You can automate much of this with some good scripting, but your scripts will need to query the backup destination pool (with 'zfs list') to find out what the latest backup snapshot on it is. Incremental 'zfs send' updates send the difference between two snapshots, so you need to know what the lastest snapshot on the backup pool is AND that snapshot has to sill exist on the source pool. You should use a different snapshot naming scheme for the backup snapshots. If your main snapshots are "@zfs-autosnap-YYYYMMDD" or whatever, then use "@backup-YYYYMMDD". Create that snapshot, and use it for a full zfs send, then create new "@backup-YYYYMMDD" snapshots just before each incremental send. e.g. the initial full backup on a pool called "source" to a pool called "backup", if you had done it yesterday: zfs snapshot source@backup-20200829 zfs send -v -R source@backup-20200829 | /sbin/zfs receive -v -d -F backup and to do an incremental backup of *everything* (including all snapshots created manually or by zfs-autosnap) from @backup-20200829 to today between the same pools: # source@backup-20200829 already exists from the last backup, no need to create it. zfs snapshot source@backup-20200830 zfs send -R -i source@backup-20200829 source@backup-20200830 | zfs receive -v -u -d backup ** NOTE: @backup-20200829 has to exist on both the source & backup pools ** Unless you need to make multiple backups to different pools, you can delete the source@backup-20200829 snapshot at this point because the next backup will be from source@backup-20200830 to some future @backup-YYYYMMDD snapshot. BTW, you don't have to backup to the top level of the backup pool. e.g. to backup to a dataset called "mylaptop" on pool backup: zfs create backup/mylaptop zfs snapshot source@backup-20200829 zfs send -R -i source@backup-20200829 source@backup-20200830 | zfs receive -v -u -d backup/mylaptop (you'd do this if you wanted to backup multiple machines to the same backup drive. or if you wanted to use it for backups AND for storage of other stuff like images or videos or audio files). and, oh yeah, get used to using the '-n' aka '--dry-run' and '-v'/'--verbose' options with both 'zfs send' and 'zfs receive' until you understand how they work and are sure they're going to do what you want. NOTE: as a single drive vdev, there will be no redundancy in the USB backup drive. but I'm guessing that since you're using a laptop, it's probably also a single drive and that you're only using ZFS for the auto compression and snapshot capabilities. If you want redundancy, you can always plug in two USB drives at a time and set them up as a zfs mirrored pool, but then you have to label them so that you know which pairs of drives belong together This is not as good as a NAS but it's cheap and easy and a lot better than nothing. I recommend using USB drive adaptors that allow you to use any drives in them (i.e. USB to SATA adaptors), not pre-made self-contained external drives (just a box with a drive in it and a USB socket or cable). Sometimes you see them with names like "disk docking station", with a power adaptor, a USB socket, and SATA slots for 1, 2, or 4 drives. Other forms include plain cables with a USB plug on one end and a SATA socket on the other. craig ps: If your backup pool was on some other machine somewhere on the internet, you can pipe the zfs send over ssh. e.g. zfs send -R -i source@backup-20200829 source@backup-20200830 | ssh remote-host zfs receive -u -d poolname/dataset The pool on your laptop is probably small enough that you could do the initial full backup over the internet too, but I wouldn't want to do a multi-terabyte send from a home connection. daily 'zfs send's of a few gigabytes or so should be no problem at all. Also, the data stream from 'zfs send' can be piped into gpg to encrypt it and then just redirected to a file on a dropbox or google drive or similar account. or sent via ssh to any machine you have a shell account on. To restore from them, decrypt them with gpg, and pipe into 'zfs receive' to restore to an appropriate pool. zfs send -R -i source@backup-20200829 source@backup-20200830 | gpg ... > /path/to/dropbox/backups/20200830.gpg or zfs send -R -i source@backup-20200829 source@backup-20200830 | gpg ... | ssh remotehost cat > ./backups/20200830.gpg -- craig sanders <cas@taz.net.au>
participants (5)
-
Craig Sanders
-
Mark Trickett
-
Rohan McLeod
-
Russell Coker
-
stripes theotoky