Re: [MLUG] Advice needed about ZFS

21 Aug 2020

      On Thu, Aug 20, 2020 at 01:40:03PM +0100, stripes theotoky wrote:
...
When we started this discussion we had this
stripinska@Stripinska:~$ sudo zfs list
NAME                                                   USED  AVAIL  REFER
 MOUNTPOINT
alexandria                                             332G  1.06G    96K
 /alexandria
...
Now having moved 47.3 GB of files to an external drive I have this.
stripinska@Stripinska:~$ sudo zfs list
NAME                                                     USED  AVAIL  REFER
 MOUNTPOINT
alexandria                                               332G   782M    96K
 /alexandria
What is eating my space?
to truly delete files from ZFS (or from anything that supports snapshots), you
need to delete not only the file(s) from the current state of the filesystem,
but also any snapshots containing them.

The space will not be freed until there are no remaining snapshots containing
the file(s) you deleted.

Note that you can't delete individual files from a snapshot, you can only
delete entire snapshots.  This will have a significant impact on your backup
and recovery strategy.
...
It is not the cache for Firefox as this is only 320M.
Maybe, maybe not.  Browser cache directories tend to change a lot, so they end up
using a lot more space in the snapshots than you might think because they keep
all the cached junk that your browser itself has deleted, but your zfs-autosnap
hasn't expired yet.

There really isn't much value in keeping snapshots of cache dirs like this, so
try creating a new dataset to hold these caches (and make sure it, or at least
sub-directories on it, are RW by your uid).  Configure zfs-autosnap to ignore
it (i.e. no snapshots), and then configure your browser to use it for caches.

I don't use zfs-auto-snap myself, but according to the man page, to exclude
a dataset from zfs-auto-snap, you need to create a property on it called
"com.sun:auto-snapshot' and set it to false. e.g.

    zfs create alexandria/nosnaps
    zfs set com.sun:auto-snapshot=false alexandria/nosnaps

(btw, you can also set a quota on a dataset so that it can't use all available
space - better to have firefox die because it can't cache extra stuff than to
have other random programs fail or crash due to out of space errors)

If you have multiple programs that keep caches like this, you could create
one dataset each for them.  IMO, it would be better to create just one
dataset (call it something like "alexandria/nosnaps") and then create as many
sub-directories as you need under it.

make /alexandria/nosnaps/stripes/ readable and writable by user 'stripes', and
your programs can create directories and files underneath it as needed. e.g.
something like

    mkdir -p /alexandria/nosnaps/stripes/firefox-cache
    chown -R stripes /alexandria/nosnaps/stripes
    chmod u=rwX /alexandria/nosnaps/stripes

I'm not entirely sure how to change the cache dir in firefox, but i am certain
that it can be done, probably somewhere in "about:config".  At worst, you
can either set the mountpoint of the "nosnaps" dataset to be the cache dir
(rememeber to quit from firefox and delete the existing cache first), or by
symlinking into a subdir under "nosnaps".  The latter is better because it
enables multiple diffferent cache dirs under the one nosnaps dataset.

BTW, if you download stuff with deluge or some other torrent client, you should
make an alexandria/torrents dataset with a recordsize of 16K instead of the
default 128K (bit torrent does a lot of random reads and writes in 16KB
blocks, so this is the optimum recordsize). for example:

  zfs create -o recordsize=16k -o mountpoint=/home/stripes/torrents/download alexandria/torrents
  chown stripes /home/stripes/torrents/download
  chmod u=rwx /home/stripes/torrents/download

zfs-autosnap should be configured to ignore this dataset too, and your torrent
client should be configured to download torrents into this directory, and then
move them to somewhere else once the download has completed.  This avoids
wasting space on snapshots of partially downloaded stuff, AND minimises
fragmented (as the downloaded torrents will be de-fragmented when they're
moved to somewhere else)

there are probably other things that don't need to be snapshotted - like
/tmp, /var/tmp, maybe (parts of) /var/cache, and other directories containing
short-lived transient files.  I wouldn't bother doing anything about them
unless they waste a lot of disk space.
...
How do I get it back before the box freezes.
1. for such a small filesystem, I recommend setting a more aggressive snapshot
expiration policy for zfs-autosnap.  From what you've posted, it looks like
zfs-autosnap is configured to keep the last 12 months or so snapshots but you
probably can't afford to keep more than the last three to six months or so of
snapshots.

zfs-auto-snap doesn't seem to have a config file, the snapshot retention
is handled by command-line options, which you can see if you look at the
zfs-auto-snap files in /etc/cron.hourly, /etc/cron.daily, /etc/cron.weekly,
and /etc/cron.monthly.

e.g. edit etc/cron.monthly/zfs-auto-snapshot and change the --keep=12 to
--keep=3 or --keep=6

edit these files to change the --keep= option and you'll slowly regain space
as the excess older snapshots are expired when the cron jobs are run.

If you're in a hurry and want the disk space back ASAP, you can probably
delete all but the last 6 months worth of monthly snapshots with commands
like:

    zfs-auto-snapshot --dry-run --delete-only --label=monthly --keep=6 //

remove the --dry-run option when you're sure it does what you want.  You'll
need to do similar things with the "frequent", "hourly", "daily", and
"weekly" labels if you want the disk space used by snapshots to be freed up
immediately.  Remember that the space used by deleted files will **NOT** be
returned until there are NO REMAINING SNAPSHOTS containing those files.

    NOTE: as I said, I don't use zfs-auto-snapshot, so take the above with
    a grain of salt.  I may have mis-interpreted the man page, and I may be
    completely misunderstanding how zfs-auto-snapshot works.  Do some reading
    and verify that what I've said is correct before trusting it.  The general
    idea is correct, the details may be dangerously wrong.

2. better yet, replace the individual drives in the pool one-by-one with
larger drives - when all the drives in a vdev have been replaced, the extra
space will become available.

It's often a lot faster to replace the entire pool with a larger pool using
zfs send (which also has the beneficial side-effects of de-fragmenting files
and re-balancing the data across the new pool).  The process for that is (very
roughly):

1. create a new pool, call it alexandria2 or something like that.

   If the drives in your current alexandria pool are your boot drives, then
   you'll need to partition the drives in alexandria2 similarly to what you
   have for alexandria (e.g. with a /boot and/or /EFI partition). this is so
   that you can later run grub-install or whatever on them.

2. make a recursive snapshot of alexandria.

3. zfs send it to alexandria2

4. repeatedly run incremental zfs sends to keep alexandria2 up to date until
you're ready to do the final changeover.  You can repeat this step as often
and for as long as you need.

5. reboot the system into single-user mode.  Or with a zfs-capable rescue
cd/usb/etc if alexandria is your root fs.  An ubuntu live CD or bootable USB
should do.

I don't know if clonezilla has built-in support for ZFS these days, but
several years ago I made my own custom clonezilla which had ZFS kernel modules
and the zfs & zpool command-line utilities.

6. do a final snapshot and zfs send from alexandria to alexandria2.

7. zfs rename alexandria to alexandria-old

8. zfs rename alexandria2 to alexandria

If you need to boot from it, chroot into it, bind-mount /proc, /dev, and /sys,
and run grub-install....same as you would when moving any root fs from one
drive/partition to another. in fact, this whole process is very similar to how
you'd use rsync to move a filesystem to another drive, but using snapshots and
zfs send instead of rsync.
...
The more disk space I free the less available space I have.
not exactly.  you're just getting more snapshots created by your zfs-autosnap
cron job.

you're not using disk space any faster than you were before the deletes, but
at approximately the same rate (which varies according to what is changing on
the dataset(s)).

craig

Re: [MLUG] Advice needed about ZFS

Craig Sanders