In any case I'm limited to non-ECC ram by the form factor of my bookshelf..
I wonder if better hardware tests would be an area worth looking into, for example, monthly online memory/CPU tests, etc?
I wonder also if there are deterministic tests we can proactively do to catch corruptions at a higher level, for example scanning any file type that includes a checksum eg .zip for corruptions and comparing to previous runs.
It seems the problem is uncaught hardware failure. If we minimize the window the failure is unknown then we increase the change of being able to compare the source to backups and recover information.