
Hi Russell, I would assume that the resilvering is related to the checksum errors. From the zpool(8) manpage: Scrubbing and resilvering are very similar operations. The difference is that resilvering only examines data that ZFS knows to be out of date (for example, when attaching a new device to a mirror or replacing an existing device), whereas scrubbing examines all data to discover silent errors due to hardware faults or disk failure. For the messages: FreeBSD has a sysctl vfs.zfs.debug. This sysctl approach was ported to Linux, my Google 'research' (e.g. http://askubuntu.com/questions/228386/how-do-you-apply-performance-tuning-se...) indicates, so you may be able to use it under Linux too. BTW: There is a Nagios/Icinga check_zfs plugin. I did not know about "mon" before... How does it compare to Nagios/Icinga? Regards Peter On Thu, Sep 22, 2016 at 10:54 PM, Russell Coker via luv-main < luv-main@luv.asn.au> wrote:
Below is part of the output of "zpool status". It seems that sdr is defective, it has a steadily increasing number of checksum errors.
Would the "resilvered 763M" part be about the 121 checksum errors? If so does that mean each checksum error required resilvering on average 6M of data?
The kernel message log has NOTHING about this. I'm used to Ext* and BTRFS which give kernel message log entries about filesystem errors. Can ZFS be configured to give similar logging?
As an aside I've written a mon module for monitoring for such ZFS errors. I'll release it sometime soon. But I'd be happy to give a version that's quite usable although not ready for full release to anyone who wants it.
status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://zfsonlinux.org/msg/ZFS-8000-9P scan: resilvered 763M in 0h0m with 0 errors on Thu Aug 18 14:48:53 2016 config:
NAME STATE READ WRITE CKSUM server ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 sdj ONLINE 0 0 0 sdk ONLINE 0 0 0 sdl ONLINE 0 0 0 sdm ONLINE 0 0 0 sdn ONLINE 0 0 0 sdo ONLINE 0 0 0 sdp ONLINE 0 0 0 sdq ONLINE 0 0 0 sdr ONLINE 0 0 121
-- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/
_______________________________________________ luv-main mailing list luv-main@luv.asn.au https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main