Re: zpool status

26 Jul 2013

      On Thu, Jul 25, 2013 at 06:31:54PM +1000, Russell Coker wrote:
...
I'm getting some errors on a zpool scrub operation.  It's obvious that
sdd has the problem both from the below output of zpool status and
from the fact that the kernel message log has read errors about sdd.
what sort of controller is it on, and do you have standby/spin-down
enabled? that can cause drives to be booted from a raid array or zpool
if they don't respond fast enough.

that's the reason for using IT mode firmware rather than RAID mode
firmware in LSI and similar cards - it's far more forgiving of consumer
drives and their slow responses when waking up from standby.  RAID mode
firmware pretty much expects enterprise drives with standby disabled.
...
But how can I get detail about what has gone wrong?  Has sdd
given corrupt data in addition to read failures?  Presumably the
"(repairing)" string will disappear as soon as the scrub finishes,
when that happens how would I determine that sdd was to blame without
the kernel error log?
# zpool status
  pool: tank
 state: ONLINE
 scan: scrub in progress since Thu Jul 25 16:38:01 2013
    1.01T scanned out of 10.3T at 164M/s, 16h26m to go
    1.40M repaired, 9.80% done
config:
NAME        STATE     READ WRITE CKSUM
        tank        ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            sda     ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdc     ONLINE       0     0     0
            sdd     ONLINE       0     0     0  (repairing)
corrupt data on sdd will have been detected and corrected by zfs. if
the block read doesn't match the (sha256 IIRC) hash then it will be
corrected from the redundant copies on the other drives in the pool.

zpool status will tell you how much data was corrected when it has
finished.

e.g. on my backup pool, zpool status says this:

  scan: scrub repaired 160K in 4h21m with 0 errors on Sat Jul 20 06:03:58 2013

also the numbers in the READ WRITE and CKSUM columns will show you the
number of errors detected for each drive.

craig

-- 
craig sanders <cas@taz.net.au>

BOFH excuse #202:

kernel panic: write-only-memory (/dev/wom0) capacity exceeded.

Re: zpool status

Craig Sanders