Re: zpool status

26 Jul 2013


      On Fri, 26 Jul 2013 14:18:44 +1000
Craig Sanders <cas@taz.net.au> wrote:
...
On Fri, Jul 26, 2013 at 01:00:30PM +1000, Russell Coker wrote:
...
...
also the numbers in the READ WRITE and CKSUM columns will show you
the number of errors detected for each drive.
However those numbers are all 0 for me.
as i said, i interpret that as indicating that there's no real problem
with the drive - unless the kernel is retrying successfully before zfs
notices the drive is having problems? is that the case?
No, the very first message in this thread included the zpool status
output which stated that 1.4M of data had been regenerated.
...
...
I'm now replacing the defective disk.  I've attached a sample of
iostat output, it seems to be reading from all disks and then
reconstructing the parity for the new disk which is surprising, I
had expected it to just read the old disk and write to the new disk
there's (at least) two reasons for that.
first is that raidz is only similar to raid5/6 but not exactly the
same. the redundant copies of a data block can exist anywhere on any
of the drives in the vdev, so it's not just a straight dd-style copy
from the old drive to the new.
the second is that when you're replacing a drive, the old one may not
be reliable or trustworthy, or may even be absent from the system.
zpool replace tank \
sdd /dev/disk/by-id/ata-ST4000DM000-1F2168_Z300MHWF-part2

In this case the old disk was online, I ran the above replace command
so ZFS should know that the new disk needs to be an exact copy of the
old.
...
...
but instead I get a scrub as well as the "resilver".
that's odd.  what makes you say that?
I've attached the zpool status output.  It shows the disk as being
replaced but is accessing all disks according to the iostat output I
attached previously.
...
and, of course, with raidz (or raid5) writes are always going to be
limited to the speed of, at best, a single drive.
Actually for contiguous writes a RAID-5 array can be expected to
exceed the performance of a single disk.  It's not difficult to
demonstrate this in real life.
...
the SATA controller is also a factor, many (most?) aren't capable
of running four or more drives at full speed simultaneously. even a
cheap-but-midrange SAS card like my LSI cards couldn't run all 8 ports
at full speed with 6Gbps SSDs flat out (since i'm only running hard
disks and not SSDs on them, i will never be limited by that so don't
care)
Yes, that's always been an issue, dating back to IDE days.
...
...
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.69    0.00   20.23    3.79    0.00   74.29
Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
avgrq-sz avgqu-sz   await  svctm  %util sda             373.90
0.40  298.30    5.80 92344.00    75.20   303.91     1.36    4.48
2.23  67.96 sdb             195.90     0.40  502.70    5.80
89902.40    75.20   176.95     1.66    3.27   1.25  63.72
sdc             374.20     0.60  300.30    6.00 92286.40    76.80
301.54     1.41    4.59   2.38  72.84 sdd             175.10
0.60  539.30    6.00 89230.40    76.80   163.78     1.78    3.27
1.24  67.76 sdl               0.00   174.30    0.00  681.10
0.00 88107.10   129.36     6.40    9.39   1.32  89.72
hmm. does iostat know about 4K sectors yet? maybe try that with -m for
megabytes/sec rather than rsec/s.
also, what does 'zpool iostat' (or 'zpool iostat -v') and 'zpool
status' say?
I've attached the zpool iostat output and it claims that the only real
activity is reading from the old disk at 38MB/s and writing to the new
disk at the same speed.  I've attached the iostat -m output which shows
that all disks are being accessed at a speed just over 45MB/s.  I
guess that the difference between 38 and 45 would be due to some random
variation, zpool gives an instant response based on past data while
iostat runs in real time and gives more current data.

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/