Re: RAID-1 synchronisation

10 Feb 2012


      On Sat, 11 Feb 2012, Robin Humble <robin.humble@anu.edu.au> wrote:
...
IMHO the main purpose of check/scrub on sw or hw raids isn't to detect
"right now" problems, but to shake out unreadable sectors and bad disks
so that they don't cause major drama later.
serious problems (eg. array failure) can occur during raid rebuild if
the raid code tries to read from a second unrecoverably bad disk.
Surely if there is a bad sector when doing a rebuild then it will only result 
in at most some corrupt data in one stripe.  Surely no RAID implementation 
would be stupid enough to eject a second disk from a RAID-5 or a third disk 
from a RAID-6 because of a few errors!
...
we lose a few disks every time we do a md 'check' over our 104 md
raid6's, but many more of the arrays do routine rewrites and fixup bad
disk sectors and make things far safer in the long term. we also have
rewrites happening ~daily in normal operation as bad disk sectors are
found during reads and remapped automatically by writes done by the
raid6 code.
That would be only unrecoverable read errors though wouldn't it?  Not sectors 
that quietly have bogus data.  AFAIK the MD driver doesn't support reading the 
entire stripe for every read to detect quiet corruption.
...
in the home context, bad disk sectors and the ability of the md code to
hide and remap these automatically is probably the best reason to make a
home raid instead of just put a single 'big enough' disk in something.
Except that if you have a RAID-1 then you can quietly lose the data unless you 
read through all the logcheck messages becuase mdadm doesn't report it when 
stripes don't match up.  To actually get this benefit it seems that you need 
either RAID-6 (which almost no-one wants in their home network) or a BTRFS 
RAID-1 (which isn't yet ready for production).
...
if one sector goes bad in that single disk then it's pretty much restore
from backup time as one part of the fs will be forever unreadable until
you find and write over the bad block.
No, you generally just lose 1 file, or maybe 1 directory has it's files go to 
lost+found.
...
the fs can also shutdown or go
read-only if it finds something unreadable. whereas if its in a raid5/6
you likely won't care or notice the problem, and if the raid code
doesn't auto remap the sector for you then you can do a check/scrub or
kick out the disk and dd over it at your leisure.
But if it's RAID-5 then the current state of play is that you won't notice it 
if one disk returns bogus data and the RAID scrub of a RAID-5 will probably 
cause corruption to spread to another sector.
...
On Mon, Feb 06, 2012 at 08:41:41AM +1100, Matthew Cengia wrote:
...
1 adam mdadm: RebuildFinished event detected on md device
      /dev/md/1, component device  mismatches found: 10496
what sort of raid is it? 1,10,5,6?
I may have missed that info in this thread...
That was a MD RAID-1 with 10M of random data dumped on one disk.
...
if raid1/10 then /usr/sbin/raid-check (on fedora at least) doesn't email
about problems ->
# Due to the fact that raid1/10 writes in the kernel are
unbuffered, # a raid1 array can have non-0 mismatch counts even when the #
array is healthy.
The only way a filesystem can be healthy in such a situation is if the journal 
covers it.  If the filesystem is something like Ext3 then the journal replay 
will result in writes to the data sectors which fixes that problem.  So the 
only way I can imagine this not being a problem is if the scrub happens on an 
unmounted filesystem that has a journal in need of replay or if the sectors in 
question correlate to an already committed section of the journal or 
unallocated disk space.
...
These non-0 counts will only exist in # transient data
areas where they don't pose a problem.  However, # since we can't tell the
difference between a non-0 count that # is just in transient data or a
non-0 count that signifies a # real problem, simply don't check the
mismatch_cnt on raid1 # devices as it's providing far too many false
positives.  But by # leaving the raid1 device in the check list and
performing the # check, we still catch and correct any bad sectors there
might # be in the device.
Since we can't tell if it's a problem or not we will just pretend that it's 
not a problem.

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/

Re: RAID-1 synchronisation

Russell Coker