Re: drdb and mdadm and more (was Re: mail storage in a distributed database)

10 Apr 2012

      On 10/04/12 15:58, Tim Connors wrote:
...
On Tue, 10 Apr 2012, Toby Corkindale wrote:
...
Also worth noting that the md layer and the iscsi layer didn't interact
all that well -- if the iscsi target dropped out, I seem to remember
that the md layer didn't respond quickly. It wasn't like with disks,
where it'd kick out a non-responding disk soon and keep going -- instead
it'd hang for aeons. This might have improved in more recent kernels..
Don't think its improved.  It's a pain that on raid1 or other levels where
there is some redundancy, if a read from one disk is taking some time
because of contention/spinup, etc, linux raid doesn't automatically
immediately resend the same command to one of the other available disks if
their queues are empty.
When I was raiding 2 external drives that I *wanted* to spin down because
they were usually only accessed twice a day, I still had to wait for it to
consecutively spin up both drives before I could start browsing my backups
(it'd send a read command to one disk, wait for a response, then stripe a
read command to the other disk.  Almost as if there was a queue depth of
only 1, and there would be absolutely no point in striping the reads
across both disks since its got to wait for the reponse from each one
anyway!  It just ensures the second disk won't have the heads in the
right place in a consecutive read!).  It should have had to only wait for
1 drive to spin up.  Or send read commands to both disks simultaneously.
Or stripe it properly since readahead should realise theres more data to
come from the second disk.  Who knows...
I think the RAID10 code might be better -- and you can actually 
configure it to work with just two disks, so it's just like RAID1.
It can do smarter things like scatter data at opposite ends of the 
disks, so your average seek time can be reduced. (Enable the "far" option)

As to your case -- there is the --write-mostly option for RAID1, which 
says that one of your disks should only be used for writes, not reads 
(unless the primary disk fails).