
Now I have another RAID problem. Each disk is partitioned like: /dev/sd[abcd]1 - bios boot partition /dev/sd[abcd]2 - md0 - /boot (I believe grub can now see inside lvm but this config predates that) /dev/sdX3 - md1 - lvm - /root + others I have been progressively replacing disks. sdd was fine, sdc was fine, then I replaced sda to test that booting worked (changed bios to boot from sdb, added new sda, added to raid, etc) and that didn't work because I mucked up the gpt bios boot partition. Having fixed that, I rebooted to test booting again while md1 was still rebuilding onto sda3, and it drops into the initramfs shell on boot because mdadm first sees sda, then tries to add sd[bcd]3 but because each of those disks say that sda3 is not consistent the rebuild fails leaving just sda3 present as a 'spare' drive. In the initramfs console I just manually stop the array, add sd[bcd]3, then sda3, and it's all good again, but I didn't think this was the way it was supposed to work. On the most recent boot I zero'd the superblock of sda3 before re-adding it, thinking maybe it had picked up some bitrot somewhere along the way, but I'm not confident. Because the server is responsible for running backups and last night's backups didn't work because it was down, I'm not going to touch it again until tonight's backups are complete, but any hints on how to make this work smoother next time would be appreciated! And just to recap, rebooting a RAID comprising of /dev/sd[abcd]3, when /dev/sda3 is being rebuilt, results in a boot that drops into the initramfs shell because it appears that mdadm tries to add /dev/sda3 first then rejects the other 3 disks because they say /dev/sda3 is inconsistent (which it is). Thanks James