
I've had a few disks fail with uncorrectable read errors just recently, and in the past my process is that any disk with any sort of error gets discarded and replaced, especially in a server. I did some reading though (see previous emails about SMART vs actual disk failures) and read that simply writing back over those sectors is often enough to clear the error and allow them to be remapped, possibly extending the life of the disk, depending on the cause of the error. In actual fact after writing the entire failed disk with /dev/zero the other day, all the SMART attributes are showing a healthy disk - no pending reallocations and no reallocated sectors, yet, so maybe it wrote over the bad sector and determined it was good again without requiring a remap. I'm deliberately using some old hardware to test ceph to see how it behaves in various failure scenarios, and has been pretty good so far despite 3 failed disks over the few weeks I've been testing. What can cause these unrecoverable read errors? Is losing power mid-write enough to cause this to happen? Or maybe a knock while writing? I grabbed these 1TB disks out of a few old PC's and NAS's I had lying around the place so their history is entirely uncertain. I definitely can't tell if they were already present when I started using ceph on them. Is Linux MD software smart enough to rewrite a bad sector with good data to clear this type of error (keeping track of error counts to know when to eject the disk from the array)? What about btrfs/zfs? Trickier with something like ceph where ceph runs on top of a filesystem which isn't itself redundant... Thanks James