RE: strange SATA errors

22 May 2015

...
[  477.971022] ata1.00: exception Emask 0x50 SAct 0xc00 SErr 0x290b02 action
0xe frozen
[  477.971107] ata1.00: irq_stat 0x01400000, PHY RDY changed
[  477.971166] ata1: SError: { RecovComm UnrecovData Persist HostInt
PHYRdyChg
10B8B BadCRC }
[  477.971243] ata1.00: failed command: READ FPDMA QUEUED
[  477.971305] ata1.00: cmd 60/a8:50:00:13:ed/00:00:05:00:00/40 tag 10 ncq
86016 in
[  477.971308]          res 40/00:58:a8:13:ed/00:00:05:00:00/40 Emask 0x50
(ATA bus error)
[  477.971449] ata1.00: status: { DRDY }
[  477.971501] ata1.00: failed command: READ FPDMA QUEUED
I was using dd to copy /dev/sda3 to /dev/sdb3 on a system that is usually
running Windows but doesn't appear to have hardware problems.  Then I
saw the
above message about a READ error on ata1.00 in the kernel log followed
immediately by the below message about a WRITE error on ata2.00.  Any
ideas
about what might be happening here?  I've attached the entire kernel
message
log.
Before you do anything else, do smartctl -H /dev/sda (and then repeat for /dev/sdb). If that tells you one of the disks is failing then that's probably your problem. I don't know if BadCRC above refers to a media or an interface error.

Also smartctl -t short /dev/sda (and again, same for /dev/sdb), then smartctl -l selftest after the prescribed amount of time to check the results.

I've never seen a long selftest show an error that a short selftest didn't also pick up, but maybe run a long selftest in the absence of any other suggestions.

Maybe also post the output of smartctl -a for each of the disks.

If it's working fine on Windows then it's probably not a hardware issue as you say, but SMART makes it so easy to do cursory checks, it doesn't make any sense not to start there.

James

RE: strange SATA errors

James Harper