When disks start to crash, the kernel try to warn the sysadmin with, not so readable, messages.

[12270071.931301] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0
[12270071.931421] ata1.00: irq_stat 0x40000008
[12270071.931524] ata1.00: failed command: WRITE FPDMA QUEUED
[12270071.931634] ata1.00: cmd 61/02:00:67:99:07/00:00:00:00:00/40 tag 0 ncq 1024 out
[12270071.931635] res 41/10:01:67:99:07/00:00:00:00:00/40 Emask 0x481 (invalid argument) <F>
[12270071.931918] ata1.00: status: { DRDY ERR }
[12270071.932021] ata1.00: error: { IDNF }
[12270071.934760] ata1.00: configured for UDMA/133
[12270071.934775] ata1: EH complete

How can i understand what device should be replaced ?

This script can help.

fail=1 ; a=`egrep "^[0-9]{1,}" /sys/class/scsi_host/host*/unique_id | grep ":$fail" | cut -d "/" -f 5` ;  ls -l /sys/block/sd* | grep $a | rev | cut -d "/" -f 1 | rev

Advertisements