Opened 14 years ago
Closed 14 years ago
#121 closed defect (worksforme)
smartd fails to report disk failure if a disk doesn't respond anymore
Reported by: | kaluscha | Owned by: | Christian Franke |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | smartd | Version: | 5.39 |
Keywords: | Cc: |
Description
I had a self test running on disk hdb:
smartd: Device: /dev/hdb, self-test in progress, 10% remaining
The disk encountered problems, see /var/log/messages:
kernel: hdb: dma_timer_expiry: dma status == 0x61 kernel: hdb: DMA timeout error kernel: hdb: dma timeout error: status=0xd0 { Busy } kernel: ide: failed opcode was: unknown kernel: hda: DMA disabled kernel: hdb: DMA disabled kernel: ide0: reset: success
There were several kernel IDE resets until the drive didn't respond anymore:
kernel: hdb: drive not ready for command
smartd wrote messages:
smartd: Device: /dev/hdb, failed to read Temperature
However, smartd had been configured to send e-mails in case of trouble (/dev/hdb -a -I 194 -W 4,40,42 -R 5 -m myamil). In this case, it failed to do so.
In my opinion this is a major problem as smartd should inform the admins that a disk is complety offline, i.,e. doesn't respond to requests on the IDE bus anymore.
Change History (3)
comment:1 by , 14 years ago
Keywords: | linux disk failure added |
---|
comment:2 by , 14 years ago
Keywords: | linux disk failure removed |
---|---|
Milestone: | → Release 5.41 |
Owner: | changed from | to
Status: | new → accepted |
comment:3 by , 14 years ago
Milestone: | Release 5.41 |
---|---|
Resolution: | → worksforme |
Status: | accepted → closed |
Note:
See TracTickets
for help on using tickets.
smartd sends a warning email
"failed to read SMART Attribute Data"
in the above situation, see smartd.cpp. No additional email"failed to read Temperature"
is sent because temperature info is part of the attribute data.If smart option
-s, --savestates
is used see also ticket #35.