#303 closed defect (wontfix)
In smart test captive mode, extend the timeout as described by the ATA device
Reported by: | gwendal1 | Owned by: | Christian Franke |
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | smartctl | Version: | 5.42 |
Keywords: | Cc: |
Description
When we use smartctl -C -t long /dev/sdX, the ATA SMART command we send has the usual 20s timeout.
This is not enough, the drive usually needs several minutes for the test to complete.
On the command line:
smartctl -C -t long /dev/sda
smartctl 5.42 2011-10-20 r3458 [x86_64-linux-3.8.11] (local build) Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Extended self-test routine immediately in captive mode". Drive command "Execute SMART Extended self-test routine immediately in captive mode" successful. Testing has begun. Please wait 10 minutes for test to complete. Test will complete after Tue Oct 22 16:44:49 2013
In /var/log/messages, with SCSI debug log enabled:
[ 122.121623] sd 0:0:0:0: [sda] sd_ioctl: disk=sda, cmd=0x2285 [ 122.121640] scsi_block_when_processing_errors: rtn: 1 [ 122.121655] sd 0:0:0:0: [sda] Send: [ 122.121662] 0xffff88015c86d300 [ 122.121672] sd 0:0:0:0: [sda] CDB: [ 122.121679] ATA command pass through(16): 85 06 0c 00 d4 00 00 00 82 00 4f 00 c2 00 b0 00 [ 122.121772] buffer = 0x (null), bufflen = 0, queuecommand 0xffffffff9df1d70c [ 122.121785] leaving scsi_dispatch_cmnd() [ 142.735081] sd 0:0:0:0: [sda] Done: [ 142.735102] 0xffff88015c86d300 TIMEOUT [ 142.735121] sd 0:0:0:0: [sda] [ 142.735134] Result: hostbyte=DID_OK driverbyte=DRIVER_OK [ 142.735150] sd 0:0:0:0: [sda] CDB: [ 142.735162] ATA command pass through(16): 85 06 0c 00 d4 00 00 00 82 00 4f 00 c2 00 b0 00 [ 142.735267] sd 0:0:0:0: [sda] scsi host busy 1 failed 0 [ 142.735287] Waking error handler thread [ 142.735329] Error handler scsi_eh_0 waking up [ 142.735365] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [ 142.735386] ata1.00: failed command: SMART [ 142.735407] ata1.00: cmd b0/d4:00:82:4f:c2/00:00:00:00:00/00 tag 0 [ 142.735407] res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [ 142.735436] ata1.00: status: { DRDY } [ 142.735459] ata1: hard resetting link
The command times out after 20s. Instead, the command should have a 10 + x minutes timeout, to be sure the device can complete the command before the error handler kicks in. We do know the test will last 10 minutes from SMART data information.
However, looking at the code it seems there is no way today to pass the desired timeout with an ATA passthrough command.
Change History (3)
comment:1 by , 11 years ago
Component: | all → smartctl |
---|---|
Owner: | changed from | to
Priority: | major → minor |
Status: | new → accepted |
comment:2 by , 11 years ago
Resolution: | → wontfix |
---|---|
Status: | accepted → closed |
I can use the offline mode, but I just wanted to point out that captive mode will not work if the test is longer than 20s.
From your explanation, I understand this is too difficult to fix it right.
comment:3 by , 11 years ago
Thanks for the info. I will probably address this in a future release for Linux SG_IO and other frequently used I/O-controls which actually support extended timeouts.
ATA pass-through I/O-controls are platform and controller-specific. There is no portable way to set the command timeout. Even if an I/O-control supports this parameter, the implementation may ignore it or set a command-specific timeout itself.
Why do you need the captive mode?