#871 closed enhancement (fixed)
cciss: Add option to disable SAT auto detection
Reported by: | Stanislav Brabec | Owned by: | Christian Franke |
---|---|---|---|
Priority: | major | Milestone: | Release 7.0 |
Component: | all | Version: | 6.5 |
Keywords: | cciss freebsd linux | Cc: |
Description
Some newer HPSA devices reply to basic SAT commands and provide inquiry that contains "ATA ".
It causes that sat variable in sat_device::autodetect_open() becomes true, and
even if cciss is explicitly specified by
smartctl -d cciss,0 -H /dev/sda
it switches to sat
dev/sda [cciss_disk_00] [SAT]: Device open changed type from 'sat,auto' to 'sat'
As a result, it causes failure:
SMART STATUS RETURN: incomplete response, ATA output registers missing
REPORT-IOCTL: Device=/dev/sda Command=SMART STATUS CHECK returned -1 errno=38 [Function not implemented]
Attached patch disables the auto-switch to "better" driver for cciss.
Note that I do not have a test report from the customer for that patch yet, but setting sat = 0 was already confirmed to prevent this bug.
Note that smart_interface::autodetect_sat_device() contains a similar code, but I am not sure whether it needs a fix as well.
Attachments (2)
Change History (14)
comment:1 by , 7 years ago
Keywords: | cciss freebsd linux added |
---|---|
Milestone: | → undecided |
follow-up: 4 comment:2 by , 7 years ago
Yes, the drive is OK and works perfectly if connected directly. Also the HPSA array is OK.
The problem is caused by the new HPSA firmware that responds to SAT/SCSI inquiry. Even if it responds to the inquiry, it does not respond to SMART STATUS CHECK, SMART ATA attributes nor SCSI temperature queries. To get these values, CCISS passthrough protocol has to be used.
This firmware behavior caused more problems than this one. For example: https://www.smartmontools.org/ticket/817
My intention was a fix that will do: Once -d cciss is specified, never fall back to SAT/SCSI protocol. Only CCISS passthrough should be used.
Notes:
- The device behind the HPSA array can still be SAS or SATA, so the code has to pick a correct CCISS passthrough protocol.
- cciss auto detection is not implemented yet https://www.smartmontools.org/ticket/345
I just got a reply from customer. The attached patch does not work, it still switches to sat later, generating the same error. I will post new patch once it will be confirmed.
I will try to revert referred patches and let you know the result.
by , 7 years ago
Attachment: | smartmontools-cciss-not-sat.patch added |
---|
New version of the patch. Confirmed to fix the issue.
comment:3 by , 7 years ago
The new version of the patch disables the inquiry based switch from cciss to sat. Customer confirmed that it fixes the problem.
Customer also confirmed that reverting of r3564 and r3565 fixes the problem as well.
As I do not have a full insight into the code, I see are some uncertain things:
- Is it correct to call hide_scsi() for cciss devices?
- Should be autodetect_sat_device() modified in the same way?
follow-up: 5 comment:4 by , 7 years ago
Replying to sbrabec:
- The device behind the HPSA array can still be SAS or SATA, so the code has to pick a correct CCISS passthrough protocol.
It already does. If SATA is detected, SAT ATA_PASS_THROUGH commands are issued via CCISS passthrough protocol to address the SAT layer in CCISS driver or firmware.
New version of the patch. Confirmed to fix the issue.
Sorry, no. Disabling -d sat,auto
for CCISS in the generic SAT code after it has been added in CCISS specific code does not make much sense. The correct way is to undo the latter (r3564, r3565).
There are three alternatives:
- Convince the customer that the
incomplete response, ATA output registers missing
reports a driver/firmware limitation and not a disk problem.
- Undo r3564 and r3565 and require all other smartmontools users relying on this 5+ year old behavior to change
-d cciss,N
to-d sat,auto+cciss,N
in all monitoring scripts andsmartd.conf
files.
- Add a new
-d noauto[+TYPE]
prefix which disables any controller/platform specific auto-detection. Then your customer could change-d cciss,N
to-d noauto+cciss,N
. The customer will possibly realize then that the smartctl output has limited value for SATA drives. The SAT layer typically translates very limited diagnostic info (temperature, health status) to the SCSI/SAS view of the drive. Other interesting parts are no longer visible then.
follow-up: 7 comment:5 by , 7 years ago
There are three alternatives:
- Convince the customer that the
incomplete response, ATA output registers missing
reports a driver/firmware limitation and not a disk problem.
In case of -d sat
I would agree. If this happens with -d cciss
, then I will not agree. If -d cciss
is used, then user explicitly requests CCISS-pass-through protocol. smartctl should never switch back to sat.
Additionally, one work-around was already added for failing temperature reading after switching to sat from -d cciss
.
Note that -d sat,auto+cciss,N
will not work these modern HPSA devices, as it will behave exactly as -d sat
.
- Add a new
-d noauto[+TYPE]
prefix which disables any controller/platform specific auto-detection. Then your customer could change-d cciss,N
to-d noauto+cciss,N
. The customer will possibly realize then that the smartctl output has limited value for SATA drives. The SAT layer typically translates very limited diagnostic info (temperature, health status) to the SCSI/SAS view of the drive. Other interesting parts are no longer visible then.
Then -d cciss
would be usable only for the legacy CCISS and HPSA devices, not those new ones, which respond to SAT inquiry.
I have another two ideas:
- Do an extended inquiry check.
For example:
If the inquiry ID is ATA EK000400GWEPE
and version is HPG0
, then never use sat.
- In CCISS/auto mode, try sat command. If it fails, try CCISS-pass-through.
by , 7 years ago
Attachment: | scsiata-scsi_only.patch added |
---|
Patch adds '-d scsi+TYPE' prefix to disable auto-detection of TYPE
follow-up: 8 comment:6 by , 7 years ago
With the attached patch, smartctl -d scsi+cciss,0 ...
should disable SAT auto-detection. Please test if possible.
comment:7 by , 7 years ago
Replying to comment 5:
In case of
-d sat
I would agree. If this happens with-d cciss
, then I will not agree. If-d cciss
is used, then user explicitly requests CCISS-pass-through protocol. smartctl should never switch back to sat.
It doesn't switch back to SAT via SG_IO protocol. It still sends SCSI (in particular SAT) commands via CCISS-pass-through protocol.
comment:8 by , 7 years ago
Replying to chrfranke:
With the attached patch,
smartctl -d scsi+cciss,0 ...
should disable SAT auto-detection. Please test if possible.
Thanks for the patch. I made a test package and sent it to the customer with the affected hardware.
comment:9 by , 7 years ago
The customer just confirmed that your patch scsiata-scsi_only.patch works perfectly on a customer's hardware with -d scsi+cciss,0. Thanks.
comment:10 by , 7 years ago
Milestone: | undecided → Release 6.7 |
---|---|
Owner: | set to |
Status: | new → accepted |
Summary: | [PATCH] cciss: Never switch cciss device back to sat → cciss: Add option to disable SAT auto detection |
Type: | defect → enhancement |
SAT auto detection for '-d cciss' was added 5+ years ago as suggested by Don Brace, see ticket #202.
This message does not indicate disk problems. It is the usual result from buggy/incomplete SAT layers which do not properly return ATA output registers in SCSI sense data (ATA Return Descriptor).
The attached patch probably does not work. It only changes the info texts. It does not change the actual ATA/SCSI interface selection.
To disable implicit SAT auto detection for
-d cciss
, simply revert theget_sat_device("sat,auto", ...)
additions from r3564 and r3565.