#939 closed defect (fixed)
drivedb correction: Innolite Satadom D150QV-L
Reported by: | Stoat | Owned by: | |
---|---|---|---|
Priority: | minor | Milestone: | Release 7.0 |
Component: | drivedb | Version: | 6.6 |
Keywords: | Cc: |
Description
After reading the full datasheet (pdf available on request) I've come up with the following changes:
# diff -u /usr/local/share/smartmontools/drivedb.h /usr/local/share/smartmontools/drivedb.h.new --- /usr/local/share/smartmontools/drivedb.h 2017-11-14 02:36:32.359088000 +0000 +++ /usr/local/share/smartmontools/drivedb.h.new 2017-11-14 02:33:58.020915000 +0000 @@ -736,7 +736,7 @@ "-v 241,raw48,Host_Writes" }, { "InnoDisk InnoLite SATADOM D150QV-L SSDs", // tested with InnoLite SATADOM D150QV-L/120319 - "InnoLite SATADOM D150QV-L", + "InnoLite SATADOM D150QV", "", "", //"-v 1,raw48,Raw_Read_Error_Rate " //"-v 2,raw48,Throughput_Performance " @@ -744,18 +744,18 @@ //"-v 5,raw16(raw16),Reallocated_Sector_Ct " //"-v 7,raw48,Seek_Error_Rate " // from InnoDisk iSMART Linux tool, useless for SSD //"-v 8,raw48,Seek_Time_Performance " - //"-v 9,raw24(raw8),Power_On_Hours " + //"-v 9,raw48,Power_On_Hours " //"-v 10,raw48,Spin_Retry_Count " //"-v 12,raw48,Power_Cycle_Count " "-v 168,raw48,SATA_PHY_Error_Count " - "-v 170,raw48,Bad_Block_Count " - "-v 173,raw48,Erase_Count " + "-v 170,raw16,Bad_Block_Count_New/Tot " + "-v 173,raw16,Erase_Count_Max/Avg " "-v 175,raw48,Bad_Cluster_Table_Count " "-v 192,raw48,Unexpect_Power_Loss_Ct " //"-v 194,tempminmax,Temperature_Celsius " //"-v 197,raw48,Current_Pending_Sector " "-v 229,hex48,Flash_ID " - "-v 235,raw48,Later_Bad_Block " + "-v 235,raw16,Lat_Bad_Blk_Era/Wri/Rea " "-v 236,raw48,Unstable_Power_Count " "-v 240,raw48,Write_Head" },
Notes:
This device is labelled on the outside as a D150QV-L, but reports as a D150QV - which is the family name, according to the spec sheet. Suffixes indicate powering and temperature variations
"235 - Later bad block" this is the count of bad blocks detected after leaving the factory and the mode they've tested faulty in. I'm not quite sure what order the write/read are in, as this was extracted from ismart documentation, but it appears to be correct.
"170 - badblock count" lists later_bad_blocks and the total including bad blocks from factory. The actual format is 0x64 0x64 0x00 0x00 [Total lsb msb] [later lsb msb] (Raw16 gives a bogus trailing zero)
"173 - erase count" has the max/avg order reversed from raw16(avg16) format. raw16 gives a bogus leading zero
attributes 9, 12, 168, 175 and 192 are set at raw48, but only the bottom 2 bytes are actually used (0x6464 lsb msb 0000 0000)
Bogons:
attribute 01 is fixed at 0x64 0x64 0xff 0xff 0xff 0x00 0x00 0x00
attribute 02, 03, 05, 07, 08, 10, 197 and 240 are all fixed at zeros.
attribute 194 has the usual temperature format, but byte 7 (the temperature) is never reported.
As such these could (and probably should!) all be filtered, particularly the reallocated and pending sector counts, as these will never move away from zero and that could be highly misleading to the casual observer (it certainly fooled our vendor!)
Comment:
I hope these help a bit. Trying to figure out what the "unknown attributes" really were was a bit of an adventure.
These devices are far more fragile than their 3000 cycle write duration indicates. They're optimised as read-only industrial controller drives (ie: don't write logs back to them and don't RAID1 them) but have been widely deployed as RAID1 boot pairs in production NASes (Certified by Nexenta, Open-E and ixSystems amongst other vendors) - where they break after a couple of years.
Thankfully they appear to be out of production, but even when they were sold there were higher-spec devices made by Innodisk - however getting hold of those devices was difficult as distributors took the attitude that "all models are the same".
Attachments (2)
Change History (10)
comment:1 by , 7 years ago
Component: | all → drivedb |
---|---|
Milestone: | → Release 6.7 |
comment:2 by , 7 years ago
comment:3 by , 7 years ago
The fixed structures are codified in the extended datasheet (pages attached), which is over 6 years old in its last revision.
The chances of firmware updates occurring is vanishingly low. The devices are long-gone from Innodisk's catalog _and_ support pages.
I've attached the white paper which allowed deduction of the Later Bad Block structure too. (image on page 4)
by , 7 years ago
Attachment: | innodisk_error_correction_detection_and_bad_block_management_white_paper_ver1 0_te.pdf added |
---|
innodisk bad block managment white paper.
comment:5 by , 7 years ago
Stock smartctl on FreeBSD 10.3
# smartctl -x /dev/ada0 smartctl 6.5 2016-05-07 r4318 [FreeBSD 10.3-STABLE amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Device Model: InnoLite SATADOM D150QV Serial Number: 20141231AA1005224296 Firmware Version: 120319 User Capacity: 32,017,047,552 bytes [32.0 GB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: 2.5 inches Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 2.6, 3.0 Gb/s Local Time is: Tue Nov 14 18:30:41 2017 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Disabled Rd look-ahead is: Enabled Write cache is: Enabled ATA Security is: Unavailable Wt Cache Reorder: Unavailable === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Total time to complete Offline data collection: ( 30) seconds. Offline data collection capabilities: (0x00) Offline data collection not supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x00) Error logging NOT supported. No General Purpose Logging support. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate PO-R-- 100 100 050 - 16777215 2 Throughput_Performance P-S--- 100 100 050 - 0 3 Spin_Up_Time POS--- 100 100 050 - 0 5 Reallocated_Sector_Ct PO--C- 100 100 050 - 0 7 Unknown_SSD_Attribute PO-R-- 100 100 050 - 0 8 Unknown_SSD_Attribute P-S--- 100 100 050 - 0 9 Power_On_Hours -O--C- 100 100 000 - 18843 10 Unknown_SSD_Attribute PO--C- 100 100 050 - 0 12 Power_Cycle_Count -O--C- 100 100 000 - 122 168 Unknown_Attribute -O--C- 100 100 000 - 0 175 Program_Fail_Count_Chip PO---- 100 100 010 - 0 192 Power-Off_Retract_Count -O--C- 100 100 000 - 0 194 Temperature_Celsius -O---K 000 100 000 - 0 (Min/Max 0/100) 197 Current_Pending_Sector -O--C- 100 100 000 - 0 240 Unknown_SSD_Attribute PO--C- 100 100 050 - 0 170 Unknown_Attribute PO---- 100 100 010 - 373674344448 173 Unknown_Attribute -O--C- 100 100 000 - 305992140 229 Unknown_Attribute -O---- 100 100 000 - 727108061228 236 Unknown_Attribute -O---- 100 100 000 - 0 235 Unknown_Attribute -O---- 100 000 000 - 5701719 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning Read SMART Log Directory failed: Input/output error General Purpose Log Directory not supported SMART Extended Comprehensive Error Log (GP Log 0x03) not supported SMART Error Log not supported SMART Extended Self-test Log (GP Log 0x07) not supported SMART Self-test Log not supported Selective Self-tests/Logging not supported SCT Commands not supported Device Statistics (GP/SMART Log 0x04) not supported SATA Phy Event Counters (GP Log 0x11) not supported
With the updated drivedb.h
# smartctl -B /usr/local/share/smartmontools/drivedb.h.new -x /dev/ada0 smartctl 6.5 2016-05-07 r4318 [FreeBSD 10.3-STABLE amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: InnoDisk InnoLite SATADOM D150QV-L SSDs Device Model: InnoLite SATADOM D150QV Serial Number: 20141231AA1005224296 Firmware Version: 120319 User Capacity: 32,017,047,552 bytes [32.0 GB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: 2.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS (minor revision not indicated) SATA Version is: SATA 2.6, 3.0 Gb/s Local Time is: Tue Nov 14 18:41:50 2017 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Disabled Rd look-ahead is: Enabled Write cache is: Enabled ATA Security is: Unavailable Wt Cache Reorder: Unavailable === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Total time to complete Offline data collection: ( 30) seconds. Offline data collection capabilities: (0x00) Offline data collection not supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x00) Error logging NOT supported. No General Purpose Logging support. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate PO-R-- 100 100 050 - 16777215 2 Throughput_Performance P-S--- 100 100 050 - 0 3 Spin_Up_Time POS--- 100 100 050 - 0 5 Reallocated_Sector_Ct PO--C- 100 100 050 - 0 7 Unknown_SSD_Attribute PO-R-- 100 100 050 - 0 8 Unknown_SSD_Attribute P-S--- 100 100 050 - 0 9 Power_On_Hours -O--C- 100 100 000 - 18843 10 Unknown_SSD_Attribute PO--C- 100 100 050 - 0 12 Power_Cycle_Count -O--C- 100 100 000 - 122 168 SATA_PHY_Error_Count -O--C- 100 100 000 - 0 175 Bad_Cluster_Table_Count PO---- 100 100 010 - 0 192 Unexpect_Power_Loss_Ct -O--C- 100 100 000 - 0 194 Temperature_Celsius -O---K 000 100 000 - 0 (Min/Max 0/100) 197 Current_Pending_Sector -O--C- 100 100 000 - 0 240 Write_Head PO--C- 100 100 050 - 0 170 Bad_Block_Count_New/Tot PO---- 100 100 010 - 87 186 0 173 Erase_Count_Max/Avg -O--C- 100 100 000 - 0 4669 4556 229 Flash_ID -O---- 100 100 000 - 0x00a94b04882c 236 Unstable_Power_Count -O---- 100 100 000 - 0 235 Lat_Bad_Blk_Era/Wri/Rea -O---- 100 000 000 - 0 87 87 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning Read SMART Log Directory failed: Input/output error General Purpose Log Directory not supported SMART Extended Comprehensive Error Log (GP Log 0x03) not supported SMART Error Log not supported SMART Extended Self-test Log (GP Log 0x07) not supported SMART Self-test Log not supported Selective Self-tests/Logging not supported SCT Commands not supported Device Statistics (GP/SMART Log 0x04) not supported SATA Phy Event Counters (GP Log 0x11) not supported
Attribute 170 has a trailing bogon 0 and 173 has a leading bogon 0 in the "raw value" column.
Both are using raw16 as I couldn't see a better way to get the numbers out in readable format whilst suppressing the 0
I've got a few of these things in 32/64GB and between 3-7 years poweron time. smartctl gives the same results on Linux (rhel and ubuntu).
They really are quite crufty devices with a _very_ low sequential write speed (18/20/40/40MB/s for 8/16/32/64GB) and an endurance of 3000 cycles.
advertising sheets quote them at 110MB/s - that's the read speed only.
One of our vendors has commented that they tend to simply stop working with little-to-no warning - but none of the vendor appliances have been monitoring bad block parameters - assuming that reallocated sectors was valid. That's why I recommended filtering these returns.
What we saw was a dramatic slowdown in write speeds when one of the 64Gb units hit 256 bad blocks - something less than 1MB/s. None of the units has changed the 0x64 "value" field for either bad blocks attribute no matter how many have failed and I'm going to push one through a few write cycles to see if they ever do (I suspect not)
Thank you for update and clarification. I will merge your changes to the drivedb, however, i would like to not filter out values, there is a [small] chance to get them fixed in the next fw release.