50 | | 2010-11-26: The info is published on [http://www.heise.de/newsticker/meldung/SMART-Tool-beschaedigt-Daten-auf-Samsung-Festplatte-1143120.html heise online news] (German). |
| 54 | 2010-11-26, updated 2010-11-30: The info is published on [http://www.heise.de/newsticker/meldung/SMART-Tool-beschaedigt-Daten-auf-Samsung-Festplatte-Update-1143120.html heise online news] (German). |
| 55 | |
| 56 | ---- |
| 57 | |
| 58 | 2010-11-30: We could reproduce the problem. |
| 59 | |
| 60 | Tested on an Intel based system with P35 chipset under Linux ([http://grml.org/ grml] 2010.04 Live CD). NCQ and disk write cache are enabled. |
| 61 | |
| 62 | {{{ |
| 63 | # uname -a |
| 64 | Linux grml.somewhere 2.6.33-grml #1 SMP PREEMPT Fri Apr 2 10:16:25 UTC 2010 i686 GNU/Linux |
| 65 | |
| 66 | # smartctl -i -q noserial /dev/sda |
| 67 | smartctl 5.40 2010-10-16 r3189 [i686-pc-linux-gnu] (local build) |
| 68 | Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net |
| 69 | |
| 70 | === START OF INFORMATION SECTION === |
| 71 | Device Model: SAMSUNG HD204UI |
| 72 | Firmware Version: 1AQ10001 |
| 73 | User Capacity: 2,000,398,934,016 bytes |
| 74 | ... |
| 75 | |
| 76 | # cat /sys/block/sda/device/queue_depth |
| 77 | 31 |
| 78 | |
| 79 | # hdparm -W /dev/sda |
| 80 | /dev/sda: |
| 81 | write-caching = 1 (on) |
| 82 | }}} |
| 83 | |
| 84 | First run one of these commands in another terminal window: |
| 85 | {{{ |
| 86 | # watch -n 1 smartctl -i /dev/sda |
| 87 | }}} |
| 88 | or: |
| 89 | {{{ |
| 90 | # watch -n 1 hdparm -I /dev/sda |
| 91 | }}} |
| 92 | |
| 93 | With the above command running concurrently the problem can be reproduced as follows: |
| 94 | {{{ |
| 95 | # dd if=/dev/zero of=/dev/sda count=1000000 |
| 96 | 1000000+0 records in |
| 97 | 1000000+0 records out |
| 98 | 512000000 bytes (512 MB) copied, 12.7394 s, 40.2 MB/s |
| 99 | |
| 100 | # badblocks -vw -b 512 -t 0x55 /dev/sda 1000000 |
| 101 | Checking for bad blocks in read-write mode |
| 102 | From block 0 to 1000000 |
| 103 | Testing with pattern 0x55: done |
| 104 | Reading and comparing: 36608 |
| 105 | ... |
| 106 | 36671 |
| 107 | 107200 |
| 108 | ... |
| 109 | 107263 |
| 110 | 169984 |
| 111 | ... |
| 112 | 170047 |
| 113 | 245824 |
| 114 | ... |
| 115 | 245887 |
| 116 | 321216 |
| 117 | ... |
| 118 | 343615 |
| 119 | 606336 |
| 120 | ... |
| 121 | 606399 |
| 122 | 875520 |
| 123 | ... |
| 124 | 875583 |
| 125 | done |
| 126 | Pass completed, 256 bad blocks found. |
| 127 | |
| 128 | # od -A x -x -N 100000b /dev/sda |
| 129 | 000000 5555 5555 5555 5555 5555 5555 5555 5555 |
| 130 | * |
| 131 | 1180000 0000 0000 0000 0000 0000 0000 0000 0000 |
| 132 | * |
| 133 | 1188000 5555 5555 5555 5555 5555 5555 5555 5555 |
| 134 | * |
| 135 | a7c0000 0000 0000 0000 0000 0000 0000 0000 0000 |
| 136 | * |
| 137 | a7c8000 5555 5555 5555 5555 5555 5555 5555 5555 |
| 138 | * |
| 139 | 12810000 0000 0000 0000 0000 0000 0000 0000 0000 |
| 140 | * |
| 141 | 12818000 5555 5555 5555 5555 5555 5555 5555 5555 |
| 142 | * |
| 143 | 1ab80000 0000 0000 0000 0000 0000 0000 0000 0000 |
| 144 | * |
| 145 | 1ab88000 5555 5555 5555 5555 5555 5555 5555 5555 |
| 146 | * |
| 147 | 1e848000 |
| 148 | }}} |
| 149 | |
| 150 | The above suggests that the disk sometimes discards a pending 64 sector write command when a IDENTIFY DEVICE command is received. This data loss occurs silently. There is no error message in kernel log, SMART Error log, NCQ Command Error log page, or SATA Phy Event Counters log page. |
| 151 | |
| 152 | The problem could '''not''' be reproduced with the above test if any of the following conditions are met: |
| 153 | |
| 154 | * Disk write cache is disabled. |
| 155 | |
| 156 | * NCQ is disabled. This may not always be true as the c't lab also reported problems with NCQ disabled. |
| 157 | |
| 158 | * A modified test version of smartctl which does not issue IDENTIFY DEVICE commands is used. Then all other SMART and non-SMART commands used by smartctl work without any data loss. |
| 159 | |
| 160 | Christian Franke |