Opened 13 years ago

Last modified 10 years ago

#215 new enhancement

Read xerror and append new error(s) to log file

Reported by: John Peterson Owned by: somebody
Priority: minor Milestone: undecided
Component: all Version: 5.42
Keywords: Cc:

Description

The Bad block HOWTO for smartmontools show how to locate a pending sector and write zero to it.

I would appreciate an automation of this task.

This would be especially helpful for a NTFS disk as the example is for another file system.

Thanks!

Change History (8)

comment:1 by Christian Franke, 13 years ago

Priority: majorminor

There are other tools available which can do a full read scan and overwrite the bad blocks then:

A full read scan is needed to get the logical block addresses of all bad sectors. ATA/SATA devices do not provide a command to read the list of all addresses of pending sectors. Self-test and error log list only a subset of these sectors. The old SMART logs do not support 48-bit LBA addresses.

To detect the affected files on NTFS, you could use

comment:2 by John Peterson, 13 years ago

Summary: Automate pending sector zero writeRead and append new xerror

Ok!, thanks. That's too bad! Since unstable sectors are serious I'm surprised my disk is designed to save only the last 24 errors. I would have saved much more than that since each error is only around 600 bytes including the call stack.

My fist inclination is obviously to reread the marked sectors only but since that's not possible I see it as my best option to run

ddrescue --force /dev/sda /dev/null

from Cygwin, setting ddrescue.exe to low i/o priority so as to least disturb other disk operations. It's a 2 TB USB 2.0 drive so a full read will take 77 hrs assuming 10 MB/s (USB max throughput is 20 MB/s but file sharing tasks continuously use the disk).

Since the disk don't maintain a list of all marked errors I'm changing my enhancement request. Can you provide a script that runs smartctl --log xerror and appends only new errors to a specified log file. I can then schedule this script in Task Scheduler so that I can maintain a complete marked sectors log. Preferable it can be accompanied by a script that reads the log and perform a read from the marked sectors with dd.

I also have a separate question, is there ever or always overlap between Current_Pending_Sector and Offline_Uncorrectable? I have 32 pending and 29 uncorrectable sectors, do I have 61, 32 or another number of marked sectors? Here's my smartctl --xall http://pastebin.com/iqKFPKSM.

Thanks!

comment:3 by John Peterson, 13 years ago

Or rather, since I have LBA_of_first_error 3575199272 I should use

ddrescue --force --input-position=1830502027264 /dev/sdb /dev/null

to skip the first 1.8 TB.

comment:4 by Christian Franke, 13 years ago

Yes, makes sense. You can pass the LBA unchanged if 'b' is appended. Always use a ddrescue log file to record good/bad/undone status for disk ranges. With a log file you can interrupt ddrescue at any time and resume it later at the same point:

ddrescue -v --force --input-position=3575199272b /dev/sdb /dev/null disk.log

By using --max-retries=N later it may be possible to force sector reallocation without loosing data.

SMART attributes are not standardized at all. The exact meaning of Current_Pending_Sector and Offline_Uncorrectable is vendor specific. Offline_Uncorrectable may count bad sectors found during SMART self-test.

comment:5 by John Peterson, 13 years ago

Summary: Read and append new xerrorRead xerror and append new error(s) to log file

Cool, thanks.

So LBA_of_first_error is the first from zero? during the self test. Not the first discovered? An extended self test is supposed to be a complete disk read so its LBA_of_first_error should give the first error from zero.

How would that move a sector to Reallocated_Sector_Ct? It never writes any data to the infile right?

Ok I get it, I thought the difference was in 'pending' and 'uncorrectable' and I misunderstood uncorrectable as failing a certain number of reads, now I understand that uncorrectable means uncorrectable by ECC which is the same error as for pending sectors and that the distinction is in the UPDATED status which is either Offline (self test) or Always.

By the way, please add phpBB to Hosted Apps!

Thanks!

comment:6 by Christian Franke, 13 years ago

Normally an extended self-test reports the first bad sector from zero as LBA_of_first_error. The self-test is typically aborted then. You could use selective self-tests to test the remaining sectors.

On some cases ddrescue --max-retries=N ... is able to finally read a bad sector after many retries. Then the firmware should reallocate the sector and write the old data to the spare sector.

comment:7 by John Peterson, 13 years ago

That's interesting, I would not have guessed that the self test ends because of a read error. Perhaps the wiki page could carry some information about the extended test too. I suggest the https://sourceforge.net/wiki/selftest_short https://sourceforge.net/wiki/test_offline articles are merged and information about the extended test is added.

That has not occurred with my drive, ddrescue show 8 unreadable sectors from a complete disk read (down from a higher number before a retry) but Reallocated_Sector_Ct remain at zero (and Current_Pending_Sector increased from 32 to 35).

I'm also happy to report that I've written the requested tools and they can be found at http://code.google.com/p/file-management-tools/. Example usage

php smarterr.php "smartctl --log xerror,99 p:" smarterr.log
php smartest.php "ddrescue -vfdM /dev/sdb /dev/null ddrescue.log" smarterr.log

and an example of scheduling the S.M.A.R.T. error log update to run regularly

schtasks /Create /TN "User\smarterr" /F /SC DAILY /ST 03:00 /TR "php C:\...\smarterr.php \"smartctl -d sat --log xerror,99 p:\" \"C:\...\smarterr.log\""

Since ddrescue remembers its progress there's no reason to skip duplicate sectors in the smarterr log, smartest therefore send all read errors to ddrescue;

I believe these features can be added to this project suite, if they don't fit in smartctl they could fit in a new executable called for example smartest.

Thanks!

comment:8 by Christian Franke, 10 years ago

Milestone: undecided
Note: See TracTickets for help on using tickets.