Changes between Version 9 and Version 10 of BadBlockHowto


Ignore:
Timestamp:
Mar 25, 2017, 4:58:39 PM (8 years ago)
Author:
Gabriele Pohl
Comment:

Bad block reassignment

Legend:

Unmodified
Added
Removed
Modified
  • BadBlockHowto

    v9 v10  
    384384[Step 0] The SMART selftest/error log (see `smartctl -l selftest`) indicated there was a problem with block address (i.e. the 512 byte sector at) `58656333`. The partition table (e.g. see `sfdisk -luS /dev/hda` or `fdisk -ul /dev/hda`) indicated that this block was in the `/dev/hda3` partition which contained a `ReiserFS` file system. That partition started at block address `54781650`.
    385385
    386 While doing the initial analysis it may also be useful to take a copy of the disk attributes returned by `smartctl -A /dev/hda`. Specifically the values associated with the `Reallocated_Sector_Ct` and `Reallocated_Event_Count` attributes (for `ATA` disks, the grown list (`GLIST`) length for `SCSI` disks). If these are incremented at the end of the procedure it indicates that the disk has re-allocated one or more sectors.
     386While doing the initial analysis it may also be useful to take a copy of the disk attributes returned by `smartctl -A /dev/hda`. Specifically the values associated with the `Reallocated_Sector_Ct` and `Reallocated_Event_Count` attributes (for `ATA` disks, the grown list (`GLIST`) length for SCSI disks). If these are incremented at the end of the procedure it indicates that the disk has re-allocated one or more sectors.
    387387
    388388[Step 1] Get the file system's block size:
     
    637637Et voilà !
    638638
     639=== Bad block reassignment ===
     640
     641The SCSI disk command set and associated disk architecture are assumed in this section. SCSI disks have their own logical to physical mapping allowing a damaged sector (usually carrying 512 bytes of data) to be remapped irrespective of the operating system, file system or software RAID being used.
     642
     643The terms ''block and sector'' are used interchangeably, although block tends to get used in higher level or more abstract contexts such as a ''logical block''.
     644
     645When a SCSI disk is formatted, defective sectors identified during the manufacturing process (the so called primary list: PLIST), those found during the format itself (the certification list: CLIST), those given explicitly to the format command (the DLIST) and optionally the previous grown list (GLIST) are not used in the logical block map. The number (and low level addresses) of the unmapped sectors can be found with the `READ DEFECT DATA SCSI` command.
     646
     647SCSI disks tend to be divided into zones which have spare sectors and perhaps spare tracks, to support the logical block address mapping process. The idea is that if a logical block is remapped, the heads do not have to move a long way to access the replacement sector. Note that spare sectors are a scarce resource.
     648
     649Once a SCSI disk format has completed successfully, other problems may appear over time. These fall into two categories:
     650
     651* recoverable: the Error Correction Codes (ECC) detect a problem but it is small enough to be corrected. Optionally other strategies such as retrying the access may retrieve the data.
     652* unrecoverable: try as it may, the disk logic and ECC algorithms cannot recover the data. This is often reported as a ''medium error''.
     653
     654Other things can go wrong, typically associated with the transport and they will be reported using a term other than ''medium error''. For example a disk may decide a read operation was successful but a computer's host bus adapter (HBA) checking the incoming data detects a CRC error due to a bad cable or termination.
     655
     656Depending on the disk vendor, recoverable errors can be ignored. After all, some disks have up to 68 bytes of ECC above the payload size of 512 bytes so why use up spare sectors which are limited in number [#footnote8 [8]] ? If the disk can recover the data and does decide to re-allocate (reassign) a sector, then first it checks the settings of the `ARRE` and `AWRE` bits in the read-write error recovery mode page. Usually these bits are set [#footnote9 [9]] enabling automatic (read or write) re-allocation. The automatic re-allocation may also fail if the zone (or disk) has run out of spare sectors.
     657
     658Another consideration with RAIDs, and applications that require a high data rate without pauses, is that the controller logic may not want a disk to spend too long trying to recover an error.
     659
     660Unrecoverable errors will cause a ''medium error'' sense key, perhaps with some useful additional sense information. If the extended background self test includes a full disk read scan, one would expect the self test log to list the bad block, as shown in section [#Repairsinafilesystem Repairs in a file system]. Recent SCSI disks with a periodic background scan should also list unrecoverable read errors (and some recoverable errors as well). The advantage of the background scan is that it runs to completion while self tests will often terminate at the first serious error.
     661
     662SCSI disks expect unrecoverable errors to be fixed manually using the `REASSIGN BLOCKS SCSI` command since loss of data is involved. It is possible that an operating system or a file system could issue the `REASSIGN BLOCKS` command itself but the authors are unaware of any examples. The `REASSIGN BLOCKS` command will reassign one or more blocks, attempting to (partially ?) recover the data (a forlorn hope at this stage), fetch an unused spare sector from the current zone while adding the damaged old sector to the GLIST (hence the name ''grown'' list). The contents of the GLIST may not be that interesting but `smartctl` prints out the number of entries in the grown list and if that number grows quickly, the disk may be approaching the end of its useful life.
     663
     664Here is an alternate brute force technique to consider: if the data on the SCSI or ATA disk has all been backed up (e.g. is held on the other disks in a RAID 5 enclosure), then simply reformatting the disk may be the least cumbersome approach.
     665
    639666== Footnotes ==
    640667
     
    652679
    653680[=#footnote7 [7]] Thanks to Manfred Schwarb for the information about storing partition table(s) beforehand.
     681
     682[=#footnote8 [8]] Detecting and fixing an error with ECC ''on the fly'' and not going the further step and reassigning the block in question may explain why some disks have large numbers in their read error counter log. Various worried users have reported large numbers in the `errors corrected without substantial delay` counter field which is in the `Errors corrected by ECC fast` column in the `smartctl -l error` output.
     683
     684[=#footnote9 [9]] Often disks inside a hardware RAID have the ARRE and AWRE bits cleared (disabled) so the RAID controller can do things manually or flag the disk for replacement.