Changes between Initial Version and Version 1 of Ticket #658, comment 12
- Timestamp:
- Feb 26, 2016, 3:32:19 PM (9 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Ticket #658, comment 12
initial v1 9 9 Harddisks with error recovery control (ERC), also known as time-limited error recovery (TLER) from Western Digital, or command completion time limit (CCTL) from Samsung/Hitachi, allow to configure the amount of time a drive's firmware may spend attemting to recover from a read or write error. 10 10 11 The error recovery (ERC) time of a drive *must* be shorter than the system's controller timeout. Otherwise errors will cause a controller reset and the loss of all unwritten data. 11 The error recovery (ERC) time of a drive *must* be shorter than the system's controller timeout. Otherwise errors will cause a controller reset and the loss of all unwritten data. Unfortunately, many drives by default have very long or disabled timeouts. 12 12 13 13 With redundant RAID hardware or software configurations this is equally important. Here, resetting an entire drive instead of just retrying the failed block causes entire drives being marked as unusable, reducing the redundancy and performance. Furthermore, during the re-sync of a drive there is a high likelihood of errors to occur (seldom used areas), and a drive reset during the re-sync can render the entire array unusable. Limiting the drives' recovery timeout also allows for improved error handling in hardware or software RAID environments. Instead of waiting for one drive to recover requested data, it can quickly be read from another (redundant) drive.