Opened 8 years ago
Closed 8 years ago
#788 closed task (worksforme)
Add temperature raw value in syslog, only log if normalized "health" value is below 100%
Reported by: | thomas303 | Owned by: | Christian Franke |
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | all | Version: | 6.5 |
Keywords: | Cc: |
Description (last modified by )
Forwarding from https://bugs.launchpad.net/ubuntu/+source/smartmontools/+bug/1653560
syslog entries like
Jan 2 20:22:27 server smartd[876]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 110 to 112
should be less confusing and logging should by default only take place if something is worth to be warned about.
That said, a "health" value below 100% (e.g. 98%) should trigger the logging, because then the health status as specified by the vendor is no more perfect.
And the output could be more verbose and less confusing. I suggest:
Jan 2 20:22:27 server smartd[876]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius: Thermal health changed from 110% (40°C) to 112% (38°C)
Given that normalization is specified by vendors, smartmontools could also take into account that e.g. health below 90% is critical (for WD drives that would be 60°C) and also should reported as critical (WARNING, etc.).
Change History (3)
comment:1 by , 8 years ago
Description: | modified (diff) |
---|
comment:2 by , 8 years ago
Owner: | set to |
---|---|
Status: | new → accepted |
comment:3 by , 8 years ago
Resolution: | → worksforme |
---|---|
Status: | accepted → closed |
Use
-W DIFF[,INFO[,CRIT]]
directive to track temperature (works also with SCSI/SAS and NVMe devices). Add-I 194
to suppress the above messages.For example
-W 2,50,60
would result in LOG_INFO messages like:and LOG_CRIT messages (and warning emails) like:
Alternatively add
-r 194
to log the raw value along with the normalized value:See
man smartd.conf
for details.Note that the mapping of raw temperature value to the normalized value is vendor and device specific. Various (undocumented) mappings exist in practice:
100-temperature, 150-temperature (above), temperature unchanged, something else.
The normalized value should not be interpreted as a health percentage.