Opened 14 months ago

Closed 14 months ago

Last modified 14 months ago

#1751 closed defect (fixed)

SMART Test never ending and cannot be stopped

Reported by: Felix Owned by:
Priority: major Milestone:
Component: smartctl Version: 7.3
Keywords: scsi Cc:

Description

Hi,
I have accumulated some 30 or more HGST HUC101212CSS600 disks over the years which have served in different environments.
Most of them have been replaced by larger SSDs by now.

How ever, as I was somewhat lazy in documenting the disks health before decommissioning them in their prior environment, I have to do this now to see, which of them can be still used as spares, where needed.

The test-server is a HP DL380 Gen8 with a SmartArray in HBA-Mode.
I have had already tested over 20 disks.
The test scenario was always the same:

Insert the disks, enumerate them and check if they are available to the system (dmesg, ssacli) and then start using

smartctl -t long -d cciss,N /dev/sdX

All went well and I was able to identify about 6 disks, which did not pass the long self test.

The last 5 disks give me a headache:

So i did my usual routine and launched on every disk the smartctl -t long -d cciss,N /dev/sdX

When issuing the command to the last of those 5 disks, the server became unresponsible, I was not even able to use SysReq to restart it and really had to hard-shutdown via Power-Off button.

So I rebooted the host, removing all but the boot disk and repeated the whole procedure.
Same result, the last disk killed the server. So, "One down, 4 to go" I thought.

So I rebooted again, removed the obviously damaged disk completely, the others I re-plugged once the server was up again.
So I checked the smart status:

All of those 4 remaining disks have now one or two long tests pending, which I cannot stop:
for x in 0 1 2 3 4; do smartctl -a -d cciss,$x /dev/sda|grep -e "Product\|Vendor\|Serial\|Self test in progress"; done|less
Vendor: HGST
Product: HUC101212CSS600
Serial number: L0G9PVTG
Vendor (Seagate Cache) information
# 1 Background long Self test in progress ... 7 NOW - [- - -]
Vendor: HGST
Product: HUC101212CSS600
Serial number: L0GXKNLH
Vendor (Seagate Cache) information
# 2 Background long Self test in progress ... 7 NOW - [- - -]
Vendor: HGST
Product: HUC101212CSS600
Serial number: L0G9KZPG
Vendor (Seagate Cache) information
# 2 Background long Self test in progress ... 7 NOW - [- - -]
Vendor: HGST
Product: HUC101212CSS600
Serial number: L0G2E74H
Vendor (Seagate Cache) information
# 1 Background long Self test in progress ... 7 NOW - [- - -]
# 2 Background long Self test in progress ... 7 NOW - [- - -]

Issuing smartctl -X -d cciss,2 /dev/sdd

smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.2.16-12-pve] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

Self Test returned without error

Re-checking with -a still shows those one or two self-tests in progress.

BTW: in /etc/smartd.conf no scheduled tests are defined, that could mess up.
BTW2: I left the server do it's things over night in the hope, that maybe the disks would finish up their tests themselves, but no luck either!

My questions are:
1) is there any other way to abort those self-tests that pretend to be in progress
2) if not, is this a sign, that all of those disks are ready for disposal? Or can they still be saved?

many thanks for any help on this!

Felix

Change History (5)

comment:1 by Christian Franke, 14 months ago

Keywords: scsi added
Priority: criticalmajor

Self-test problems cannot be fixed by smartctl. Self-tests are simply started by smartctl and then controlled by drive firmware. See related FAQ entries.

My questions are:
1) is there any other way to abort those self-tests that pretend to be in progress
2) if not, is this a sign, that all of those disks are ready for disposal? Or can they still be saved?

I don't know, sorry.

PS: This is a bug tracker, not a support forum. For future support questions, please use the smartmontools-support mailing list instead. Thanks.

comment:2 by Christian Franke, 14 months ago

Milestone: undecided

comment:3 by Felix, 14 months ago

Thank you for this clarification.
I will try to find some utility on the UBCD as mentioned in the FAQs.
Sorry, I have missed the correct platform for this.
How ever, thank you so far.
regards,
Felix

comment:4 by Felix, 14 months ago

Resolution: fixed
Status: newclosed

comment:5 by Christian Franke, 14 months ago

Milestone: undecided
Note: See TracTickets for help on using tickets.