Opened 5 years ago
Closed 4 years ago
#1215 closed defect (invalid)
smartctl selects the wrong NVMe device
Reported by: | bendreth | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | all | Version: | |
Keywords: | nvme linux | Cc: |
Description (last modified by )
In the currently nightly, smartctl, when used on /dev/nvmeX rathern than /dev/nvmeXn1, selects the wrong device when two identical nvme modules are installed. Example:
# ls -l nvme-Samsung_SSD_970_EVO_2TB_S464NB0M200088N | cut -c40- nvme-Samsung_SSD_970_EVO_2TB_S464NB0M200088N -> ../../nvme0n1 # ls -l nvme-Samsung_SSD_970_EVO_2TB_S464NB0M200161Y | cut -c40- nvme-Samsung_SSD_970_EVO_2TB_S464NB0M200161Y -> ../../nvme1n1 # ./smartctl -a /dev/nvme0n1 | sed -n '1p;5,7p' smartctl 7.1 2019-07-01 r4934 [x86_64-linux-4.18.0-24-generic] (local build) Model Number: Samsung SSD 970 EVO 2TB Serial Number: S464NB0M200088N Firmware Version: 2B2QEXE7 # ./smartctl -a /dev/nvme1n1 | sed -n '1p;5,7p' smartctl 7.1 2019-07-01 r4934 [x86_64-linux-4.18.0-24-generic] (local build) Model Number: Samsung SSD 970 EVO 2TB Serial Number: S464NB0M200161Y Firmware Version: 2B2QEXE7 # ./smartctl -a /dev/nvme0 | sed -n '1p;5,7p' smartctl 7.1 2019-07-01 r4934 [x86_64-linux-4.18.0-24-generic] (local build) Model Number: Samsung SSD 970 EVO 2TB Serial Number: S464NB0M200161Y Firmware Version: 2B2QEXE7 # ./smartctl -a /dev/nvme1 | sed -n '1p;5,7p' smartctl 7.1 2019-07-01 r4934 [x86_64-linux-4.18.0-24-generic] (local build) Model Number: Samsung SSD 970 EVO 2TB Serial Number: S464NB0M200088N Firmware Version: 2B2QEXE7
Change History (5)
comment:1 by , 5 years ago
Description: | modified (diff) |
---|---|
Keywords: | nvme linux added |
Milestone: | → undecided |
comment:2 by , 5 years ago
It looks the same on older versions. Here is the output:
# smartctl -? | head -n1 smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.18.0-24-generic] (local build) # smartctl -a /dev/nvme0n1 | sed -n '6p' Serial Number: S464NB0M200088N # smartctl -a /dev/nvme1n1 | sed -n '6p' Serial Number: S464NB0M200161Y # smartctl -a /dev/nvme0 | sed -n '6p' Serial Number: S464NB0M200161Y # smartctl -a /dev/nvme1 | sed -n '6p' Serial Number: S464NB0M200088N # ls -l /dev/nvme* crw------- 1 root root 241, 0 Jun 25 02:29 /dev/nvme0 brw-rw---- 1 root disk 259, 1 Jun 25 02:29 /dev/nvme0n1 brw-rw---- 1 root disk 259, 4 Jun 25 02:29 /dev/nvme0n1p1 brw-rw---- 1 root disk 259, 5 Jun 25 02:29 /dev/nvme0n1p9 crw------- 1 root root 241, 1 Jun 25 02:29 /dev/nvme1 brw-rw---- 1 root disk 259, 0 Jun 25 02:29 /dev/nvme1n1 brw-rw---- 1 root disk 259, 2 Jun 25 02:29 /dev/nvme1n1p1 brw-rw---- 1 root disk 259, 3 Jun 25 02:29 /dev/nvme1n1p9 # egrep ':|nvme' /proc/devices Character devices: 241 nvme Block devices: # ./smartctl -d nvme,0x1 -a /dev/nvme0 | sed -n '1p;5,7p' smartctl 7.1 2019-07-01 r4934 [x86_64-linux-4.18.0-24-generic] (local build) Model Number: Samsung SSD 970 EVO 2TB Serial Number: S464NB0M200161Y Firmware Version: 2B2QEXE7
This is on Ubuntu 18.04.2.
comment:3 by , 5 years ago
259 is:
Block devices: 259 blkext
The kernel version is:
$ uname -a Linux neptune 4.18.0-24-generic #25~18.04.1-Ubuntu SMP Thu Jun 20 11:13:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
According to https://people.canonical.com/~kernel/info/kernel-version-map.html this corresponds to a mainline kernel 4.18.20.
comment:4 by , 5 years ago
The entries are mixed up in /sys
:
$ ls -l /sys/dev/block/259* /sys/dev/char/241* | cut -c39- /sys/dev/block/259:0 -> ../../devices/pci0000:00/0000:00:1b.0/0000:01:00.0/nvme/nvme0/nvme1n1 /sys/dev/block/259:1 -> ../../devices/pci0000:00/0000:00:1b.4/0000:02:00.0/nvme/nvme1/nvme0n1 /sys/dev/block/259:2 -> ../../devices/pci0000:00/0000:00:1b.0/0000:01:00.0/nvme/nvme0/nvme1n1/nvme1n1p1 /sys/dev/block/259:3 -> ../../devices/pci0000:00/0000:00:1b.0/0000:01:00.0/nvme/nvme0/nvme1n1/nvme1n1p9 /sys/dev/block/259:4 -> ../../devices/pci0000:00/0000:00:1b.4/0000:02:00.0/nvme/nvme1/nvme0n1/nvme0n1p1 /sys/dev/block/259:5 -> ../../devices/pci0000:00/0000:00:1b.4/0000:02:00.0/nvme/nvme1/nvme0n1/nvme0n1p9 /sys/dev/char/241:0 -> ../../devices/pci0000:00/0000:00:1b.0/0000:01:00.0/nvme/nvme0 /sys/dev/char/241:1 -> ../../devices/pci0000:00/0000:00:1b.4/0000:02:00.0/nvme/nvme1
nvme0
points to 0000:01:00.0
, but nvme1/nvme0n1
definitely looks wrong, and points to the other device.
comment:5 by , 4 years ago
Milestone: | undecided |
---|---|
Resolution: | → invalid |
Status: | new → closed |
Above output suggests that this is a Linux kernel issue. Cannot be fixed in smartmontools.
Did it work with an older release on same machine?
Smartctl does not explicitly select a device, it simply accesses the pass-through I/O-control behind the specified device node. Are the major/minor device numbers correctly set for these nodes?
Please provide output of the following commands: