All of lore.kernel.org
 help / color / mirror / Atom feed
* Two M.2 NVMe drives with same NQN, one gets removed
@ 2018-11-26 13:38 James Dingwall
  2018-11-26 15:31 ` Keith Busch
  0 siblings, 1 reply; 11+ messages in thread
From: James Dingwall @ 2018-11-26 13:38 UTC (permalink / raw)


Hi,

We have also encountered this issue but the devices have a customised
firmware from Lenovo:

# nvme id-ctrl /dev/nvme0
NVME Identify Controller:
vid : 0x8086
ssvid : 0x8086
sn : BTHH82250N1X1P0E
mn : INTEL SSDPEKKF010T8L
fr : L08P

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1803692?comments=all

I tried to force install the Intel firmware but that did not work.  I
was fobbed off by Lenovo support with a 'we don't support Linux' even
after pointing out the upstream Intel fix.  Is there any possibility
of being able to work around this in the nvme code?

Thanks,
James

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Two M.2 NVMe drives with same NQN, one gets removed
  2018-11-26 13:38 Two M.2 NVMe drives with same NQN, one gets removed James Dingwall
@ 2018-11-26 15:31 ` Keith Busch
  2018-11-27  7:54   ` Christoph Hellwig
  0 siblings, 1 reply; 11+ messages in thread
From: Keith Busch @ 2018-11-26 15:31 UTC (permalink / raw)


On Mon, Nov 26, 2018@01:38:39PM +0000, James Dingwall wrote:
> Hi,
> 
> We have also encountered this issue but the devices have a customised
> firmware from Lenovo:
> 
> # nvme id-ctrl /dev/nvme0
> NVME Identify Controller:
> vid : 0x8086
> ssvid : 0x8086
> sn : BTHH82250N1X1P0E
> mn : INTEL SSDPEKKF010T8L
> fr : L08P
> 
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1803692?comments=all
> 
> I tried to force install the Intel firmware but that did not work.  I
> was fobbed off by Lenovo support with a 'we don't support Linux' even
> after pointing out the upstream Intel fix.  Is there any possibility
> of being able to work around this in the nvme code?
> 
> Thanks,
> James

Hi James,

According to the resolution discussion here:

  https://downloadcenter.intel.com/download/28320/Known-Issue-Intel-SSD-760p-Pro-7600p-Series-SubNQN-Conflict-on-Linux

  "
  If you purchased your Intel? SSD from an OEM, your firmware version may
  have a different naming. Contact your local OEM representative for
  latest firmware revisions.
  "

Sounds like Lenovo will need to merge the Intel update into their
specific firmware if they haven't already done so.

Thansk,
Keith

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Two M.2 NVMe drives with same NQN, one gets removed
  2018-11-26 15:31 ` Keith Busch
@ 2018-11-27  7:54   ` Christoph Hellwig
  2018-11-27 10:25     ` James Dingwall
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2018-11-27  7:54 UTC (permalink / raw)


On Mon, Nov 26, 2018@08:31:11AM -0700, Keith Busch wrote:
> According to the resolution discussion here:
> 
>   https://downloadcenter.intel.com/download/28320/Known-Issue-Intel-SSD-760p-Pro-7600p-Series-SubNQN-Conflict-on-Linux
> 
>   "
>   If you purchased your Intel? SSD from an OEM, your firmware version may
>   have a different naming. Contact your local OEM representative for
>   latest firmware revisions.
>   "
> 
> Sounds like Lenovo will need to merge the Intel update into their
> specific firmware if they haven't already done so.

OEMs are notoriously bad in picking up firmware fixes.  If the problems
with this drive persist we should probably add a quick that ignores the
drive provided NQN and build up one from based on the legacy
model/serial number algorithm instead.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Two M.2 NVMe drives with same NQN, one gets removed
  2018-11-27  7:54   ` Christoph Hellwig
@ 2018-11-27 10:25     ` James Dingwall
  2018-11-27 15:01       ` Keith Busch
  0 siblings, 1 reply; 11+ messages in thread
From: James Dingwall @ 2018-11-27 10:25 UTC (permalink / raw)


Hi,
On Mon, Nov 26, 2018@11:54:27PM -0800, Christoph Hellwig wrote:
> On Mon, Nov 26, 2018@08:31:11AM -0700, Keith Busch wrote:
> > According to the resolution discussion here:
> > 
> >   https://downloadcenter.intel.com/download/28320/Known-Issue-Intel-SSD-760p-Pro-7600p-Series-SubNQN-Conflict-on-Linux
> > 
> >   "
> >   If you purchased your Intel? SSD from an OEM, your firmware version may
> >   have a different naming. Contact your local OEM representative for
> >   latest firmware revisions.
> >   "
> > 
> > Sounds like Lenovo will need to merge the Intel update into their
> > specific firmware if they haven't already done so.
> 
> OEMs are notoriously bad in picking up firmware fixes.  If the problems
> with this drive persist we should probably add a quick that ignores the
> drive provided NQN and build up one from based on the legacy
> model/serial number algorithm instead.

Would something like this be the way to go if an appropriate entry is
created in the nvme_id_table in pci.c?  The example subnqn shown in
http://lists.infradead.org/pipermail/linux-nvme/2018-November/021154.html
looks like the fallback entry so I suppose the firmware fix has been to
just blank the relevant field.  (Should 2014.08.org be 2014-08.org
in core.c when generating the fake name?)

Thanks,
James

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 559d567693b8..d4ace74237d9 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2076,10 +2076,12 @@ static void nvme_init_subnqn(struct nvme_subsystem *subsys, struct nvme_ctrl *ct
 	size_t nqnlen;
 	int off;
 
-	nqnlen = strnlen(id->subnqn, NVMF_NQN_SIZE);
-	if (nqnlen > 0 && nqnlen < NVMF_NQN_SIZE) {
-		strlcpy(subsys->subnqn, id->subnqn, NVMF_NQN_SIZE);
-		return;
+	if(!(ctrl->quirks & NVME_QUIRK_IGNORE_DEV_SUBNQN)) {
+		nqnlen = strnlen(id->subnqn, NVMF_NQN_SIZE);
+		if (nqnlen > 0 && nqnlen < NVMF_NQN_SIZE) {
+			strlcpy(subsys->subnqn, id->subnqn, NVMF_NQN_SIZE);
+			return;
+		}
 	}
 
 	if (ctrl->vs >= NVME_VS(1, 2, 1))
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index cee79cb388af..a07155c05328 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -90,6 +90,11 @@ enum nvme_quirks {
 	 * Set MEDIUM priority on SQ creation
 	 */
 	NVME_QUIRK_MEDIUM_PRIO_SQ		= (1 << 7),
+
+	/*
+	 * Ignore device provided subnqn.
+	 */
+	NVME_QUIRK_IGNORE_DEV_SUBNQN		= (1 << 8),
 };
 
 /*

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Two M.2 NVMe drives with same NQN, one gets removed
  2018-11-27 10:25     ` James Dingwall
@ 2018-11-27 15:01       ` Keith Busch
  2018-11-30 11:50         ` James Dingwall
  0 siblings, 1 reply; 11+ messages in thread
From: Keith Busch @ 2018-11-27 15:01 UTC (permalink / raw)


On Tue, Nov 27, 2018@02:25:54AM -0800, James Dingwall wrote:
> Hi,
> On Mon, Nov 26, 2018@11:54:27PM -0800, Christoph Hellwig wrote:
> > On Mon, Nov 26, 2018@08:31:11AM -0700, Keith Busch wrote:
> > > According to the resolution discussion here:
> > > 
> > >   https://downloadcenter.intel.com/download/28320/Known-Issue-Intel-SSD-760p-Pro-7600p-Series-SubNQN-Conflict-on-Linux
> > > 
> > >   "
> > >   If you purchased your Intel? SSD from an OEM, your firmware version may
> > >   have a different naming. Contact your local OEM representative for
> > >   latest firmware revisions.
> > >   "
> > > 
> > > Sounds like Lenovo will need to merge the Intel update into their
> > > specific firmware if they haven't already done so.
> > 
> > OEMs are notoriously bad in picking up firmware fixes.  If the problems
> > with this drive persist we should probably add a quick that ignores the
> > drive provided NQN and build up one from based on the legacy
> > model/serial number algorithm instead.
> 
> Would something like this be the way to go if an appropriate entry is
> created in the nvme_id_table in pci.c?  The example subnqn shown in
> http://lists.infradead.org/pipermail/linux-nvme/2018-November/021154.html
> looks like the fallback entry so I suppose the firmware fix has been to
> just blank the relevant field.  (Should 2014.08.org be 2014-08.org
> in core.c when generating the fake name?)
> 
> Thanks,
> James

Looks correct, and just needs the nvme_id_table updated as you
mentioned. I also recommend skipping the NVME_VS(1, 2, 1) check if the
quirk is set so we're not warning users.


> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 559d567693b8..d4ace74237d9 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2076,10 +2076,12 @@ static void nvme_init_subnqn(struct nvme_subsystem *subsys, struct nvme_ctrl *ct
>  	size_t nqnlen;
>  	int off;
>  
> -	nqnlen = strnlen(id->subnqn, NVMF_NQN_SIZE);
> -	if (nqnlen > 0 && nqnlen < NVMF_NQN_SIZE) {
> -		strlcpy(subsys->subnqn, id->subnqn, NVMF_NQN_SIZE);
> -		return;
> +	if(!(ctrl->quirks & NVME_QUIRK_IGNORE_DEV_SUBNQN)) {
> +		nqnlen = strnlen(id->subnqn, NVMF_NQN_SIZE);
> +		if (nqnlen > 0 && nqnlen < NVMF_NQN_SIZE) {
> +			strlcpy(subsys->subnqn, id->subnqn, NVMF_NQN_SIZE);
> +			return;
> +		}
>  	}
>  
>  	if (ctrl->vs >= NVME_VS(1, 2, 1))
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index cee79cb388af..a07155c05328 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -90,6 +90,11 @@ enum nvme_quirks {
>  	 * Set MEDIUM priority on SQ creation
>  	 */
>  	NVME_QUIRK_MEDIUM_PRIO_SQ		= (1 << 7),
> +
> +	/*
> +	 * Ignore device provided subnqn.
> +	 */
> +	NVME_QUIRK_IGNORE_DEV_SUBNQN		= (1 << 8),
>  };
>  
>  /*

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Two M.2 NVMe drives with same NQN, one gets removed
  2018-11-27 15:01       ` Keith Busch
@ 2018-11-30 11:50         ` James Dingwall
  0 siblings, 0 replies; 11+ messages in thread
From: James Dingwall @ 2018-11-30 11:50 UTC (permalink / raw)


On Tue, Nov 27, 2018@08:01:02AM -0700, Keith Busch wrote:
> On Tue, Nov 27, 2018@02:25:54AM -0800, James Dingwall wrote:
> > Hi,
> > On Mon, Nov 26, 2018@11:54:27PM -0800, Christoph Hellwig wrote:
> > > On Mon, Nov 26, 2018@08:31:11AM -0700, Keith Busch wrote:
> > > > According to the resolution discussion here:
> > > > 
> > > >   https://downloadcenter.intel.com/download/28320/Known-Issue-Intel-SSD-760p-Pro-7600p-Series-SubNQN-Conflict-on-Linux
> > > > 
> > > >   "
> > > >   If you purchased your Intel? SSD from an OEM, your firmware version may
> > > >   have a different naming. Contact your local OEM representative for
> > > >   latest firmware revisions.
> > > >   "
> > > > 
> > > > Sounds like Lenovo will need to merge the Intel update into their
> > > > specific firmware if they haven't already done so.
> > > 
> > > OEMs are notoriously bad in picking up firmware fixes.  If the problems
> > > with this drive persist we should probably add a quick that ignores the
> > > drive provided NQN and build up one from based on the legacy
> > > model/serial number algorithm instead.
> > 
> > Would something like this be the way to go if an appropriate entry is
> > created in the nvme_id_table in pci.c?  The example subnqn shown in
> > http://lists.infradead.org/pipermail/linux-nvme/2018-November/021154.html
> > looks like the fallback entry so I suppose the firmware fix has been to
> > just blank the relevant field.  (Should 2014.08.org be 2014-08.org
> > in core.c when generating the fake name?)
> > 
> > Thanks,
> > James
> 
> Looks correct, and just needs the nvme_id_table updated as you
> mentioned. I also recommend skipping the NVME_VS(1, 2, 1) check if the
> quirk is set so we're not warning users.
> 
I have tested this patch in a build of the Ubuntu 4.15.0-39-generic
kernel adjusted only for a strncpy / strlcpy for the change in core.c.
I have moved the NVME_VS(1, 2, 1) if the quirk is set.  I'm not sure
how to restrict the nvme_id_table entry to specific firmware revisions
but perhaps that doesn't matter if (at least in this case) the fixed
firmware no longer supplies a value.

james at nvmetest:~$ cat /sys/class/block/nvme[01]n1/device/subsysnqn
nqn.2014.08.org.nvmexpress:80868086BTHH82250N1X1P0E    INTEL SSDPEKKF010T8L                    
nqn.2014.08.org.nvmexpress:80868086BTHH82250N261P0E    INTEL SSDPEKKF010T8L                    
james at nvmetest:~$ uname -r
4.15.0-39-generic

I have put the patch as a single commit on nvme-quirk-subnqn of
https://github.com/JKDingwall/linux.git.  I can split it if it makes
sense.

Thanks,
James


commit e6a9e9cc96aefbf4c984844f19b23095f225d5c0
Author: James Dingwall <james at dingwall.me.uk>
Date:   Tue Nov 27 16:11:23 2018 +0000

    nvme: introduce NVME_QUIRK_IGNORE_DEV_SUBNQN
    
    If a device provides an NQN it is expected to be globally unique.
    Unfortunately some firmware revisions for Intel 760p/Pro 7600p devices did
    not satisfy this requirement.  In these circumstances if a system has >1
    affected device then only one device is enabled.  If this quirk is enabled
    then the device supplied subnqn is ignored and we fallback to generating
    one as if the field was empty.  In this case we also suppress the version
    check so we don't print a warning when the quirk is enabled.
    
    Signed-off-by: James Dingwall <james at dingwall.me.uk>

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 559d567693b8..54bd23bba17d 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2076,14 +2076,16 @@ static void nvme_init_subnqn(struct nvme_subsystem *subsys, struct nvme_ctrl *ct
 	size_t nqnlen;
 	int off;
 
-	nqnlen = strnlen(id->subnqn, NVMF_NQN_SIZE);
-	if (nqnlen > 0 && nqnlen < NVMF_NQN_SIZE) {
-		strlcpy(subsys->subnqn, id->subnqn, NVMF_NQN_SIZE);
-		return;
-	}
+	if(!(ctrl->quirks & NVME_QUIRK_IGNORE_DEV_SUBNQN)) {
+		nqnlen = strnlen(id->subnqn, NVMF_NQN_SIZE);
+		if (nqnlen > 0 && nqnlen < NVMF_NQN_SIZE) {
+			strlcpy(subsys->subnqn, id->subnqn, NVMF_NQN_SIZE);
+			return;
+		}
 
-	if (ctrl->vs >= NVME_VS(1, 2, 1))
-		dev_warn(ctrl->device, "missing or invalid SUBNQN field.\n");
+		if (ctrl->vs >= NVME_VS(1, 2, 1))
+			dev_warn(ctrl->device, "missing or invalid SUBNQN field.\n");
+	}
 
 	/* Generate a "fake" NQN per Figure 254 in NVMe 1.3 + ECN 001 */
 	off = snprintf(subsys->subnqn, NVMF_NQN_SIZE,
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index cee79cb388af..a07155c05328 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -90,6 +90,11 @@ enum nvme_quirks {
 	 * Set MEDIUM priority on SQ creation
 	 */
 	NVME_QUIRK_MEDIUM_PRIO_SQ		= (1 << 7),
+
+	/*
+	 * Ignore device provided subnqn.
+	 */
+	NVME_QUIRK_IGNORE_DEV_SUBNQN		= (1 << 8),
 };
 
 /*
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index c33bb201b884..f05cdf4802f7 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2696,6 +2696,8 @@ static const struct pci_device_id nvme_id_table[] = {
 	{ PCI_VDEVICE(INTEL, 0xf1a5),	/* Intel 600P/P3100 */
 		.driver_data = NVME_QUIRK_NO_DEEPEST_PS |
 				NVME_QUIRK_MEDIUM_PRIO_SQ },
+	{ PCI_VDEVICE(INTEL, 0xf1a6),	/* Intel 760p/Pro 7600p */
+		.driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, },
 	{ PCI_VDEVICE(INTEL, 0x5845),	/* Qemu emulated controller */
 		.driver_data = NVME_QUIRK_IDENTIFY_CNS, },
 	{ PCI_DEVICE(0x1bb1, 0x0100),   /* Seagate Nytro Flash Storage */

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Two M.2 NVMe drives with same NQN, one gets removed
  2018-11-19 17:00     ` Keith Busch
@ 2018-11-21 21:19       ` John Van Bockel
  0 siblings, 0 replies; 11+ messages in thread
From: John Van Bockel @ 2018-11-21 21:19 UTC (permalink / raw)


Hi Keith,

I updated to version 004X of the 760p firmware and all appears to be working.
Thanks a bunch for steering me in the right direction.  Some results follow, in
case you're interested.

Thanks
jvbockel at gmail.com

# dmesg | grep -i nvme
[    3.475727] nvme nvme0: pci function 0000:72:00.0
[    3.475798] nvme nvme1: pci function 0000:73:00.0
[    3.585705] nvme nvme1: missing or invalid SUBNQN field.
[    3.588561]  nvme0n1: p1 p2
[    3.688492] nvme nvme0: missing or invalid SUBNQN field.
[    3.690815]  nvme1n1: p1 p2 p3 p4

# nvme list
Node             SN                   Model
        Namespace Usage                      Format           FW Rev
---------------- --------------------
---------------------------------------- ---------
-------------------------- ---------------- --------
/dev/nvme0n1     BTHH81850BX31P0E     INTEL SSDPEKKW010T8
        1           1.02  TB /   1.02  TB    512   B +  0 B   004X
/dev/nvme1n1     BTHH81850C8W1P0E     INTEL SSDPEKKW010T8
        1           1.02  TB /   1.02  TB    512   B +  0 B   004X

# nvme list-subsys
nvme-subsys0 - NQN=nqn.2014.08.org.nvmexpress:80868086BTHH81850BX31P0E
   INTEL SSDPEKKW010T8
\
 +- nvme1 pcie 0000:73:00.0
nvme-subsys1 - NQN=nqn.2014.08.org.nvmexpress:80868086BTHH81850C8W1P0E
   INTEL SSDPEKKW010T8
\
 +- nvme0 pcie 0000:72:00.0


On Mon, Nov 19, 2018@11:04 AM Keith Busch <keith.busch@intel.com> wrote:
>
> On Sun, Nov 18, 2018@10:55:11AM -0800, John Van Bockel wrote:
> > That's good news and very much appreciate the information and all
> > that you do for the open source community.  Is this an update for the
> > 760p NVMe drives or instead a UEFI update for the NUC8i7HVK?
> > Either way, good news.  Just wondering which I'm keeping an eye out
> > for.
>
> It's a 760p update, and looks like it was released earlier this month:
>
>   https://downloadcenter.intel.com/download/28320/Known-Issue-Intel-SSD-760p-Pro-7600p-Series-SubNQN-Conflict-on-Linux
>
> Sorry I missed that update, otherwise I could have sent you that link
> last week.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Two M.2 NVMe drives with same NQN, one gets removed
       [not found]   ` <CAGmwwfc=g+4h12qMBvRVJoE66Z70kzzfkabjyxLC=d7LZp627A@mail.gmail.com>
@ 2018-11-19 17:00     ` Keith Busch
  2018-11-21 21:19       ` John Van Bockel
  0 siblings, 1 reply; 11+ messages in thread
From: Keith Busch @ 2018-11-19 17:00 UTC (permalink / raw)


On Sun, Nov 18, 2018@10:55:11AM -0800, John Van Bockel wrote:
> That's good news and very much appreciate the information and all 
> that you do for the open source community.  Is this an update for the 
> 760p NVMe drives or instead a UEFI update for the NUC8i7HVK?  
> Either way, good news.  Just wondering which I'm keeping an eye out 
> for.  

It's a 760p update, and looks like it was released earlier this month:

  https://downloadcenter.intel.com/download/28320/Known-Issue-Intel-SSD-760p-Pro-7600p-Series-SubNQN-Conflict-on-Linux

Sorry I missed that update, otherwise I could have sent you that link
last week.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Two M.2 NVMe drives with same NQN, one gets removed
  2018-11-17  1:33 ` Keith Busch
@ 2018-11-18 19:08   ` John Van Bockel
       [not found]   ` <CAGmwwfc=g+4h12qMBvRVJoE66Z70kzzfkabjyxLC=d7LZp627A@mail.gmail.com>
  1 sibling, 0 replies; 11+ messages in thread
From: John Van Bockel @ 2018-11-18 19:08 UTC (permalink / raw)


Hi Keith,

That's good news and very much appreciate the information and all
that you do for the open source community.  Is this an update for the
760p NVMe drives or instead a UEFI update for the NUC8i7HVK?
Either way, good news.  Just wondering which I'm keeping an eye out
for.

Other than this minor problem, I sure am liking this NUC.  I'm finding
that contemporary versions of Linux work very nicely, even the Radeon
Vega M GPU.

Thanks again.  I'll patiently keep an eye out for the FW update.  Easily
worked around for the moment.

Cheers
jvbockel at gmail.com

On Fri, Nov 16, 2018@7:37 PM Keith Busch <keith.busch@intel.com> wrote:
>
> On Fri, Nov 16, 2018@06:17:22PM -0600, John Van Bockel wrote:
> > Hi,
> >
> > I have an Intel NUC8i7HVK mini-computer and a pair of Intel 760p 1TB M.2 NVMe
> > SSD drives.  With Fedora 29 and Ubuntu 18.10 (both 4.18 kernels), the
> > nvme kernel
> > module insists upon disabling one of the two NVMe drives when it notices that
> > both have been assigned the same NVMe Qualified Name (NQN).  The one that gets
> > removed by the nvme module is not always the same.  Whichever remains enabled
> > performs perfectly.
> >
> > [root at NUCnFutz ~]# dmesg | grep -i nvme
> >
> > [    3.372662] nvme nvme0: pci function 0000:72:00.0
> > [    3.372710] nvme nvme1: pci function 0000:73:00.0
> > [    3.484113]  nvme0n1: p1 p2 p3 p4
> > [    3.584531] nvme nvme1: ignoring ctrl due to duplicate subnqn
> > (nqn.2017-12.org.nvmexpress:uuid:11111111-2222-3333-4444-555555555555).
> > [    3.584533] nvme nvme1: Removing after probe failure status: -22
>
> This is a firmware bug. The maker is aware and have a fix undergoing
> validation. I am not sure what is gating its release so I've pinged the
> management for an update and will respond as soon as I hear a reply.
>
> Thanks,
> Keith

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Two M.2 NVMe drives with same NQN, one gets removed
  2018-11-17  0:17 John Van Bockel
@ 2018-11-17  1:33 ` Keith Busch
  2018-11-18 19:08   ` John Van Bockel
       [not found]   ` <CAGmwwfc=g+4h12qMBvRVJoE66Z70kzzfkabjyxLC=d7LZp627A@mail.gmail.com>
  0 siblings, 2 replies; 11+ messages in thread
From: Keith Busch @ 2018-11-17  1:33 UTC (permalink / raw)


On Fri, Nov 16, 2018@06:17:22PM -0600, John Van Bockel wrote:
> Hi,
> 
> I have an Intel NUC8i7HVK mini-computer and a pair of Intel 760p 1TB M.2 NVMe
> SSD drives.  With Fedora 29 and Ubuntu 18.10 (both 4.18 kernels), the
> nvme kernel
> module insists upon disabling one of the two NVMe drives when it notices that
> both have been assigned the same NVMe Qualified Name (NQN).  The one that gets
> removed by the nvme module is not always the same.  Whichever remains enabled
> performs perfectly.
> 
> [root at NUCnFutz ~]# dmesg | grep -i nvme
> 
> [    3.372662] nvme nvme0: pci function 0000:72:00.0
> [    3.372710] nvme nvme1: pci function 0000:73:00.0
> [    3.484113]  nvme0n1: p1 p2 p3 p4
> [    3.584531] nvme nvme1: ignoring ctrl due to duplicate subnqn
> (nqn.2017-12.org.nvmexpress:uuid:11111111-2222-3333-4444-555555555555).
> [    3.584533] nvme nvme1: Removing after probe failure status: -22

This is a firmware bug. The maker is aware and have a fix undergoing
validation. I am not sure what is gating its release so I've pinged the
management for an update and will respond as soon as I hear a reply.

Thanks,
Keith

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Two M.2 NVMe drives with same NQN, one gets removed
@ 2018-11-17  0:17 John Van Bockel
  2018-11-17  1:33 ` Keith Busch
  0 siblings, 1 reply; 11+ messages in thread
From: John Van Bockel @ 2018-11-17  0:17 UTC (permalink / raw)


Hi,

I have an Intel NUC8i7HVK mini-computer and a pair of Intel 760p 1TB M.2 NVMe
SSD drives.  With Fedora 29 and Ubuntu 18.10 (both 4.18 kernels), the
nvme kernel
module insists upon disabling one of the two NVMe drives when it notices that
both have been assigned the same NVMe Qualified Name (NQN).  The one that gets
removed by the nvme module is not always the same.  Whichever remains enabled
performs perfectly.

[root at NUCnFutz ~]# dmesg | grep -i nvme

[    3.372662] nvme nvme0: pci function 0000:72:00.0
[    3.372710] nvme nvme1: pci function 0000:73:00.0
[    3.484113]  nvme0n1: p1 p2 p3 p4
[    3.584531] nvme nvme1: ignoring ctrl due to duplicate subnqn
(nqn.2017-12.org.nvmexpress:uuid:11111111-2222-3333-4444-555555555555).
[    3.584533] nvme nvme1: Removing after probe failure status: -22

If I disable one in the UEFI BIOS, the other works perfectly with Linux.  It
does not matter which I disable, the enabled drive works perfectly.  Windows 10
has no trouble performing I/O against both drives when both are enabled.  I've
tried various combinations of settings in the UEFI to see if I could alter the
result of both being given the same NQN.  Firware versions for both
the NUC8i7hvk
and 760p SSD drives are at the latest 51 and 004C versions, respectively.

With the first M.2 slot enabled in UEFI, "nvme list" & "nvme list-subsys" yield:
/dev/nvme0n1 BTHH81850C8W1P0E INTEL SSDPEKKW010T8 1 1.02 TB / 1.02 TB
512 B + 0 B 004C
nvme-subsys0 - NQN=nqn.2017-12.org.nvmexpress:uuid:11111111-2222-3333-4444-555555555555
+- nvme0 pcie 0000:72:00.0

With the second M.2 slot enabled, "nvme list" & "nvme list-susbsys" produce:
/dev/nvme0n1 BTHH81850BX31P0E INTEL SSDPEKKW010T8 1 1.02 TB / 1.02 TB
512 B + 0 B 004C
nvme-subsys0 - NQN=nqn.2017-12.org.nvmexpress:uuid:11111111-2222-3333-4444-555555555555
+- nvme0 pcie 0000:73:00.0

I did have one boot in which both remained enabled.  Slot one was given a device
name of /dev/nvme0n1 with a 0000:72:00.0 path and slot two was given the
/dev/nvme0n2 name with its 0000:73:00.0 path.  I was excited by this result but
was unable to repeat the result.  I wish I had thought to see if both continued
to share the same NQN in that one fleeting success.

Intel regards the combination of NUC8i7HVK and 2TB 760p NVMe drives as
supported.
Mine are the 1TB version of the 760p drives, so I am deviating ever so slightly.
Also, Intel doesn't claim any responsibility for running Linux on the
NUCs and says
to contact the distribution.  The problem seems to be independent of
distribution,
having tried both Fedora 29 (kernel 4.18.16 & 4.18.18) and Ubuntu
18.10 (4.18.?).

I do not know how the NQN is being assigned to the drives.  Whether
the UEFI BIOS
is to blame or is instead being assigned by a Linux kernel module.  If this is
instead a defect in the UEFI, please let me know and I'll move on to communicate
with Intel Support.

Thanks
jvbockel at gmail.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-11-30 11:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-26 13:38 Two M.2 NVMe drives with same NQN, one gets removed James Dingwall
2018-11-26 15:31 ` Keith Busch
2018-11-27  7:54   ` Christoph Hellwig
2018-11-27 10:25     ` James Dingwall
2018-11-27 15:01       ` Keith Busch
2018-11-30 11:50         ` James Dingwall
  -- strict thread matches above, loose matches on Subject: below --
2018-11-17  0:17 John Van Bockel
2018-11-17  1:33 ` Keith Busch
2018-11-18 19:08   ` John Van Bockel
     [not found]   ` <CAGmwwfc=g+4h12qMBvRVJoE66Z70kzzfkabjyxLC=d7LZp627A@mail.gmail.com>
2018-11-19 17:00     ` Keith Busch
2018-11-21 21:19       ` John Van Bockel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.