linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] nvme: add error message on mismatching controller ids
@ 2019-11-21 17:58 James Smart
  2019-11-21 19:28 ` Ewan D. Milne
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: James Smart @ 2019-11-21 17:58 UTC (permalink / raw)
  To: linux-nvme; +Cc: James Smart

We've seen a few devices that return different controller
id's to the Fabric Connect command vs the Identify(controller)
command. It's currently hard to identify this failure by
existing error messages. It comes across as a (re)connect
attempt in the transport that fails with a -22 (-EINVAL)
status. The issue is compounded by older kernels not having
the controller id check or had the identify command overwrite
the fabrics controller id value before it checked. Both resulted
in cases where the devices appeared fine until more recent kernels.

Clarify the reject by adding an error message on controller
id mismatches.

Signed-off-by: James Smart <jsmart2021@gmail.com>
---
 drivers/nvme/host/core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 9696404a6182..c272afb084d1 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2824,6 +2824,10 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
 		 * admin connect
 		 */
 		if (ctrl->cntlid != le16_to_cpu(id->cntlid)) {
+			dev_err(ctrl->device,
+				"Mismatching cntlid: Connect %u vs Identify "
+				"%u, rejecting\n",
+				ctrl->cntlid, le16_to_cpu(id->cntlid));
 			ret = -EINVAL;
 			goto out_free;
 		}
-- 
2.13.7


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] nvme: add error message on mismatching controller ids
  2019-11-21 17:58 [PATCH] nvme: add error message on mismatching controller ids James Smart
@ 2019-11-21 19:28 ` Ewan D. Milne
  2019-11-21 20:16 ` Chaitanya Kulkarni
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Ewan D. Milne @ 2019-11-21 19:28 UTC (permalink / raw)
  To: linux-nvme

On Thu, 2019-11-21 at 09:58 -0800, James Smart wrote:
> We've seen a few devices that return different controller
> id's to the Fabric Connect command vs the Identify(controller)
> command. It's currently hard to identify this failure by
> existing error messages. It comes across as a (re)connect
> attempt in the transport that fails with a -22 (-EINVAL)
> status. The issue is compounded by older kernels not having
> the controller id check or had the identify command overwrite
> the fabrics controller id value before it checked. Both resulted
> in cases where the devices appeared fine until more recent kernels.
> 
> Clarify the reject by adding an error message on controller
> id mismatches.
> 
> Signed-off-by: James Smart <jsmart2021@gmail.com>
> ---
>  drivers/nvme/host/core.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 9696404a6182..c272afb084d1 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2824,6 +2824,10 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
>  		 * admin connect
>  		 */
>  		if (ctrl->cntlid != le16_to_cpu(id->cntlid)) {
> +			dev_err(ctrl->device,
> +				"Mismatching cntlid: Connect %u vs Identify "
> +				"%u, rejecting\n",
> +				ctrl->cntlid, le16_to_cpu(id->cntlid));
>  			ret = -EINVAL;
>  			goto out_free;
>  		}

Yes please.  More than one storage vendor has run into this.

Reviewed-by: Ewan D. Milne <emilne@redhat.com>


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nvme: add error message on mismatching controller ids
  2019-11-21 17:58 [PATCH] nvme: add error message on mismatching controller ids James Smart
  2019-11-21 19:28 ` Ewan D. Milne
@ 2019-11-21 20:16 ` Chaitanya Kulkarni
  2019-11-22 14:40 ` Hannes Reinecke
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Chaitanya Kulkarni @ 2019-11-21 20:16 UTC (permalink / raw)
  To: James Smart, linux-nvme

Looks good to me, just one nit with the commit message
and can be applied at the time of applying patch.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>

On 11/21/2019 09:58 AM, James Smart wrote:
> We've seen a few devices that return different controller
> id's to the Fabric Connect command vs the Identify(controller)
> command. It's currently hard to identify this failure by
> existing error messages. It comes across as a (re)connect
> attempt in the transport that fails with a -22 (-EINVAL)
> status. The issue is compounded by older kernels not having
> the controller id check or had the identify command overwrite
> the fabrics controller id value before it checked. Both resulted
> in cases where the devices appeared fine until more recent kernels.
>
> Clarify the reject by adding an error message on controller
> id mismatches.

When I applied this patch found that there is still some room
so we can get to the 72 columns (unless it is done with some purpose)
for the commit message, I've just adjusted the lines without changing 
any description :-

We've seen a few devices that return different controller id's to the
Fabric Connect command vs the Identify(controller) command. It's
currently hard to identify this failure by existing error messages.
It comes across as a (re)connect attempt in the transport that fails
with a -22 (-EINVAL) status. The issue is compounded by older kernels
not having the controller id check or had the identify command
overwrite the fabrics controller id value before it checked. Both
resulted in cases where the devices appeared fine until more recent
kernels.

Clarify the reject by adding an error message on controller id
mismatches.

>
> Signed-off-by: James Smart <jsmart2021@gmail.com>
> ---
>   drivers/nvme/host/core.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 9696404a6182..c272afb084d1 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2824,6 +2824,10 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
>   		 * admin connect
>   		 */
>   		if (ctrl->cntlid != le16_to_cpu(id->cntlid)) {
> +			dev_err(ctrl->device,
> +				"Mismatching cntlid: Connect %u vs Identify "
> +				"%u, rejecting\n",
> +				ctrl->cntlid, le16_to_cpu(id->cntlid));
>   			ret = -EINVAL;
>   			goto out_free;
>   		}
>


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nvme: add error message on mismatching controller ids
  2019-11-21 17:58 [PATCH] nvme: add error message on mismatching controller ids James Smart
  2019-11-21 19:28 ` Ewan D. Milne
  2019-11-21 20:16 ` Chaitanya Kulkarni
@ 2019-11-22 14:40 ` Hannes Reinecke
  2019-11-26 16:51 ` Christoph Hellwig
  2019-11-26 17:39 ` Keith Busch
  4 siblings, 0 replies; 6+ messages in thread
From: Hannes Reinecke @ 2019-11-22 14:40 UTC (permalink / raw)
  To: linux-nvme

On 11/21/19 6:58 PM, James Smart wrote:
> We've seen a few devices that return different controller
> id's to the Fabric Connect command vs the Identify(controller)
> command. It's currently hard to identify this failure by
> existing error messages. It comes across as a (re)connect
> attempt in the transport that fails with a -22 (-EINVAL)
> status. The issue is compounded by older kernels not having
> the controller id check or had the identify command overwrite
> the fabrics controller id value before it checked. Both resulted
> in cases where the devices appeared fine until more recent kernels.
> 
> Clarify the reject by adding an error message on controller
> id mismatches.
> 
> Signed-off-by: James Smart <jsmart2021@gmail.com>
> ---
>  drivers/nvme/host/core.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 9696404a6182..c272afb084d1 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2824,6 +2824,10 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
>  		 * admin connect
>  		 */
>  		if (ctrl->cntlid != le16_to_cpu(id->cntlid)) {
> +			dev_err(ctrl->device,
> +				"Mismatching cntlid: Connect %u vs Identify "
> +				"%u, rejecting\n",
> +				ctrl->cntlid, le16_to_cpu(id->cntlid));
>  			ret = -EINVAL;
>  			goto out_free;
>  		}
> 
Indeed, we've seem them too.

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      Teamlead Storage & Networking
hare@suse.de			                  +49 911 74053 688
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), GF: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nvme: add error message on mismatching controller ids
  2019-11-21 17:58 [PATCH] nvme: add error message on mismatching controller ids James Smart
                   ` (2 preceding siblings ...)
  2019-11-22 14:40 ` Hannes Reinecke
@ 2019-11-26 16:51 ` Christoph Hellwig
  2019-11-26 17:39 ` Keith Busch
  4 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2019-11-26 16:51 UTC (permalink / raw)
  To: James Smart; +Cc: linux-nvme

Looks good (modulo the commit message nitpicks):

Reviewed-by: Christoph Hellwig <hch@lst.de>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] nvme: add error message on mismatching controller ids
  2019-11-21 17:58 [PATCH] nvme: add error message on mismatching controller ids James Smart
                   ` (3 preceding siblings ...)
  2019-11-26 16:51 ` Christoph Hellwig
@ 2019-11-26 17:39 ` Keith Busch
  4 siblings, 0 replies; 6+ messages in thread
From: Keith Busch @ 2019-11-26 17:39 UTC (permalink / raw)
  To: James Smart; +Cc: linux-nvme

Applied to nvme/for-5.5 with the minor word-wrap adjustment on the
commit log.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-11-26 17:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-21 17:58 [PATCH] nvme: add error message on mismatching controller ids James Smart
2019-11-21 19:28 ` Ewan D. Milne
2019-11-21 20:16 ` Chaitanya Kulkarni
2019-11-22 14:40 ` Hannes Reinecke
2019-11-26 16:51 ` Christoph Hellwig
2019-11-26 17:39 ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).