All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme: fix possible hang when ns scanning fails during error recovery
@ 2020-05-06 22:44 Sagi Grimberg
  2020-05-08 14:38 ` Keith Busch
  0 siblings, 1 reply; 2+ messages in thread
From: Sagi Grimberg @ 2020-05-06 22:44 UTC (permalink / raw)
  To: linux-nvme, Christoph Hellwig, Keith Busch; +Cc: Anton Eidelman

When the controller is reconnecting, the host fails I/O and admin
commands as the host cannot reach the controller. ns scanning may
revalidate namespaces during that period and it is wrong to remove
namespaces due to these failures as we may hang (see 205da2434301).

One command that may fail is nvme_identify_ns_descs. Since we return
success due to having ns identify descriptor list optional, we continue
to compare ns identifiers in nvme_revalidate_disk, obviously fail and
return -ENODEV to nvme_validate_ns, which will remove the namespace.

Exactly what we don't want to happen.

Fixes: 22802bf742c2 ("nvme: Namepace identification descriptor list is optional")
Tested-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/nvme/host/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 2df6eb4dfe5c..fd81115edb82 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1125,7 +1125,7 @@ static int nvme_identify_ns_descs(struct nvme_ctrl *ctrl, unsigned nsid,
 		  * Don't treat an error as fatal, as we potentially already
 		  * have a NGUID or EUI-64.
 		  */
-		if (status > 0)
+		if (status > 0 && !(status & NVME_SC_DNR))
 			status = 0;
 		goto free_data;
 	}
-- 
2.20.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] nvme: fix possible hang when ns scanning fails during error recovery
  2020-05-06 22:44 [PATCH] nvme: fix possible hang when ns scanning fails during error recovery Sagi Grimberg
@ 2020-05-08 14:38 ` Keith Busch
  0 siblings, 0 replies; 2+ messages in thread
From: Keith Busch @ 2020-05-08 14:38 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: Anton Eidelman, Christoph Hellwig, linux-nvme

On Wed, May 06, 2020 at 03:44:02PM -0700, Sagi Grimberg wrote:
> When the controller is reconnecting, the host fails I/O and admin
> commands as the host cannot reach the controller. ns scanning may
> revalidate namespaces during that period and it is wrong to remove
> namespaces due to these failures as we may hang (see 205da2434301).
> 
> One command that may fail is nvme_identify_ns_descs. Since we return
> success due to having ns identify descriptor list optional, we continue
> to compare ns identifiers in nvme_revalidate_disk, obviously fail and
> return -ENODEV to nvme_validate_ns, which will remove the namespace.
> 
> Exactly what we don't want to happen.

Looks fine,

Reviewed-by: Keith Busch <kbusch@kernel.org>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-05-08 14:38 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-06 22:44 [PATCH] nvme: fix possible hang when ns scanning fails during error recovery Sagi Grimberg
2020-05-08 14:38 ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.