All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme-pci: Fix rapid add remove sequence
@ 2019-01-24  1:46 Keith Busch
  2019-01-30 17:22 ` Alex_Gagniuc
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Keith Busch @ 2019-01-24  1:46 UTC (permalink / raw)


A surprise removal may fail to tear down request queues if it is racing
with the initial asynchronous probe. If that happens, the remove path
won't see the queue resources to tear down, and the controller reset
path may create a new request queue on a removed device, but will not
be able to make forward progress, deadlocking the pci removal.

Protect setting up non-blocking resources from a shutdown by holding the
same mutex, and transition to the CONNECTING state after these resources
are initialized so the probe path may see the dead controller state
before dispatching new IO.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=202081
Reported-by: Alex Gagniuc <Alex_Gagniuc at Dellteam.com>
Signed-off-by: Keith Busch <keith.busch at intel.com>
---
 drivers/nvme/host/pci.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 9bc585415d9b..022ea1ee63f8 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2557,16 +2557,7 @@ static void nvme_reset_work(struct work_struct *work)
 	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
 		nvme_dev_disable(dev, false);
 
-	/*
-	 * Introduce CONNECTING state from nvme-fc/rdma transports to mark the
-	 * initializing procedure here.
-	 */
-	if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_CONNECTING)) {
-		dev_warn(dev->ctrl.device,
-			"failed to mark controller CONNECTING\n");
-		goto out;
-	}
-
+	mutex_lock(&dev->shutdown_lock);
 	result = nvme_pci_enable(dev);
 	if (result)
 		goto out;
@@ -2585,6 +2576,17 @@ static void nvme_reset_work(struct work_struct *work)
 	 */
 	dev->ctrl.max_hw_sectors = NVME_MAX_KB_SZ << 1;
 	dev->ctrl.max_segments = NVME_MAX_SEGS;
+	mutex_unlock(&dev->shutdown_lock);
+
+	/*
+	 * Introduce CONNECTING state from nvme-fc/rdma transports to mark the
+	 * initializing procedure here.
+	 */
+	if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_CONNECTING)) {
+		dev_warn(dev->ctrl.device,
+			"failed to mark controller CONNECTING\n");
+		goto out;
+	}
 
 	result = nvme_init_identify(&dev->ctrl);
 	if (result)
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH] nvme-pci: Fix rapid add remove sequence
  2019-01-24  1:46 [PATCH] nvme-pci: Fix rapid add remove sequence Keith Busch
@ 2019-01-30 17:22 ` Alex_Gagniuc
  2019-01-31 20:54 ` Sagi Grimberg
  2019-02-05  8:51 ` Christoph Hellwig
  2 siblings, 0 replies; 5+ messages in thread
From: Alex_Gagniuc @ 2019-01-30 17:22 UTC (permalink / raw)


On 1/23/19 7:47 PM, Keith Busch wrote:
> A surprise removal may fail to tear down request queues if it is racing
> with the initial asynchronous probe. If that happens, the remove path
> won't see the queue resources to tear down, and the controller reset
> path may create a new request queue on a removed device, but will not
> be able to make forward progress, deadlocking the pci removal.
> 
> Protect setting up non-blocking resources from a shutdown by holding the
> same mutex, and transition to the CONNECTING state after these resources
> are initialized so the probe path may see the dead controller state
> before dispatching new IO.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=202081
> Reported-by: Alex Gagniuc <Alex_Gagniuc at Dellteam.com>
> Signed-off-by: Keith Busch <keith.busch at intel.com>

Tested-by: Alex Gagniuc <mr.nuke.me at gmail.com>

> ---
>   drivers/nvme/host/pci.c | 22 ++++++++++++----------
>   1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 9bc585415d9b..022ea1ee63f8 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2557,16 +2557,7 @@ static void nvme_reset_work(struct work_struct *work)
>   	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
>   		nvme_dev_disable(dev, false);
>   
> -	/*
> -	 * Introduce CONNECTING state from nvme-fc/rdma transports to mark the
> -	 * initializing procedure here.
> -	 */
> -	if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_CONNECTING)) {
> -		dev_warn(dev->ctrl.device,
> -			"failed to mark controller CONNECTING\n");
> -		goto out;
> -	}
> -
> +	mutex_lock(&dev->shutdown_lock);
>   	result = nvme_pci_enable(dev);
>   	if (result)
>   		goto out;
> @@ -2585,6 +2576,17 @@ static void nvme_reset_work(struct work_struct *work)
>   	 */
>   	dev->ctrl.max_hw_sectors = NVME_MAX_KB_SZ << 1;
>   	dev->ctrl.max_segments = NVME_MAX_SEGS;
> +	mutex_unlock(&dev->shutdown_lock);
> +
> +	/*
> +	 * Introduce CONNECTING state from nvme-fc/rdma transports to mark the
> +	 * initializing procedure here.
> +	 */
> +	if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_CONNECTING)) {
> +		dev_warn(dev->ctrl.device,
> +			"failed to mark controller CONNECTING\n");
> +		goto out;
> +	}
>   
>   	result = nvme_init_identify(&dev->ctrl);
>   	if (result)
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] nvme-pci: Fix rapid add remove sequence
  2019-01-24  1:46 [PATCH] nvme-pci: Fix rapid add remove sequence Keith Busch
  2019-01-30 17:22 ` Alex_Gagniuc
@ 2019-01-31 20:54 ` Sagi Grimberg
  2019-01-31 21:05   ` Keith Busch
  2019-02-05  8:51 ` Christoph Hellwig
  2 siblings, 1 reply; 5+ messages in thread
From: Sagi Grimberg @ 2019-01-31 20:54 UTC (permalink / raw)



> A surprise removal may fail to tear down request queues if it is racing
> with the initial asynchronous probe. If that happens, the remove path
> won't see the queue resources to tear down, and the controller reset
> path may create a new request queue on a removed device, but will not
> be able to make forward progress, deadlocking the pci removal.

Doesn't pci removal flush the reset work before making forward
progress? Perhaps what is needed that it will flush it earlier instead
of serializing with the shutdown lock?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] nvme-pci: Fix rapid add remove sequence
  2019-01-31 20:54 ` Sagi Grimberg
@ 2019-01-31 21:05   ` Keith Busch
  0 siblings, 0 replies; 5+ messages in thread
From: Keith Busch @ 2019-01-31 21:05 UTC (permalink / raw)


On Thu, Jan 31, 2019@12:54:03PM -0800, Sagi Grimberg wrote:
> 
> > A surprise removal may fail to tear down request queues if it is racing
> > with the initial asynchronous probe. If that happens, the remove path
> > won't see the queue resources to tear down, and the controller reset
> > path may create a new request queue on a removed device, but will not
> > be able to make forward progress, deadlocking the pci removal.
> 
> Doesn't pci removal flush the reset work before making forward
> progress? Perhaps what is needed that it will flush it earlier instead
> of serializing with the shutdown lock?

Removal does flush reset work, but doesn't help this particular
issue. It's pretty timing sensitive to trigger.

Before flushing reset on an surprise removal, we do an ungraceful device
teardown first in order to unblock any IO that reset work is waiting on.

In this case that Alex discovered, though, the surprise removal happens
just before the nvme driver has set up the admin and io tagsets, so
removal doesn't find any tagsets to kill, and proceeds with flushing
the reset work.

The reset work, though, just allocated brand new tagsets right after
that, so it looks like they are good to use, so dispatches an admin
command to a device that's gone.

You might expect the nvme_timeout() work to trigger 60 seconds later,
but we can't use that when the pci device is not in a normal channel
state. I wouldn't want to wait 60 seconds either, so the removal task
needs to handle get things unblocked.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] nvme-pci: Fix rapid add remove sequence
  2019-01-24  1:46 [PATCH] nvme-pci: Fix rapid add remove sequence Keith Busch
  2019-01-30 17:22 ` Alex_Gagniuc
  2019-01-31 20:54 ` Sagi Grimberg
@ 2019-02-05  8:51 ` Christoph Hellwig
  2 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2019-02-05  8:51 UTC (permalink / raw)


Thanks,

applied to nvme-5.0.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-02-05  8:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-24  1:46 [PATCH] nvme-pci: Fix rapid add remove sequence Keith Busch
2019-01-30 17:22 ` Alex_Gagniuc
2019-01-31 20:54 ` Sagi Grimberg
2019-01-31 21:05   ` Keith Busch
2019-02-05  8:51 ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.