All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2] NVMe: Fix reset/remove race
@ 2016-03-22 21:49 Keith Busch
  2016-03-23  7:44 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Keith Busch @ 2016-03-22 21:49 UTC (permalink / raw)


This fixes a scenario where device is present and being reset, but a
request to unbind the driver occurs.

A previous patch series addressing a device failure removal scenario
flushed reset_work after controller disable to unblock reset_work waiting
on a completion that wouldn't occur. This isn't safe as-is. The broken
scenario can potentially be induced with:

  modprobe nvme && modprobe -r nvme

To fix, the reset work is flushed immediately after setting the controller
removing flag, and any subsequent reset will not proceed with controller
initialization if the flag is set.

The controller status must be polled while active, so the watchdog timer
is also left active until the controller is disabled to cleanup requests
that may be stuck during namespace removal.

[Fixes: ff23a2a15a2117245b4599c1352343c8b8fb4c43]
Signed-off-by: Keith Busch <keith.busch at intel.com>
---
v1->v2:
  Removed the untested change on IO timeout handling that skipped queueing
  reset work.

 drivers/nvme/host/pci.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 24ccda3..660ec84 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1859,6 +1859,9 @@ static void nvme_reset_work(struct work_struct *work)
 	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
 		nvme_dev_disable(dev, false);
 
+	if (test_bit(NVME_CTRL_REMOVING, &dev->flags))
+		goto out;
+
 	set_bit(NVME_CTRL_RESETTING, &dev->flags);
 
 	result = nvme_pci_enable(dev);
@@ -2078,11 +2081,10 @@ static void nvme_remove(struct pci_dev *pdev)
 {
 	struct nvme_dev *dev = pci_get_drvdata(pdev);
 
-	del_timer_sync(&dev->watchdog_timer);
-
 	set_bit(NVME_CTRL_REMOVING, &dev->flags);
 	pci_set_drvdata(pdev, NULL);
 	flush_work(&dev->async_work);
+	flush_work(&dev->reset_work);
 	flush_work(&dev->scan_work);
 	nvme_remove_namespaces(&dev->ctrl);
 	nvme_uninit_ctrl(&dev->ctrl);
-- 
2.7.2

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCHv2] NVMe: Fix reset/remove race
  2016-03-22 21:49 [PATCHv2] NVMe: Fix reset/remove race Keith Busch
@ 2016-03-23  7:44 ` Christoph Hellwig
  2016-03-23  7:56 ` Johannes Thumshirn
  2016-04-03 16:38 ` sagig
  2 siblings, 0 replies; 5+ messages in thread
From: Christoph Hellwig @ 2016-03-23  7:44 UTC (permalink / raw)


This looks ok:

Reviewed-by: Christoph Hellwig <hch at lst.de>

But this just shows we'll finally need a real state machine for the
NVMe controller instead of the flags.  Sagi did a really nice one
for the Fabrics driver, and I'll plan to lift it to common code for
4.7.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCHv2] NVMe: Fix reset/remove race
  2016-03-22 21:49 [PATCHv2] NVMe: Fix reset/remove race Keith Busch
  2016-03-23  7:44 ` Christoph Hellwig
@ 2016-03-23  7:56 ` Johannes Thumshirn
  2016-04-03 16:38 ` sagig
  2 siblings, 0 replies; 5+ messages in thread
From: Johannes Thumshirn @ 2016-03-23  7:56 UTC (permalink / raw)


On Dienstag, 22. M?rz 2016 15:49:35 CET Keith Busch wrote:
> This fixes a scenario where device is present and being reset, but a
> request to unbind the driver occurs.
> 
> A previous patch series addressing a device failure removal scenario
> flushed reset_work after controller disable to unblock reset_work waiting
> on a completion that wouldn't occur. This isn't safe as-is. The broken
> scenario can potentially be induced with:
> 
>   modprobe nvme && modprobe -r nvme
> 
> To fix, the reset work is flushed immediately after setting the controller
> removing flag, and any subsequent reset will not proceed with controller
> initialization if the flag is set.
> 
> The controller status must be polled while active, so the watchdog timer
> is also left active until the controller is disabled to cleanup requests
> that may be stuck during namespace removal.
> 
> [Fixes: ff23a2a15a2117245b4599c1352343c8b8fb4c43]
> Signed-off-by: Keith Busch <keith.busch at intel.com>
> ---
> v1->v2:
>   Removed the untested change on IO timeout handling that skipped queueing
>   reset work.
> 
>  drivers/nvme/host/pci.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 24ccda3..660ec84 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1859,6 +1859,9 @@ static void nvme_reset_work(struct work_struct *work)
>  	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
>  		nvme_dev_disable(dev, false);
> 
> +	if (test_bit(NVME_CTRL_REMOVING, &dev->flags))
> +		goto out;
> +
>  	set_bit(NVME_CTRL_RESETTING, &dev->flags);
> 
>  	result = nvme_pci_enable(dev);
> @@ -2078,11 +2081,10 @@ static void nvme_remove(struct pci_dev *pdev)
>  {
>  	struct nvme_dev *dev = pci_get_drvdata(pdev);
> 
> -	del_timer_sync(&dev->watchdog_timer);
> -
>  	set_bit(NVME_CTRL_REMOVING, &dev->flags);
>  	pci_set_drvdata(pdev, NULL);
>  	flush_work(&dev->async_work);
> +	flush_work(&dev->reset_work);
>  	flush_work(&dev->scan_work);
>  	nvme_remove_namespaces(&dev->ctrl);
>  	nvme_uninit_ctrl(&dev->ctrl);

Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de>

-- 
Johannes Thumshirn                                          Storage
jthumshirn at suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: Felix Imend?rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N?rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCHv2] NVMe: Fix reset/remove race
  2016-03-22 21:49 [PATCHv2] NVMe: Fix reset/remove race Keith Busch
  2016-03-23  7:44 ` Christoph Hellwig
  2016-03-23  7:56 ` Johannes Thumshirn
@ 2016-04-03 16:38 ` sagig
  2016-04-04 19:14   ` Keith Busch
  2 siblings, 1 reply; 5+ messages in thread
From: sagig @ 2016-04-03 16:38 UTC (permalink / raw)




On 22/03/16 23:49, Keith Busch wrote:
> This fixes a scenario where device is present and being reset, but a
> request to unbind the driver occurs.
>
> A previous patch series addressing a device failure removal scenario
> flushed reset_work after controller disable to unblock reset_work waiting
> on a completion that wouldn't occur. This isn't safe as-is. The broken
> scenario can potentially be induced with:
>
>    modprobe nvme && modprobe -r nvme
>
> To fix, the reset work is flushed immediately after setting the controller
> removing flag, and any subsequent reset will not proceed with controller
> initialization if the flag is set.
>
> The controller status must be polled while active, so the watchdog timer
> is also left active until the controller is disabled to cleanup requests
> that may be stuck during namespace removal.
>
> [Fixes: ff23a2a15a2117245b4599c1352343c8b8fb4c43]
> Signed-off-by: Keith Busch <keith.busch at intel.com>
> ---
> v1->v2:
>    Removed the untested change on IO timeout handling that skipped queueing
>    reset work.
>
>   drivers/nvme/host/pci.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 24ccda3..660ec84 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1859,6 +1859,9 @@ static void nvme_reset_work(struct work_struct *work)
>   	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
>   		nvme_dev_disable(dev, false);
>   
> +	if (test_bit(NVME_CTRL_REMOVING, &dev->flags))
> +		goto out;
> +
>   	set_bit(NVME_CTRL_RESETTING, &dev->flags);
>   
>   	result = nvme_pci_enable(dev);
> @@ -2078,11 +2081,10 @@ static void nvme_remove(struct pci_dev *pdev)
>   {
>   	struct nvme_dev *dev = pci_get_drvdata(pdev);
>   
> -	del_timer_sync(&dev->watchdog_timer);
> -
>   	set_bit(NVME_CTRL_REMOVING, &dev->flags);
>   	pci_set_drvdata(pdev, NULL);
>   	flush_work(&dev->async_work);
> +	flush_work(&dev->reset_work);

Do we need the same for scan_work? AFAICT it can still
sneak in while we're removing...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCHv2] NVMe: Fix reset/remove race
  2016-04-03 16:38 ` sagig
@ 2016-04-04 19:14   ` Keith Busch
  0 siblings, 0 replies; 5+ messages in thread
From: Keith Busch @ 2016-04-04 19:14 UTC (permalink / raw)


On Sun, Apr 03, 2016@07:38:46PM +0300, sagig wrote:
> On 22/03/16 23:49, Keith Busch wrote:
> >  drivers/nvme/host/pci.c | 6 ++++--
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> >diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> >index 24ccda3..660ec84 100644
> >--- a/drivers/nvme/host/pci.c
> >+++ b/drivers/nvme/host/pci.c
> >@@ -1859,6 +1859,9 @@ static void nvme_reset_work(struct work_struct *work)
> >  	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
> >  		nvme_dev_disable(dev, false);
> >+	if (test_bit(NVME_CTRL_REMOVING, &dev->flags))
> >+		goto out;
> >+
> >  	set_bit(NVME_CTRL_RESETTING, &dev->flags);
> >  	result = nvme_pci_enable(dev);
> >@@ -2078,11 +2081,10 @@ static void nvme_remove(struct pci_dev *pdev)
> >  {
> >  	struct nvme_dev *dev = pci_get_drvdata(pdev);
> >-	del_timer_sync(&dev->watchdog_timer);
> >-
> >  	set_bit(NVME_CTRL_REMOVING, &dev->flags);
> >  	pci_set_drvdata(pdev, NULL);
> >  	flush_work(&dev->async_work);
> >+	flush_work(&dev->reset_work);
> 
> Do we need the same for scan_work? AFAICT it can still
> sneak in while we're removing...

It is flushed in the very next line not included in your reply. :)

scan_work isn't queued again when the "NVME_CTRL_REMOVING" flag is set,
so we're safe from seeing that queue again.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-04-04 19:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-22 21:49 [PATCHv2] NVMe: Fix reset/remove race Keith Busch
2016-03-23  7:44 ` Christoph Hellwig
2016-03-23  7:56 ` Johannes Thumshirn
2016-04-03 16:38 ` sagig
2016-04-04 19:14   ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.