All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] NVMe: Unbind driver on failure
@ 2016-03-28 21:42 Keith Busch
  2016-03-29  7:03 ` Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Keith Busch @ 2016-03-28 21:42 UTC (permalink / raw)


Instead of removing the PCI device from the kernel's topology on
controller failure, this patch simply requests unbinding the device
from the driver. This avoids concurrently running pci removal with the
hot plug event, which has been reported to be problematic when multiple
surprise events occur near simultaneously.

The other benefit is that we will have PCI config and memory space
available to poke around for debugging a failed controller, assuming
the device was not physically removed.

The down side occurs if the platform and/or kernel do not support any
type of surprise hot removal. The device will remain visible through
sysfs (and therefore lspci), and some manual work is necessary to get
the logical topology corrected. But if your platform and/or kernel don't
support surprise removal, you probably shouldn't be doing that anyway.

Signed-off-by: Keith Busch <keith.busch at intel.com>
---
 drivers/nvme/host/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 660ec84..5acd6e4 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1916,7 +1916,7 @@ static void nvme_remove_dead_ctrl_work(struct work_struct *work)
 
 	nvme_kill_queues(&dev->ctrl);
 	if (pci_get_drvdata(pdev))
-		pci_stop_and_remove_bus_device_locked(pdev);
+		device_release_driver(&pdev->dev);
 	nvme_put_ctrl(&dev->ctrl);
 }
 
-- 
2.7.2

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH] NVMe: Unbind driver on failure
  2016-03-28 21:42 [PATCH] NVMe: Unbind driver on failure Keith Busch
@ 2016-03-29  7:03 ` Christoph Hellwig
  2016-04-03 16:33 ` sagig
  2016-04-26 20:15 ` Keith Busch
  2 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2016-03-29  7:03 UTC (permalink / raw)


This looks reasonable:

Reviewed-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] NVMe: Unbind driver on failure
  2016-03-28 21:42 [PATCH] NVMe: Unbind driver on failure Keith Busch
  2016-03-29  7:03 ` Christoph Hellwig
@ 2016-04-03 16:33 ` sagig
  2016-04-26 20:15 ` Keith Busch
  2 siblings, 0 replies; 4+ messages in thread
From: sagig @ 2016-04-03 16:33 UTC (permalink / raw)


Looks good,

Reviewed-by: Sagi Grimberg <sagi at grimberg.me>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] NVMe: Unbind driver on failure
  2016-03-28 21:42 [PATCH] NVMe: Unbind driver on failure Keith Busch
  2016-03-29  7:03 ` Christoph Hellwig
  2016-04-03 16:33 ` sagig
@ 2016-04-26 20:15 ` Keith Busch
  2 siblings, 0 replies; 4+ messages in thread
From: Keith Busch @ 2016-04-26 20:15 UTC (permalink / raw)


ping

On Mon, Mar 28, 2016@03:42:39PM -0600, Keith Busch wrote:
> Instead of removing the PCI device from the kernel's topology on
> controller failure, this patch simply requests unbinding the device
> from the driver. This avoids concurrently running pci removal with the
> hot plug event, which has been reported to be problematic when multiple
> surprise events occur near simultaneously.
> 
> The other benefit is that we will have PCI config and memory space
> available to poke around for debugging a failed controller, assuming
> the device was not physically removed.
> 
> The down side occurs if the platform and/or kernel do not support any
> type of surprise hot removal. The device will remain visible through
> sysfs (and therefore lspci), and some manual work is necessary to get
> the logical topology corrected. But if your platform and/or kernel don't
> support surprise removal, you probably shouldn't be doing that anyway.
> 
> Signed-off-by: Keith Busch <keith.busch at intel.com>
> ---
>  drivers/nvme/host/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 660ec84..5acd6e4 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1916,7 +1916,7 @@ static void nvme_remove_dead_ctrl_work(struct work_struct *work)
>  
>  	nvme_kill_queues(&dev->ctrl);
>  	if (pci_get_drvdata(pdev))
> -		pci_stop_and_remove_bus_device_locked(pdev);
> +		device_release_driver(&pdev->dev);
>  	nvme_put_ctrl(&dev->ctrl);
>  }
>  
> -- 
> 2.7.2

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-04-26 20:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-28 21:42 [PATCH] NVMe: Unbind driver on failure Keith Busch
2016-03-29  7:03 ` Christoph Hellwig
2016-04-03 16:33 ` sagig
2016-04-26 20:15 ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.