All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] NVMe: Lock out shutdown during pci init
@ 2016-02-11 23:00 Keith Busch
  2016-02-11 23:00 ` [PATCH 2/2] NVMe: NULL pointer check for admin queues Keith Busch
  2016-02-13  9:58 ` [PATCH 1/2] NVMe: Lock out shutdown during pci init Christoph Hellwig
  0 siblings, 2 replies; 8+ messages in thread
From: Keith Busch @ 2016-02-11 23:00 UTC (permalink / raw)


An unexpected driver unbind needs to be prevented from unmapping the
pci nvme registers while they're being initialized.

Signed-off-by: Keith Busch <keith.busch at intel.com>
---
 drivers/nvme/host/pci.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 1b0e447..24a7972 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1927,13 +1927,19 @@ static void nvme_reset_work(struct work_struct *work)
 
 	set_bit(NVME_CTRL_RESETTING, &dev->flags);
 
+	/*
+	 * Lock out shutdown so an unexpected driver removal won't unmap
+	 * registers while reset work is setting them up.
+	 */
+	mutex_lock(&dev->shutdown_lock);
 	result = nvme_dev_map(dev);
 	if (result)
-		goto out;
+		goto out_unlock;
 
 	result = nvme_configure_admin_queue(dev);
 	if (result)
-		goto out;
+		goto out_unlock;
+	mutex_unlock(&dev->shutdown_lock);
 
 	nvme_init_queue(dev->queues[0], 0);
 	result = nvme_alloc_admin_tags(dev);
@@ -1970,6 +1976,8 @@ static void nvme_reset_work(struct work_struct *work)
 	clear_bit(NVME_CTRL_RESETTING, &dev->flags);
 	return;
 
+ out_unlock:
+	mutex_unlock(&dev->shutdown_lock);
  out:
 	nvme_remove_dead_ctrl(dev, result);
 }
-- 
2.6.2.307.g37023ba

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] NVMe: NULL pointer check for admin queues
  2016-02-11 23:00 [PATCH 1/2] NVMe: Lock out shutdown during pci init Keith Busch
@ 2016-02-11 23:00 ` Keith Busch
  2016-02-13  9:59   ` Christoph Hellwig
  2016-02-22  8:44   ` Sagi Grimberg
  2016-02-13  9:58 ` [PATCH 1/2] NVMe: Lock out shutdown during pci init Christoph Hellwig
  1 sibling, 2 replies; 8+ messages in thread
From: Keith Busch @ 2016-02-11 23:00 UTC (permalink / raw)


Admin commands don't have a namespace, so don't check the state when
there isn't one.

An admin command that does happen to make it to a disabled queue can
failed immediatly: Initialization commands should not be requeued
across resets as they would mess up the controller if/when it is
reinitialized. If a user command made hits this condition, it should
be able to handle this error as that's the existing result for commands
when the controller is reset with outstanding admin commands.

Signed-off-by: Keith Busch <keith.busch at intel.com>
---
 drivers/nvme/host/pci.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 24a7972..44ccd23 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -678,7 +678,12 @@ static int nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
 
 	spin_lock_irq(&nvmeq->q_lock);
 	if (unlikely(nvmeq->cq_vector < 0)) {
-		ret = test_bit(NVME_NS_DEAD, &ns->flags) ?
+		/*
+		 * Admin commands need to fail immediatly when device is
+		 * disabled: initialization commands should not requeue
+		 * across resets.
+		 */
+		ret = !ns || test_bit(NVME_NS_DEAD, &ns->flags) ?
 					BLK_MQ_RQ_QUEUE_ERROR :
 					BLK_MQ_RQ_QUEUE_BUSY;
 		spin_unlock_irq(&nvmeq->q_lock);
-- 
2.6.2.307.g37023ba

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 1/2] NVMe: Lock out shutdown during pci init
  2016-02-11 23:00 [PATCH 1/2] NVMe: Lock out shutdown during pci init Keith Busch
  2016-02-11 23:00 ` [PATCH 2/2] NVMe: NULL pointer check for admin queues Keith Busch
@ 2016-02-13  9:58 ` Christoph Hellwig
  2016-02-16 18:40   ` Keith Busch
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2016-02-13  9:58 UTC (permalink / raw)


On Thu, Feb 11, 2016@04:00:25PM -0700, Keith Busch wrote:
> An unexpected driver unbind needs to be prevented from unmapping the
> pci nvme registers while they're being initialized.
> 
> Signed-off-by: Keith Busch <keith.busch at intel.com>
> ---
>  drivers/nvme/host/pci.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 1b0e447..24a7972 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1927,13 +1927,19 @@ static void nvme_reset_work(struct work_struct *work)
>  
>  	set_bit(NVME_CTRL_RESETTING, &dev->flags);
>  
> +	/*
> +	 * Lock out shutdown so an unexpected driver removal won't unmap
> +	 * registers while reset work is setting them up.
> +	 */
> +	mutex_lock(&dev->shutdown_lock);
>  	result = nvme_dev_map(dev);
>  	if (result)
> -		goto out;
> +		goto out_unlock;
>  
>  	result = nvme_configure_admin_queue(dev);
>  	if (result)
> -		goto out;
> +		goto out_unlock;
> +	mutex_unlock(&dev->shutdown_lock);

Seems like we should be holding the lock over the call
to nvme_dev_disable to get protection for the whole execution,
e.g. by adding a __nvme_dev_disable variant that expects the lock
to be held.

Maybe it's also worth renaming shutdown_lock to register_mutex
or similar to better document what it protects.

>  
>  	nvme_init_queue(dev->queues[0], 0);
>  	result = nvme_alloc_admin_tags(dev);
> @@ -1970,6 +1976,8 @@ static void nvme_reset_work(struct work_struct *work)
>  	clear_bit(NVME_CTRL_RESETTING, &dev->flags);
>  	return;
>  
> + out_unlock:
> +	mutex_unlock(&dev->shutdown_lock);
>   out:
>  	nvme_remove_dead_ctrl(dev, result);
>  }
> -- 
> 2.6.2.307.g37023ba
> 
> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
---end quoted text---

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 2/2] NVMe: NULL pointer check for admin queues
  2016-02-11 23:00 ` [PATCH 2/2] NVMe: NULL pointer check for admin queues Keith Busch
@ 2016-02-13  9:59   ` Christoph Hellwig
  2016-02-22  8:44   ` Sagi Grimberg
  1 sibling, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2016-02-13  9:59 UTC (permalink / raw)


On Thu, Feb 11, 2016@04:00:26PM -0700, Keith Busch wrote:
> Admin commands don't have a namespace, so don't check the state when
> there isn't one.
> 
> An admin command that does happen to make it to a disabled queue can
> failed immediatly: Initialization commands should not be requeued
> across resets as they would mess up the controller if/when it is
> reinitialized. If a user command made hits this condition, it should
> be able to handle this error as that's the existing result for commands
> when the controller is reset with outstanding admin commands.

Looks fine, but I'd say fold this directly into the respin of the patch
introducing the check.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] NVMe: Lock out shutdown during pci init
  2016-02-13  9:58 ` [PATCH 1/2] NVMe: Lock out shutdown during pci init Christoph Hellwig
@ 2016-02-16 18:40   ` Keith Busch
  2016-02-17  7:59     ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Keith Busch @ 2016-02-16 18:40 UTC (permalink / raw)


On Sat, Feb 13, 2016@01:58:51AM -0800, Christoph Hellwig wrote:
> Seems like we should be holding the lock over the call
> to nvme_dev_disable to get protection for the whole execution,
> e.g. by adding a __nvme_dev_disable variant that expects the lock
> to be held.
> 
> Maybe it's also worth renaming shutdown_lock to register_mutex
> or similar to better document what it protects.

Perhaps we shouldn't unmap the registers on a reset. That'd fix several
problems, including this one and writing queue doorbells.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] NVMe: Lock out shutdown during pci init
  2016-02-16 18:40   ` Keith Busch
@ 2016-02-17  7:59     ` Christoph Hellwig
  2016-02-17 13:40       ` Keith Busch
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2016-02-17  7:59 UTC (permalink / raw)


On Tue, Feb 16, 2016@06:40:47PM +0000, Keith Busch wrote:
> On Sat, Feb 13, 2016@01:58:51AM -0800, Christoph Hellwig wrote:
> > Seems like we should be holding the lock over the call
> > to nvme_dev_disable to get protection for the whole execution,
> > e.g. by adding a __nvme_dev_disable variant that expects the lock
> > to be held.
> > 
> > Maybe it's also worth renaming shutdown_lock to register_mutex
> > or similar to better document what it protects.
> 
> Perhaps we shouldn't unmap the registers on a reset. That'd fix several
> problems, including this one and writing queue doorbells.

I always thought there was a deep reason we'd need to unmap it.  But
if there isn't getting rid of the unmap should make the driver a lot
more robust.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] NVMe: Lock out shutdown during pci init
  2016-02-17  7:59     ` Christoph Hellwig
@ 2016-02-17 13:40       ` Keith Busch
  0 siblings, 0 replies; 8+ messages in thread
From: Keith Busch @ 2016-02-17 13:40 UTC (permalink / raw)


On Tue, Feb 16, 2016@11:59:13PM -0800, Christoph Hellwig wrote:
> I always thought there was a deep reason we'd need to unmap it.  But
> if there isn't getting rid of the unmap should make the driver a lot
> more robust.

Remapping after each reset was prepping for a non-existent feature.
We'd like to create domains safe for the PCI core to reassign bus numbers
and memory windows on a live system for various hot plug scenarios,
but I don't see that happening soon.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 2/2] NVMe: NULL pointer check for admin queues
  2016-02-11 23:00 ` [PATCH 2/2] NVMe: NULL pointer check for admin queues Keith Busch
  2016-02-13  9:59   ` Christoph Hellwig
@ 2016-02-22  8:44   ` Sagi Grimberg
  1 sibling, 0 replies; 8+ messages in thread
From: Sagi Grimberg @ 2016-02-22  8:44 UTC (permalink / raw)


Looks good,

Reviewed-by: Sagi Grimberg <sagig at mellanox.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-02-22  8:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-11 23:00 [PATCH 1/2] NVMe: Lock out shutdown during pci init Keith Busch
2016-02-11 23:00 ` [PATCH 2/2] NVMe: NULL pointer check for admin queues Keith Busch
2016-02-13  9:59   ` Christoph Hellwig
2016-02-22  8:44   ` Sagi Grimberg
2016-02-13  9:58 ` [PATCH 1/2] NVMe: Lock out shutdown during pci init Christoph Hellwig
2016-02-16 18:40   ` Keith Busch
2016-02-17  7:59     ` Christoph Hellwig
2016-02-17 13:40       ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.