[PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing
@ 2018-01-18 10:10 Jianchao Wang
  2018-01-18 10:10 ` [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure Jianchao Wang
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Jianchao Wang @ 2018-01-18 10:10 UTC (permalink / raw)
  To: keith.busch, axboe, hch, sagi, maxg, james.smart; +Cc: linux-nvme, linux-kernel

Hello

Please consider the following scenario.
nvme_reset_ctrl
  -> set state to RESETTING
  -> queue reset_work       
    (scheduling)
nvme_reset_work
  -> nvme_dev_disable
    -> quiesce queues
    -> nvme_cancel_request 
       on outstanding requests
-------------------------------_boundary_
  -> nvme initializing (issue request on adminq)

Before the _boundary_, not only quiesce the queues, but only cancel
all the outstanding requests.

A request could expire when the ctrl state is RESETTING.
 - If the timeout occur before the _boundary_, the expired requests
   are from the previous work.
 - Otherwise, the expired requests are from the controller initializing
   procedure, such as sending cq/sq create commands to adminq to setup
   io queues.
In current implementation, nvme_timeout cannot identify the _boundary_ 
so only handles second case above.

In fact, after Sagi's commit (nvme-rdma: fix concurrent reset and
reconnect), both nvme-fc/rdma have following pattern:
RESETTING    - quiesce blk-mq queues, teardown and delete queues/
               connections, clear out outstanding IO requests...
RECONNECTING - establish new queues/connections and some other
               initializing things.
Introduce RECONNECTING to nvme-pci transport to do the same mark
Then we get a coherent state definition among nvme pci/rdma/fc
transports and nvme_timeout could identify the _boundary_.

V5:
 - discard RESET_PREPARE and introduce RESETTING into nvme-pci
 - change the 1st patch's name and comment
 - other misc changes

V4:
 - rebase patches on Jens' for-next
 - let RESETTING equal to RECONNECTING in terms of work procedure
 - change the 1st patch's name and comment
 - other misc changes

V3:
 - fix wrong reference in loop.c
 - other misc changes

V2:
 - split NVME_CTRL_RESETTING into NVME_CTRL_RESET_PREPARE and
   NVME_CTRL_RESETTING. Introduce new patch based on this.
 - distinguish the requests based on the new state in nvme_timeout
 - change comments of patch

drivers/nvme/host/core.c |  2 +-
drivers/nvme/host/pci.c  | 43 ++++++++++++++++++++++++++++++++-----------
2 files changed, 33 insertions(+), 12 deletions(-)

Thanks
Jianchao

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure
  2018-01-18 10:10 [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing Jianchao Wang
@ 2018-01-18 10:10 ` Jianchao Wang
  2018-01-18 10:17   ` Max Gurtovoy
  2018-01-18 15:23   ` James Smart
  2018-01-18 10:10 ` [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing Jianchao Wang
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 20+ messages in thread
From: Jianchao Wang @ 2018-01-18 10:10 UTC (permalink / raw)
  To: keith.busch, axboe, hch, sagi, maxg, james.smart; +Cc: linux-nvme, linux-kernel

After Sagi's commit (nvme-rdma: fix concurrent reset and reconnect),
both nvme-fc/rdma have following pattern:
RESETTING    - quiesce blk-mq queues, teardown and delete queues/
               connections, clear out outstanding IO requests...
RECONNECTING - establish new queues/connections and some other
               initializing things.
Introduce RECONNECTING to nvme-pci transport to do the same mark.
Then we get a coherent state definition among nvme pci/rdma/fc
transports.

Suggested-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 drivers/nvme/host/core.c |  2 +-
 drivers/nvme/host/pci.c  | 19 +++++++++++++++++--
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 230cc09..23b3e53 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -242,7 +242,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 	switch (new_state) {
 	case NVME_CTRL_ADMIN_ONLY:
 		switch (old_state) {
-		case NVME_CTRL_RESETTING:
+		case NVME_CTRL_RECONNECTING:
 			changed = true;
 			/* FALLTHRU */
 		default:
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 45f843d..05344be 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1138,9 +1138,14 @@ static bool nvme_should_reset(struct nvme_dev *dev, u32 csts)
 	 */
 	bool nssro = dev->subsystem && (csts & NVME_CSTS_NSSRO);
 
-	/* If there is a reset ongoing, we shouldn't reset again. */
-	if (dev->ctrl.state == NVME_CTRL_RESETTING)
+	/* If there is a reset/reinit ongoing, we shouldn't reset again. */
+	switch (dev->ctrl.state) {
+	case NVME_CTRL_RESETTING:
+	case NVME_CTRL_RECONNECTING:
 		return false;
+	default:
+		break;
+	}
 
 	/* We shouldn't reset unless the controller is on fatal error state
 	 * _or_ if we lost the communication with it.
@@ -2304,6 +2309,16 @@ static void nvme_reset_work(struct work_struct *work)
 	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
 		nvme_dev_disable(dev, false);
 
+	/*
+	 * Introduce RECONNECTING state from nvme-fc/rdma transports to mark the
+	 * initializing procedure here.
+	 */
+	if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RECONNECTING)) {
+		dev_warn(dev->ctrl.device,
+			"failed to mark controller RECONNECTING\n");
+		goto out;
+	}
+
 	result = nvme_pci_enable(dev);
 	if (result)
 		goto out;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing
  2018-01-18 10:10 [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing Jianchao Wang
  2018-01-18 10:10 ` [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure Jianchao Wang
@ 2018-01-18 10:10 ` Jianchao Wang
  2018-01-19  4:59   ` Keith Busch
  2018-01-18 15:34 ` [PATCH V5 0/2] nvme-pci: fix " James Smart
  2018-01-19  8:01 ` Keith Busch
  3 siblings, 1 reply; 20+ messages in thread
From: Jianchao Wang @ 2018-01-18 10:10 UTC (permalink / raw)
  To: keith.busch, axboe, hch, sagi, maxg, james.smart; +Cc: linux-nvme, linux-kernel

Look at the following scenario.
nvme_reset_ctrl
  -> set state to RESETTING
  -> queue reset_work
     (scheduling)
nvme_reset_work
  -> nvme_dev_disable
    -> quiesce queues
    -> nvme_cancel_request
       on outstanding requests
    -> set state to RECONNECTING
    -> nvme initializing
       (will issue request on adminq)
A request could expire after the ctrl state is set to RESETTING
and queue the reset_work.
 - If the timeout occurs before state RECONNECTING, the expired
   requests are from the previous work.
 - Otherwise, the expired requests are from the controller
   initializing procedure, such as sending cq/sq create commands
   to adminq to setup io queues.
In current implementation, nvme_timeout only handles second case
above.

To fix this, when the state is RESETTING, handle the expired
requests as nvme_cancel_request, when the state is RECONNECTING,
handle it as the original method and discard the nvme_dev_disable
there, the nvme_reset_work will see the error and invoke itself.

Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
---
 drivers/nvme/host/pci.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 05344be..a9b2359 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1210,18 +1210,24 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 	}
 
 	/*
-	 * Shutdown immediately if controller times out while starting. The
-	 * reset work will see the pci device disabled when it gets the forced
-	 * cancellation error. All outstanding requests are completed on
-	 * shutdown, so we return BLK_EH_HANDLED.
+	 * - When the ctrl.state is NVME_CTRL_RESETTING, the expired
+	 *   request should come from the previous work and we handle
+	 *   it as nvme_cancel_request.
+	 * - When the ctrl.state is NVME_CTRL_RECONNECTING, the expired
+	 *   request should come from the initializing procedure such as
+	 *   setup io queues, because all the previous outstanding
+	 *   requests should have been cancelled.
 	 */
-	if (dev->ctrl.state == NVME_CTRL_RESETTING) {
-		dev_warn(dev->ctrl.device,
-			 "I/O %d QID %d timeout, disable controller\n",
-			 req->tag, nvmeq->qid);
-		nvme_dev_disable(dev, false);
+	switch (dev->ctrl.state) {
+	case NVME_CTRL_RESETTING:
+		nvme_req(req)->status = NVME_SC_ABORT_REQ;
+		return BLK_EH_HANDLED;
+	case NVME_CTRL_RECONNECTING:
+		WARN_ON_ONCE(nvmeq->qid);
 		nvme_req(req)->flags |= NVME_REQ_CANCELLED;
 		return BLK_EH_HANDLED;
+	default:
+		break;
 	}
 
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure
  2018-01-18 10:10 ` [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure Jianchao Wang
@ 2018-01-18 10:17   ` Max Gurtovoy
  2018-01-19  9:49     ` jianchao.wang
  2018-01-18 15:23   ` James Smart
  1 sibling, 1 reply; 20+ messages in thread
From: Max Gurtovoy @ 2018-01-18 10:17 UTC (permalink / raw)
  To: Jianchao Wang, keith.busch, axboe, hch, sagi, james.smart
  Cc: linux-nvme, linux-kernel



On 1/18/2018 12:10 PM, Jianchao Wang wrote:
> After Sagi's commit (nvme-rdma: fix concurrent reset and reconnect),
> both nvme-fc/rdma have following pattern:
> RESETTING    - quiesce blk-mq queues, teardown and delete queues/
>                 connections, clear out outstanding IO requests...
> RECONNECTING - establish new queues/connections and some other
>                 initializing things.

I guess we can call it NVME_CTRL_CONNECTING in later patch (more 
suitable name for this state now).

> Introduce RECONNECTING to nvme-pci transport to do the same mark.
> Then we get a coherent state definition among nvme pci/rdma/fc
> transports.
> 
> Suggested-by: James Smart <james.smart@broadcom.com>
> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> ---
>   drivers/nvme/host/core.c |  2 +-
>   drivers/nvme/host/pci.c  | 19 +++++++++++++++++--
>   2 files changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 230cc09..23b3e53 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -242,7 +242,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
>   	switch (new_state) {
>   	case NVME_CTRL_ADMIN_ONLY:
>   		switch (old_state) {
> -		case NVME_CTRL_RESETTING:
> +		case NVME_CTRL_RECONNECTING:
>   			changed = true;
>   			/* FALLTHRU */
>   		default:
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 45f843d..05344be 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1138,9 +1138,14 @@ static bool nvme_should_reset(struct nvme_dev *dev, u32 csts)
>   	 */
>   	bool nssro = dev->subsystem && (csts & NVME_CSTS_NSSRO);
>   
> -	/* If there is a reset ongoing, we shouldn't reset again. */
> -	if (dev->ctrl.state == NVME_CTRL_RESETTING)
> +	/* If there is a reset/reinit ongoing, we shouldn't reset again. */
> +	switch (dev->ctrl.state) {
> +	case NVME_CTRL_RESETTING:
> +	case NVME_CTRL_RECONNECTING:
>   		return false;
> +	default:
> +		break;
> +	}
>   
>   	/* We shouldn't reset unless the controller is on fatal error state
>   	 * _or_ if we lost the communication with it.
> @@ -2304,6 +2309,16 @@ static void nvme_reset_work(struct work_struct *work)
>   	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
>   		nvme_dev_disable(dev, false);
>   
> +	/*
> +	 * Introduce RECONNECTING state from nvme-fc/rdma transports to mark the
> +	 * initializing procedure here.
> +	 */
> +	if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RECONNECTING)) {
> +		dev_warn(dev->ctrl.device,
> +			"failed to mark controller RECONNECTING\n");
> +		goto out;
> +	}
> +
>   	result = nvme_pci_enable(dev);
>   	if (result)
>   		goto out;
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure
  2018-01-18 10:10 ` [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure Jianchao Wang
  2018-01-18 10:17   ` Max Gurtovoy
@ 2018-01-18 15:23   ` James Smart
  1 sibling, 0 replies; 20+ messages in thread
From: James Smart @ 2018-01-18 15:23 UTC (permalink / raw)
  To: Jianchao Wang, keith.busch, axboe, hch, sagi, maxg
  Cc: linux-nvme, linux-kernel

On 1/18/2018 2:10 AM, Jianchao Wang wrote:
> After Sagi's commit (nvme-rdma: fix concurrent reset and reconnect),
> both nvme-fc/rdma have following pattern:
> RESETTING    - quiesce blk-mq queues, teardown and delete queues/
>                 connections, clear out outstanding IO requests...
> RECONNECTING - establish new queues/connections and some other
>                 initializing things.
> Introduce RECONNECTING to nvme-pci transport to do the same mark.
> Then we get a coherent state definition among nvme pci/rdma/fc
> transports.
>
> Suggested-by: James Smart <james.smart@broadcom.com>
> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com>
> ---
>   drivers/nvme/host/core.c |  2 +-
>   drivers/nvme/host/pci.c  | 19 +++++++++++++++++--
>   2 files changed, 18 insertions(+), 3 deletions(-)
>

Reviewed-by: James Smart  <james.smart@broadcom.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing
  2018-01-18 10:10 [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing Jianchao Wang
  2018-01-18 10:10 ` [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure Jianchao Wang
  2018-01-18 10:10 ` [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing Jianchao Wang
@ 2018-01-18 15:34 ` James Smart
  2018-01-19  8:01 ` Keith Busch
  3 siblings, 0 replies; 20+ messages in thread
From: James Smart @ 2018-01-18 15:34 UTC (permalink / raw)
  To: Jianchao Wang, keith.busch, axboe, hch, sagi, maxg
  Cc: linux-nvme, linux-kernel

Jianchao,

This looks very coherent to me. Thank You.

-- james



On 1/18/2018 2:10 AM, Jianchao Wang wrote:
> Hello
>
> Please consider the following scenario.
> nvme_reset_ctrl
>    -> set state to RESETTING
>    -> queue reset_work
>      (scheduling)
> nvme_reset_work
>    -> nvme_dev_disable
>      -> quiesce queues
>      -> nvme_cancel_request
>         on outstanding requests
> -------------------------------_boundary_
>    -> nvme initializing (issue request on adminq)
>
> Before the _boundary_, not only quiesce the queues, but only cancel
> all the outstanding requests.
>
> A request could expire when the ctrl state is RESETTING.
>   - If the timeout occur before the _boundary_, the expired requests
>     are from the previous work.
>   - Otherwise, the expired requests are from the controller initializing
>     procedure, such as sending cq/sq create commands to adminq to setup
>     io queues.
> In current implementation, nvme_timeout cannot identify the _boundary_
> so only handles second case above.
>
> In fact, after Sagi's commit (nvme-rdma: fix concurrent reset and
> reconnect), both nvme-fc/rdma have following pattern:
> RESETTING    - quiesce blk-mq queues, teardown and delete queues/
>                 connections, clear out outstanding IO requests...
> RECONNECTING - establish new queues/connections and some other
>                 initializing things.
> Introduce RECONNECTING to nvme-pci transport to do the same mark
> Then we get a coherent state definition among nvme pci/rdma/fc
> transports and nvme_timeout could identify the _boundary_.
>
> V5:
>   - discard RESET_PREPARE and introduce RESETTING into nvme-pci
>   - change the 1st patch's name and comment
>   - other misc changes
>
> V4:
>   - rebase patches on Jens' for-next
>   - let RESETTING equal to RECONNECTING in terms of work procedure
>   - change the 1st patch's name and comment
>   - other misc changes
>
> V3:
>   - fix wrong reference in loop.c
>   - other misc changes
>
> V2:
>   - split NVME_CTRL_RESETTING into NVME_CTRL_RESET_PREPARE and
>     NVME_CTRL_RESETTING. Introduce new patch based on this.
>   - distinguish the requests based on the new state in nvme_timeout
>   - change comments of patch
>
> drivers/nvme/host/core.c |  2 +-
> drivers/nvme/host/pci.c  | 43 ++++++++++++++++++++++++++++++++-----------
> 2 files changed, 33 insertions(+), 12 deletions(-)
>
> Thanks
> Jianchao

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing
  2018-01-18 10:10 ` [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing Jianchao Wang
@ 2018-01-19  4:59   ` Keith Busch
  2018-01-19  5:55     ` jianchao.wang
  0 siblings, 1 reply; 20+ messages in thread
From: Keith Busch @ 2018-01-19  4:59 UTC (permalink / raw)
  To: Jianchao Wang
  Cc: axboe, hch, sagi, maxg, james.smart, linux-nvme, linux-kernel

On Thu, Jan 18, 2018 at 06:10:02PM +0800, Jianchao Wang wrote:
> +	 * - When the ctrl.state is NVME_CTRL_RESETTING, the expired
> +	 *   request should come from the previous work and we handle
> +	 *   it as nvme_cancel_request.
> +	 * - When the ctrl.state is NVME_CTRL_RECONNECTING, the expired
> +	 *   request should come from the initializing procedure such as
> +	 *   setup io queues, because all the previous outstanding
> +	 *   requests should have been cancelled.
>  	 */
> -	if (dev->ctrl.state == NVME_CTRL_RESETTING) {
> -		dev_warn(dev->ctrl.device,
> -			 "I/O %d QID %d timeout, disable controller\n",
> -			 req->tag, nvmeq->qid);
> -		nvme_dev_disable(dev, false);
> +	switch (dev->ctrl.state) {
> +	case NVME_CTRL_RESETTING:
> +		nvme_req(req)->status = NVME_SC_ABORT_REQ;
> +		return BLK_EH_HANDLED;
> +	case NVME_CTRL_RECONNECTING:
> +		WARN_ON_ONCE(nvmeq->qid);
>  		nvme_req(req)->flags |= NVME_REQ_CANCELLED;
>  		return BLK_EH_HANDLED;
> +	default:
> +		break;
>  	}

The driver may be giving up on the command here, but that doesn't mean
the controller has. We can't just end the request like this because that
will release the memory the controller still owns. We must wait until
after nvme_dev_disable clears bus master because we can't say for sure
the controller isn't going to write to that address right after we end
the request.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing
  2018-01-19  4:59   ` Keith Busch
@ 2018-01-19  5:55     ` jianchao.wang
  2018-01-19  6:05       ` Keith Busch
  0 siblings, 1 reply; 20+ messages in thread
From: jianchao.wang @ 2018-01-19  5:55 UTC (permalink / raw)
  To: Keith Busch; +Cc: axboe, hch, sagi, maxg, james.smart, linux-nvme, linux-kernel

Hi Keith

Thanks for your kindly response and directive.

On 01/19/2018 12:59 PM, Keith Busch wrote:
> On Thu, Jan 18, 2018 at 06:10:02PM +0800, Jianchao Wang wrote:
>> +	 * - When the ctrl.state is NVME_CTRL_RESETTING, the expired
>> +	 *   request should come from the previous work and we handle
>> +	 *   it as nvme_cancel_request.
>> +	 * - When the ctrl.state is NVME_CTRL_RECONNECTING, the expired
>> +	 *   request should come from the initializing procedure such as
>> +	 *   setup io queues, because all the previous outstanding
>> +	 *   requests should have been cancelled.
>>  	 */
>> -	if (dev->ctrl.state == NVME_CTRL_RESETTING) {
>> -		dev_warn(dev->ctrl.device,
>> -			 "I/O %d QID %d timeout, disable controller\n",
>> -			 req->tag, nvmeq->qid);
>> -		nvme_dev_disable(dev, false);
>> +	switch (dev->ctrl.state) {
>> +	case NVME_CTRL_RESETTING:
>> +		nvme_req(req)->status = NVME_SC_ABORT_REQ;
>> +		return BLK_EH_HANDLED;
>> +	case NVME_CTRL_RECONNECTING:
>> +		WARN_ON_ONCE(nvmeq->qid);
>>  		nvme_req(req)->flags |= NVME_REQ_CANCELLED;
>>  		return BLK_EH_HANDLED;
>> +	default:
>> +		break;
>>  	}
> 
> The driver may be giving up on the command here, but that doesn't mean
> the controller has. We can't just end the request like this because that
> will release the memory the controller still owns. We must wait until
> after nvme_dev_disable clears bus master because we can't say for sure
> the controller isn't going to write to that address right after we end
> the request.
> 
Yes, but the controller is going to be reseted or shutdown at the moment,
even if the controller accesses a bad address and goes wrong, everything will
be ok after reset or shutdown. :)

Thanks
Jianchao  

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing
  2018-01-19  5:55     ` jianchao.wang
@ 2018-01-19  6:05       ` Keith Busch
  2018-01-19  6:53         ` jianchao.wang
  0 siblings, 1 reply; 20+ messages in thread
From: Keith Busch @ 2018-01-19  6:05 UTC (permalink / raw)
  To: jianchao.wang
  Cc: axboe, hch, sagi, maxg, james.smart, linux-nvme, linux-kernel

On Fri, Jan 19, 2018 at 01:55:29PM +0800, jianchao.wang wrote:
> On 01/19/2018 12:59 PM, Keith Busch wrote:
> > On Thu, Jan 18, 2018 at 06:10:02PM +0800, Jianchao Wang wrote:
> >> +	 * - When the ctrl.state is NVME_CTRL_RESETTING, the expired
> >> +	 *   request should come from the previous work and we handle
> >> +	 *   it as nvme_cancel_request.
> >> +	 * - When the ctrl.state is NVME_CTRL_RECONNECTING, the expired
> >> +	 *   request should come from the initializing procedure such as
> >> +	 *   setup io queues, because all the previous outstanding
> >> +	 *   requests should have been cancelled.
> >>  	 */
> >> -	if (dev->ctrl.state == NVME_CTRL_RESETTING) {
> >> -		dev_warn(dev->ctrl.device,
> >> -			 "I/O %d QID %d timeout, disable controller\n",
> >> -			 req->tag, nvmeq->qid);
> >> -		nvme_dev_disable(dev, false);
> >> +	switch (dev->ctrl.state) {
> >> +	case NVME_CTRL_RESETTING:
> >> +		nvme_req(req)->status = NVME_SC_ABORT_REQ;
> >> +		return BLK_EH_HANDLED;
> >> +	case NVME_CTRL_RECONNECTING:
> >> +		WARN_ON_ONCE(nvmeq->qid);
> >>  		nvme_req(req)->flags |= NVME_REQ_CANCELLED;
> >>  		return BLK_EH_HANDLED;
> >> +	default:
> >> +		break;
> >>  	}
> > 
> > The driver may be giving up on the command here, but that doesn't mean
> > the controller has. We can't just end the request like this because that
> > will release the memory the controller still owns. We must wait until
> > after nvme_dev_disable clears bus master because we can't say for sure
> > the controller isn't going to write to that address right after we end
> > the request.
> > 
> Yes, but the controller is going to be reseted or shutdown at the moment,
> even if the controller accesses a bad address and goes wrong, everything will
> be ok after reset or shutdown. :)

Hm, I don't follow. DMA access after free is never okay.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing
  2018-01-19  6:05       ` Keith Busch
@ 2018-01-19  6:53         ` jianchao.wang
  0 siblings, 0 replies; 20+ messages in thread
From: jianchao.wang @ 2018-01-19  6:53 UTC (permalink / raw)
  To: Keith Busch; +Cc: sagi, james.smart, linux-nvme, linux-kernel, axboe, maxg, hch

Hi Keith

Thanks for your kindly reminding.

On 01/19/2018 02:05 PM, Keith Busch wrote:
>>> The driver may be giving up on the command here, but that doesn't mean
>>> the controller has. We can't just end the request like this because that
>>> will release the memory the controller still owns. We must wait until
>>> after nvme_dev_disable clears bus master because we can't say for sure
>>> the controller isn't going to write to that address right after we end
>>> the request.
>>>
>> Yes, but the controller is going to be reseted or shutdown at the moment,
>> even if the controller accesses a bad address and goes wrong, everything will
>> be ok after reset or shutdown. :)
> Hm, I don't follow. DMA access after free is never okay.
Yes, this may cause unexpected memory corruption.

Thanks
Jianchao

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing
  2018-01-18 10:10 [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing Jianchao Wang
                   ` (2 preceding siblings ...)
  2018-01-18 15:34 ` [PATCH V5 0/2] nvme-pci: fix " James Smart
@ 2018-01-19  8:01 ` Keith Busch
  2018-01-19  8:14   ` jianchao.wang
  3 siblings, 1 reply; 20+ messages in thread
From: Keith Busch @ 2018-01-19  8:01 UTC (permalink / raw)
  To: Jianchao Wang
  Cc: axboe, hch, sagi, maxg, james.smart, linux-nvme, linux-kernel

On Thu, Jan 18, 2018 at 06:10:00PM +0800, Jianchao Wang wrote:
> Hello
> 
> Please consider the following scenario.
> nvme_reset_ctrl
>   -> set state to RESETTING
>   -> queue reset_work       
>     (scheduling)
> nvme_reset_work
>   -> nvme_dev_disable
>     -> quiesce queues
>     -> nvme_cancel_request 
>        on outstanding requests
> -------------------------------_boundary_
>   -> nvme initializing (issue request on adminq)
> 
> Before the _boundary_, not only quiesce the queues, but only cancel
> all the outstanding requests.
> 
> A request could expire when the ctrl state is RESETTING.
>  - If the timeout occur before the _boundary_, the expired requests
>    are from the previous work.
>  - Otherwise, the expired requests are from the controller initializing
>    procedure, such as sending cq/sq create commands to adminq to setup
>    io queues.
> In current implementation, nvme_timeout cannot identify the _boundary_ 
> so only handles second case above.

Bare with me a moment, as I'm only just now getting a real chance to look
at this, and I'm not quite sure I follow what problem this is solving.

The nvme_dev_disable routine makes forward progress without depending on
timeout handling to complete expired commands. Once controller disabling
completes, there can't possibly be any started requests that can expire.
So we don't need nvme_timeout to do anything for requests above the
boundary.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing
  2018-01-19  8:01 ` Keith Busch
@ 2018-01-19  8:14   ` jianchao.wang
  2018-01-19  8:42     ` Keith Busch
  0 siblings, 1 reply; 20+ messages in thread
From: jianchao.wang @ 2018-01-19  8:14 UTC (permalink / raw)
  To: Keith Busch; +Cc: axboe, hch, sagi, maxg, james.smart, linux-nvme, linux-kernel

Hi Keith

Thanks for your time to look into this.

On 01/19/2018 04:01 PM, Keith Busch wrote:
> On Thu, Jan 18, 2018 at 06:10:00PM +0800, Jianchao Wang wrote:
>> Hello
>>
>> Please consider the following scenario.
>> nvme_reset_ctrl
>>   -> set state to RESETTING
>>   -> queue reset_work       
>>     (scheduling)
>> nvme_reset_work
>>   -> nvme_dev_disable
>>     -> quiesce queues
>>     -> nvme_cancel_request 
>>        on outstanding requests
>> -------------------------------_boundary_
>>   -> nvme initializing (issue request on adminq)
>>
>> Before the _boundary_, not only quiesce the queues, but only cancel
>> all the outstanding requests.
>>
>> A request could expire when the ctrl state is RESETTING.
>>  - If the timeout occur before the _boundary_, the expired requests
>>    are from the previous work.
>>  - Otherwise, the expired requests are from the controller initializing
>>    procedure, such as sending cq/sq create commands to adminq to setup
>>    io queues.
>> In current implementation, nvme_timeout cannot identify the _boundary_ 
>> so only handles second case above.
> 
> Bare with me a moment, as I'm only just now getting a real chance to look
> at this, and I'm not quite sure I follow what problem this is solving.
> 
> The nvme_dev_disable routine makes forward progress without depending on
> timeout handling to complete expired commands. Once controller disabling
> completes, there can't possibly be any started requests that can expire.
> So we don't need nvme_timeout to do anything for requests above the
> boundary.
> 
Yes, once controller disabling completes, any started requests will be handled and cannot expire.
But before the _boundary_, there could be a nvme_timeout context runs with nvme_dev_disable in parallel.
If a timeout path grabs a request, then nvme_dev_disable cannot get and cancel it.
So even though the nvme_dev_disable completes, there still could be a request in nvme_timeout context.

The worst case is :
nvme_timeout                              nvme_reset_work
if (ctrl->state == RESETTING )              nvme_dev_disable
    nvme_dev_disable                        initializing procedure

the nvme_dev_disable run with reinit procedure in nvme_reset_work in parallel.


Thanks
Jianchao

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing
  2018-01-19  8:14   ` jianchao.wang
@ 2018-01-19  8:42     ` Keith Busch
  2018-01-19  9:02       ` jianchao.wang
  0 siblings, 1 reply; 20+ messages in thread
From: Keith Busch @ 2018-01-19  8:42 UTC (permalink / raw)
  To: jianchao.wang
  Cc: axboe, hch, sagi, maxg, james.smart, linux-nvme, linux-kernel

On Fri, Jan 19, 2018 at 04:14:02PM +0800, jianchao.wang wrote:
> On 01/19/2018 04:01 PM, Keith Busch wrote:
> > The nvme_dev_disable routine makes forward progress without depending on
> > timeout handling to complete expired commands. Once controller disabling
> > completes, there can't possibly be any started requests that can expire.
> > So we don't need nvme_timeout to do anything for requests above the
> > boundary.
> > 
> Yes, once controller disabling completes, any started requests will be handled and cannot expire.
> But before the _boundary_, there could be a nvme_timeout context runs with nvme_dev_disable in parallel.
> If a timeout path grabs a request, then nvme_dev_disable cannot get and cancel it.
> So even though the nvme_dev_disable completes, there still could be a request in nvme_timeout context.
> 
> The worst case is :
> nvme_timeout                              nvme_reset_work
> if (ctrl->state == RESETTING )              nvme_dev_disable
>     nvme_dev_disable                        initializing procedure
> 
> the nvme_dev_disable run with reinit procedure in nvme_reset_work in parallel.

Okay, I see what you're saying. That's a pretty obscure case, as normally
we enter nvme_reset_work with the queues already disabled, but there
are a few cases where we need nvme_reset_work to handle that.

But if we are in that case, I think we really just want to sync the
queues. What do you think of this?

---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index fde6fd2e7eef..516383193416 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3520,6 +3520,17 @@ void nvme_stop_queues(struct nvme_ctrl *ctrl)
 }
 EXPORT_SYMBOL_GPL(nvme_stop_queues);
 
+void nvme_sync_queues(struct nvme_ctrl *ctrl)
+{
+	struct nvme_ns *ns;
+
+	mutex_lock(&ctrl->namespaces_mutex);
+	list_for_each_entry(ns, &ctrl->namespaces, list)
+		blk_sync_queue(ns->queue);
+	mutex_unlock(&ctrl->namespaces_mutex);
+}
+EXPORT_SYMBOL_GPL(nvme_sync_queues);
+
 void nvme_start_queues(struct nvme_ctrl *ctrl)
 {
 	struct nvme_ns *ns;
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 8e7fc1b041b7..45b1b8ceddb6 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -374,6 +374,7 @@ void nvme_complete_async_event(struct nvme_ctrl *ctrl, __le16 status,
 
 void nvme_stop_queues(struct nvme_ctrl *ctrl);
 void nvme_start_queues(struct nvme_ctrl *ctrl);
+void nvme_sync_queues(struct nvme_ctrl *ctrl)
 void nvme_kill_queues(struct nvme_ctrl *ctrl);
 void nvme_unfreeze(struct nvme_ctrl *ctrl);
 void nvme_wait_freeze(struct nvme_ctrl *ctrl);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index a2ffb557b616..1fe00be22ad1 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2289,8 +2289,10 @@ static void nvme_reset_work(struct work_struct *work)
 	 * If we're called to reset a live controller first shut it down before
 	 * moving on.
 	 */
-	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
+	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) {
 		nvme_dev_disable(dev, false);
+		nvme_sync_queues(&dev->ctrl);
+	}
 
 	result = nvme_pci_enable(dev);
 	if (result)
--

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing
  2018-01-19  8:42     ` Keith Busch
@ 2018-01-19  9:02       ` jianchao.wang
  2018-01-19 11:52         ` Keith Busch
  0 siblings, 1 reply; 20+ messages in thread
From: jianchao.wang @ 2018-01-19  9:02 UTC (permalink / raw)
  To: Keith Busch; +Cc: axboe, hch, sagi, maxg, james.smart, linux-nvme, linux-kernel

Hi Keith

Thanks for your kindly and detailed response and patch.

On 01/19/2018 04:42 PM, Keith Busch wrote:
> On Fri, Jan 19, 2018 at 04:14:02PM +0800, jianchao.wang wrote:
>> On 01/19/2018 04:01 PM, Keith Busch wrote:
>>> The nvme_dev_disable routine makes forward progress without depending on
>>> timeout handling to complete expired commands. Once controller disabling
>>> completes, there can't possibly be any started requests that can expire.
>>> So we don't need nvme_timeout to do anything for requests above the
>>> boundary.
>>>
>> Yes, once controller disabling completes, any started requests will be handled and cannot expire.
>> But before the _boundary_, there could be a nvme_timeout context runs with nvme_dev_disable in parallel.
>> If a timeout path grabs a request, then nvme_dev_disable cannot get and cancel it.
>> So even though the nvme_dev_disable completes, there still could be a request in nvme_timeout context.
>>
>> The worst case is :
>> nvme_timeout                              nvme_reset_work
>> if (ctrl->state == RESETTING )              nvme_dev_disable
>>     nvme_dev_disable                        initializing procedure
>>
>> the nvme_dev_disable run with reinit procedure in nvme_reset_work in parallel.
> 
> Okay, I see what you're saying. That's a pretty obscure case, as normally
> we enter nvme_reset_work with the queues already disabled, but there
> are a few cases where we need nvme_reset_work to handle that.
> 
> But if we are in that case, I think we really just want to sync the
> queues. What do you think of this?
> 
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index fde6fd2e7eef..516383193416 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3520,6 +3520,17 @@ void nvme_stop_queues(struct nvme_ctrl *ctrl)
>  }
>  EXPORT_SYMBOL_GPL(nvme_stop_queues);
>  
> +void nvme_sync_queues(struct nvme_ctrl *ctrl)
> +{
> +	struct nvme_ns *ns;
> +
> +	mutex_lock(&ctrl->namespaces_mutex);
> +	list_for_each_entry(ns, &ctrl->namespaces, list)
> +		blk_sync_queue(ns->queue);
> +	mutex_unlock(&ctrl->namespaces_mutex);
> +}
> +EXPORT_SYMBOL_GPL(nvme_sync_queues);
> +
>  void nvme_start_queues(struct nvme_ctrl *ctrl)
>  {
>  	struct nvme_ns *ns;
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 8e7fc1b041b7..45b1b8ceddb6 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -374,6 +374,7 @@ void nvme_complete_async_event(struct nvme_ctrl *ctrl, __le16 status,
>  
>  void nvme_stop_queues(struct nvme_ctrl *ctrl);
>  void nvme_start_queues(struct nvme_ctrl *ctrl);
> +void nvme_sync_queues(struct nvme_ctrl *ctrl)
>  void nvme_kill_queues(struct nvme_ctrl *ctrl);
>  void nvme_unfreeze(struct nvme_ctrl *ctrl);
>  void nvme_wait_freeze(struct nvme_ctrl *ctrl);
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index a2ffb557b616..1fe00be22ad1 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2289,8 +2289,10 @@ static void nvme_reset_work(struct work_struct *work)
>  	 * If we're called to reset a live controller first shut it down before
>  	 * moving on.
>  	 */
> -	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
> +	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) {
>  		nvme_dev_disable(dev, false);
> +		nvme_sync_queues(&dev->ctrl);
> +	}
>  
>  	result = nvme_pci_enable(dev);
>  	if (result)
> --
> 

We should not use blk_sync_queue here, the requeue_work and run_work will be canceled.
Just flush_work(&q->timeout_work) should be ok.

In addition, we could check NVME_CC_ENABLE in nvme_dev_disable to avoid redundant invoking.
:)

Thanks
Jianchao

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure
  2018-01-18 10:17   ` Max Gurtovoy
@ 2018-01-19  9:49     ` jianchao.wang
  0 siblings, 0 replies; 20+ messages in thread
From: jianchao.wang @ 2018-01-19  9:49 UTC (permalink / raw)
  To: Max Gurtovoy, keith.busch, axboe, hch, sagi, james.smart
  Cc: linux-nvme, linux-kernel

Hi Max

Thanks for your kindly comment and response.

On 01/18/2018 06:17 PM, Max Gurtovoy wrote:
> 
> On 1/18/2018 12:10 PM, Jianchao Wang wrote:
>> After Sagi's commit (nvme-rdma: fix concurrent reset and reconnect),
>> both nvme-fc/rdma have following pattern:
>> RESETTING    - quiesce blk-mq queues, teardown and delete queues/
>>                 connections, clear out outstanding IO requests...
>> RECONNECTING - establish new queues/connections and some other
>>                 initializing things.
> 
> I guess we can call it NVME_CTRL_CONNECTING in later patch (more suitable name for this state now).

I think we can change the name in another patchset in which just talk about the more suitable names of the states.
:)
Thanks
Jianchao

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing
  2018-01-19  9:02       ` jianchao.wang
@ 2018-01-19 11:52         ` Keith Busch
  2018-01-19 13:56           ` jianchao.wang
  0 siblings, 1 reply; 20+ messages in thread
From: Keith Busch @ 2018-01-19 11:52 UTC (permalink / raw)
  To: jianchao.wang
  Cc: axboe, hch, sagi, maxg, james.smart, linux-nvme, linux-kernel

On Fri, Jan 19, 2018 at 05:02:06PM +0800, jianchao.wang wrote:
> We should not use blk_sync_queue here, the requeue_work and run_work will be canceled.
> Just flush_work(&q->timeout_work) should be ok.

I agree flushing timeout_work is sufficient. All the other work had
already better not be running either, so it doesn't hurt to call the
sync API.
 
> In addition, we could check NVME_CC_ENABLE in nvme_dev_disable to avoid redundant invoking.
> :)

That should already be inferred through reading back the CSTS register.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing
  2018-01-19 11:52         ` Keith Busch
@ 2018-01-19 13:56           ` jianchao.wang
  2018-01-20  2:11             ` Keith Busch
  0 siblings, 1 reply; 20+ messages in thread
From: jianchao.wang @ 2018-01-19 13:56 UTC (permalink / raw)
  To: Keith Busch; +Cc: axboe, hch, sagi, maxg, james.smart, linux-nvme, linux-kernel

Hi Keith

Thanks for your kindly response.

On 01/19/2018 07:52 PM, Keith Busch wrote:
> On Fri, Jan 19, 2018 at 05:02:06PM +0800, jianchao.wang wrote:
>> We should not use blk_sync_queue here, the requeue_work and run_work will be canceled.
>> Just flush_work(&q->timeout_work) should be ok.
> 
> I agree flushing timeout_work is sufficient. All the other work had
> already better not be running either, so it doesn't hurt to call the
> sync API.
In nvme_dev_disable, the outstanding requests will be requeued finally.
I'm afraid the requests requeued on the q->requeue_list will be blocked until another requeue
occurs, if we cancel the requeue work before it get scheduled.

>  
>> In addition, we could check NVME_CC_ENABLE in nvme_dev_disable to avoid redundant invoking.
>> :)
> 
> That should already be inferred through reading back the CSTS register.
> 
Yes, the "dead"  in nvme_dev_disable looks enough for these uncommon cases.

Thanks
Jianchao

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing
  2018-01-19 13:56           ` jianchao.wang
@ 2018-01-20  2:11             ` Keith Busch
  2018-01-20 14:07               ` jianchao.wang
  0 siblings, 1 reply; 20+ messages in thread
From: Keith Busch @ 2018-01-20  2:11 UTC (permalink / raw)
  To: jianchao.wang
  Cc: axboe, hch, sagi, maxg, james.smart, linux-nvme, linux-kernel

On Fri, Jan 19, 2018 at 09:56:48PM +0800, jianchao.wang wrote:
> In nvme_dev_disable, the outstanding requests will be requeued finally.
> I'm afraid the requests requeued on the q->requeue_list will be blocked until another requeue
> occurs, if we cancel the requeue work before it get scheduled.

We should kick the request list in nvme_start_queues.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing
  2018-01-20  2:11             ` Keith Busch
@ 2018-01-20 14:07               ` jianchao.wang
  2018-01-20 14:14                 ` jianchao.wang
  0 siblings, 1 reply; 20+ messages in thread
From: jianchao.wang @ 2018-01-20 14:07 UTC (permalink / raw)
  To: Keith Busch; +Cc: axboe, hch, sagi, maxg, james.smart, linux-nvme, linux-kernel

Hi Keith

Thanks for you kindly response.

On 01/20/2018 10:11 AM, Keith Busch wrote:
> On Fri, Jan 19, 2018 at 09:56:48PM +0800, jianchao.wang wrote:
>> In nvme_dev_disable, the outstanding requests will be requeued finally.
>> I'm afraid the requests requeued on the q->requeue_list will be blocked until another requeue
>> occurs, if we cancel the requeue work before it get scheduled.
> 
> We should kick the request list in nvme_start_queues.
> 
Yes

@@ -3513,8 +3513,10 @@ void nvme_start_queues(struct nvme_ctrl *ctrl)
        struct nvme_ns *ns;
 
        mutex_lock(&ctrl->namespaces_mutex);
-       list_for_each_entry(ns, &ctrl->namespaces, list)
+       list_for_each_entry(ns, &ctrl->namespaces, list) {
                blk_mq_unquiesce_queue(ns->queue);
+               blk_mq_kick_requeue_list(ns->queue);
+       }
        mutex_unlock(&ctrl->namespaces_mutex);
 }

Then, nvme_sync_queues could be more universal.

Many thanks for your directive.

Jianchao

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing
  2018-01-20 14:07               ` jianchao.wang
@ 2018-01-20 14:14                 ` jianchao.wang
  0 siblings, 0 replies; 20+ messages in thread
From: jianchao.wang @ 2018-01-20 14:14 UTC (permalink / raw)
  To: Keith Busch; +Cc: axboe, hch, sagi, maxg, james.smart, linux-nvme, linux-kernel



On 01/20/2018 10:07 PM, jianchao.wang wrote:
> Hi Keith
> 
> Thanks for you kindly response.
> 
> On 01/20/2018 10:11 AM, Keith Busch wrote:
>> On Fri, Jan 19, 2018 at 09:56:48PM +0800, jianchao.wang wrote:
>>> In nvme_dev_disable, the outstanding requests will be requeued finally.
>>> I'm afraid the requests requeued on the q->requeue_list will be blocked until another requeue
>>> occurs, if we cancel the requeue work before it get scheduled.
>>
>> We should kick the request list in nvme_start_queues.
>>
> Yes
> 
> @@ -3513,8 +3513,10 @@ void nvme_start_queues(struct nvme_ctrl *ctrl)
>         struct nvme_ns *ns;
>  
>         mutex_lock(&ctrl->namespaces_mutex);
> -       list_for_each_entry(ns, &ctrl->namespaces, list)
> +       list_for_each_entry(ns, &ctrl->namespaces, list) {
>                 blk_mq_unquiesce_queue(ns->queue);
> +               blk_mq_kick_requeue_list(ns->queue);
> +       }
>         mutex_unlock(&ctrl->namespaces_mutex);
>  }

We have to also add blk_mq_kick_requeue_list in nvme_kill_queues in case of queue_count < 2.

> 
> Then, nvme_sync_queues could be more universal.
> 
> Many thanks for your directive.
> 
> Jianchao
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-01-20 14:16 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-18 10:10 [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing Jianchao Wang
2018-01-18 10:10 ` [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure Jianchao Wang
2018-01-18 10:17   ` Max Gurtovoy
2018-01-19  9:49     ` jianchao.wang
2018-01-18 15:23   ` James Smart
2018-01-18 10:10 ` [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing Jianchao Wang
2018-01-19  4:59   ` Keith Busch
2018-01-19  5:55     ` jianchao.wang
2018-01-19  6:05       ` Keith Busch
2018-01-19  6:53         ` jianchao.wang
2018-01-18 15:34 ` [PATCH V5 0/2] nvme-pci: fix " James Smart
2018-01-19  8:01 ` Keith Busch
2018-01-19  8:14   ` jianchao.wang
2018-01-19  8:42     ` Keith Busch
2018-01-19  9:02       ` jianchao.wang
2018-01-19 11:52         ` Keith Busch
2018-01-19 13:56           ` jianchao.wang
2018-01-20  2:11             ` Keith Busch
2018-01-20 14:07               ` jianchao.wang
2018-01-20 14:14                 ` jianchao.wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).