From: Jianchao Wang <jianchao.w.wang@oracle.com> To: keith.busch@intel.com, axboe@fb.com, hch@lst.de, sagi@grimberg.me, maxg@mellanox.com, james.smart@broadcom.com Cc: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing Date: Thu, 18 Jan 2018 18:10:02 +0800 [thread overview] Message-ID: <1516270202-8051-3-git-send-email-jianchao.w.wang@oracle.com> (raw) In-Reply-To: <1516270202-8051-1-git-send-email-jianchao.w.wang@oracle.com> Look at the following scenario. nvme_reset_ctrl -> set state to RESETTING -> queue reset_work (scheduling) nvme_reset_work -> nvme_dev_disable -> quiesce queues -> nvme_cancel_request on outstanding requests -> set state to RECONNECTING -> nvme initializing (will issue request on adminq) A request could expire after the ctrl state is set to RESETTING and queue the reset_work. - If the timeout occurs before state RECONNECTING, the expired requests are from the previous work. - Otherwise, the expired requests are from the controller initializing procedure, such as sending cq/sq create commands to adminq to setup io queues. In current implementation, nvme_timeout only handles second case above. To fix this, when the state is RESETTING, handle the expired requests as nvme_cancel_request, when the state is RECONNECTING, handle it as the original method and discard the nvme_dev_disable there, the nvme_reset_work will see the error and invoke itself. Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> --- drivers/nvme/host/pci.c | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 05344be..a9b2359 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1210,18 +1210,24 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) } /* - * Shutdown immediately if controller times out while starting. The - * reset work will see the pci device disabled when it gets the forced - * cancellation error. All outstanding requests are completed on - * shutdown, so we return BLK_EH_HANDLED. + * - When the ctrl.state is NVME_CTRL_RESETTING, the expired + * request should come from the previous work and we handle + * it as nvme_cancel_request. + * - When the ctrl.state is NVME_CTRL_RECONNECTING, the expired + * request should come from the initializing procedure such as + * setup io queues, because all the previous outstanding + * requests should have been cancelled. */ - if (dev->ctrl.state == NVME_CTRL_RESETTING) { - dev_warn(dev->ctrl.device, - "I/O %d QID %d timeout, disable controller\n", - req->tag, nvmeq->qid); - nvme_dev_disable(dev, false); + switch (dev->ctrl.state) { + case NVME_CTRL_RESETTING: + nvme_req(req)->status = NVME_SC_ABORT_REQ; + return BLK_EH_HANDLED; + case NVME_CTRL_RECONNECTING: + WARN_ON_ONCE(nvmeq->qid); nvme_req(req)->flags |= NVME_REQ_CANCELLED; return BLK_EH_HANDLED; + default: + break; } /* -- 2.7.4
WARNING: multiple messages have this Message-ID (diff)
From: jianchao.w.wang@oracle.com (Jianchao Wang) Subject: [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing Date: Thu, 18 Jan 2018 18:10:02 +0800 [thread overview] Message-ID: <1516270202-8051-3-git-send-email-jianchao.w.wang@oracle.com> (raw) In-Reply-To: <1516270202-8051-1-git-send-email-jianchao.w.wang@oracle.com> Look at the following scenario. nvme_reset_ctrl -> set state to RESETTING -> queue reset_work (scheduling) nvme_reset_work -> nvme_dev_disable -> quiesce queues -> nvme_cancel_request on outstanding requests -> set state to RECONNECTING -> nvme initializing (will issue request on adminq) A request could expire after the ctrl state is set to RESETTING and queue the reset_work. - If the timeout occurs before state RECONNECTING, the expired requests are from the previous work. - Otherwise, the expired requests are from the controller initializing procedure, such as sending cq/sq create commands to adminq to setup io queues. In current implementation, nvme_timeout only handles second case above. To fix this, when the state is RESETTING, handle the expired requests as nvme_cancel_request, when the state is RECONNECTING, handle it as the original method and discard the nvme_dev_disable there, the nvme_reset_work will see the error and invoke itself. Signed-off-by: Jianchao Wang <jianchao.w.wang at oracle.com> --- drivers/nvme/host/pci.c | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 05344be..a9b2359 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -1210,18 +1210,24 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved) } /* - * Shutdown immediately if controller times out while starting. The - * reset work will see the pci device disabled when it gets the forced - * cancellation error. All outstanding requests are completed on - * shutdown, so we return BLK_EH_HANDLED. + * - When the ctrl.state is NVME_CTRL_RESETTING, the expired + * request should come from the previous work and we handle + * it as nvme_cancel_request. + * - When the ctrl.state is NVME_CTRL_RECONNECTING, the expired + * request should come from the initializing procedure such as + * setup io queues, because all the previous outstanding + * requests should have been cancelled. */ - if (dev->ctrl.state == NVME_CTRL_RESETTING) { - dev_warn(dev->ctrl.device, - "I/O %d QID %d timeout, disable controller\n", - req->tag, nvmeq->qid); - nvme_dev_disable(dev, false); + switch (dev->ctrl.state) { + case NVME_CTRL_RESETTING: + nvme_req(req)->status = NVME_SC_ABORT_REQ; + return BLK_EH_HANDLED; + case NVME_CTRL_RECONNECTING: + WARN_ON_ONCE(nvmeq->qid); nvme_req(req)->flags |= NVME_REQ_CANCELLED; return BLK_EH_HANDLED; + default: + break; } /* -- 2.7.4
next prev parent reply other threads:[~2018-01-18 10:11 UTC|newest] Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-01-18 10:10 [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing Jianchao Wang 2018-01-18 10:10 ` Jianchao Wang 2018-01-18 10:10 ` [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure Jianchao Wang 2018-01-18 10:10 ` Jianchao Wang 2018-01-18 10:17 ` Max Gurtovoy 2018-01-18 10:17 ` Max Gurtovoy 2018-01-19 9:49 ` jianchao.wang 2018-01-19 9:49 ` jianchao.wang 2018-01-18 15:23 ` James Smart 2018-01-18 15:23 ` James Smart 2018-01-18 10:10 ` Jianchao Wang [this message] 2018-01-18 10:10 ` [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing Jianchao Wang 2018-01-19 4:59 ` Keith Busch 2018-01-19 4:59 ` Keith Busch 2018-01-19 5:55 ` jianchao.wang 2018-01-19 5:55 ` jianchao.wang 2018-01-19 6:05 ` Keith Busch 2018-01-19 6:05 ` Keith Busch 2018-01-19 6:53 ` jianchao.wang 2018-01-19 6:53 ` jianchao.wang 2018-01-18 15:34 ` [PATCH V5 0/2] nvme-pci: fix " James Smart 2018-01-18 15:34 ` James Smart 2018-01-19 8:01 ` Keith Busch 2018-01-19 8:01 ` Keith Busch 2018-01-19 8:14 ` jianchao.wang 2018-01-19 8:14 ` jianchao.wang 2018-01-19 8:42 ` Keith Busch 2018-01-19 8:42 ` Keith Busch 2018-01-19 9:02 ` jianchao.wang 2018-01-19 9:02 ` jianchao.wang 2018-01-19 11:52 ` Keith Busch 2018-01-19 11:52 ` Keith Busch 2018-01-19 13:56 ` jianchao.wang 2018-01-19 13:56 ` jianchao.wang 2018-01-20 2:11 ` Keith Busch 2018-01-20 2:11 ` Keith Busch 2018-01-20 14:07 ` jianchao.wang 2018-01-20 14:07 ` jianchao.wang 2018-01-20 14:14 ` jianchao.wang 2018-01-20 14:14 ` jianchao.wang
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1516270202-8051-3-git-send-email-jianchao.w.wang@oracle.com \ --to=jianchao.w.wang@oracle.com \ --cc=axboe@fb.com \ --cc=hch@lst.de \ --cc=james.smart@broadcom.com \ --cc=keith.busch@intel.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-nvme@lists.infradead.org \ --cc=maxg@mellanox.com \ --cc=sagi@grimberg.me \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.