From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Subject: Re: [PATCH V5 0/9] nvme: pci: fix & improve timeout handling To: Ming Lei Cc: Jens Axboe , linux-block@vger.kernel.org, Laurence Oberman , Sagi Grimberg , James Smart , linux-nvme@lists.infradead.org, Keith Busch , Christoph Hellwig References: <20180511122933.27155-1-ming.lei@redhat.com> <776f21e1-dc19-1b77-9ba4-44f0b8366625@oracle.com> <20180514093850.GA807@ming.t460p> <008cb38d-aa91-6ab7-64d9-417d6c53a1eb@oracle.com> <20180514122211.GB807@ming.t460p> <20180515003332.GB21743@ming.t460p> From: "jianchao.wang" Message-ID: Date: Tue, 15 May 2018 17:56:14 +0800 MIME-Version: 1.0 In-Reply-To: <20180515003332.GB21743@ming.t460p> Content-Type: text/plain; charset=utf-8 List-ID: Hi ming On 05/15/2018 08:33 AM, Ming Lei wrote: > We still have to quiesce admin queue before canceling request, so looks > the following patch is better, so please ignore the above patch and try > the following one and see if your hang can be addressed: > > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index f509d37b2fb8..c2adc76472a8 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -1741,8 +1741,7 @@ static int nvme_alloc_admin_tags(struct nvme_dev *dev) > dev->ctrl.admin_q = NULL; > return -ENODEV; > } > - } else > - blk_mq_unquiesce_queue(dev->ctrl.admin_q); > + } > > return 0; > } > @@ -2520,6 +2519,12 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown, bool > */ > if (shutdown) > nvme_start_queues(&dev->ctrl); > + > + /* > + * Avoid to suck reset because timeout may happen during reset and > + * reset may hang forever if admin queue is kept as quiesced > + */ > + blk_mq_unquiesce_queue(dev->ctrl.admin_q); > mutex_unlock(&dev->shutdown_lock); > } w/ patch above and patch below, both the warning and io hung issue didn't reproduce till now. @@ -1450,6 +1648,7 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid) { struct nvme_dev *dev = nvmeq->dev; int result; + int cq_vector; if (dev->cmb && use_cmb_sqes && (dev->cmbsz & NVME_CMBSZ_SQS)) { unsigned offset = (qid - 1) * roundup(SQ_SIZE(nvmeq->q_depth), @@ -1462,15 +1661,16 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid) * A queue's vector matches the queue identifier unless the controller * has only one vector available. */ - nvmeq->cq_vector = dev->num_vecs == 1 ? 0 : qid; - result = adapter_alloc_cq(dev, qid, nvmeq); + cq_vector = dev->num_vecs == 1 ? 0 : qid; + result = adapter_alloc_cq(dev, qid, nvmeq, cq_vector); if (result < 0) - goto release_vector; + goto out; result = adapter_alloc_sq(dev, qid, nvmeq); if (result < 0) goto release_cq; - + + nvmeq->cq_vector = cq_vector; nvme_init_queue(nvmeq, qid); result = queue_request_irq(nvmeq); if (result < 0) @@ -1479,12 +1679,12 @@ static int nvme_create_queue(struct nvme_queue *nvmeq, int qid) return result; release_sq: + nvmeq->cq_vector = -1; dev->online_queues--; adapter_delete_sq(dev, qid); release_cq: adapter_delete_cq(dev, qid); - release_vector: - nvmeq->cq_vector = -1; + out: return result; } Thanks Jianchao