linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] nvme-pci: Simpler completions
@ 2020-03-02 19:46 Keith Busch
  2020-03-02 19:46 ` [PATCH 1/2] nvme-pci: Remove tag from process cq Keith Busch
  2020-03-02 19:46 ` [PATCH 2/2] nvme-pci: Remove two-pass completions Keith Busch
  0 siblings, 2 replies; 7+ messages in thread
From: Keith Busch @ 2020-03-02 19:46 UTC (permalink / raw)
  To: linux-nvme, hch, sagi; +Cc: Keith Busch, bijan.mottahedeh

This actually demostrates a modest performance increase, and that's
always nice.

The first patch prepares for cq doorbell ringing without the start/end
check, the second one is the real work.

Keith Busch (2):
  nvme-pci: Remove tag from process cq
  nvme-pci: Remove two-pass completions

 drivers/nvme/host/pci.c | 48 ++++++++++++-----------------------------
 1 file changed, 14 insertions(+), 34 deletions(-)

-- 
2.24.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] nvme-pci: Remove tag from process cq
  2020-03-02 19:46 [PATCH 0/2] nvme-pci: Simpler completions Keith Busch
@ 2020-03-02 19:46 ` Keith Busch
  2020-03-04 16:05   ` Christoph Hellwig
  2020-03-02 19:46 ` [PATCH 2/2] nvme-pci: Remove two-pass completions Keith Busch
  1 sibling, 1 reply; 7+ messages in thread
From: Keith Busch @ 2020-03-02 19:46 UTC (permalink / raw)
  To: linux-nvme, hch, sagi; +Cc: Keith Busch, bijan.mottahedeh

The only user for tagged completion was for timeout handling. That user,
though, really only cares if the timed out command is completed, which
we can safely check within the timeout handler.

Remove the tag check to simplify completion handling.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 drivers/nvme/host/pci.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index da392b50f73e..db84283f2a5a 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -991,14 +991,13 @@ static inline void nvme_update_cq_head(struct nvme_queue *nvmeq)
 }
 
 static inline int nvme_process_cq(struct nvme_queue *nvmeq, u16 *start,
-				  u16 *end, unsigned int tag)
+				  u16 *end)
 {
 	int found = 0;
 
 	*start = nvmeq->cq_head;
 	while (nvme_cqe_pending(nvmeq)) {
-		if (tag == -1U || nvmeq->cqes[nvmeq->cq_head].command_id == tag)
-			found++;
+		++found;
 		nvme_update_cq_head(nvmeq);
 	}
 	*end = nvmeq->cq_head;
@@ -1019,7 +1018,7 @@ static irqreturn_t nvme_irq(int irq, void *data)
 	 * the irq handler, even if that was on another CPU.
 	 */
 	rmb();
-	nvme_process_cq(nvmeq, &start, &end, -1);
+	nvme_process_cq(nvmeq, &start, &end);
 	wmb();
 
 	if (start != end) {
@@ -1042,7 +1041,7 @@ static irqreturn_t nvme_irq_check(int irq, void *data)
  * Poll for completions any queue, including those not dedicated to polling.
  * Can be called from any context.
  */
-static int nvme_poll_irqdisable(struct nvme_queue *nvmeq, unsigned int tag)
+static int nvme_poll_irqdisable(struct nvme_queue *nvmeq)
 {
 	struct pci_dev *pdev = to_pci_dev(nvmeq->dev->dev);
 	u16 start, end;
@@ -1055,11 +1054,11 @@ static int nvme_poll_irqdisable(struct nvme_queue *nvmeq, unsigned int tag)
 	 */
 	if (test_bit(NVMEQ_POLLED, &nvmeq->flags)) {
 		spin_lock(&nvmeq->cq_poll_lock);
-		found = nvme_process_cq(nvmeq, &start, &end, tag);
+		found = nvme_process_cq(nvmeq, &start, &end);
 		spin_unlock(&nvmeq->cq_poll_lock);
 	} else {
 		disable_irq(pci_irq_vector(pdev, nvmeq->cq_vector));
-		found = nvme_process_cq(nvmeq, &start, &end, tag);
+		found = nvme_process_cq(nvmeq, &start, &end);
 		enable_irq(pci_irq_vector(pdev, nvmeq->cq_vector));
 	}
 
@@ -1077,7 +1076,7 @@ static int nvme_poll(struct blk_mq_hw_ctx *hctx)
 		return 0;
 
 	spin_lock(&nvmeq->cq_poll_lock);
-	found = nvme_process_cq(nvmeq, &start, &end, -1);
+	found = nvme_process_cq(nvmeq, &start, &end);
 	spin_unlock(&nvmeq->cq_poll_lock);
 
 	nvme_complete_cqes(nvmeq, start, end);
@@ -1255,7 +1254,8 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
 	/*
 	 * Did we miss an interrupt?
 	 */
-	if (nvme_poll_irqdisable(nvmeq, req->tag)) {
+	if (nvme_poll_irqdisable(nvmeq) &&
+	    blk_mq_request_completed(req)) {
 		dev_warn(dev->ctrl.device,
 			 "I/O %d QID %d timeout, completion polled\n",
 			 req->tag, nvmeq->qid);
@@ -1398,7 +1398,7 @@ static void nvme_disable_admin_queue(struct nvme_dev *dev, bool shutdown)
 	else
 		nvme_disable_ctrl(&dev->ctrl);
 
-	nvme_poll_irqdisable(nvmeq, -1);
+	nvme_poll_irqdisable(nvmeq);
 }
 
 static int nvme_cmb_qdepth(struct nvme_dev *dev, int nr_io_queues,
@@ -2238,7 +2238,7 @@ static bool __nvme_disable_io_queues(struct nvme_dev *dev, u8 opcode)
 		/* handle any remaining CQEs */
 		if (opcode == nvme_admin_delete_cq &&
 		    !test_bit(NVMEQ_DELETE_ERROR, &nvmeq->flags))
-			nvme_poll_irqdisable(nvmeq, -1);
+			nvme_poll_irqdisable(nvmeq);
 
 		sent--;
 		if (nr_queues)
-- 
2.24.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] nvme-pci: Remove two-pass completions
  2020-03-02 19:46 [PATCH 0/2] nvme-pci: Simpler completions Keith Busch
  2020-03-02 19:46 ` [PATCH 1/2] nvme-pci: Remove tag from process cq Keith Busch
@ 2020-03-02 19:46 ` Keith Busch
  2020-03-04 16:08   ` Christoph Hellwig
  2020-03-05 20:52   ` Sagi Grimberg
  1 sibling, 2 replies; 7+ messages in thread
From: Keith Busch @ 2020-03-02 19:46 UTC (permalink / raw)
  To: linux-nvme, hch, sagi; +Cc: Keith Busch, bijan.mottahedeh

Completion handling had been done in two steps: find all new completions
under a lock, then handle those completions outside the lock. This was
done to make the locked section as short as possible so that other
threads using the same lock wait less time.

The driver no longer shares locks during completion, and is in fact
lockless for interrupt driven queues, so the optimization no longer
serves its original purpose. Replace the two-pass completion queue
handler with a single pass that completes entries immediately.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 drivers/nvme/host/pci.c | 36 ++++++++----------------------------
 1 file changed, 8 insertions(+), 28 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index db84283f2a5a..0e91b8a3b357 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -971,15 +971,6 @@ static inline void nvme_handle_cqe(struct nvme_queue *nvmeq, u16 idx)
 	nvme_end_request(req, cqe->status, cqe->result);
 }
 
-static void nvme_complete_cqes(struct nvme_queue *nvmeq, u16 start, u16 end)
-{
-	while (start != end) {
-		nvme_handle_cqe(nvmeq, start);
-		if (++start == nvmeq->q_depth)
-			start = 0;
-	}
-}
-
 static inline void nvme_update_cq_head(struct nvme_queue *nvmeq)
 {
 	if (nvmeq->cq_head == nvmeq->q_depth - 1) {
@@ -990,19 +981,17 @@ static inline void nvme_update_cq_head(struct nvme_queue *nvmeq)
 	}
 }
 
-static inline int nvme_process_cq(struct nvme_queue *nvmeq, u16 *start,
-				  u16 *end)
+static inline int nvme_process_cq(struct nvme_queue *nvmeq)
 {
 	int found = 0;
 
-	*start = nvmeq->cq_head;
 	while (nvme_cqe_pending(nvmeq)) {
 		++found;
+		nvme_handle_cqe(nvmeq, nvmeq->cq_head);
 		nvme_update_cq_head(nvmeq);
 	}
-	*end = nvmeq->cq_head;
 
-	if (*start != *end)
+	if (found)
 		nvme_ring_cq_doorbell(nvmeq);
 	return found;
 }
@@ -1011,21 +1000,16 @@ static irqreturn_t nvme_irq(int irq, void *data)
 {
 	struct nvme_queue *nvmeq = data;
 	irqreturn_t ret = IRQ_NONE;
-	u16 start, end;
 
 	/*
 	 * The rmb/wmb pair ensures we see all updates from a previous run of
 	 * the irq handler, even if that was on another CPU.
 	 */
 	rmb();
-	nvme_process_cq(nvmeq, &start, &end);
+	if (nvme_process_cq(nvmeq))
+		ret = IRQ_HANDLED;
 	wmb();
 
-	if (start != end) {
-		nvme_complete_cqes(nvmeq, start, end);
-		return IRQ_HANDLED;
-	}
-
 	return ret;
 }
 
@@ -1044,7 +1028,6 @@ static irqreturn_t nvme_irq_check(int irq, void *data)
 static int nvme_poll_irqdisable(struct nvme_queue *nvmeq)
 {
 	struct pci_dev *pdev = to_pci_dev(nvmeq->dev->dev);
-	u16 start, end;
 	int found;
 
 	/*
@@ -1054,32 +1037,29 @@ static int nvme_poll_irqdisable(struct nvme_queue *nvmeq)
 	 */
 	if (test_bit(NVMEQ_POLLED, &nvmeq->flags)) {
 		spin_lock(&nvmeq->cq_poll_lock);
-		found = nvme_process_cq(nvmeq, &start, &end);
+		found = nvme_process_cq(nvmeq);
 		spin_unlock(&nvmeq->cq_poll_lock);
 	} else {
 		disable_irq(pci_irq_vector(pdev, nvmeq->cq_vector));
-		found = nvme_process_cq(nvmeq, &start, &end);
+		found = nvme_process_cq(nvmeq);
 		enable_irq(pci_irq_vector(pdev, nvmeq->cq_vector));
 	}
 
-	nvme_complete_cqes(nvmeq, start, end);
 	return found;
 }
 
 static int nvme_poll(struct blk_mq_hw_ctx *hctx)
 {
 	struct nvme_queue *nvmeq = hctx->driver_data;
-	u16 start, end;
 	bool found;
 
 	if (!nvme_cqe_pending(nvmeq))
 		return 0;
 
 	spin_lock(&nvmeq->cq_poll_lock);
-	found = nvme_process_cq(nvmeq, &start, &end);
+	found = nvme_process_cq(nvmeq);
 	spin_unlock(&nvmeq->cq_poll_lock);
 
-	nvme_complete_cqes(nvmeq, start, end);
 	return found;
 }
 
-- 
2.24.1


_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] nvme-pci: Remove tag from process cq
  2020-03-02 19:46 ` [PATCH 1/2] nvme-pci: Remove tag from process cq Keith Busch
@ 2020-03-04 16:05   ` Christoph Hellwig
  2020-03-04 16:45     ` Keith Busch
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2020-03-04 16:05 UTC (permalink / raw)
  To: Keith Busch; +Cc: bijan.mottahedeh, hch, linux-nvme, sagi

On Mon, Mar 02, 2020 at 11:46:15AM -0800, Keith Busch wrote:
> The only user for tagged completion was for timeout handling. That user,
> though, really only cares if the timed out command is completed, which
> we can safely check within the timeout handler.
> 
> Remove the tag check to simplify completion handling.
> 
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  drivers/nvme/host/pci.c | 22 +++++++++++-----------
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index da392b50f73e..db84283f2a5a 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -991,14 +991,13 @@ static inline void nvme_update_cq_head(struct nvme_queue *nvmeq)
>  }
>  
>  static inline int nvme_process_cq(struct nvme_queue *nvmeq, u16 *start,
> -				  u16 *end, unsigned int tag)
> +				  u16 *end)
>  {
>  	int found = 0;
>  
>  	*start = nvmeq->cq_head;
>  	while (nvme_cqe_pending(nvmeq)) {
> -		if (tag == -1U || nvmeq->cqes[nvmeq->cq_head].command_id == tag)
> -			found++;
> +		++found;

Any reason for moving from found++ to ++found?  I always find the infix
notation a little odd when there is no need for it, but that is just a
personal preference.

>  	 */
> -	if (nvme_poll_irqdisable(nvmeq, req->tag)) {
> +	if (nvme_poll_irqdisable(nvmeq) &&
> +	    blk_mq_request_completed(req)) {

Do we need to check the nvme_poll_irqdisable value here?  If it got
completed by other means we should be fine too, shouldn't we?

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] nvme-pci: Remove two-pass completions
  2020-03-02 19:46 ` [PATCH 2/2] nvme-pci: Remove two-pass completions Keith Busch
@ 2020-03-04 16:08   ` Christoph Hellwig
  2020-03-05 20:52   ` Sagi Grimberg
  1 sibling, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2020-03-04 16:08 UTC (permalink / raw)
  To: Keith Busch; +Cc: bijan.mottahedeh, hch, linux-nvme, sagi

> +static inline int nvme_process_cq(struct nvme_queue *nvmeq)
>  {
>  	int found = 0;
>  
>  	while (nvme_cqe_pending(nvmeq)) {
>  		++found;
> +		nvme_handle_cqe(nvmeq, nvmeq->cq_head);
>  		nvme_update_cq_head(nvmeq);
>  	}
>  
> +	if (found)
>  		nvme_ring_cq_doorbell(nvmeq);
>  	return found;
>  }
> @@ -1011,21 +1000,16 @@ static irqreturn_t nvme_irq(int irq, void *data)
>  {
>  	struct nvme_queue *nvmeq = data;
>  	irqreturn_t ret = IRQ_NONE;
> -	u16 start, end;
>  
>  	/*
>  	 * The rmb/wmb pair ensures we see all updates from a previous run of
>  	 * the irq handler, even if that was on another CPU.
>  	 */
>  	rmb();
> -	nvme_process_cq(nvmeq, &start, &end);
> +	if (nvme_process_cq(nvmeq))
> +		ret = IRQ_HANDLED;
>  	wmb();
>  
> -	if (start != end) {
> -		nvme_complete_cqes(nvmeq, start, end);
> -		return IRQ_HANDLED;
> -	}
> -
>  	return ret;
>  }
>  
> @@ -1044,7 +1028,6 @@ static irqreturn_t nvme_irq_check(int irq, void *data)
>  static int nvme_poll_irqdisable(struct nvme_queue *nvmeq)
>  {
>  	struct pci_dev *pdev = to_pci_dev(nvmeq->dev->dev);
> -	u16 start, end;
>  	int found;
>  
>  	/*
> @@ -1054,32 +1037,29 @@ static int nvme_poll_irqdisable(struct nvme_queue *nvmeq)
>  	 */
>  	if (test_bit(NVMEQ_POLLED, &nvmeq->flags)) {
>  		spin_lock(&nvmeq->cq_poll_lock);
> -		found = nvme_process_cq(nvmeq, &start, &end);
> +		found = nvme_process_cq(nvmeq);
>  		spin_unlock(&nvmeq->cq_poll_lock);
>  	} else {
>  		disable_irq(pci_irq_vector(pdev, nvmeq->cq_vector));
> -		found = nvme_process_cq(nvmeq, &start, &end);
> +		found = nvme_process_cq(nvmeq);
>  		enable_irq(pci_irq_vector(pdev, nvmeq->cq_vector));
>  	}
>  
> -	nvme_complete_cqes(nvmeq, start, end);
>  	return found;

I think we can just make nvme_timeout call nvme_poll for the
polled case, and just handle the irq driven case in this function
(should probably be another patch on top of this one).

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] nvme-pci: Remove tag from process cq
  2020-03-04 16:05   ` Christoph Hellwig
@ 2020-03-04 16:45     ` Keith Busch
  0 siblings, 0 replies; 7+ messages in thread
From: Keith Busch @ 2020-03-04 16:45 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: sagi, linux-nvme, bijan.mottahedeh

On Wed, Mar 04, 2020 at 05:05:18PM +0100, Christoph Hellwig wrote:
> On Mon, Mar 02, 2020 at 11:46:15AM -0800, Keith Busch wrote:
> > The only user for tagged completion was for timeout handling. That user,
> > though, really only cares if the timed out command is completed, which
> > we can safely check within the timeout handler.
> > 
> > Remove the tag check to simplify completion handling.
> > 
> > Signed-off-by: Keith Busch <kbusch@kernel.org>
> > ---
> >  drivers/nvme/host/pci.c | 22 +++++++++++-----------
> >  1 file changed, 11 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> > index da392b50f73e..db84283f2a5a 100644
> > --- a/drivers/nvme/host/pci.c
> > +++ b/drivers/nvme/host/pci.c
> > @@ -991,14 +991,13 @@ static inline void nvme_update_cq_head(struct nvme_queue *nvmeq)
> >  }
> >  
> >  static inline int nvme_process_cq(struct nvme_queue *nvmeq, u16 *start,
> > -				  u16 *end, unsigned int tag)
> > +				  u16 *end)
> >  {
> >  	int found = 0;
> >  
> >  	*start = nvmeq->cq_head;
> >  	while (nvme_cqe_pending(nvmeq)) {
> > -		if (tag == -1U || nvmeq->cqes[nvmeq->cq_head].command_id == tag)
> > -			found++;
> > +		++found;
> 
> Any reason for moving from found++ to ++found?  I always find the infix
> notation a little odd when there is no need for it, but that is just a
> personal preference.

Heh, not intentional. I will keep it the same as before.

> > -	if (nvme_poll_irqdisable(nvmeq, req->tag)) {
> > +	if (nvme_poll_irqdisable(nvmeq) &&
> > +	    blk_mq_request_completed(req)) {
> 
> Do we need to check the nvme_poll_irqdisable value here?  If it got
> completed by other means we should be fine too, shouldn't we?

Right, we really don't care how a request was completed. We just want to
know if it has completed.

So I'll change it to this:

	nvme_poll_irqdisable(nvmeq);
	if (blk_mq_request_completed(req)) {

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] nvme-pci: Remove two-pass completions
  2020-03-02 19:46 ` [PATCH 2/2] nvme-pci: Remove two-pass completions Keith Busch
  2020-03-04 16:08   ` Christoph Hellwig
@ 2020-03-05 20:52   ` Sagi Grimberg
  1 sibling, 0 replies; 7+ messages in thread
From: Sagi Grimberg @ 2020-03-05 20:52 UTC (permalink / raw)
  To: Keith Busch, linux-nvme, hch; +Cc: bijan.mottahedeh

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-03-05 20:52 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-02 19:46 [PATCH 0/2] nvme-pci: Simpler completions Keith Busch
2020-03-02 19:46 ` [PATCH 1/2] nvme-pci: Remove tag from process cq Keith Busch
2020-03-04 16:05   ` Christoph Hellwig
2020-03-04 16:45     ` Keith Busch
2020-03-02 19:46 ` [PATCH 2/2] nvme-pci: Remove two-pass completions Keith Busch
2020-03-04 16:08   ` Christoph Hellwig
2020-03-05 20:52   ` Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).