linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET v2 0/4] Add support for list issue
@ 2021-12-03 21:45 Jens Axboe
  2021-12-03 21:45 ` [PATCH 1/4] block: add mq_ops->queue_rqs hook Jens Axboe
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: Jens Axboe @ 2021-12-03 21:45 UTC (permalink / raw)
  To: linux-block, linux-nvme

Hi,

With the support in 5.16-rc1 for allocating and completing batches of
IO, the one missing piece is passing down a list of requests for issue.
Drivers can take advantage of this by defining an mq_ops->queue_rqs()
hook.

This implements it for NVMe, allowing copy of multiple commands in one
swoop.

This is good for around a 500K IOPS/core improvement in my testing,
which is around a 5-6% improvement in efficiency.

Changes since v1:

- Addressed review comments
- Rebase on top of Ming's hctx lock change
- Clean ups
- Bypass for shared tags

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/4] block: add mq_ops->queue_rqs hook
  2021-12-03 21:45 [PATCHSET v2 0/4] Add support for list issue Jens Axboe
@ 2021-12-03 21:45 ` Jens Axboe
  2021-12-04 10:43   ` Hannes Reinecke
  2021-12-03 21:45 ` [PATCH 2/4] nvme: split command copy into a helper Jens Axboe
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2021-12-03 21:45 UTC (permalink / raw)
  To: linux-block, linux-nvme; +Cc: Jens Axboe

If we have a list of requests in our plug list, send it to the driver in
one go, if possible. The driver must set mq_ops->queue_rqs() to support
this, if not the usual one-by-one path is used.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 block/blk-mq.c         | 24 +++++++++++++++++++++---
 include/linux/blk-mq.h |  8 ++++++++
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 22ec21aa0c22..9ac9174a2ba4 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2513,6 +2513,7 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 {
 	struct blk_mq_hw_ctx *this_hctx;
 	struct blk_mq_ctx *this_ctx;
+	struct request *rq;
 	unsigned int depth;
 	LIST_HEAD(list);
 
@@ -2521,7 +2522,26 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 	plug->rq_count = 0;
 
 	if (!plug->multiple_queues && !plug->has_elevator && !from_schedule) {
-		blk_mq_run_dispatch_ops(plug->mq_list->q,
+		struct request_queue *q;
+
+		rq = plug->mq_list;
+		q = rq->q;
+
+		/*
+		 * Peek first request and see if we have a ->queue_rqs() hook.
+		 * If we do, we can dispatch the whole plug list in one go. We
+		 * already know at this point that all requests belong to the
+		 * same queue, caller must ensure that's the case.
+		 */
+		if (q->mq_ops->queue_rqs &&
+		    !(rq->mq_hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) {
+			blk_mq_run_dispatch_ops(q,
+				q->mq_ops->queue_rqs(&plug->mq_list));
+			if (rq_list_empty(plug->mq_list))
+				return;
+		}
+
+		blk_mq_run_dispatch_ops(q,
 				blk_mq_plug_issue_direct(plug, false));
 		if (rq_list_empty(plug->mq_list))
 			return;
@@ -2531,8 +2551,6 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 	this_ctx = NULL;
 	depth = 0;
 	do {
-		struct request *rq;
-
 		rq = rq_list_pop(&plug->mq_list);
 
 		if (!this_hctx) {
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index ecdc049b52fa..cdd183757ea0 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -494,6 +494,14 @@ struct blk_mq_ops {
 	 */
 	void (*commit_rqs)(struct blk_mq_hw_ctx *);
 
+	/**
+	 * @queue_rqs: Queue a list of new requests. Driver is guaranteed
+	 * that each request belongs to the same queue. If the driver doesn't
+	 * empty the @rqlist completely, then the rest will be queued
+	 * individually by the block layer upon return.
+	 */
+	void (*queue_rqs)(struct request **rqlist);
+
 	/**
 	 * @get_budget: Reserve budget before queue request, once .queue_rq is
 	 * run, it is driver's responsibility to release the
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/4] nvme: split command copy into a helper
  2021-12-03 21:45 [PATCHSET v2 0/4] Add support for list issue Jens Axboe
  2021-12-03 21:45 ` [PATCH 1/4] block: add mq_ops->queue_rqs hook Jens Axboe
@ 2021-12-03 21:45 ` Jens Axboe
  2021-12-04 10:44   ` Hannes Reinecke
  2021-12-06  7:38   ` Christoph Hellwig
  2021-12-03 21:45 ` [PATCH 3/4] nvme: separate command prep and issue Jens Axboe
  2021-12-03 21:45 ` [PATCH 4/4] nvme: add support for mq_ops->queue_rqs() Jens Axboe
  3 siblings, 2 replies; 24+ messages in thread
From: Jens Axboe @ 2021-12-03 21:45 UTC (permalink / raw)
  To: linux-block, linux-nvme; +Cc: Jens Axboe, Chaitanya Kulkarni

We'll need it for batched submit as well.

Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/nvme/host/pci.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 8637538f3fd5..09ea21f75439 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -500,6 +500,15 @@ static inline void nvme_write_sq_db(struct nvme_queue *nvmeq, bool write_sq)
 	nvmeq->last_sq_tail = nvmeq->sq_tail;
 }
 
+static inline void nvme_sq_copy_cmd(struct nvme_queue *nvmeq,
+				    struct nvme_command *cmd)
+{
+	memcpy(nvmeq->sq_cmds + (nvmeq->sq_tail << nvmeq->sqes), cmd,
+		sizeof(*cmd));
+	if (++nvmeq->sq_tail == nvmeq->q_depth)
+		nvmeq->sq_tail = 0;
+}
+
 /**
  * nvme_submit_cmd() - Copy a command into a queue and ring the doorbell
  * @nvmeq: The queue to use
@@ -510,10 +519,7 @@ static void nvme_submit_cmd(struct nvme_queue *nvmeq, struct nvme_command *cmd,
 			    bool write_sq)
 {
 	spin_lock(&nvmeq->sq_lock);
-	memcpy(nvmeq->sq_cmds + (nvmeq->sq_tail << nvmeq->sqes),
-	       cmd, sizeof(*cmd));
-	if (++nvmeq->sq_tail == nvmeq->q_depth)
-		nvmeq->sq_tail = 0;
+	nvme_sq_copy_cmd(nvmeq, cmd);
 	nvme_write_sq_db(nvmeq, write_sq);
 	spin_unlock(&nvmeq->sq_lock);
 }
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/4] nvme: separate command prep and issue
  2021-12-03 21:45 [PATCHSET v2 0/4] Add support for list issue Jens Axboe
  2021-12-03 21:45 ` [PATCH 1/4] block: add mq_ops->queue_rqs hook Jens Axboe
  2021-12-03 21:45 ` [PATCH 2/4] nvme: split command copy into a helper Jens Axboe
@ 2021-12-03 21:45 ` Jens Axboe
  2021-12-04 10:45   ` Hannes Reinecke
  2021-12-03 21:45 ` [PATCH 4/4] nvme: add support for mq_ops->queue_rqs() Jens Axboe
  3 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2021-12-03 21:45 UTC (permalink / raw)
  To: linux-block, linux-nvme; +Cc: Jens Axboe

Add a nvme_prep_rq() helper to setup a command, and nvme_queue_rq() is
adapted to use this helper.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/nvme/host/pci.c | 57 ++++++++++++++++++++++++-----------------
 1 file changed, 33 insertions(+), 24 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 09ea21f75439..6be6b1ab4285 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -918,52 +918,32 @@ static blk_status_t nvme_map_metadata(struct nvme_dev *dev, struct request *req,
 	return BLK_STS_OK;
 }
 
-/*
- * NOTE: ns is NULL when called on the admin queue.
- */
-static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
-			 const struct blk_mq_queue_data *bd)
+static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct request *req)
 {
-	struct nvme_ns *ns = hctx->queue->queuedata;
-	struct nvme_queue *nvmeq = hctx->driver_data;
-	struct nvme_dev *dev = nvmeq->dev;
-	struct request *req = bd->rq;
 	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
-	struct nvme_command *cmnd = &iod->cmd;
 	blk_status_t ret;
 
 	iod->aborted = 0;
 	iod->npages = -1;
 	iod->nents = 0;
 
-	/*
-	 * We should not need to do this, but we're still using this to
-	 * ensure we can drain requests on a dying queue.
-	 */
-	if (unlikely(!test_bit(NVMEQ_ENABLED, &nvmeq->flags)))
-		return BLK_STS_IOERR;
-
-	if (!nvme_check_ready(&dev->ctrl, req, true))
-		return nvme_fail_nonready_command(&dev->ctrl, req);
-
-	ret = nvme_setup_cmd(ns, req);
+	ret = nvme_setup_cmd(req->q->queuedata, req);
 	if (ret)
 		return ret;
 
 	if (blk_rq_nr_phys_segments(req)) {
-		ret = nvme_map_data(dev, req, cmnd);
+		ret = nvme_map_data(dev, req, &iod->cmd);
 		if (ret)
 			goto out_free_cmd;
 	}
 
 	if (blk_integrity_rq(req)) {
-		ret = nvme_map_metadata(dev, req, cmnd);
+		ret = nvme_map_metadata(dev, req, &iod->cmd);
 		if (ret)
 			goto out_unmap_data;
 	}
 
 	blk_mq_start_request(req);
-	nvme_submit_cmd(nvmeq, cmnd, bd->last);
 	return BLK_STS_OK;
 out_unmap_data:
 	nvme_unmap_data(dev, req);
@@ -972,6 +952,35 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
 	return ret;
 }
 
+/*
+ * NOTE: ns is NULL when called on the admin queue.
+ */
+static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
+			 const struct blk_mq_queue_data *bd)
+{
+	struct nvme_queue *nvmeq = hctx->driver_data;
+	struct nvme_dev *dev = nvmeq->dev;
+	struct request *req = bd->rq;
+	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
+	blk_status_t ret;
+
+	/*
+	 * We should not need to do this, but we're still using this to
+	 * ensure we can drain requests on a dying queue.
+	 */
+	if (unlikely(!test_bit(NVMEQ_ENABLED, &nvmeq->flags)))
+		return BLK_STS_IOERR;
+
+	if (unlikely(!nvme_check_ready(&dev->ctrl, req, true)))
+		return nvme_fail_nonready_command(&dev->ctrl, req);
+
+	ret = nvme_prep_rq(dev, req);
+	if (unlikely(ret))
+		return ret;
+	nvme_submit_cmd(nvmeq, &iod->cmd, bd->last);
+	return BLK_STS_OK;
+}
+
 static __always_inline void nvme_pci_unmap_rq(struct request *req)
 {
 	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/4] nvme: add support for mq_ops->queue_rqs()
  2021-12-03 21:45 [PATCHSET v2 0/4] Add support for list issue Jens Axboe
                   ` (2 preceding siblings ...)
  2021-12-03 21:45 ` [PATCH 3/4] nvme: separate command prep and issue Jens Axboe
@ 2021-12-03 21:45 ` Jens Axboe
  2021-12-04 10:47   ` Hannes Reinecke
  2021-12-06  7:40   ` Christoph Hellwig
  3 siblings, 2 replies; 24+ messages in thread
From: Jens Axboe @ 2021-12-03 21:45 UTC (permalink / raw)
  To: linux-block, linux-nvme; +Cc: Jens Axboe

This enables the block layer to send us a full plug list of requests
that need submitting. The block layer guarantees that they all belong
to the same queue, but we do have to check the hardware queue mapping
for each request.

If errors are encountered, leave them in the passed in list. Then the
block layer will handle them individually.

This is good for about a 4% improvement in peak performance, taking us
from 9.6M to 10M IOPS/core.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/nvme/host/pci.c | 61 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 61 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 6be6b1ab4285..197aa45ef7ef 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -981,6 +981,66 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
 	return BLK_STS_OK;
 }
 
+static void nvme_submit_cmds(struct nvme_queue *nvmeq, struct request **rqlist)
+{
+	spin_lock(&nvmeq->sq_lock);
+	while (!rq_list_empty(*rqlist)) {
+		struct request *req = rq_list_pop(rqlist);
+		struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
+
+		memcpy(nvmeq->sq_cmds + (nvmeq->sq_tail << nvmeq->sqes),
+				absolute_pointer(&iod->cmd), sizeof(iod->cmd));
+		if (++nvmeq->sq_tail == nvmeq->q_depth)
+			nvmeq->sq_tail = 0;
+	}
+	nvme_write_sq_db(nvmeq, true);
+	spin_unlock(&nvmeq->sq_lock);
+}
+
+static bool nvme_prep_rq_batch(struct nvme_queue *nvmeq, struct request *req)
+{
+	/*
+	 * We should not need to do this, but we're still using this to
+	 * ensure we can drain requests on a dying queue.
+	 */
+	if (unlikely(!test_bit(NVMEQ_ENABLED, &nvmeq->flags)))
+		return false;
+	if (unlikely(!nvme_check_ready(&nvmeq->dev->ctrl, req, true)))
+		return false;
+
+	req->mq_hctx->tags->rqs[req->tag] = req;
+	return nvme_prep_rq(nvmeq->dev, req) == BLK_STS_OK;
+}
+
+static void nvme_queue_rqs(struct request **rqlist)
+{
+	struct request *req = rq_list_peek(rqlist), *prev = NULL;
+	struct request *requeue_list = NULL;
+
+	do {
+		struct nvme_queue *nvmeq = req->mq_hctx->driver_data;
+
+		if (!nvme_prep_rq_batch(nvmeq, req)) {
+			/* detach 'req' and add to remainder list */
+			if (prev)
+				prev->rq_next = req->rq_next;
+			rq_list_add(&requeue_list, req);
+		} else {
+			prev = req;
+		}
+
+		req = rq_list_next(req);
+		if (!req || (prev && req->mq_hctx != prev->mq_hctx)) {
+			/* detach rest of list, and submit */
+			prev->rq_next = NULL;
+			nvme_submit_cmds(nvmeq, rqlist);
+			*rqlist = req;
+		}
+	} while (req);
+
+	*rqlist = requeue_list;
+}
+
 static __always_inline void nvme_pci_unmap_rq(struct request *req)
 {
 	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
@@ -1678,6 +1738,7 @@ static const struct blk_mq_ops nvme_mq_admin_ops = {
 
 static const struct blk_mq_ops nvme_mq_ops = {
 	.queue_rq	= nvme_queue_rq,
+	.queue_rqs	= nvme_queue_rqs,
 	.complete	= nvme_pci_complete_rq,
 	.commit_rqs	= nvme_commit_rqs,
 	.init_hctx	= nvme_init_hctx,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] block: add mq_ops->queue_rqs hook
  2021-12-03 21:45 ` [PATCH 1/4] block: add mq_ops->queue_rqs hook Jens Axboe
@ 2021-12-04 10:43   ` Hannes Reinecke
  2021-12-04 20:13     ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Hannes Reinecke @ 2021-12-04 10:43 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-nvme

On 12/3/21 10:45 PM, Jens Axboe wrote:
> If we have a list of requests in our plug list, send it to the driver in
> one go, if possible. The driver must set mq_ops->queue_rqs() to support
> this, if not the usual one-by-one path is used.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   block/blk-mq.c         | 24 +++++++++++++++++++++---
>   include/linux/blk-mq.h |  8 ++++++++
>   2 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 22ec21aa0c22..9ac9174a2ba4 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2513,6 +2513,7 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
>   {
>   	struct blk_mq_hw_ctx *this_hctx;
>   	struct blk_mq_ctx *this_ctx;
> +	struct request *rq;
>   	unsigned int depth;
>   	LIST_HEAD(list);
>   
> @@ -2521,7 +2522,26 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
>   	plug->rq_count = 0;
>   
>   	if (!plug->multiple_queues && !plug->has_elevator && !from_schedule) {
> -		blk_mq_run_dispatch_ops(plug->mq_list->q,
> +		struct request_queue *q;
> +
> +		rq = plug->mq_list;
> +		q = rq->q;
> +
> +		/*
> +		 * Peek first request and see if we have a ->queue_rqs() hook.
> +		 * If we do, we can dispatch the whole plug list in one go. We
> +		 * already know at this point that all requests belong to the
> +		 * same queue, caller must ensure that's the case.
> +		 */
> +		if (q->mq_ops->queue_rqs &&
> +		    !(rq->mq_hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) {

What is the dependency on shared tags here?
 From what I've seen it's just about submitting requests; the only 
difference to shared tags is the way the tags are allocated.
Care to explain?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] nvme: split command copy into a helper
  2021-12-03 21:45 ` [PATCH 2/4] nvme: split command copy into a helper Jens Axboe
@ 2021-12-04 10:44   ` Hannes Reinecke
  2021-12-06  7:38   ` Christoph Hellwig
  1 sibling, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2021-12-04 10:44 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-nvme; +Cc: Chaitanya Kulkarni

On 12/3/21 10:45 PM, Jens Axboe wrote:
> We'll need it for batched submit as well.
> 
> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   drivers/nvme/host/pci.c | 14 ++++++++++----
>   1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 8637538f3fd5..09ea21f75439 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -500,6 +500,15 @@ static inline void nvme_write_sq_db(struct nvme_queue *nvmeq, bool write_sq)
>   	nvmeq->last_sq_tail = nvmeq->sq_tail;
>   }
>   
> +static inline void nvme_sq_copy_cmd(struct nvme_queue *nvmeq,
> +				    struct nvme_command *cmd)
> +{
> +	memcpy(nvmeq->sq_cmds + (nvmeq->sq_tail << nvmeq->sqes), cmd,
> +		sizeof(*cmd));
> +	if (++nvmeq->sq_tail == nvmeq->q_depth)
> +		nvmeq->sq_tail = 0;
> +}
> +
>   /**
>    * nvme_submit_cmd() - Copy a command into a queue and ring the doorbell
>    * @nvmeq: The queue to use
> @@ -510,10 +519,7 @@ static void nvme_submit_cmd(struct nvme_queue *nvmeq, struct nvme_command *cmd,
>   			    bool write_sq)
>   {
>   	spin_lock(&nvmeq->sq_lock);
> -	memcpy(nvmeq->sq_cmds + (nvmeq->sq_tail << nvmeq->sqes),
> -	       cmd, sizeof(*cmd));
> -	if (++nvmeq->sq_tail == nvmeq->q_depth)
> -		nvmeq->sq_tail = 0;
> +	nvme_sq_copy_cmd(nvmeq, cmd);
>   	nvme_write_sq_db(nvmeq, write_sq);
>   	spin_unlock(&nvmeq->sq_lock);
>   }
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] nvme: separate command prep and issue
  2021-12-03 21:45 ` [PATCH 3/4] nvme: separate command prep and issue Jens Axboe
@ 2021-12-04 10:45   ` Hannes Reinecke
  0 siblings, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2021-12-04 10:45 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-nvme

On 12/3/21 10:45 PM, Jens Axboe wrote:
> Add a nvme_prep_rq() helper to setup a command, and nvme_queue_rq() is
> adapted to use this helper.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   drivers/nvme/host/pci.c | 57 ++++++++++++++++++++++++-----------------
>   1 file changed, 33 insertions(+), 24 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] nvme: add support for mq_ops->queue_rqs()
  2021-12-03 21:45 ` [PATCH 4/4] nvme: add support for mq_ops->queue_rqs() Jens Axboe
@ 2021-12-04 10:47   ` Hannes Reinecke
  2021-12-06  7:40   ` Christoph Hellwig
  1 sibling, 0 replies; 24+ messages in thread
From: Hannes Reinecke @ 2021-12-04 10:47 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-nvme

On 12/3/21 10:45 PM, Jens Axboe wrote:
> This enables the block layer to send us a full plug list of requests
> that need submitting. The block layer guarantees that they all belong
> to the same queue, but we do have to check the hardware queue mapping
> for each request.
> 
> If errors are encountered, leave them in the passed in list. Then the
> block layer will handle them individually.
> 
> This is good for about a 4% improvement in peak performance, taking us
> from 9.6M to 10M IOPS/core.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>   drivers/nvme/host/pci.c | 61 +++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 61 insertions(+)
> 
Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] block: add mq_ops->queue_rqs hook
  2021-12-04 10:43   ` Hannes Reinecke
@ 2021-12-04 20:13     ` Jens Axboe
  2021-12-05  9:07       ` Hannes Reinecke
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2021-12-04 20:13 UTC (permalink / raw)
  To: Hannes Reinecke, linux-block, linux-nvme

On 12/4/21 3:43 AM, Hannes Reinecke wrote:
> On 12/3/21 10:45 PM, Jens Axboe wrote:
>> If we have a list of requests in our plug list, send it to the driver in
>> one go, if possible. The driver must set mq_ops->queue_rqs() to support
>> this, if not the usual one-by-one path is used.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>   block/blk-mq.c         | 24 +++++++++++++++++++++---
>>   include/linux/blk-mq.h |  8 ++++++++
>>   2 files changed, 29 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index 22ec21aa0c22..9ac9174a2ba4 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -2513,6 +2513,7 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
>>   {
>>   	struct blk_mq_hw_ctx *this_hctx;
>>   	struct blk_mq_ctx *this_ctx;
>> +	struct request *rq;
>>   	unsigned int depth;
>>   	LIST_HEAD(list);
>>   
>> @@ -2521,7 +2522,26 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
>>   	plug->rq_count = 0;
>>   
>>   	if (!plug->multiple_queues && !plug->has_elevator && !from_schedule) {
>> -		blk_mq_run_dispatch_ops(plug->mq_list->q,
>> +		struct request_queue *q;
>> +
>> +		rq = plug->mq_list;
>> +		q = rq->q;
>> +
>> +		/*
>> +		 * Peek first request and see if we have a ->queue_rqs() hook.
>> +		 * If we do, we can dispatch the whole plug list in one go. We
>> +		 * already know at this point that all requests belong to the
>> +		 * same queue, caller must ensure that's the case.
>> +		 */
>> +		if (q->mq_ops->queue_rqs &&
>> +		    !(rq->mq_hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) {
> 
> What is the dependency on shared tags here?
>  From what I've seen it's just about submitting requests; the only 
> difference to shared tags is the way the tags are allocated.
> Care to explain?

For shared tags, we need to actively increment the use count per
request. This path doesn't do that, so it's disabled for now. It could
be done, but then it'd have to be in the caller, so I'd rather leave it
for a future optimization if anyone cares enough about this for shared
tags.

I can add a comment about it if that helps.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] block: add mq_ops->queue_rqs hook
  2021-12-04 20:13     ` Jens Axboe
@ 2021-12-05  9:07       ` Hannes Reinecke
  2021-12-05 13:09         ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Hannes Reinecke @ 2021-12-05  9:07 UTC (permalink / raw)
  To: Jens Axboe, linux-block, linux-nvme

On 12/4/21 9:13 PM, Jens Axboe wrote:
> On 12/4/21 3:43 AM, Hannes Reinecke wrote:
>> On 12/3/21 10:45 PM, Jens Axboe wrote:
>>> If we have a list of requests in our plug list, send it to the driver in
>>> one go, if possible. The driver must set mq_ops->queue_rqs() to support
>>> this, if not the usual one-by-one path is used.
>>>
>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>> ---
>>>    block/blk-mq.c         | 24 +++++++++++++++++++++---
>>>    include/linux/blk-mq.h |  8 ++++++++
>>>    2 files changed, 29 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>> index 22ec21aa0c22..9ac9174a2ba4 100644
>>> --- a/block/blk-mq.c
>>> +++ b/block/blk-mq.c
>>> @@ -2513,6 +2513,7 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
>>>    {
>>>    	struct blk_mq_hw_ctx *this_hctx;
>>>    	struct blk_mq_ctx *this_ctx;
>>> +	struct request *rq;
>>>    	unsigned int depth;
>>>    	LIST_HEAD(list);
>>>    
>>> @@ -2521,7 +2522,26 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
>>>    	plug->rq_count = 0;
>>>    
>>>    	if (!plug->multiple_queues && !plug->has_elevator && !from_schedule) {
>>> -		blk_mq_run_dispatch_ops(plug->mq_list->q,
>>> +		struct request_queue *q;
>>> +
>>> +		rq = plug->mq_list;
>>> +		q = rq->q;
>>> +
>>> +		/*
>>> +		 * Peek first request and see if we have a ->queue_rqs() hook.
>>> +		 * If we do, we can dispatch the whole plug list in one go. We
>>> +		 * already know at this point that all requests belong to the
>>> +		 * same queue, caller must ensure that's the case.
>>> +		 */
>>> +		if (q->mq_ops->queue_rqs &&
>>> +		    !(rq->mq_hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) {
>>
>> What is the dependency on shared tags here?
>>   From what I've seen it's just about submitting requests; the only
>> difference to shared tags is the way the tags are allocated.
>> Care to explain?
> 
> For shared tags, we need to actively increment the use count per
> request. This path doesn't do that, so it's disabled for now. It could
> be done, but then it'd have to be in the caller, so I'd rather leave it
> for a future optimization if anyone cares enough about this for shared
> tags.
> 
> I can add a comment about it if that helps.
> 
Please do.
It'll act as a reminder what needs to be done if and when one of the 
drivers requiring shared tags is looking at implementing queue_rqs.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/4] block: add mq_ops->queue_rqs hook
  2021-12-05  9:07       ` Hannes Reinecke
@ 2021-12-05 13:09         ` Jens Axboe
  0 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2021-12-05 13:09 UTC (permalink / raw)
  To: Hannes Reinecke, linux-block, linux-nvme

On 12/5/21 2:07 AM, Hannes Reinecke wrote:
> On 12/4/21 9:13 PM, Jens Axboe wrote:
>> On 12/4/21 3:43 AM, Hannes Reinecke wrote:
>>> On 12/3/21 10:45 PM, Jens Axboe wrote:
>>>> If we have a list of requests in our plug list, send it to the driver in
>>>> one go, if possible. The driver must set mq_ops->queue_rqs() to support
>>>> this, if not the usual one-by-one path is used.
>>>>
>>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>>> ---
>>>>    block/blk-mq.c         | 24 +++++++++++++++++++++---
>>>>    include/linux/blk-mq.h |  8 ++++++++
>>>>    2 files changed, 29 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>>> index 22ec21aa0c22..9ac9174a2ba4 100644
>>>> --- a/block/blk-mq.c
>>>> +++ b/block/blk-mq.c
>>>> @@ -2513,6 +2513,7 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
>>>>    {
>>>>    	struct blk_mq_hw_ctx *this_hctx;
>>>>    	struct blk_mq_ctx *this_ctx;
>>>> +	struct request *rq;
>>>>    	unsigned int depth;
>>>>    	LIST_HEAD(list);
>>>>    
>>>> @@ -2521,7 +2522,26 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
>>>>    	plug->rq_count = 0;
>>>>    
>>>>    	if (!plug->multiple_queues && !plug->has_elevator && !from_schedule) {
>>>> -		blk_mq_run_dispatch_ops(plug->mq_list->q,
>>>> +		struct request_queue *q;
>>>> +
>>>> +		rq = plug->mq_list;
>>>> +		q = rq->q;
>>>> +
>>>> +		/*
>>>> +		 * Peek first request and see if we have a ->queue_rqs() hook.
>>>> +		 * If we do, we can dispatch the whole plug list in one go. We
>>>> +		 * already know at this point that all requests belong to the
>>>> +		 * same queue, caller must ensure that's the case.
>>>> +		 */
>>>> +		if (q->mq_ops->queue_rqs &&
>>>> +		    !(rq->mq_hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) {
>>>
>>> What is the dependency on shared tags here?
>>>   From what I've seen it's just about submitting requests; the only
>>> difference to shared tags is the way the tags are allocated.
>>> Care to explain?
>>
>> For shared tags, we need to actively increment the use count per
>> request. This path doesn't do that, so it's disabled for now. It could
>> be done, but then it'd have to be in the caller, so I'd rather leave it
>> for a future optimization if anyone cares enough about this for shared
>> tags.
>>
>> I can add a comment about it if that helps.
>>
> Please do.
> It'll act as a reminder what needs to be done if and when one of the 
> drivers requiring shared tags is looking at implementing queue_rqs.

I added to the comment yesterday:

https://git.kernel.dk/cgit/linux-block/commit/?h=perf-wip&id=f9f526700607bf804fa8541c824ea54253f4241a

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] nvme: split command copy into a helper
  2021-12-03 21:45 ` [PATCH 2/4] nvme: split command copy into a helper Jens Axboe
  2021-12-04 10:44   ` Hannes Reinecke
@ 2021-12-06  7:38   ` Christoph Hellwig
  2021-12-06 16:33     ` Jens Axboe
  1 sibling, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2021-12-06  7:38 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-nvme, Chaitanya Kulkarni

As mentioned last time I really don't like the proliferation of
tiny helpers with just two callers.  See my patch I responded with
for my alternative.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] nvme: add support for mq_ops->queue_rqs()
  2021-12-03 21:45 ` [PATCH 4/4] nvme: add support for mq_ops->queue_rqs() Jens Axboe
  2021-12-04 10:47   ` Hannes Reinecke
@ 2021-12-06  7:40   ` Christoph Hellwig
  2021-12-06 16:33     ` Jens Axboe
  1 sibling, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2021-12-06  7:40 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, linux-nvme

On Fri, Dec 03, 2021 at 02:45:44PM -0700, Jens Axboe wrote:
> This enables the block layer to send us a full plug list of requests
> that need submitting. The block layer guarantees that they all belong
> to the same queue, but we do have to check the hardware queue mapping
> for each request.
> 
> If errors are encountered, leave them in the passed in list. Then the
> block layer will handle them individually.
> 
> This is good for about a 4% improvement in peak performance, taking us
> from 9.6M to 10M IOPS/core.

This looks pretty similar to my proposed cleanups (which is nice), but
back then you mentioned the cleaner version was much slower.  Do you
know what brought the speed back in this version?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] nvme: split command copy into a helper
  2021-12-06  7:38   ` Christoph Hellwig
@ 2021-12-06 16:33     ` Jens Axboe
  0 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2021-12-06 16:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block, linux-nvme, Chaitanya Kulkarni

On 12/6/21 12:38 AM, Christoph Hellwig wrote:
> As mentioned last time I really don't like the proliferation of
> tiny helpers with just two callers.  See my patch I responded with
> for my alternative.

I tend to agree with you, but for this particular case I really don't
think it's better to not have the helper. I did try that version too,
but you end up with several sites duplicating it. So while I agree in
general, I don't think this one applies.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] nvme: add support for mq_ops->queue_rqs()
  2021-12-06  7:40   ` Christoph Hellwig
@ 2021-12-06 16:33     ` Jens Axboe
  0 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2021-12-06 16:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block, linux-nvme

On 12/6/21 12:40 AM, Christoph Hellwig wrote:
> On Fri, Dec 03, 2021 at 02:45:44PM -0700, Jens Axboe wrote:
>> This enables the block layer to send us a full plug list of requests
>> that need submitting. The block layer guarantees that they all belong
>> to the same queue, but we do have to check the hardware queue mapping
>> for each request.
>>
>> If errors are encountered, leave them in the passed in list. Then the
>> block layer will handle them individually.
>>
>> This is good for about a 4% improvement in peak performance, taking us
>> from 9.6M to 10M IOPS/core.
> 
> This looks pretty similar to my proposed cleanups (which is nice), but
> back then you mentioned the cleaner version was much slower.  Do you
> know what brought the speed back in this version?

Yes, it has that folded in and tweaked on top. Current version seems
to be fine.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] nvme: add support for mq_ops->queue_rqs()
  2021-12-16 16:39 ` [PATCH 4/4] nvme: add support for mq_ops->queue_rqs() Jens Axboe
@ 2021-12-16 17:53   ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2021-12-16 17:53 UTC (permalink / raw)
  To: Jens Axboe
  Cc: io-uring, linux-block, linux-nvme, Hannes Reinecke, Keith Busch

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 4/4] nvme: add support for mq_ops->queue_rqs()
  2021-12-16 16:38 [PATCHSET v5 0/4] Add support for list issue Jens Axboe
@ 2021-12-16 16:39 ` Jens Axboe
  2021-12-16 17:53   ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2021-12-16 16:39 UTC (permalink / raw)
  To: io-uring, linux-block, linux-nvme
  Cc: Jens Axboe, Hannes Reinecke, Keith Busch

This enables the block layer to send us a full plug list of requests
that need submitting. The block layer guarantees that they all belong
to the same queue, but we do have to check the hardware queue mapping
for each request.

If errors are encountered, leave them in the passed in list. Then the
block layer will handle them individually.

This is good for about a 4% improvement in peak performance, taking us
from 9.6M to 10M IOPS/core.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/nvme/host/pci.c | 59 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 59 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 7062128c8204..51a903d91d92 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -969,6 +969,64 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
 	return BLK_STS_OK;
 }
 
+static void nvme_submit_cmds(struct nvme_queue *nvmeq, struct request **rqlist)
+{
+	spin_lock(&nvmeq->sq_lock);
+	while (!rq_list_empty(*rqlist)) {
+		struct request *req = rq_list_pop(rqlist);
+		struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
+
+		nvme_sq_copy_cmd(nvmeq, &iod->cmd);
+	}
+	nvme_write_sq_db(nvmeq, true);
+	spin_unlock(&nvmeq->sq_lock);
+}
+
+static bool nvme_prep_rq_batch(struct nvme_queue *nvmeq, struct request *req)
+{
+	/*
+	 * We should not need to do this, but we're still using this to
+	 * ensure we can drain requests on a dying queue.
+	 */
+	if (unlikely(!test_bit(NVMEQ_ENABLED, &nvmeq->flags)))
+		return false;
+	if (unlikely(!nvme_check_ready(&nvmeq->dev->ctrl, req, true)))
+		return false;
+
+	req->mq_hctx->tags->rqs[req->tag] = req;
+	return nvme_prep_rq(nvmeq->dev, req) == BLK_STS_OK;
+}
+
+static void nvme_queue_rqs(struct request **rqlist)
+{
+	struct request *req = rq_list_peek(rqlist), *prev = NULL;
+	struct request *requeue_list = NULL;
+
+	do {
+		struct nvme_queue *nvmeq = req->mq_hctx->driver_data;
+
+		if (!nvme_prep_rq_batch(nvmeq, req)) {
+			/* detach 'req' and add to remainder list */
+			if (prev)
+				prev->rq_next = req->rq_next;
+			rq_list_add(&requeue_list, req);
+		} else {
+			prev = req;
+		}
+
+		req = rq_list_next(req);
+		if (!req || (prev && req->mq_hctx != prev->mq_hctx)) {
+			/* detach rest of list, and submit */
+			if (prev)
+				prev->rq_next = NULL;
+			nvme_submit_cmds(nvmeq, rqlist);
+			*rqlist = req;
+		}
+	} while (req);
+
+	*rqlist = requeue_list;
+}
+
 static __always_inline void nvme_pci_unmap_rq(struct request *req)
 {
 	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
@@ -1670,6 +1728,7 @@ static const struct blk_mq_ops nvme_mq_admin_ops = {
 
 static const struct blk_mq_ops nvme_mq_ops = {
 	.queue_rq	= nvme_queue_rq,
+	.queue_rqs	= nvme_queue_rqs,
 	.complete	= nvme_pci_complete_rq,
 	.commit_rqs	= nvme_commit_rqs,
 	.init_hctx	= nvme_init_hctx,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/4] nvme: add support for mq_ops->queue_rqs()
  2021-12-16 16:05 [PATCHSET v4 0/4] Add support for list issue Jens Axboe
@ 2021-12-16 16:05 ` Jens Axboe
  0 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2021-12-16 16:05 UTC (permalink / raw)
  To: io-uring, linux-block, linux-nvme
  Cc: Jens Axboe, Hannes Reinecke, Keith Busch

This enables the block layer to send us a full plug list of requests
that need submitting. The block layer guarantees that they all belong
to the same queue, but we do have to check the hardware queue mapping
for each request.

If errors are encountered, leave them in the passed in list. Then the
block layer will handle them individually.

This is good for about a 4% improvement in peak performance, taking us
from 9.6M to 10M IOPS/core.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/nvme/host/pci.c | 58 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 6be6b1ab4285..e34ad67c4c41 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -981,6 +981,63 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
 	return BLK_STS_OK;
 }
 
+static void nvme_submit_cmds(struct nvme_queue *nvmeq, struct request **rqlist)
+{
+	spin_lock(&nvmeq->sq_lock);
+	while (!rq_list_empty(*rqlist)) {
+		struct request *req = rq_list_pop(rqlist);
+		struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
+
+		nvme_sq_copy_cmd(nvmeq, &iod->cmd);
+	}
+	nvme_write_sq_db(nvmeq, true);
+	spin_unlock(&nvmeq->sq_lock);
+}
+
+static bool nvme_prep_rq_batch(struct nvme_queue *nvmeq, struct request *req)
+{
+	/*
+	 * We should not need to do this, but we're still using this to
+	 * ensure we can drain requests on a dying queue.
+	 */
+	if (unlikely(!test_bit(NVMEQ_ENABLED, &nvmeq->flags)))
+		return false;
+	if (unlikely(!nvme_check_ready(&nvmeq->dev->ctrl, req, true)))
+		return false;
+
+	req->mq_hctx->tags->rqs[req->tag] = req;
+	return nvme_prep_rq(nvmeq->dev, req) == BLK_STS_OK;
+}
+
+static void nvme_queue_rqs(struct request **rqlist)
+{
+	struct request *req = rq_list_peek(rqlist), *prev = NULL;
+	struct request *requeue_list = NULL;
+
+	do {
+		struct nvme_queue *nvmeq = req->mq_hctx->driver_data;
+
+		if (!nvme_prep_rq_batch(nvmeq, req)) {
+			/* detach 'req' and add to remainder list */
+			if (prev)
+				prev->rq_next = req->rq_next;
+			rq_list_add(&requeue_list, req);
+		} else {
+			prev = req;
+		}
+
+		req = rq_list_next(req);
+		if (!req || (prev && req->mq_hctx != prev->mq_hctx)) {
+			/* detach rest of list, and submit */
+			prev->rq_next = NULL;
+			nvme_submit_cmds(nvmeq, rqlist);
+			*rqlist = req;
+		}
+	} while (req);
+
+	*rqlist = requeue_list;
+}
+
 static __always_inline void nvme_pci_unmap_rq(struct request *req)
 {
 	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
@@ -1678,6 +1735,7 @@ static const struct blk_mq_ops nvme_mq_admin_ops = {
 
 static const struct blk_mq_ops nvme_mq_ops = {
 	.queue_rq	= nvme_queue_rq,
+	.queue_rqs	= nvme_queue_rqs,
 	.complete	= nvme_pci_complete_rq,
 	.commit_rqs	= nvme_commit_rqs,
 	.init_hctx	= nvme_init_hctx,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] nvme: add support for mq_ops->queue_rqs()
  2021-11-17  3:38 ` [PATCH 4/4] nvme: add support for mq_ops->queue_rqs() Jens Axboe
  2021-11-17  8:39   ` Christoph Hellwig
@ 2021-11-17 19:41   ` Keith Busch
  1 sibling, 0 replies; 24+ messages in thread
From: Keith Busch @ 2021-11-17 19:41 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, hch

[adding linux-nvme]

On Tue, Nov 16, 2021 at 08:38:07PM -0700, Jens Axboe wrote:
> This enables the block layer to send us a full plug list of requests
> that need submitting. The block layer guarantees that they all belong
> to the same queue, but we do have to check the hardware queue mapping
> for each request.

So this means that the nvme namespace will always be the same for all
the requests in the list, but the rqlist may contain requests allocated
from different CPUs?
 
> If errors are encountered, leave them in the passed in list. Then the
> block layer will handle them individually.
> 
> This is good for about a 4% improvement in peak performance, taking us
> from 9.6M to 10M IOPS/core.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  drivers/nvme/host/pci.c | 67 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 67 insertions(+)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index d2b654fc3603..2eedd04b1f90 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -1004,6 +1004,72 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
>  	return ret;
>  }
>  
> +static void nvme_submit_cmds(struct nvme_queue *nvmeq, struct request **rqlist)
> +{
> +	spin_lock(&nvmeq->sq_lock);
> +	while (!rq_list_empty(*rqlist)) {
> +		struct request *req = rq_list_pop(rqlist);
> +		struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
> +
> +		nvme_copy_cmd(nvmeq, absolute_pointer(&iod->cmd));
> +	}
> +	nvme_write_sq_db(nvmeq, true);
> +	spin_unlock(&nvmeq->sq_lock);
> +}
> +
> +static void nvme_queue_rqs(struct request **rqlist)
> +{
> +	struct request *requeue_list = NULL, *req, *prev = NULL;
> +	struct blk_mq_hw_ctx *hctx;
> +	struct nvme_queue *nvmeq;
> +	struct nvme_ns *ns;
> +
> +restart:
> +	req = rq_list_peek(rqlist);
> +	hctx = req->mq_hctx;
> +	nvmeq = hctx->driver_data;
> +	ns = hctx->queue->queuedata;
> +
> +	/*
> +	 * We should not need to do this, but we're still using this to
> +	 * ensure we can drain requests on a dying queue.
> +	 */
> +	if (unlikely(!test_bit(NVMEQ_ENABLED, &nvmeq->flags)))
> +		return;
> +
> +	rq_list_for_each(rqlist, req) {
> +		struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
> +		blk_status_t ret;
> +
> +		if (unlikely(!nvme_check_ready(&nvmeq->dev->ctrl, req, true)))
> +			goto requeue;
> +
> +		if (req->mq_hctx != hctx) {
> +			/* detach rest of list, and submit */
> +			prev->rq_next = NULL;
> +			nvme_submit_cmds(nvmeq, rqlist);
> +			/* req now start of new list for this hw queue */
> +			*rqlist = req;
> +			goto restart;
> +		}
> +
> +		hctx->tags->rqs[req->tag] = req;

After checking how this is handled previously, it appears this new
.queue_rqs() skips incrementing active requests, and bypasses the
hctx_lock(). Won't that break quiesce?

> +		ret = nvme_prep_rq(nvmeq->dev, ns, req, &iod->cmd);
> +		if (ret == BLK_STS_OK) {
> +			prev = req;
> +			continue;
> +		}
> +requeue:
> +		/* detach 'req' and add to remainder list */
> +		if (prev)
> +			prev->rq_next = req->rq_next;
> +		rq_list_add(&requeue_list, req);
> +	}
> +
> +	nvme_submit_cmds(nvmeq, rqlist);
> +	*rqlist = requeue_list;
> +}
> +
>  static __always_inline void nvme_pci_unmap_rq(struct request *req)
>  {
>  	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
> @@ -1741,6 +1807,7 @@ static const struct blk_mq_ops nvme_mq_admin_ops = {
>  
>  static const struct blk_mq_ops nvme_mq_ops = {
>  	.queue_rq	= nvme_queue_rq,
> +	.queue_rqs	= nvme_queue_rqs,
>  	.complete	= nvme_pci_complete_rq,
>  	.commit_rqs	= nvme_commit_rqs,
>  	.init_hctx	= nvme_init_hctx,
> -- 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] nvme: add support for mq_ops->queue_rqs()
  2021-11-17 15:55     ` Jens Axboe
@ 2021-11-17 15:58       ` Jens Axboe
  0 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2021-11-17 15:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block

On 11/17/21 8:55 AM, Jens Axboe wrote:
> On 11/17/21 1:39 AM, Christoph Hellwig wrote:
>> On Tue, Nov 16, 2021 at 08:38:07PM -0700, Jens Axboe wrote:
>>> This enables the block layer to send us a full plug list of requests
>>> that need submitting. The block layer guarantees that they all belong
>>> to the same queue, but we do have to check the hardware queue mapping
>>> for each request.
>>>
>>> If errors are encountered, leave them in the passed in list. Then the
>>> block layer will handle them individually.
>>>
>>> This is good for about a 4% improvement in peak performance, taking us
>>> from 9.6M to 10M IOPS/core.
>>
>> The concept looks sensible, but the loop in nvme_queue_rqs is a complete
>> mess to follow. What about something like this (untested) on top?
> 
> Let me take a closer look.

Something changed, efficiency is way down:

     2.26%     +4.34%  [nvme]            [k] nvme_queue_rqs


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] nvme: add support for mq_ops->queue_rqs()
  2021-11-17  8:39   ` Christoph Hellwig
@ 2021-11-17 15:55     ` Jens Axboe
  2021-11-17 15:58       ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2021-11-17 15:55 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block

On 11/17/21 1:39 AM, Christoph Hellwig wrote:
> On Tue, Nov 16, 2021 at 08:38:07PM -0700, Jens Axboe wrote:
>> This enables the block layer to send us a full plug list of requests
>> that need submitting. The block layer guarantees that they all belong
>> to the same queue, but we do have to check the hardware queue mapping
>> for each request.
>>
>> If errors are encountered, leave them in the passed in list. Then the
>> block layer will handle them individually.
>>
>> This is good for about a 4% improvement in peak performance, taking us
>> from 9.6M to 10M IOPS/core.
> 
> The concept looks sensible, but the loop in nvme_queue_rqs is a complete
> mess to follow. What about something like this (untested) on top?

Let me take a closer look.

> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 13722cc400c2c..555a7609580c7 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -509,21 +509,6 @@ static inline void nvme_copy_cmd(struct nvme_queue *nvmeq,
>  		nvmeq->sq_tail = 0;
>  }
>  
> -/**
> - * nvme_submit_cmd() - Copy a command into a queue and ring the doorbell
> - * @nvmeq: The queue to use
> - * @cmd: The command to send
> - * @write_sq: whether to write to the SQ doorbell
> - */
> -static void nvme_submit_cmd(struct nvme_queue *nvmeq, struct nvme_command *cmd,
> -			    bool write_sq)
> -{
> -	spin_lock(&nvmeq->sq_lock);
> -	nvme_copy_cmd(nvmeq, cmd);
> -	nvme_write_sq_db(nvmeq, write_sq);
> -	spin_unlock(&nvmeq->sq_lock);
> -}

You really don't like helpers? Code generation wise it doesn't matter,
but without this and the copy helper we do end up having some trivial
duplicated code...

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/4] nvme: add support for mq_ops->queue_rqs()
  2021-11-17  3:38 ` [PATCH 4/4] nvme: add support for mq_ops->queue_rqs() Jens Axboe
@ 2021-11-17  8:39   ` Christoph Hellwig
  2021-11-17 15:55     ` Jens Axboe
  2021-11-17 19:41   ` Keith Busch
  1 sibling, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2021-11-17  8:39 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, hch

On Tue, Nov 16, 2021 at 08:38:07PM -0700, Jens Axboe wrote:
> This enables the block layer to send us a full plug list of requests
> that need submitting. The block layer guarantees that they all belong
> to the same queue, but we do have to check the hardware queue mapping
> for each request.
> 
> If errors are encountered, leave them in the passed in list. Then the
> block layer will handle them individually.
> 
> This is good for about a 4% improvement in peak performance, taking us
> from 9.6M to 10M IOPS/core.

The concept looks sensible, but the loop in nvme_queue_rqs is a complete
mess to follow. What about something like this (untested) on top?

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 13722cc400c2c..555a7609580c7 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -509,21 +509,6 @@ static inline void nvme_copy_cmd(struct nvme_queue *nvmeq,
 		nvmeq->sq_tail = 0;
 }
 
-/**
- * nvme_submit_cmd() - Copy a command into a queue and ring the doorbell
- * @nvmeq: The queue to use
- * @cmd: The command to send
- * @write_sq: whether to write to the SQ doorbell
- */
-static void nvme_submit_cmd(struct nvme_queue *nvmeq, struct nvme_command *cmd,
-			    bool write_sq)
-{
-	spin_lock(&nvmeq->sq_lock);
-	nvme_copy_cmd(nvmeq, cmd);
-	nvme_write_sq_db(nvmeq, write_sq);
-	spin_unlock(&nvmeq->sq_lock);
-}
-
 static void nvme_commit_rqs(struct blk_mq_hw_ctx *hctx)
 {
 	struct nvme_queue *nvmeq = hctx->driver_data;
@@ -918,8 +903,7 @@ static blk_status_t nvme_map_metadata(struct nvme_dev *dev, struct request *req,
 	return BLK_STS_OK;
 }
 
-static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct nvme_ns *ns,
-				 struct request *req, struct nvme_command *cmnd)
+static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct request *req)
 {
 	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
 	blk_status_t ret;
@@ -928,18 +912,18 @@ static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct nvme_ns *ns,
 	iod->npages = -1;
 	iod->nents = 0;
 
-	ret = nvme_setup_cmd(ns, req);
+	ret = nvme_setup_cmd(req->q->queuedata, req);
 	if (ret)
 		return ret;
 
 	if (blk_rq_nr_phys_segments(req)) {
-		ret = nvme_map_data(dev, req, cmnd);
+		ret = nvme_map_data(dev, req, &iod->cmd);
 		if (ret)
 			goto out_free_cmd;
 	}
 
 	if (blk_integrity_rq(req)) {
-		ret = nvme_map_metadata(dev, req, cmnd);
+		ret = nvme_map_metadata(dev, req, &iod->cmd);
 		if (ret)
 			goto out_unmap_data;
 	}
@@ -959,7 +943,6 @@ static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct nvme_ns *ns,
 static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
 			 const struct blk_mq_queue_data *bd)
 {
-	struct nvme_ns *ns = hctx->queue->queuedata;
 	struct nvme_queue *nvmeq = hctx->driver_data;
 	struct nvme_dev *dev = nvmeq->dev;
 	struct request *req = bd->rq;
@@ -976,12 +959,15 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
 	if (!nvme_check_ready(&dev->ctrl, req, true))
 		return nvme_fail_nonready_command(&dev->ctrl, req);
 
-	ret = nvme_prep_rq(dev, ns, req, &iod->cmd);
-	if (ret == BLK_STS_OK) {
-		nvme_submit_cmd(nvmeq, &iod->cmd, bd->last);
-		return BLK_STS_OK;
-	}
-	return ret;
+	ret = nvme_prep_rq(dev, req);
+	if (ret)
+		return ret;
+
+	spin_lock(&nvmeq->sq_lock);
+	nvme_copy_cmd(nvmeq, &iod->cmd);
+	nvme_write_sq_db(nvmeq, bd->last);
+	spin_unlock(&nvmeq->sq_lock);
+	return BLK_STS_OK;
 }
 
 static void nvme_submit_cmds(struct nvme_queue *nvmeq, struct request **rqlist)
@@ -997,56 +983,47 @@ static void nvme_submit_cmds(struct nvme_queue *nvmeq, struct request **rqlist)
 	spin_unlock(&nvmeq->sq_lock);
 }
 
-static void nvme_queue_rqs(struct request **rqlist)
+static bool nvme_prep_rq_batch(struct nvme_queue *nvmeq, struct request *req)
 {
-	struct request *requeue_list = NULL, *req, *prev = NULL;
-	struct blk_mq_hw_ctx *hctx;
-	struct nvme_queue *nvmeq;
-	struct nvme_ns *ns;
-
-restart:
-	req = rq_list_peek(rqlist);
-	hctx = req->mq_hctx;
-	nvmeq = hctx->driver_data;
-	ns = hctx->queue->queuedata;
-
 	/*
 	 * We should not need to do this, but we're still using this to
 	 * ensure we can drain requests on a dying queue.
 	 */
 	if (unlikely(!test_bit(NVMEQ_ENABLED, &nvmeq->flags)))
-		return;
+		return false;
+	if (unlikely(!nvme_check_ready(&nvmeq->dev->ctrl, req, true)))
+		return false;
 
-	rq_list_for_each(rqlist, req) {
-		struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
-		blk_status_t ret;
+	req->mq_hctx->tags->rqs[req->tag] = req;
+	return nvme_prep_rq(nvmeq->dev, req) == BLK_STS_OK;
+}
 
-		if (unlikely(!nvme_check_ready(&nvmeq->dev->ctrl, req, true)))
-			goto requeue;
+static void nvme_queue_rqs(struct request **rqlist)
+{
+	struct request *req = rq_list_peek(rqlist), *prev = NULL;
+	struct request *requeue_list = NULL;
+
+	do {
+		struct nvme_queue *nvmeq = req->mq_hctx->driver_data;
+
+		if (!nvme_prep_rq_batch(nvmeq, req)) {
+			/* detach 'req' and add to remainder list */
+			if (prev)
+				prev->rq_next = req->rq_next;
+			rq_list_add(&requeue_list, req);
+		} else {
+			prev = req;
+		}
 
-		if (req->mq_hctx != hctx) {
+		req = rq_list_next(req);
+		if (!req || (prev && req->mq_hctx != prev->mq_hctx)) {
 			/* detach rest of list, and submit */
 			prev->rq_next = NULL;
 			nvme_submit_cmds(nvmeq, rqlist);
-			/* req now start of new list for this hw queue */
 			*rqlist = req;
-			goto restart;
-		}
-
-		hctx->tags->rqs[req->tag] = req;
-		ret = nvme_prep_rq(nvmeq->dev, ns, req, &iod->cmd);
-		if (ret == BLK_STS_OK) {
-			prev = req;
-			continue;
 		}
-requeue:
-		/* detach 'req' and add to remainder list */
-		if (prev)
-			prev->rq_next = req->rq_next;
-		rq_list_add(&requeue_list, req);
-	}
+	} while (req);
 
-	nvme_submit_cmds(nvmeq, rqlist);
 	*rqlist = requeue_list;
 }
 
@@ -1224,7 +1201,11 @@ static void nvme_pci_submit_async_event(struct nvme_ctrl *ctrl)
 
 	c.common.opcode = nvme_admin_async_event;
 	c.common.command_id = NVME_AQ_BLK_MQ_DEPTH;
-	nvme_submit_cmd(nvmeq, &c, true);
+
+	spin_lock(&nvmeq->sq_lock);
+	nvme_copy_cmd(nvmeq, &c);
+	nvme_write_sq_db(nvmeq, true);
+	spin_unlock(&nvmeq->sq_lock);
 }
 
 static int adapter_delete_queue(struct nvme_dev *dev, u8 opcode, u16 id)

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/4] nvme: add support for mq_ops->queue_rqs()
  2021-11-17  3:38 [PATCHSET 0/4] Add support for list issue Jens Axboe
@ 2021-11-17  3:38 ` Jens Axboe
  2021-11-17  8:39   ` Christoph Hellwig
  2021-11-17 19:41   ` Keith Busch
  0 siblings, 2 replies; 24+ messages in thread
From: Jens Axboe @ 2021-11-17  3:38 UTC (permalink / raw)
  To: linux-block; +Cc: hch, Jens Axboe

This enables the block layer to send us a full plug list of requests
that need submitting. The block layer guarantees that they all belong
to the same queue, but we do have to check the hardware queue mapping
for each request.

If errors are encountered, leave them in the passed in list. Then the
block layer will handle them individually.

This is good for about a 4% improvement in peak performance, taking us
from 9.6M to 10M IOPS/core.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/nvme/host/pci.c | 67 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index d2b654fc3603..2eedd04b1f90 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1004,6 +1004,72 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
 	return ret;
 }
 
+static void nvme_submit_cmds(struct nvme_queue *nvmeq, struct request **rqlist)
+{
+	spin_lock(&nvmeq->sq_lock);
+	while (!rq_list_empty(*rqlist)) {
+		struct request *req = rq_list_pop(rqlist);
+		struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
+
+		nvme_copy_cmd(nvmeq, absolute_pointer(&iod->cmd));
+	}
+	nvme_write_sq_db(nvmeq, true);
+	spin_unlock(&nvmeq->sq_lock);
+}
+
+static void nvme_queue_rqs(struct request **rqlist)
+{
+	struct request *requeue_list = NULL, *req, *prev = NULL;
+	struct blk_mq_hw_ctx *hctx;
+	struct nvme_queue *nvmeq;
+	struct nvme_ns *ns;
+
+restart:
+	req = rq_list_peek(rqlist);
+	hctx = req->mq_hctx;
+	nvmeq = hctx->driver_data;
+	ns = hctx->queue->queuedata;
+
+	/*
+	 * We should not need to do this, but we're still using this to
+	 * ensure we can drain requests on a dying queue.
+	 */
+	if (unlikely(!test_bit(NVMEQ_ENABLED, &nvmeq->flags)))
+		return;
+
+	rq_list_for_each(rqlist, req) {
+		struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
+		blk_status_t ret;
+
+		if (unlikely(!nvme_check_ready(&nvmeq->dev->ctrl, req, true)))
+			goto requeue;
+
+		if (req->mq_hctx != hctx) {
+			/* detach rest of list, and submit */
+			prev->rq_next = NULL;
+			nvme_submit_cmds(nvmeq, rqlist);
+			/* req now start of new list for this hw queue */
+			*rqlist = req;
+			goto restart;
+		}
+
+		hctx->tags->rqs[req->tag] = req;
+		ret = nvme_prep_rq(nvmeq->dev, ns, req, &iod->cmd);
+		if (ret == BLK_STS_OK) {
+			prev = req;
+			continue;
+		}
+requeue:
+		/* detach 'req' and add to remainder list */
+		if (prev)
+			prev->rq_next = req->rq_next;
+		rq_list_add(&requeue_list, req);
+	}
+
+	nvme_submit_cmds(nvmeq, rqlist);
+	*rqlist = requeue_list;
+}
+
 static __always_inline void nvme_pci_unmap_rq(struct request *req)
 {
 	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
@@ -1741,6 +1807,7 @@ static const struct blk_mq_ops nvme_mq_admin_ops = {
 
 static const struct blk_mq_ops nvme_mq_ops = {
 	.queue_rq	= nvme_queue_rq,
+	.queue_rqs	= nvme_queue_rqs,
 	.complete	= nvme_pci_complete_rq,
 	.commit_rqs	= nvme_commit_rqs,
 	.init_hctx	= nvme_init_hctx,
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-12-16 17:53 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-03 21:45 [PATCHSET v2 0/4] Add support for list issue Jens Axboe
2021-12-03 21:45 ` [PATCH 1/4] block: add mq_ops->queue_rqs hook Jens Axboe
2021-12-04 10:43   ` Hannes Reinecke
2021-12-04 20:13     ` Jens Axboe
2021-12-05  9:07       ` Hannes Reinecke
2021-12-05 13:09         ` Jens Axboe
2021-12-03 21:45 ` [PATCH 2/4] nvme: split command copy into a helper Jens Axboe
2021-12-04 10:44   ` Hannes Reinecke
2021-12-06  7:38   ` Christoph Hellwig
2021-12-06 16:33     ` Jens Axboe
2021-12-03 21:45 ` [PATCH 3/4] nvme: separate command prep and issue Jens Axboe
2021-12-04 10:45   ` Hannes Reinecke
2021-12-03 21:45 ` [PATCH 4/4] nvme: add support for mq_ops->queue_rqs() Jens Axboe
2021-12-04 10:47   ` Hannes Reinecke
2021-12-06  7:40   ` Christoph Hellwig
2021-12-06 16:33     ` Jens Axboe
  -- strict thread matches above, loose matches on Subject: below --
2021-12-16 16:38 [PATCHSET v5 0/4] Add support for list issue Jens Axboe
2021-12-16 16:39 ` [PATCH 4/4] nvme: add support for mq_ops->queue_rqs() Jens Axboe
2021-12-16 17:53   ` Christoph Hellwig
2021-12-16 16:05 [PATCHSET v4 0/4] Add support for list issue Jens Axboe
2021-12-16 16:05 ` [PATCH 4/4] nvme: add support for mq_ops->queue_rqs() Jens Axboe
2021-11-17  3:38 [PATCHSET 0/4] Add support for list issue Jens Axboe
2021-11-17  3:38 ` [PATCH 4/4] nvme: add support for mq_ops->queue_rqs() Jens Axboe
2021-11-17  8:39   ` Christoph Hellwig
2021-11-17 15:55     ` Jens Axboe
2021-11-17 15:58       ` Jens Axboe
2021-11-17 19:41   ` Keith Busch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).