All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] nvme-rdma: Fix max_hw_sectors calculation
@ 2019-09-18 14:12 Max Gurtovoy
  2019-09-20 17:16 ` Sagi Grimberg
  2019-09-20 17:19 ` Sagi Grimberg
  0 siblings, 2 replies; 4+ messages in thread
From: Max Gurtovoy @ 2019-09-18 14:12 UTC (permalink / raw)
  To: linux-nvme, hch, sagi; +Cc: keith.busch, shlomin, Max Gurtovoy, israelr

By default, the NVMe/RDMA driver should support max io_size of 1MiB (or
upto the maximum supported size by the HCA). Currently, one will see that
/sys/class/block/<bdev>/queue/max_hw_sectors_kb is 1020 instead of 1024.

A non power of 2 value can cause performance degradation due to
unnecessary splitting of IO requests and unoptimized allocation units.

The number of pages per MR has been fixed here, so there is no longer any
need to reduce max_sectors by 1.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/nvme/host/rdma.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 1a6449b..cc19563 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -427,7 +427,7 @@ static void nvme_rdma_destroy_queue_ib(struct nvme_rdma_queue *queue)
 static int nvme_rdma_get_max_fr_pages(struct ib_device *ibdev)
 {
 	return min_t(u32, NVME_RDMA_MAX_SEGMENTS,
-		     ibdev->attrs.max_fast_reg_page_list_len);
+		     ibdev->attrs.max_fast_reg_page_list_len - 1);
 }
 
 static int nvme_rdma_create_queue_ib(struct nvme_rdma_queue *queue)
@@ -437,7 +437,7 @@ static int nvme_rdma_create_queue_ib(struct nvme_rdma_queue *queue)
 	const int cq_factor = send_wr_factor + 1;	/* + RECV */
 	int comp_vector, idx = nvme_rdma_queue_idx(queue);
 	enum ib_poll_context poll_ctx;
-	int ret;
+	int ret, pages_per_mr;
 
 	queue->device = nvme_rdma_find_get_device(queue->cm_id);
 	if (!queue->device) {
@@ -479,10 +479,16 @@ static int nvme_rdma_create_queue_ib(struct nvme_rdma_queue *queue)
 		goto out_destroy_qp;
 	}
 
+	/*
+	 * Currently we don't use SG_GAPS MR's so if the first entry is
+	 * misaligned we'll end up using two entries for a single data page,
+	 * so one additional entry is required.
+	 */
+	pages_per_mr = nvme_rdma_get_max_fr_pages(ibdev) + 1;
 	ret = ib_mr_pool_init(queue->qp, &queue->qp->rdma_mrs,
 			      queue->queue_size,
 			      IB_MR_TYPE_MEM_REG,
-			      nvme_rdma_get_max_fr_pages(ibdev), 0);
+			      pages_per_mr, 0);
 	if (ret) {
 		dev_err(queue->ctrl->ctrl.device,
 			"failed to initialize MR pool sized %d for QID %d\n",
@@ -824,8 +830,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
 	if (error)
 		goto out_stop_queue;
 
-	ctrl->ctrl.max_hw_sectors =
-		(ctrl->max_fr_pages - 1) << (ilog2(SZ_4K) - 9);
+	ctrl->ctrl.max_segments = ctrl->max_fr_pages;
+	ctrl->ctrl.max_hw_sectors = ctrl->max_fr_pages << (ilog2(SZ_4K) - 9);
 
 	error = nvme_init_identify(&ctrl->ctrl);
 	if (error)
-- 
1.8.3.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] nvme-rdma: Fix max_hw_sectors calculation
  2019-09-18 14:12 [PATCH 1/1] nvme-rdma: Fix max_hw_sectors calculation Max Gurtovoy
@ 2019-09-20 17:16 ` Sagi Grimberg
  2019-09-20 17:19 ` Sagi Grimberg
  1 sibling, 0 replies; 4+ messages in thread
From: Sagi Grimberg @ 2019-09-20 17:16 UTC (permalink / raw)
  To: Max Gurtovoy, linux-nvme, hch; +Cc: keith.busch, shlomin, israelr

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] nvme-rdma: Fix max_hw_sectors calculation
  2019-09-18 14:12 [PATCH 1/1] nvme-rdma: Fix max_hw_sectors calculation Max Gurtovoy
  2019-09-20 17:16 ` Sagi Grimberg
@ 2019-09-20 17:19 ` Sagi Grimberg
  2019-09-21 20:39   ` Max Gurtovoy
  1 sibling, 1 reply; 4+ messages in thread
From: Sagi Grimberg @ 2019-09-20 17:19 UTC (permalink / raw)
  To: Max Gurtovoy, linux-nvme, hch; +Cc: keith.busch, shlomin, israelr

This doesn't apply on nvme-5.4,

can you please respin a patch that cleanly applies?

Thanks

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH 1/1] nvme-rdma: Fix max_hw_sectors calculation
  2019-09-20 17:19 ` Sagi Grimberg
@ 2019-09-21 20:39   ` Max Gurtovoy
  0 siblings, 0 replies; 4+ messages in thread
From: Max Gurtovoy @ 2019-09-21 20:39 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme, hch; +Cc: keith.busch, shlomin, israelr


On 9/20/2019 8:19 PM, Sagi Grimberg wrote:
> This doesn't apply on nvme-5.4,
>
> can you please respin a patch that cleanly applies?

Sure.

what do you think about adding an option to configure max_io_size up-to 
16MiB as we do in iSER ?

I was thinking to do it using nvme-cli connect command parameter..

>
> Thanks

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-09-21 20:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-18 14:12 [PATCH 1/1] nvme-rdma: Fix max_hw_sectors calculation Max Gurtovoy
2019-09-20 17:16 ` Sagi Grimberg
2019-09-20 17:19 ` Sagi Grimberg
2019-09-21 20:39   ` Max Gurtovoy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.