All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH for-next 0/5] RTRS enable write path fast memory regitration
@ 2021-06-08 11:35 Jack Wang
  2021-06-08 11:35 ` [PATCH for-next 1/5] RDMA/rtrs: Introduce head/tail wr Jack Wang
                   ` (5 more replies)
  0 siblings, 6 replies; 8+ messages in thread
From: Jack Wang @ 2021-06-08 11:35 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, haris.iqbal, jinpu.wang, axboe

Hi Jason, hi Doug, hi Jens

Please consider to include following changes to the next merge window.

This enables fast memory registration for write IO patch, so rtrs can
support bigger IO than 116k without splitting. With this in place, both
read/write request are more symmetric, and we can also reduce the memory
usage.

The patchset is orgnized as:
- patch1 preparation.
- patch2 implement fast memory registration for write patch.
- patch3 reduce memory usage.
- patch4 raise MAX_SGEMENTs
- patch5 rnbd-clt to query and use the max_sgements setting.

As the main change is in RTRS, so it's easier to go through RDMA tree, hence
send this patchset to linux-rdma.

This patchset depends on: https://lore.kernel.org/linux-rdma/20210608103039.39080-1-jinpu.wang@ionos.com/T/#t

Jack Wang (5):
  RDMA/rtrs: Introduce head/tail wr
  RDMA/rtrs-clt: Write path fast memory registration
  RDMA/rtrs_clt: Alloc less memory with write path fast memory
    registration
  RDMA/rtrs-clt: Raise MAX_SEGMENTS
  rnbd/rtrs-clt: Query and use max_segments from rtrs-clt.

 drivers/block/rnbd/rnbd-clt.c          |   5 +-
 drivers/block/rnbd/rnbd-clt.h          |   5 +-
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 143 ++++++++++++++++---------
 drivers/infiniband/ulp/rtrs/rtrs-clt.h |   1 +
 drivers/infiniband/ulp/rtrs/rtrs-pri.h |   3 +-
 drivers/infiniband/ulp/rtrs/rtrs.c     |  28 ++---
 drivers/infiniband/ulp/rtrs/rtrs.h     |   2 +-
 7 files changed, 119 insertions(+), 68 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH for-next 1/5] RDMA/rtrs: Introduce head/tail wr
  2021-06-08 11:35 [PATCH for-next 0/5] RTRS enable write path fast memory regitration Jack Wang
@ 2021-06-08 11:35 ` Jack Wang
  2021-06-08 11:35 ` [PATCH for-next 2/5] RDMA/rtrs-clt: Write path fast memory registration Jack Wang
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Jack Wang @ 2021-06-08 11:35 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, haris.iqbal, jinpu.wang, axboe,
	Jack Wang, Md Haris Iqbal

From: Jack Wang <jinpu.wang@cloud.ionos.com>

Introduce tail wr, we can send as the last wr, we want to send the local
invalidate wr after rdma wr in later patch.

While at it, also fix coding style issue.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 16 ++++++++-------
 drivers/infiniband/ulp/rtrs/rtrs-pri.h |  3 ++-
 drivers/infiniband/ulp/rtrs/rtrs.c     | 28 +++++++++++++++-----------
 3 files changed, 27 insertions(+), 20 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 67ff5bf9bfa8..5ec02f78be3f 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -480,7 +480,7 @@ static int rtrs_post_send_rdma(struct rtrs_clt_con *con,
 
 	return rtrs_iu_post_rdma_write_imm(&con->c, req->iu, &sge, 1,
 					    rbuf->rkey, rbuf->addr + off,
-					    imm, flags, wr);
+					    imm, flags, wr, NULL);
 }
 
 static void process_io_rsp(struct rtrs_clt_sess *sess, u32 msg_id,
@@ -999,9 +999,10 @@ rtrs_clt_get_copy_req(struct rtrs_clt_sess *alive_sess,
 }
 
 static int rtrs_post_rdma_write_sg(struct rtrs_clt_con *con,
-				    struct rtrs_clt_io_req *req,
-				    struct rtrs_rbuf *rbuf,
-				    u32 size, u32 imm)
+				   struct rtrs_clt_io_req *req,
+				   struct rtrs_rbuf *rbuf,
+				   u32 size, u32 imm, struct ib_send_wr *wr,
+				   struct ib_send_wr *tail)
 {
 	struct rtrs_clt_sess *sess = to_clt_sess(con->c.sess);
 	struct ib_sge *sge = req->sge;
@@ -1009,6 +1010,7 @@ static int rtrs_post_rdma_write_sg(struct rtrs_clt_con *con,
 	struct scatterlist *sg;
 	size_t num_sge;
 	int i;
+	struct ib_send_wr *ptail = NULL;
 
 	for_each_sg(req->sglist, sg, req->sg_cnt, i) {
 		sge[i].addr   = sg_dma_address(sg);
@@ -1033,7 +1035,7 @@ static int rtrs_post_rdma_write_sg(struct rtrs_clt_con *con,
 
 	return rtrs_iu_post_rdma_write_imm(&con->c, req->iu, sge, num_sge,
 					    rbuf->rkey, rbuf->addr, imm,
-					    flags, NULL);
+					    flags, wr, ptail);
 }
 
 static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
@@ -1081,8 +1083,8 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
 	rtrs_clt_update_all_stats(req, WRITE);
 
 	ret = rtrs_post_rdma_write_sg(req->con, req, rbuf,
-				       req->usr_len + sizeof(*msg),
-				       imm);
+				      req->usr_len + sizeof(*msg),
+				      imm, NULL, NULL);
 	if (unlikely(ret)) {
 		rtrs_err_rl(s,
 			    "Write request failed: error=%d path=%s [%s:%u]\n",
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-pri.h b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
index 76cca2058f6f..36f184a3b676 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-pri.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-pri.h
@@ -305,7 +305,8 @@ int rtrs_iu_post_rdma_write_imm(struct rtrs_con *con, struct rtrs_iu *iu,
 				struct ib_sge *sge, unsigned int num_sge,
 				u32 rkey, u64 rdma_addr, u32 imm_data,
 				enum ib_send_flags flags,
-				struct ib_send_wr *head);
+				struct ib_send_wr *head,
+				struct ib_send_wr *tail);
 
 int rtrs_post_recv_empty(struct rtrs_con *con, struct ib_cqe *cqe);
 int rtrs_post_rdma_write_imm_empty(struct rtrs_con *con, struct ib_cqe *cqe,
diff --git a/drivers/infiniband/ulp/rtrs/rtrs.c b/drivers/infiniband/ulp/rtrs/rtrs.c
index 08e1f7d82c95..61919ebd92b2 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs.c
@@ -105,18 +105,21 @@ int rtrs_post_recv_empty(struct rtrs_con *con, struct ib_cqe *cqe)
 EXPORT_SYMBOL_GPL(rtrs_post_recv_empty);
 
 static int rtrs_post_send(struct ib_qp *qp, struct ib_send_wr *head,
-			     struct ib_send_wr *wr)
+			  struct ib_send_wr *wr, struct ib_send_wr *tail)
 {
 	if (head) {
-		struct ib_send_wr *tail = head;
+		struct ib_send_wr *next = head;
 
-		while (tail->next)
-			tail = tail->next;
-		tail->next = wr;
+		while (next->next)
+			next = next->next;
+		next->next = wr;
 	} else {
 		head = wr;
 	}
 
+	if (tail)
+		wr->next = tail;
+
 	return ib_post_send(qp, head, NULL);
 }
 
@@ -142,15 +145,16 @@ int rtrs_iu_post_send(struct rtrs_con *con, struct rtrs_iu *iu, size_t size,
 		.send_flags = IB_SEND_SIGNALED,
 	};
 
-	return rtrs_post_send(con->qp, head, &wr);
+	return rtrs_post_send(con->qp, head, &wr, NULL);
 }
 EXPORT_SYMBOL_GPL(rtrs_iu_post_send);
 
 int rtrs_iu_post_rdma_write_imm(struct rtrs_con *con, struct rtrs_iu *iu,
-				 struct ib_sge *sge, unsigned int num_sge,
-				 u32 rkey, u64 rdma_addr, u32 imm_data,
-				 enum ib_send_flags flags,
-				 struct ib_send_wr *head)
+				struct ib_sge *sge, unsigned int num_sge,
+				u32 rkey, u64 rdma_addr, u32 imm_data,
+				enum ib_send_flags flags,
+				struct ib_send_wr *head,
+				struct ib_send_wr *tail)
 {
 	struct ib_rdma_wr wr;
 	int i;
@@ -174,7 +178,7 @@ int rtrs_iu_post_rdma_write_imm(struct rtrs_con *con, struct rtrs_iu *iu,
 		if (WARN_ON(sge[i].length == 0))
 			return -EINVAL;
 
-	return rtrs_post_send(con->qp, head, &wr.wr);
+	return rtrs_post_send(con->qp, head, &wr.wr, tail);
 }
 EXPORT_SYMBOL_GPL(rtrs_iu_post_rdma_write_imm);
 
@@ -191,7 +195,7 @@ int rtrs_post_rdma_write_imm_empty(struct rtrs_con *con, struct ib_cqe *cqe,
 		.wr.ex.imm_data	= cpu_to_be32(imm_data),
 	};
 
-	return rtrs_post_send(con->qp, head, &wr.wr);
+	return rtrs_post_send(con->qp, head, &wr.wr, NULL);
 }
 EXPORT_SYMBOL_GPL(rtrs_post_rdma_write_imm_empty);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH for-next 2/5] RDMA/rtrs-clt: Write path fast memory registration
  2021-06-08 11:35 [PATCH for-next 0/5] RTRS enable write path fast memory regitration Jack Wang
  2021-06-08 11:35 ` [PATCH for-next 1/5] RDMA/rtrs: Introduce head/tail wr Jack Wang
@ 2021-06-08 11:35 ` Jack Wang
  2021-06-08 11:35 ` [PATCH for-next 3/5] RDMA/rtrs_clt: Alloc less memory with write " Jack Wang
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Jack Wang @ 2021-06-08 11:35 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, haris.iqbal, jinpu.wang, axboe,
	Jack Wang, Dima Stepanov

From: Jack Wang <jinpu.wang@cloud.ionos.com>

With fast memory registration in write path, we can reduce
the memory consumption by using less max_send_sge, support IO bigger
than 116 KB (29 segments * 4 KB) without splitting, and it also
make the IO path more symmetric.

To avoid some times MR reg failed, waiting for the invalidation to finish
before the new mr reg. Introduce a refcount, only finish the request
when both local invalidation and io reply are there.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Dima Stepanov <dmitrii.stepanov@ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 100 ++++++++++++++++++-------
 drivers/infiniband/ulp/rtrs/rtrs-clt.h |   1 +
 2 files changed, 74 insertions(+), 27 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 5ec02f78be3f..b7c9684d7f62 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -412,6 +412,7 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
 				req->inv_errno = errno;
 			}
 
+			refcount_inc(&req->ref);
 			err = rtrs_inv_rkey(req);
 			if (unlikely(err)) {
 				rtrs_err(con->c.sess, "Send INV WR key=%#x: %d\n",
@@ -427,10 +428,14 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
 
 				return;
 			}
+			if (!refcount_dec_and_test(&req->ref))
+				return;
 		}
 		ib_dma_unmap_sg(sess->s.dev->ib_dev, req->sglist,
 				req->sg_cnt, req->dir);
 	}
+	if (!refcount_dec_and_test(&req->ref))
+		return;
 	if (sess->clt->mp_policy == MP_POLICY_MIN_INFLIGHT)
 		atomic_dec(&sess->stats->inflight);
 
@@ -438,10 +443,9 @@ static void complete_rdma_req(struct rtrs_clt_io_req *req, int errno,
 	req->con = NULL;
 
 	if (errno) {
-		rtrs_err_rl(con->c.sess,
-			    "IO request failed: error=%d path=%s [%s:%u]\n",
+		rtrs_err_rl(con->c.sess, "IO request failed: error=%d path=%s [%s:%u] notify=%d\n",
 			    errno, kobject_name(&sess->kobj), sess->hca_name,
-			    sess->hca_port);
+			    sess->hca_port, notify);
 	}
 
 	if (notify)
@@ -956,6 +960,7 @@ static void rtrs_clt_init_req(struct rtrs_clt_io_req *req,
 	req->need_inv = false;
 	req->need_inv_comp = false;
 	req->inv_errno = 0;
+	refcount_set(&req->ref, 1);
 
 	iov_iter_kvec(&iter, READ, vec, 1, usr_len);
 	len = _copy_from_iter(req->iu->buf, usr_len, &iter);
@@ -1000,7 +1005,7 @@ rtrs_clt_get_copy_req(struct rtrs_clt_sess *alive_sess,
 
 static int rtrs_post_rdma_write_sg(struct rtrs_clt_con *con,
 				   struct rtrs_clt_io_req *req,
-				   struct rtrs_rbuf *rbuf,
+				   struct rtrs_rbuf *rbuf, bool fr_en,
 				   u32 size, u32 imm, struct ib_send_wr *wr,
 				   struct ib_send_wr *tail)
 {
@@ -1012,17 +1017,26 @@ static int rtrs_post_rdma_write_sg(struct rtrs_clt_con *con,
 	int i;
 	struct ib_send_wr *ptail = NULL;
 
-	for_each_sg(req->sglist, sg, req->sg_cnt, i) {
-		sge[i].addr   = sg_dma_address(sg);
-		sge[i].length = sg_dma_len(sg);
-		sge[i].lkey   = sess->s.dev->ib_pd->local_dma_lkey;
+	if (fr_en) {
+		i = 0;
+		sge[i].addr   = req->mr->iova;
+		sge[i].length = req->mr->length;
+		sge[i].lkey   = req->mr->lkey;
+		i++;
+		num_sge = 2;
+		ptail = tail;
+	} else {
+		for_each_sg(req->sglist, sg, req->sg_cnt, i) {
+			sge[i].addr   = sg_dma_address(sg);
+			sge[i].length = sg_dma_len(sg);
+			sge[i].lkey   = sess->s.dev->ib_pd->local_dma_lkey;
+		}
+		num_sge = 1 + req->sg_cnt;
 	}
 	sge[i].addr   = req->iu->dma_addr;
 	sge[i].length = size;
 	sge[i].lkey   = sess->s.dev->ib_pd->local_dma_lkey;
 
-	num_sge = 1 + req->sg_cnt;
-
 	/*
 	 * From time to time we have to post signalled sends,
 	 * or send queue will fill up and only QP reset can help.
@@ -1038,6 +1052,21 @@ static int rtrs_post_rdma_write_sg(struct rtrs_clt_con *con,
 					    flags, wr, ptail);
 }
 
+static int rtrs_map_sg_fr(struct rtrs_clt_io_req *req, size_t count)
+{
+	int nr;
+
+	/* Align the MR to a 4K page size to match the block virt boundary */
+	nr = ib_map_mr_sg(req->mr, req->sglist, count, NULL, SZ_4K);
+	if (nr < 0)
+		return nr;
+	if (unlikely(nr < req->sg_cnt))
+		return -EINVAL;
+	ib_update_fast_reg_key(req->mr, ib_inc_rkey(req->mr->rkey));
+
+	return nr;
+}
+
 static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
 {
 	struct rtrs_clt_con *con = req->con;
@@ -1048,6 +1077,10 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
 	struct rtrs_rbuf *rbuf;
 	int ret, count = 0;
 	u32 imm, buf_id;
+	struct ib_reg_wr rwr;
+	struct ib_send_wr inv_wr;
+	struct ib_send_wr *wr = NULL;
+	bool fr_en = false;
 
 	const size_t tsize = sizeof(*msg) + req->data_len + req->usr_len;
 
@@ -1076,15 +1109,43 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
 	req->sg_size = tsize;
 	rbuf = &sess->rbufs[buf_id];
 
+	if (count) {
+		ret = rtrs_map_sg_fr(req, count);
+		if (ret < 0) {
+			rtrs_err_rl(s,
+				    "Write request failed, failed to map fast reg. data, err: %d\n",
+				    ret);
+			ib_dma_unmap_sg(sess->s.dev->ib_dev, req->sglist,
+					req->sg_cnt, req->dir);
+			return ret;
+		}
+		inv_wr = (struct ib_send_wr) {
+			.opcode		    = IB_WR_LOCAL_INV,
+			.wr_cqe		    = &req->inv_cqe,
+			.send_flags	    = IB_SEND_SIGNALED,
+			.ex.invalidate_rkey = req->mr->rkey,
+		};
+		req->inv_cqe.done = rtrs_clt_inv_rkey_done;
+		rwr = (struct ib_reg_wr) {
+			.wr.opcode = IB_WR_REG_MR,
+			.wr.wr_cqe = &fast_reg_cqe,
+			.mr = req->mr,
+			.key = req->mr->rkey,
+			.access = (IB_ACCESS_LOCAL_WRITE),
+		};
+		wr = &rwr.wr;
+		fr_en = true;
+		refcount_inc(&req->ref);
+	}
 	/*
 	 * Update stats now, after request is successfully sent it is not
 	 * safe anymore to touch it.
 	 */
 	rtrs_clt_update_all_stats(req, WRITE);
 
-	ret = rtrs_post_rdma_write_sg(req->con, req, rbuf,
+	ret = rtrs_post_rdma_write_sg(req->con, req, rbuf, fr_en,
 				      req->usr_len + sizeof(*msg),
-				      imm, NULL, NULL);
+				      imm, wr, &inv_wr);
 	if (unlikely(ret)) {
 		rtrs_err_rl(s,
 			    "Write request failed: error=%d path=%s [%s:%u]\n",
@@ -1100,21 +1161,6 @@ static int rtrs_clt_write_req(struct rtrs_clt_io_req *req)
 	return ret;
 }
 
-static int rtrs_map_sg_fr(struct rtrs_clt_io_req *req, size_t count)
-{
-	int nr;
-
-	/* Align the MR to a 4K page size to match the block virt boundary */
-	nr = ib_map_mr_sg(req->mr, req->sglist, count, NULL, SZ_4K);
-	if (nr < 0)
-		return nr;
-	if (unlikely(nr < req->sg_cnt))
-		return -EINVAL;
-	ib_update_fast_reg_key(req->mr, ib_inc_rkey(req->mr->rkey));
-
-	return nr;
-}
-
 static int rtrs_clt_read_req(struct rtrs_clt_io_req *req)
 {
 	struct rtrs_clt_con *con = req->con;
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.h b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
index eed2a20ee9be..e276a2dfcf7c 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.h
@@ -116,6 +116,7 @@ struct rtrs_clt_io_req {
 	int			inv_errno;
 	bool			need_inv_comp;
 	bool			need_inv;
+	refcount_t		ref;
 };
 
 struct rtrs_rbuf {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH for-next 3/5] RDMA/rtrs_clt: Alloc less memory with write path fast memory registration
  2021-06-08 11:35 [PATCH for-next 0/5] RTRS enable write path fast memory regitration Jack Wang
  2021-06-08 11:35 ` [PATCH for-next 1/5] RDMA/rtrs: Introduce head/tail wr Jack Wang
  2021-06-08 11:35 ` [PATCH for-next 2/5] RDMA/rtrs-clt: Write path fast memory registration Jack Wang
@ 2021-06-08 11:35 ` Jack Wang
  2021-06-08 11:35 ` [PATCH for-next 4/5] RDMA/rtrs-clt: Raise MAX_SEGMENTS Jack Wang
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 8+ messages in thread
From: Jack Wang @ 2021-06-08 11:35 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, haris.iqbal, jinpu.wang, axboe,
	Jack Wang, Md Haris Iqbal

From: Jack Wang <jinpu.wang@cloud.ionos.com>

With write path fast memory registration, we need less memory for
each request.

With fast memory registration, we can reduce max_send_sge to save
memory usage.

Also convert the kmalloc_array to kcalloc.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index b7c9684d7f62..af738e7e1396 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1372,8 +1372,7 @@ static int alloc_sess_reqs(struct rtrs_clt_sess *sess)
 		if (!req->iu)
 			goto out;
 
-		req->sge = kmalloc_array(clt->max_segments + 1,
-					 sizeof(*req->sge), GFP_KERNEL);
+		req->sge = kcalloc(2, sizeof(*req->sge), GFP_KERNEL);
 		if (!req->sge)
 			goto out;
 
@@ -1674,7 +1673,7 @@ static int create_con_cq_qp(struct rtrs_clt_con *con)
 		max_recv_wr =
 			min_t(int, sess->s.dev->ib_dev->attrs.max_qp_wr,
 			      sess->queue_depth * 3 + 1);
-		max_send_sge = sess->clt->max_segments + 1;
+		max_send_sge = 2;
 	}
 	cq_num = max_send_wr + max_recv_wr;
 	/* alloc iu to recv new rkey reply when server reports flags set */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH for-next 4/5] RDMA/rtrs-clt: Raise MAX_SEGMENTS
  2021-06-08 11:35 [PATCH for-next 0/5] RTRS enable write path fast memory regitration Jack Wang
                   ` (2 preceding siblings ...)
  2021-06-08 11:35 ` [PATCH for-next 3/5] RDMA/rtrs_clt: Alloc less memory with write " Jack Wang
@ 2021-06-08 11:35 ` Jack Wang
  2021-06-08 11:35 ` [PATCH for-next 5/5] rnbd/rtrs-clt: Query and use max_segments from rtrs-clt Jack Wang
  2021-06-18 16:54 ` [PATCH for-next 0/5] RTRS enable write path fast memory regitration Jason Gunthorpe
  5 siblings, 0 replies; 8+ messages in thread
From: Jack Wang @ 2021-06-08 11:35 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, haris.iqbal, jinpu.wang, axboe,
	Jack Wang, Md Haris Iqbal

From: Jack Wang <jinpu.wang@cloud.ionos.com>

As we can do fast memory registration on write, we can increase
the max_segments, default to 512K.

Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index af738e7e1396..721ed0b5ae70 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -32,6 +32,8 @@
 #define RTRS_RECONNECT_SEED 8
 
 #define FIRST_CONN 0x01
+/* limit to 128 * 4k = 512k max IO */
+#define RTRS_MAX_SEGMENTS          128
 
 MODULE_DESCRIPTION("RDMA Transport Client");
 MODULE_LICENSE("GPL");
@@ -1545,7 +1547,7 @@ static struct rtrs_clt_sess *alloc_sess(struct rtrs_clt *clt,
 		       rdma_addr_size((struct sockaddr *)path->src));
 	strscpy(sess->s.sessname, clt->sessname, sizeof(sess->s.sessname));
 	sess->clt = clt;
-	sess->max_pages_per_mr = max_segments;
+	sess->max_pages_per_mr = RTRS_MAX_SEGMENTS;
 	init_waitqueue_head(&sess->state_wq);
 	sess->state = RTRS_CLT_CONNECTING;
 	atomic_set(&sess->connected_cnt, 0);
@@ -2694,7 +2696,7 @@ static struct rtrs_clt *alloc_clt(const char *sessname, size_t paths_num,
 	clt->paths_up = MAX_PATHS_NUM;
 	clt->port = port;
 	clt->pdu_sz = pdu_sz;
-	clt->max_segments = max_segments;
+	clt->max_segments = RTRS_MAX_SEGMENTS;
 	clt->reconnect_delay_sec = reconnect_delay_sec;
 	clt->max_reconnect_attempts = max_reconnect_attempts;
 	clt->priv = priv;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH for-next 5/5] rnbd/rtrs-clt: Query and use max_segments from rtrs-clt.
  2021-06-08 11:35 [PATCH for-next 0/5] RTRS enable write path fast memory regitration Jack Wang
                   ` (3 preceding siblings ...)
  2021-06-08 11:35 ` [PATCH for-next 4/5] RDMA/rtrs-clt: Raise MAX_SEGMENTS Jack Wang
@ 2021-06-08 11:35 ` Jack Wang
  2021-06-18 16:54 ` [PATCH for-next 0/5] RTRS enable write path fast memory regitration Jason Gunthorpe
  5 siblings, 0 replies; 8+ messages in thread
From: Jack Wang @ 2021-06-08 11:35 UTC (permalink / raw)
  To: linux-rdma
  Cc: bvanassche, leon, dledford, jgg, haris.iqbal, jinpu.wang, axboe,
	Jack Wang, Md Haris Iqbal

From: Jack Wang <jinpu.wang@cloud.ionos.com>

With fast memory registration on write request, rnbd-clt
can do bigger IO without split. rnbd-clt now can query
rtrs-clt to get the max_segments, instead of using
BMAX_SEGMENTS.

BMAX_SEGMENTS is not longer needed, so remove it.

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
---
 drivers/block/rnbd/rnbd-clt.c          |  5 +++--
 drivers/block/rnbd/rnbd-clt.h          |  5 +----
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 18 ++++++++----------
 drivers/infiniband/ulp/rtrs/rtrs.h     |  2 +-
 4 files changed, 13 insertions(+), 17 deletions(-)

diff --git a/drivers/block/rnbd/rnbd-clt.c b/drivers/block/rnbd/rnbd-clt.c
index c604a402cd5c..d6f12e6c91f7 100644
--- a/drivers/block/rnbd/rnbd-clt.c
+++ b/drivers/block/rnbd/rnbd-clt.c
@@ -92,7 +92,7 @@ static int rnbd_clt_set_dev_attr(struct rnbd_clt_dev *dev,
 	dev->fua		    = !!(rsp->cache_policy & RNBD_FUA);
 
 	dev->max_hw_sectors = sess->max_io_size / SECTOR_SIZE;
-	dev->max_segments = BMAX_SEGMENTS;
+	dev->max_segments = sess->max_segments;
 
 	return 0;
 }
@@ -1292,7 +1292,7 @@ find_and_get_or_create_sess(const char *sessname,
 	sess->rtrs = rtrs_clt_open(&rtrs_ops, sessname,
 				   paths, path_cnt, port_nr,
 				   0, /* Do not use pdu of rtrs */
-				   RECONNECT_DELAY, BMAX_SEGMENTS,
+				   RECONNECT_DELAY,
 				   MAX_RECONNECTS, nr_poll_queues);
 	if (IS_ERR(sess->rtrs)) {
 		err = PTR_ERR(sess->rtrs);
@@ -1306,6 +1306,7 @@ find_and_get_or_create_sess(const char *sessname,
 	sess->max_io_size = attrs.max_io_size;
 	sess->queue_depth = attrs.queue_depth;
 	sess->nr_poll_queues = nr_poll_queues;
+	sess->max_segments = attrs.max_segments;
 
 	err = setup_mq_tags(sess);
 	if (err)
diff --git a/drivers/block/rnbd/rnbd-clt.h b/drivers/block/rnbd/rnbd-clt.h
index b5322c5aaac0..9ef8c4f306f2 100644
--- a/drivers/block/rnbd/rnbd-clt.h
+++ b/drivers/block/rnbd/rnbd-clt.h
@@ -20,10 +20,6 @@
 #include "rnbd-proto.h"
 #include "rnbd-log.h"
 
-/* Max. number of segments per IO request, Mellanox Connect X ~ Connect X5,
- * choose minimial 30 for all, minus 1 for internal protocol, so 29.
- */
-#define BMAX_SEGMENTS 29
 /*  time in seconds between reconnect tries, default to 30 s */
 #define RECONNECT_DELAY 30
 /*
@@ -89,6 +85,7 @@ struct rnbd_clt_session {
 	atomic_t		busy;
 	size_t			queue_depth;
 	u32			max_io_size;
+	u32			max_segments;
 	struct blk_mq_tag_set	tag_set;
 	u32			nr_poll_queues;
 	struct mutex		lock; /* protects state and devs_list */
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 721ed0b5ae70..40dd524b5101 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -1357,7 +1357,6 @@ static void free_sess_reqs(struct rtrs_clt_sess *sess)
 static int alloc_sess_reqs(struct rtrs_clt_sess *sess)
 {
 	struct rtrs_clt_io_req *req;
-	struct rtrs_clt *clt = sess->clt;
 	int i, err = -ENOMEM;
 
 	sess->reqs = kcalloc(sess->queue_depth, sizeof(*sess->reqs),
@@ -1466,6 +1465,8 @@ static void query_fast_reg_mode(struct rtrs_clt_sess *sess)
 	sess->max_pages_per_mr =
 		min3(sess->max_pages_per_mr, (u32)max_pages_per_mr,
 		     ib_dev->attrs.max_fast_reg_page_list_len);
+	sess->clt->max_segments =
+		min(sess->max_pages_per_mr, sess->clt->max_segments);
 }
 
 static bool rtrs_clt_change_state_get_old(struct rtrs_clt_sess *sess,
@@ -1503,9 +1504,8 @@ static void rtrs_clt_reconnect_work(struct work_struct *work);
 static void rtrs_clt_close_work(struct work_struct *work);
 
 static struct rtrs_clt_sess *alloc_sess(struct rtrs_clt *clt,
-					 const struct rtrs_addr *path,
-					 size_t con_num, u16 max_segments,
-					 u32 nr_poll_queues)
+					const struct rtrs_addr *path,
+					size_t con_num, u32 nr_poll_queues)
 {
 	struct rtrs_clt_sess *sess;
 	int err = -ENOMEM;
@@ -2667,7 +2667,6 @@ static struct rtrs_clt *alloc_clt(const char *sessname, size_t paths_num,
 				  u16 port, size_t pdu_sz, void *priv,
 				  void	(*link_ev)(void *priv,
 						   enum rtrs_clt_link_ev ev),
-				  unsigned int max_segments,
 				  unsigned int reconnect_delay_sec,
 				  unsigned int max_reconnect_attempts)
 {
@@ -2765,7 +2764,6 @@ static void free_clt(struct rtrs_clt *clt)
  * @port: port to be used by the RTRS session
  * @pdu_sz: Size of extra payload which can be accessed after permit allocation.
  * @reconnect_delay_sec: time between reconnect tries
- * @max_segments: Max. number of segments per IO request
  * @max_reconnect_attempts: Number of times to reconnect on error before giving
  *			    up, 0 for * disabled, -1 for forever
  * @nr_poll_queues: number of polling mode connection using IB_POLL_DIRECT flag
@@ -2780,7 +2778,6 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 				 const struct rtrs_addr *paths,
 				 size_t paths_num, u16 port,
 				 size_t pdu_sz, u8 reconnect_delay_sec,
-				 u16 max_segments,
 				 s16 max_reconnect_attempts, u32 nr_poll_queues)
 {
 	struct rtrs_clt_sess *sess, *tmp;
@@ -2789,7 +2786,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 
 	clt = alloc_clt(sessname, paths_num, port, pdu_sz, ops->priv,
 			ops->link_ev,
-			max_segments, reconnect_delay_sec,
+			reconnect_delay_sec,
 			max_reconnect_attempts);
 	if (IS_ERR(clt)) {
 		err = PTR_ERR(clt);
@@ -2799,7 +2796,7 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 		struct rtrs_clt_sess *sess;
 
 		sess = alloc_sess(clt, &paths[i], nr_cpu_ids,
-				  max_segments, nr_poll_queues);
+				  nr_poll_queues);
 		if (IS_ERR(sess)) {
 			err = PTR_ERR(sess);
 			goto close_all_sess;
@@ -3061,6 +3058,7 @@ int rtrs_clt_query(struct rtrs_clt *clt, struct rtrs_attrs *attr)
 		return -ECOMM;
 
 	attr->queue_depth      = clt->queue_depth;
+	attr->max_segments     = clt->max_segments;
 	/* Cap max_io_size to min of remote buffer size and the fr pages */
 	attr->max_io_size = min_t(int, clt->max_io_size,
 				  clt->max_segments * SZ_4K);
@@ -3075,7 +3073,7 @@ int rtrs_clt_create_path_from_sysfs(struct rtrs_clt *clt,
 	struct rtrs_clt_sess *sess;
 	int err;
 
-	sess = alloc_sess(clt, addr, nr_cpu_ids, clt->max_segments, 0);
+	sess = alloc_sess(clt, addr, nr_cpu_ids, 0);
 	if (IS_ERR(sess))
 		return PTR_ERR(sess);
 
diff --git a/drivers/infiniband/ulp/rtrs/rtrs.h b/drivers/infiniband/ulp/rtrs/rtrs.h
index dc3e1af1a85b..859c79685daf 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs.h
+++ b/drivers/infiniband/ulp/rtrs/rtrs.h
@@ -57,7 +57,6 @@ struct rtrs_clt *rtrs_clt_open(struct rtrs_clt_ops *ops,
 				 const struct rtrs_addr *paths,
 				 size_t path_cnt, u16 port,
 				 size_t pdu_sz, u8 reconnect_delay_sec,
-				 u16 max_segments,
 				 s16 max_reconnect_attempts, u32 nr_poll_queues);
 
 void rtrs_clt_close(struct rtrs_clt *sess);
@@ -110,6 +109,7 @@ int rtrs_clt_rdma_cq_direct(struct rtrs_clt *clt, unsigned int index);
 struct rtrs_attrs {
 	u32		queue_depth;
 	u32		max_io_size;
+	u32		max_segments;
 };
 
 int rtrs_clt_query(struct rtrs_clt *sess, struct rtrs_attrs *attr);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH for-next 0/5] RTRS enable write path fast memory regitration
  2021-06-08 11:35 [PATCH for-next 0/5] RTRS enable write path fast memory regitration Jack Wang
                   ` (4 preceding siblings ...)
  2021-06-08 11:35 ` [PATCH for-next 5/5] rnbd/rtrs-clt: Query and use max_segments from rtrs-clt Jack Wang
@ 2021-06-18 16:54 ` Jason Gunthorpe
  2021-06-21  5:55   ` Jinpu Wang
  5 siblings, 1 reply; 8+ messages in thread
From: Jason Gunthorpe @ 2021-06-18 16:54 UTC (permalink / raw)
  To: Jack Wang; +Cc: linux-rdma, bvanassche, leon, dledford, haris.iqbal, axboe

On Tue, Jun 08, 2021 at 01:35:31PM +0200, Jack Wang wrote:
> Hi Jason, hi Doug, hi Jens
> 
> Please consider to include following changes to the next merge window.
> 
> This enables fast memory registration for write IO patch, so rtrs can
> support bigger IO than 116k without splitting. With this in place, both
> read/write request are more symmetric, and we can also reduce the memory
> usage.
> 
> The patchset is orgnized as:
> - patch1 preparation.
> - patch2 implement fast memory registration for write patch.
> - patch3 reduce memory usage.
> - patch4 raise MAX_SGEMENTs
> - patch5 rnbd-clt to query and use the max_sgements setting.
> 
> As the main change is in RTRS, so it's easier to go through RDMA tree, hence
> send this patchset to linux-rdma.
> 
> This patchset depends on: https://lore.kernel.org/linux-rdma/20210608103039.39080-1-jinpu.wang@ionos.com/T/#t

It doesn't apply - please rebase and resend it

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH for-next 0/5] RTRS enable write path fast memory regitration
  2021-06-18 16:54 ` [PATCH for-next 0/5] RTRS enable write path fast memory regitration Jason Gunthorpe
@ 2021-06-21  5:55   ` Jinpu Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Jinpu Wang @ 2021-06-21  5:55 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: RDMA mailing list, Bart Van Assche, Leon Romanovsky,
	Doug Ledford, Haris Iqbal, Jens Axboe

On Fri, Jun 18, 2021 at 6:54 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Jun 08, 2021 at 01:35:31PM +0200, Jack Wang wrote:
> > Hi Jason, hi Doug, hi Jens
> >
> > Please consider to include following changes to the next merge window.
> >
> > This enables fast memory registration for write IO patch, so rtrs can
> > support bigger IO than 116k without splitting. With this in place, both
> > read/write request are more symmetric, and we can also reduce the memory
> > usage.
> >
> > The patchset is orgnized as:
> > - patch1 preparation.
> > - patch2 implement fast memory registration for write patch.
> > - patch3 reduce memory usage.
> > - patch4 raise MAX_SGEMENTs
> > - patch5 rnbd-clt to query and use the max_sgements setting.
> >
> > As the main change is in RTRS, so it's easier to go through RDMA tree, hence
> > send this patchset to linux-rdma.
> >
> > This patchset depends on: https://lore.kernel.org/linux-rdma/20210608103039.39080-1-jinpu.wang@ionos.com/T/#t
>
> It doesn't apply - please rebase and resend it
>
> Jason
Sorry for the inconvenience, just rebased and resent.
https://lore.kernel.org/linux-rdma/20210621055340.11789-1-jinpu.wang@ionos.com/T/#t

Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-06-21  5:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-08 11:35 [PATCH for-next 0/5] RTRS enable write path fast memory regitration Jack Wang
2021-06-08 11:35 ` [PATCH for-next 1/5] RDMA/rtrs: Introduce head/tail wr Jack Wang
2021-06-08 11:35 ` [PATCH for-next 2/5] RDMA/rtrs-clt: Write path fast memory registration Jack Wang
2021-06-08 11:35 ` [PATCH for-next 3/5] RDMA/rtrs_clt: Alloc less memory with write " Jack Wang
2021-06-08 11:35 ` [PATCH for-next 4/5] RDMA/rtrs-clt: Raise MAX_SEGMENTS Jack Wang
2021-06-08 11:35 ` [PATCH for-next 5/5] rnbd/rtrs-clt: Query and use max_segments from rtrs-clt Jack Wang
2021-06-18 16:54 ` [PATCH for-next 0/5] RTRS enable write path fast memory regitration Jason Gunthorpe
2021-06-21  5:55   ` Jinpu Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.