All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: anna.schumaker@netapp.com
Cc: linux-nfs@vger.kernel.org, linux-rdma@vger.kernel.org
Subject: [PATCH v2 06/21] xprtrdma: Boost maximum transport header size
Date: Mon, 19 Aug 2019 18:40:11 -0400	[thread overview]
Message-ID: <156625439150.8161.9923129489297297655.stgit@seurat29.1015granger.net> (raw)
In-Reply-To: <156625401091.8161.14744201497689200191.stgit@seurat29.1015granger.net>

Although I haven't seen any performance results that justify it,
I've received several complaints that NFS/RDMA no longer supports
a maximum rsize and wsize of 1MB. These days it is somewhat smaller.

To simplify the logic that determines whether a chunk list is
necessary, the implementation uses a fixed maximum size of the
transport header. Currently that maximum size is 256 bytes, one
quarter of the default inline threshold size for RPC/RDMA v1.

Since commit a78868497c2e ("xprtrdma: Reduce max_frwr_depth"), the
size of chunks is also smaller to take advantage of inline page
lists in device internal MR data structures.

The combination of these two design choices has reduced the maximum
NFS rsize and wsize that can be used for most RNIC/HCAs. Increasing
the maximum transport header size and the maximum number of RDMA
segments it can contain increases the negotiated maximum rsize/wsize
on common RNIC/HCAs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/verbs.c     |    9 ++++++++-
 net/sunrpc/xprtrdma/xprt_rdma.h |   23 ++++++++++-------------
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 805b1f35e1ca..e639ea0faf19 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -53,6 +53,7 @@
 #include <linux/slab.h>
 #include <linux/sunrpc/addr.h>
 #include <linux/sunrpc/svc_rdma.h>
+#include <linux/log2.h>
 
 #include <asm-generic/barrier.h>
 #include <asm/bitops.h>
@@ -1000,12 +1001,18 @@ struct rpcrdma_req *rpcrdma_req_create(struct rpcrdma_xprt *r_xprt, size_t size,
 	struct rpcrdma_buffer *buffer = &r_xprt->rx_buf;
 	struct rpcrdma_regbuf *rb;
 	struct rpcrdma_req *req;
+	size_t maxhdrsize;
 
 	req = kzalloc(sizeof(*req), flags);
 	if (req == NULL)
 		goto out1;
 
-	rb = rpcrdma_regbuf_alloc(RPCRDMA_HDRBUF_SIZE, DMA_TO_DEVICE, flags);
+	/* Compute maximum header buffer size in bytes */
+	maxhdrsize = rpcrdma_fixed_maxsz + 3 +
+		     r_xprt->rx_ia.ri_max_segs * rpcrdma_readchunk_maxsz;
+	maxhdrsize *= sizeof(__be32);
+	rb = rpcrdma_regbuf_alloc(__roundup_pow_of_two(maxhdrsize),
+				  DMA_TO_DEVICE, flags);
 	if (!rb)
 		goto out2;
 	req->rl_rdmabuf = rb;
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index 3b2f2041e889..eaf6b907a76e 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -155,25 +155,22 @@ static inline void *rdmab_data(const struct rpcrdma_regbuf *rb)
 
 /* To ensure a transport can always make forward progress,
  * the number of RDMA segments allowed in header chunk lists
- * is capped at 8. This prevents less-capable devices and
- * memory registrations from overrunning the Send buffer
- * while building chunk lists.
+ * is capped at 16. This prevents less-capable devices from
+ * overrunning the Send buffer while building chunk lists.
  *
  * Elements of the Read list take up more room than the
- * Write list or Reply chunk. 8 read segments means the Read
- * list (or Write list or Reply chunk) cannot consume more
- * than
+ * Write list or Reply chunk. 16 read segments means the
+ * chunk lists cannot consume more than
  *
- * ((8 + 2) * read segment size) + 1 XDR words, or 244 bytes.
+ * ((16 + 2) * read segment size) + 1 XDR words,
  *
- * And the fixed part of the header is another 24 bytes.
- *
- * The smallest inline threshold is 1024 bytes, ensuring that
- * at least 750 bytes are available for RPC messages.
+ * or about 400 bytes. The fixed part of the header is
+ * another 24 bytes. Thus when the inline threshold is
+ * 1024 bytes, at least 600 bytes are available for RPC
+ * message bodies.
  */
 enum {
-	RPCRDMA_MAX_HDR_SEGS = 8,
-	RPCRDMA_HDRBUF_SIZE = 256,
+	RPCRDMA_MAX_HDR_SEGS = 16,
 };
 
 /*


  parent reply	other threads:[~2019-08-19 22:40 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-19 22:35 [PATCH v2 00/21] NFS/RDMA client-side for-5.4 Chuck Lever
2019-08-19 22:36 ` [PATCH v2 01/21] SUNRPC: Remove rpc_wake_up_queued_task_on_wq() Chuck Lever
2019-08-19 22:37 ` [PATCH v2 02/21] SUNRPC: Inline xdr_commit_encode Chuck Lever
2019-08-19 22:37 ` [PATCH v2 03/21] xprtrdma: Refresh the documenting comment in frwr_ops.c Chuck Lever
2019-08-19 22:38 ` [PATCH v2 04/21] xprtrdma: Update obsolete comment Chuck Lever
2019-08-19 22:39 ` [PATCH v2 05/21] xprtrdma: Fix calculation of ri_max_segs again Chuck Lever
2019-08-19 22:40 ` Chuck Lever [this message]
2019-08-19 22:40 ` [PATCH v2 07/21] xprtrdma: Boost client's max slot table size to match Linux server Chuck Lever
2019-08-19 22:41 ` [PATCH v2 08/21] xprtrdma: Rename CQE field in Receive trace points Chuck Lever
2019-08-19 22:42 ` [PATCH v2 09/21] xprtrdma: Rename rpcrdma_buffer::rb_all Chuck Lever
2019-08-19 22:43 ` [PATCH v2 10/21] xprtrdma: Toggle XPRT_CONGESTED in xprtrdma's slot methods Chuck Lever
2019-08-19 22:44 ` [PATCH v2 11/21] xprtrdma: Simplify rpcrdma_mr_pop Chuck Lever
2019-08-19 22:44 ` [PATCH v2 12/21] xprtrdma: Combine rpcrdma_mr_put and rpcrdma_mr_unmap_and_put Chuck Lever
2019-08-19 22:45 ` [PATCH v2 13/21] xprtrdma: Move rpcrdma_mr_get out of frwr_map Chuck Lever
2019-08-19 22:46 ` [PATCH v2 14/21] xprtrdma: Ensure creating an MR does not trigger FS writeback Chuck Lever
2019-08-19 22:47 ` [PATCH v2 15/21] xprtrdma: Cache free MRs in each rpcrdma_req Chuck Lever
2019-08-19 22:47 ` [PATCH v2 16/21] xprtrdma: Remove rpcrdma_buffer::rb_mrlock Chuck Lever
2019-08-19 22:48 ` [PATCH v2 17/21] xprtrdma: Use an llist to manage free rpcrdma_reps Chuck Lever
2019-08-19 22:49 ` [PATCH v2 18/21] xprtrdma: Clean up xprt_rdma_set_connect_timeout() Chuck Lever
2019-08-19 22:50 ` [PATCH v2 19/21] xprtrdma: Fix bc_max_slots return value Chuck Lever
2019-08-19 22:51 ` [PATCH v2 20/21] xprtrdma: Inline XDR chunk encoder functions Chuck Lever
2019-08-19 22:51 ` [PATCH v2 21/21] xprtrdma: Optimize rpcrdma_post_recvs() Chuck Lever
2019-08-22 20:18 ` [PATCH v2 00/21] NFS/RDMA client-side for-5.4 Schumaker, Anna

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=156625439150.8161.9923129489297297655.stgit@seurat29.1015granger.net \
    --to=chuck.lever@oracle.com \
    --cc=anna.schumaker@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.