All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: linux-rdma@vger.kernel.org, linux-nfs@vger.kernel.org
Subject: [PATCH v1 03/19] xprtrdma: Defer completion only when local invalidation is needed
Date: Wed, 10 Apr 2019 16:06:47 -0400	[thread overview]
Message-ID: <20190410200647.11522.29484.stgit@manet.1015granger.net> (raw)
In-Reply-To: <20190410200446.11522.21145.stgit@manet.1015granger.net>

While looking at another issue, I noticed that deferred completion
happens to run on the same CPU as Receive completion, thanks to the
fact that the deferred completion workqueue is BOUND. That suggests
there's really no benefit to deferring completion unless it will
have to context switch while waiting for LocalInv to complete.

A somewhat non-intuitive side benefit of this change is that there
are fewer waits for Send completions. Now that this wait is always
done in the Reply handler (a single process) it serializes
subsequent replies. Send completions are batched, so waiting for one
Send completion means waiting for all outstanding Send completions
at once. When the Reply handler gets to subsequent replies, waiting
(and the context switch that goes with it) is less likely to be
needed.

Measurements of IOPS throughput without deferred completion show
improvement of several percent, and latency is just as good or
slightly better for 4KB 100% read and 8KB 70% read / 30% write.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/rpc_rdma.c  |   31 +++++++++++++++++++++++++------
 net/sunrpc/xprtrdma/verbs.c     |    8 ++++----
 net/sunrpc/xprtrdma/xprt_rdma.h |    1 -
 3 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index b759b16..c3bd18a 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -1226,7 +1226,7 @@ static int decode_reply_chunk(struct xdr_stream *xdr, u32 *length)
  * RPC completion while holding the transport lock to ensure
  * the rep, rqst, and rq_task pointers remain stable.
  */
-void rpcrdma_complete_rqst(struct rpcrdma_rep *rep)
+static void rpcrdma_complete_rqst(struct rpcrdma_rep *rep)
 {
 	struct rpcrdma_xprt *r_xprt = rep->rr_rxprt;
 	struct rpc_xprt *xprt = &r_xprt->rx_xprt;
@@ -1268,6 +1268,12 @@ void rpcrdma_complete_rqst(struct rpcrdma_rep *rep)
 	goto out;
 }
 
+/**
+ * rpcrdma_release_rqst - Release hardware resources
+ * @r_xprt: controlling transport
+ * @req: request with resources to release
+ *
+ */
 void rpcrdma_release_rqst(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
 {
 	/* Invalidate and unmap the data payloads before waking
@@ -1295,7 +1301,11 @@ void rpcrdma_release_rqst(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req)
 	}
 }
 
-/* Reply handling runs in the poll worker thread. Anything that
+/**
+ * rpcrdma_deferred_completion
+ * @work: work struct embedded in an rpcrdma_rep
+ *
+ * Reply handling runs in the poll worker thread. Anything that
  * might wait is deferred to a separate workqueue.
  */
 void rpcrdma_deferred_completion(struct work_struct *work)
@@ -1306,13 +1316,14 @@ void rpcrdma_deferred_completion(struct work_struct *work)
 	struct rpcrdma_xprt *r_xprt = rep->rr_rxprt;
 
 	trace_xprtrdma_defer_cmp(rep);
-	if (rep->rr_wc_flags & IB_WC_WITH_INVALIDATE)
-		frwr_reminv(rep, &req->rl_registered);
+
 	rpcrdma_release_rqst(r_xprt, req);
 	rpcrdma_complete_rqst(rep);
 }
 
-/* Process received RPC/RDMA messages.
+/**
+ * rpcrdma_reply_handler - Process received RPC/RDMA messages
+ * @rep: Incoming rpcrdma_rep object to process
  *
  * Errors must result in the RPC task either being awakened, or
  * allowed to timeout, to discover the errors at that time.
@@ -1375,7 +1386,15 @@ void rpcrdma_reply_handler(struct rpcrdma_rep *rep)
 	clear_bit(RPCRDMA_REQ_F_PENDING, &req->rl_flags);
 
 	trace_xprtrdma_reply(rqst->rq_task, rep, req, credits);
-	queue_work(buf->rb_completion_wq, &rep->rr_work);
+
+	if (rep->rr_wc_flags & IB_WC_WITH_INVALIDATE)
+		frwr_reminv(rep, &req->rl_registered);
+	if (!list_empty(&req->rl_registered)) {
+		queue_work(buf->rb_completion_wq, &rep->rr_work);
+	} else {
+		rpcrdma_release_rqst(r_xprt, req);
+		rpcrdma_complete_rqst(rep);
+	}
 	return;
 
 out_badversion:
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 30cfc0e..fe005c6 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -1106,10 +1106,10 @@ struct rpcrdma_req *
 	if (rc)
 		goto out;
 
-	buf->rb_completion_wq = alloc_workqueue("rpcrdma-%s",
-						WQ_MEM_RECLAIM | WQ_HIGHPRI,
-						0,
-			r_xprt->rx_xprt.address_strings[RPC_DISPLAY_ADDR]);
+	buf->rb_completion_wq =
+		alloc_workqueue("rpcrdma-%s",
+				WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_UNBOUND, 0,
+				r_xprt->rx_xprt.address_strings[RPC_DISPLAY_ADDR]);
 	if (!buf->rb_completion_wq) {
 		rc = -ENOMEM;
 		goto out;
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index 10f6593..6a49597 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -613,7 +613,6 @@ int rpcrdma_prepare_send_sges(struct rpcrdma_xprt *r_xprt,
 void rpcrdma_unmap_sendctx(struct rpcrdma_sendctx *sc);
 int rpcrdma_marshal_req(struct rpcrdma_xprt *r_xprt, struct rpc_rqst *rqst);
 void rpcrdma_set_max_header_sizes(struct rpcrdma_xprt *);
-void rpcrdma_complete_rqst(struct rpcrdma_rep *rep);
 void rpcrdma_reply_handler(struct rpcrdma_rep *rep);
 void rpcrdma_release_rqst(struct rpcrdma_xprt *r_xprt,
 			  struct rpcrdma_req *req);


  parent reply	other threads:[~2019-04-10 20:06 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-10 20:06 [PATCH v1 00/19] Proposed NFS/RDMA patches for v5.2 Chuck Lever
2019-04-10 20:06 ` [PATCH v1 01/19] SUNRPC: Avoid digging into the ATOMIC pool Chuck Lever
2019-04-10 20:06 ` [PATCH v1 02/19] xprtrdma: Fix an frwr_map recovery nit Chuck Lever
2019-04-10 20:06 ` Chuck Lever [this message]
2019-04-10 20:06 ` [PATCH v1 04/19] xprtrdma: Clean up rpcrdma_create_req() Chuck Lever
2019-04-10 20:06 ` [PATCH v1 05/19] xprtrdma: Clean up rpcrdma_create_rep() and rpcrdma_destroy_rep() Chuck Lever
2019-04-11 20:47   ` Anna Schumaker
2019-04-10 20:07 ` [PATCH v1 06/19] xprtrdma: rpcrdma_regbuf alignment Chuck Lever
2019-04-10 20:07 ` [PATCH v1 07/19] xprtrdma: Allocate req's regbufs at xprt create time Chuck Lever
2019-04-10 20:07 ` [PATCH v1 08/19] xprtrdma: De-duplicate "allocate new, free old regbuf" Chuck Lever
2019-04-10 20:07 ` [PATCH v1 09/19] xprtrdma: Clean up regbuf helpers Chuck Lever
2019-04-10 20:07 ` [PATCH v1 10/19] xprtrdma: Backchannel can use GFP_KERNEL allocations Chuck Lever
2019-04-10 20:07 ` [PATCH v1 11/19] xprtrdma: Increase maximum number of backchannel request Chuck Lever
2019-04-10 20:07 ` [PATCH v1 12/19] xprtrdma: Trace marshaling failures Chuck Lever
2019-04-10 20:07 ` [PATCH v1 13/19] xprtrdma: Clean up sendctx functions Chuck Lever
2019-04-10 20:07 ` [PATCH v1 14/19] xprtrdma: More Send completion batching Chuck Lever
2019-04-10 20:07 ` [PATCH v1 15/19] xprtrdma: Eliminate rpcrdma_ia::ri_device Chuck Lever
2019-04-10 20:07 ` [PATCH v1 16/19] SUNRPC: Update comments based on recent changes Chuck Lever
2019-04-10 20:08 ` [PATCH v1 17/19] xprtrdma: Remove rpcrdma_create_data_internal::rsize and wsize Chuck Lever
2019-04-10 20:08 ` [PATCH v1 18/19] xprtrdma: Aggregate the inline settings in struct rpcrdma_ep Chuck Lever
2019-04-10 20:08 ` [PATCH v1 19/19] xprtrdma: Eliminate struct rpcrdma_create_data_internal Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190410200647.11522.29484.stgit@manet.1015granger.net \
    --to=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.