All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] Second round of v4.17 NFS/RDMA client patches
@ 2018-03-05 20:12 Chuck Lever
  2018-03-05 20:12 ` [PATCH 1/9] SUNRPC: Move xprt_update_rtt callsite Chuck Lever
                   ` (8 more replies)
  0 siblings, 9 replies; 17+ messages in thread
From: Chuck Lever @ 2018-03-05 20:12 UTC (permalink / raw)
  To: anna.schumaker; +Cc: linux-rdma, linux-nfs

Hi Anna-

Second round of three. This set could be more controversial. In it,
I adjust some code paths so that I can create rpcrdma-specific
alloc_slot and free_slot methods. These are clean-ups that make
it straightforward to do the changes coming in the third round,
and hopefully add a little bit of a scalability boost as well.


---

Chuck Lever (9):
      SUNRPC: Move xprt_update_rtt callsite
      SUNRPC: Make RTT measurement more precise (Receive)
      SUNRPC: Make RTT measurement more precise (Send)
      SUNRPC: Make num_reqs a non-atomic integer
      SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
      SUNRPC: Add a ->free_slot transport callout
      xprtrdma: Introduce ->alloc_slot call-out for xprtrdma
      xprtrdma: Make rpc_rqst part of rpcrdma_req
      xprtrdma: Allocate rpcrdma_reps during Receive completion


 include/linux/sunrpc/xprt.h                |    9 ++-
 net/sunrpc/clnt.c                          |    1 
 net/sunrpc/xprt.c                          |   51 +++++++++------
 net/sunrpc/xprtrdma/backchannel.c          |   94 ++++++++++------------------
 net/sunrpc/xprtrdma/rpc_rdma.c             |    8 ++
 net/sunrpc/xprtrdma/svc_rdma_backchannel.c |    1 
 net/sunrpc/xprtrdma/transport.c            |   61 ++++++++++++++----
 net/sunrpc/xprtrdma/verbs.c                |   35 ++++++++--
 net/sunrpc/xprtrdma/xprt_rdma.h            |   13 +---
 net/sunrpc/xprtsock.c                      |    8 ++
 10 files changed, 166 insertions(+), 115 deletions(-)

--
Chuck Lever

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/9] SUNRPC: Move xprt_update_rtt callsite
  2018-03-05 20:12 [PATCH 0/9] Second round of v4.17 NFS/RDMA client patches Chuck Lever
@ 2018-03-05 20:12 ` Chuck Lever
  2018-03-05 20:13 ` [PATCH 2/9] SUNRPC: Make RTT measurement more precise (Receive) Chuck Lever
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Chuck Lever @ 2018-03-05 20:12 UTC (permalink / raw)
  To: anna.schumaker; +Cc: linux-rdma, linux-nfs

Since commit 33849792cbcd ("xprtrdma: Detect unreachable NFS/RDMA
servers more reliably"), the xprtrdma transport now has a ->timer
callout. But xprtrdma does not need to compute RTT data, only UDP
needs that. Move the xprt_update_rtt call into the UDP transport
implementation.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/sunrpc/xprt.h |    1 +
 net/sunrpc/xprt.c           |   11 ++++++++---
 net/sunrpc/xprtsock.c       |    1 +
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 7fad838..ad322ce 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -373,6 +373,7 @@ static inline __be32 *xprt_skip_transport_header(struct rpc_xprt *xprt, __be32 *
 void			xprt_write_space(struct rpc_xprt *xprt);
 void			xprt_adjust_cwnd(struct rpc_xprt *xprt, struct rpc_task *task, int result);
 struct rpc_rqst *	xprt_lookup_rqst(struct rpc_xprt *xprt, __be32 xid);
+void			xprt_update_rtt(struct rpc_task *task);
 void			xprt_complete_rqst(struct rpc_task *task, int copied);
 void			xprt_pin_rqst(struct rpc_rqst *req);
 void			xprt_unpin_rqst(struct rpc_rqst *req);
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 8f0ad4f..13fbb48 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -889,7 +889,13 @@ static void xprt_wait_on_pinned_rqst(struct rpc_rqst *req)
 	}
 }
 
-static void xprt_update_rtt(struct rpc_task *task)
+/**
+ * xprt_update_rtt - Update RPC RTT statistics
+ * @task: RPC request that recently completed
+ *
+ * Caller holds xprt->recv_lock.
+ */
+void xprt_update_rtt(struct rpc_task *task)
 {
 	struct rpc_rqst *req = task->tk_rqstp;
 	struct rpc_rtt *rtt = task->tk_client->cl_rtt;
@@ -902,6 +908,7 @@ static void xprt_update_rtt(struct rpc_task *task)
 		rpc_set_timeo(rtt, timer, req->rq_ntrans - 1);
 	}
 }
+EXPORT_SYMBOL_GPL(xprt_update_rtt);
 
 /**
  * xprt_complete_rqst - called when reply processing is complete
@@ -921,8 +928,6 @@ void xprt_complete_rqst(struct rpc_task *task, int copied)
 
 	xprt->stat.recvs++;
 	req->rq_rtt = ktime_sub(ktime_get(), req->rq_xtime);
-	if (xprt->ops->timer != NULL)
-		xprt_update_rtt(task);
 
 	list_del_init(&req->rq_list);
 	req->rq_private_buf.len = copied;
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index a6b8c1f..61df77f 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1060,6 +1060,7 @@ static void xs_udp_data_read_skb(struct rpc_xprt *xprt,
 	if (!rovr)
 		goto out_unlock;
 	xprt_pin_rqst(rovr);
+	xprt_update_rtt(rovr->rq_task);
 	spin_unlock(&xprt->recv_lock);
 	task = rovr->rq_task;
 


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/9] SUNRPC: Make RTT measurement more precise (Receive)
  2018-03-05 20:12 [PATCH 0/9] Second round of v4.17 NFS/RDMA client patches Chuck Lever
  2018-03-05 20:12 ` [PATCH 1/9] SUNRPC: Move xprt_update_rtt callsite Chuck Lever
@ 2018-03-05 20:13 ` Chuck Lever
  2018-03-05 20:13 ` [PATCH 3/9] SUNRPC: Make RTT measurement more precise (Send) Chuck Lever
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Chuck Lever @ 2018-03-05 20:13 UTC (permalink / raw)
  To: anna.schumaker; +Cc: linux-rdma, linux-nfs

Some RPC transports have more overhead in their reply handlers
than others. For example, for RPC-over-RDMA:

- RPC completion has to wait for memory invalidation, which is
  not a part of the server/network round trip

- Recently a context switch was introduced into the reply handler,
  which further artificially inflates the measure of RPC RTT

To capture just server and network latencies more precisely: when
receiving a reply, compute the RTT as soon as the XID is recognized
rather than at RPC completion time.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprt.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 13fbb48..cb7784c 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -826,6 +826,7 @@ static void xprt_connect_status(struct rpc_task *task)
  * @xprt: transport on which the original request was transmitted
  * @xid: RPC XID of incoming reply
  *
+ * Caller holds xprt->recv_lock.
  */
 struct rpc_rqst *xprt_lookup_rqst(struct rpc_xprt *xprt, __be32 xid)
 {
@@ -834,6 +835,7 @@ struct rpc_rqst *xprt_lookup_rqst(struct rpc_xprt *xprt, __be32 xid)
 	list_for_each_entry(entry, &xprt->recv, rq_list)
 		if (entry->rq_xid == xid) {
 			trace_xprt_lookup_rqst(xprt, xid, 0);
+			entry->rq_rtt = ktime_sub(ktime_get(), entry->rq_xtime);
 			return entry;
 		}
 
@@ -915,7 +917,7 @@ void xprt_update_rtt(struct rpc_task *task)
  * @task: RPC request that recently completed
  * @copied: actual number of bytes received from the transport
  *
- * Caller holds transport lock.
+ * Caller holds xprt->recv_lock.
  */
 void xprt_complete_rqst(struct rpc_task *task, int copied)
 {
@@ -927,7 +929,6 @@ void xprt_complete_rqst(struct rpc_task *task, int copied)
 	trace_xprt_complete_rqst(xprt, req->rq_xid, copied);
 
 	xprt->stat.recvs++;
-	req->rq_rtt = ktime_sub(ktime_get(), req->rq_xtime);
 
 	list_del_init(&req->rq_list);
 	req->rq_private_buf.len = copied;


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/9] SUNRPC: Make RTT measurement more precise (Send)
  2018-03-05 20:12 [PATCH 0/9] Second round of v4.17 NFS/RDMA client patches Chuck Lever
  2018-03-05 20:12 ` [PATCH 1/9] SUNRPC: Move xprt_update_rtt callsite Chuck Lever
  2018-03-05 20:13 ` [PATCH 2/9] SUNRPC: Make RTT measurement more precise (Receive) Chuck Lever
@ 2018-03-05 20:13 ` Chuck Lever
  2018-03-05 20:13 ` [PATCH 4/9] SUNRPC: Make num_reqs a non-atomic integer Chuck Lever
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Chuck Lever @ 2018-03-05 20:13 UTC (permalink / raw)
  To: anna.schumaker; +Cc: linux-rdma, linux-nfs

Some RPC transports have more overhead in their send_request
callouts than others. For example, for RPC-over-RDMA:

- Marshaling an RPC often has to DMA map the RPC arguments

- Registration methods perform memory registration as part of
  marshaling

To capture just server and network latencies more precisely: when
sending a Call, capture the rq_xtime timestamp _after_ the transport
header has been marshaled.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprt.c               |    1 -
 net/sunrpc/xprtrdma/transport.c |    1 +
 net/sunrpc/xprtsock.c           |    3 +++
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index cb7784c..73f05a1 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1033,7 +1033,6 @@ void xprt_transmit(struct rpc_task *task)
 		return;
 
 	connect_cookie = xprt->connect_cookie;
-	req->rq_xtime = ktime_get();
 	status = xprt->ops->send_request(task);
 	trace_xprt_transmit(xprt, req->rq_xid, status);
 	if (status != 0) {
diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 67e4386..cc1aad3 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -696,6 +696,7 @@
 	/* Must suppress retransmit to maintain credits */
 	if (rqst->rq_connect_cookie == xprt->connect_cookie)
 		goto drop_connection;
+	rqst->rq_xtime = ktime_get();
 
 	__set_bit(RPCRDMA_REQ_F_PENDING, &req->rl_flags);
 	if (rpcrdma_ep_post(&r_xprt->rx_ia, &r_xprt->rx_ep, req))
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 61df77f..e3c6a3d 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -527,6 +527,7 @@ static int xs_local_send_request(struct rpc_task *task)
 	xs_pktdump("packet data:",
 			req->rq_svec->iov_base, req->rq_svec->iov_len);
 
+	req->rq_xtime = ktime_get();
 	status = xs_sendpages(transport->sock, NULL, 0, xdr, req->rq_bytes_sent,
 			      true, &sent);
 	dprintk("RPC:       %s(%u) = %d\n",
@@ -589,6 +590,7 @@ static int xs_udp_send_request(struct rpc_task *task)
 
 	if (!xprt_bound(xprt))
 		return -ENOTCONN;
+	req->rq_xtime = ktime_get();
 	status = xs_sendpages(transport->sock, xs_addr(xprt), xprt->addrlen,
 			      xdr, req->rq_bytes_sent, true, &sent);
 
@@ -678,6 +680,7 @@ static int xs_tcp_send_request(struct rpc_task *task)
 	/* Continue transmitting the packet/record. We must be careful
 	 * to cope with writespace callbacks arriving _after_ we have
 	 * called sendmsg(). */
+	req->rq_xtime = ktime_get();
 	while (1) {
 		sent = 0;
 		status = xs_sendpages(transport->sock, NULL, 0, xdr,


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/9] SUNRPC: Make num_reqs a non-atomic integer
  2018-03-05 20:12 [PATCH 0/9] Second round of v4.17 NFS/RDMA client patches Chuck Lever
                   ` (2 preceding siblings ...)
  2018-03-05 20:13 ` [PATCH 3/9] SUNRPC: Make RTT measurement more precise (Send) Chuck Lever
@ 2018-03-05 20:13 ` Chuck Lever
  2018-03-05 20:13 ` [PATCH 5/9] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock Chuck Lever
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Chuck Lever @ 2018-03-05 20:13 UTC (permalink / raw)
  To: anna.schumaker; +Cc: linux-rdma, linux-nfs

If recording xprt->stat.max_slots is moved into xprt_alloc_slot,
then xprt->num_reqs is never manipulated outside
xprt->reserve_lock. There's no longer a need for xprt->num_reqs to
be atomic.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/sunrpc/xprt.h |    2 +-
 net/sunrpc/xprt.c           |   17 +++++++++--------
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index ad322ce..5fea0fb 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -197,7 +197,7 @@ struct rpc_xprt {
 	struct list_head	free;		/* free slots */
 	unsigned int		max_reqs;	/* max number of slots */
 	unsigned int		min_reqs;	/* min number of slots */
-	atomic_t		num_reqs;	/* total slots */
+	unsigned int		num_reqs;	/* total slots */
 	unsigned long		state;		/* transport state */
 	unsigned char		resvport   : 1; /* use a reserved port */
 	atomic_t		swapper;	/* we're swapping over this
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 73f05a1..70f0050 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1009,7 +1009,7 @@ void xprt_transmit(struct rpc_task *task)
 	struct rpc_rqst	*req = task->tk_rqstp;
 	struct rpc_xprt	*xprt = req->rq_xprt;
 	unsigned int connect_cookie;
-	int status, numreqs;
+	int status;
 
 	dprintk("RPC: %5u xprt_transmit(%u)\n", task->tk_pid, req->rq_slen);
 
@@ -1047,9 +1047,6 @@ void xprt_transmit(struct rpc_task *task)
 
 	xprt->ops->set_retrans_timeout(task);
 
-	numreqs = atomic_read(&xprt->num_reqs);
-	if (numreqs > xprt->stat.max_slots)
-		xprt->stat.max_slots = numreqs;
 	xprt->stat.sends++;
 	xprt->stat.req_u += xprt->stat.sends - xprt->stat.recvs;
 	xprt->stat.bklog_u += xprt->backlog.qlen;
@@ -1111,14 +1108,15 @@ static struct rpc_rqst *xprt_dynamic_alloc_slot(struct rpc_xprt *xprt)
 {
 	struct rpc_rqst *req = ERR_PTR(-EAGAIN);
 
-	if (!atomic_add_unless(&xprt->num_reqs, 1, xprt->max_reqs))
+	if (xprt->num_reqs >= xprt->max_reqs)
 		goto out;
+	++xprt->num_reqs;
 	spin_unlock(&xprt->reserve_lock);
 	req = kzalloc(sizeof(struct rpc_rqst), GFP_NOFS);
 	spin_lock(&xprt->reserve_lock);
 	if (req != NULL)
 		goto out;
-	atomic_dec(&xprt->num_reqs);
+	--xprt->num_reqs;
 	req = ERR_PTR(-ENOMEM);
 out:
 	return req;
@@ -1126,7 +1124,8 @@ static struct rpc_rqst *xprt_dynamic_alloc_slot(struct rpc_xprt *xprt)
 
 static bool xprt_dynamic_free_slot(struct rpc_xprt *xprt, struct rpc_rqst *req)
 {
-	if (atomic_add_unless(&xprt->num_reqs, -1, xprt->min_reqs)) {
+	if (xprt->num_reqs > xprt->min_reqs) {
+		--xprt->num_reqs;
 		kfree(req);
 		return true;
 	}
@@ -1162,6 +1161,8 @@ void xprt_alloc_slot(struct rpc_xprt *xprt, struct rpc_task *task)
 	spin_unlock(&xprt->reserve_lock);
 	return;
 out_init_req:
+	xprt->stat.max_slots = max_t(unsigned int, xprt->stat.max_slots,
+				     xprt->num_reqs);
 	task->tk_status = 0;
 	task->tk_rqstp = req;
 	xprt_request_init(task, xprt);
@@ -1229,7 +1230,7 @@ struct rpc_xprt *xprt_alloc(struct net *net, size_t size,
 	else
 		xprt->max_reqs = num_prealloc;
 	xprt->min_reqs = num_prealloc;
-	atomic_set(&xprt->num_reqs, num_prealloc);
+	xprt->num_reqs = num_prealloc;
 
 	return xprt;
 


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/9] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
  2018-03-05 20:12 [PATCH 0/9] Second round of v4.17 NFS/RDMA client patches Chuck Lever
                   ` (3 preceding siblings ...)
  2018-03-05 20:13 ` [PATCH 4/9] SUNRPC: Make num_reqs a non-atomic integer Chuck Lever
@ 2018-03-05 20:13 ` Chuck Lever
  2018-03-06 22:02   ` Anna Schumaker
  2018-03-05 20:13 ` [PATCH 6/9] SUNRPC: Add a ->free_slot transport callout Chuck Lever
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 17+ messages in thread
From: Chuck Lever @ 2018-03-05 20:13 UTC (permalink / raw)
  To: anna.schumaker; +Cc: linux-rdma, linux-nfs

alloc_slot is a transport-specific op, but initializing an rpc_rqst
is common to all transports. Move initialization to common code in
preparation for adding a transport-specific alloc_slot to xprtrdma.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/sunrpc/xprt.h |    1 +
 net/sunrpc/clnt.c           |    1 +
 net/sunrpc/xprt.c           |   12 +++++++-----
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 5fea0fb..9784e28 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -324,6 +324,7 @@ struct xprt_class {
 struct rpc_xprt		*xprt_create_transport(struct xprt_create *args);
 void			xprt_connect(struct rpc_task *task);
 void			xprt_reserve(struct rpc_task *task);
+void			xprt_request_init(struct rpc_task *task);
 void			xprt_retry_reserve(struct rpc_task *task);
 int			xprt_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task);
 int			xprt_reserve_xprt_cong(struct rpc_xprt *xprt, struct rpc_task *task);
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 6e432ec..226f558 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1546,6 +1546,7 @@ void rpc_force_rebind(struct rpc_clnt *clnt)
 	task->tk_status = 0;
 	if (status >= 0) {
 		if (task->tk_rqstp) {
+			xprt_request_init(task);
 			task->tk_action = call_refresh;
 			return;
 		}
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 70f0050..a394b46 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -66,7 +66,7 @@
  * Local functions
  */
 static void	 xprt_init(struct rpc_xprt *xprt, struct net *net);
-static void	xprt_request_init(struct rpc_task *, struct rpc_xprt *);
+static __be32	xprt_alloc_xid(struct rpc_xprt *xprt);
 static void	xprt_connect_status(struct rpc_task *task);
 static int      __xprt_get_cong(struct rpc_xprt *, struct rpc_task *);
 static void     __xprt_put_cong(struct rpc_xprt *, struct rpc_rqst *);
@@ -987,6 +987,8 @@ bool xprt_prepare_transmit(struct rpc_task *task)
 		task->tk_status = -EAGAIN;
 		goto out_unlock;
 	}
+	if (likely(!bc_prealloc(req)))
+		req->rq_xid = xprt_alloc_xid(xprt);
 	ret = true;
 out_unlock:
 	spin_unlock_bh(&xprt->transport_lock);
@@ -1163,10 +1165,10 @@ void xprt_alloc_slot(struct rpc_xprt *xprt, struct rpc_task *task)
 out_init_req:
 	xprt->stat.max_slots = max_t(unsigned int, xprt->stat.max_slots,
 				     xprt->num_reqs);
+	spin_unlock(&xprt->reserve_lock);
+
 	task->tk_status = 0;
 	task->tk_rqstp = req;
-	xprt_request_init(task, xprt);
-	spin_unlock(&xprt->reserve_lock);
 }
 EXPORT_SYMBOL_GPL(xprt_alloc_slot);
 
@@ -1303,8 +1305,9 @@ static inline void xprt_init_xid(struct rpc_xprt *xprt)
 	xprt->xid = prandom_u32();
 }
 
-static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt)
+void xprt_request_init(struct rpc_task *task)
 {
+	struct rpc_xprt *xprt = task->tk_xprt;
 	struct rpc_rqst	*req = task->tk_rqstp;
 
 	INIT_LIST_HEAD(&req->rq_list);
@@ -1312,7 +1315,6 @@ static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt)
 	req->rq_task	= task;
 	req->rq_xprt    = xprt;
 	req->rq_buffer  = NULL;
-	req->rq_xid     = xprt_alloc_xid(xprt);
 	req->rq_connect_cookie = xprt->connect_cookie - 1;
 	req->rq_bytes_sent = 0;
 	req->rq_snd_buf.len = 0;


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/9] SUNRPC: Add a ->free_slot transport callout
  2018-03-05 20:12 [PATCH 0/9] Second round of v4.17 NFS/RDMA client patches Chuck Lever
                   ` (4 preceding siblings ...)
  2018-03-05 20:13 ` [PATCH 5/9] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock Chuck Lever
@ 2018-03-05 20:13 ` Chuck Lever
  2018-03-05 20:13 ` [PATCH 7/9] xprtrdma: Introduce ->alloc_slot call-out for xprtrdma Chuck Lever
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 17+ messages in thread
From: Chuck Lever @ 2018-03-05 20:13 UTC (permalink / raw)
  To: anna.schumaker; +Cc: linux-rdma, linux-nfs

Refactor: xprtrdma needs to have better control over when RPCs are
awoken from the backlog queue, so replace xprt_free_slot with a
transport op callout.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/sunrpc/xprt.h                |    4 ++++
 net/sunrpc/xprt.c                          |    5 +++--
 net/sunrpc/xprtrdma/svc_rdma_backchannel.c |    1 +
 net/sunrpc/xprtrdma/transport.c            |    1 +
 net/sunrpc/xprtsock.c                      |    4 ++++
 5 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 9784e28..706eef1 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -127,6 +127,8 @@ struct rpc_xprt_ops {
 	int		(*reserve_xprt)(struct rpc_xprt *xprt, struct rpc_task *task);
 	void		(*release_xprt)(struct rpc_xprt *xprt, struct rpc_task *task);
 	void		(*alloc_slot)(struct rpc_xprt *xprt, struct rpc_task *task);
+	void		(*free_slot)(struct rpc_xprt *xprt,
+				     struct rpc_rqst *req);
 	void		(*rpcbind)(struct rpc_task *task);
 	void		(*set_port)(struct rpc_xprt *xprt, unsigned short port);
 	void		(*connect)(struct rpc_xprt *xprt, struct rpc_task *task);
@@ -329,6 +331,8 @@ struct xprt_class {
 int			xprt_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task);
 int			xprt_reserve_xprt_cong(struct rpc_xprt *xprt, struct rpc_task *task);
 void			xprt_alloc_slot(struct rpc_xprt *xprt, struct rpc_task *task);
+void			xprt_free_slot(struct rpc_xprt *xprt,
+				       struct rpc_rqst *req);
 void			xprt_lock_and_alloc_slot(struct rpc_xprt *xprt, struct rpc_task *task);
 bool			xprt_prepare_transmit(struct rpc_task *task);
 void			xprt_transmit(struct rpc_task *task);
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index a394b46..fb3093d 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1186,7 +1186,7 @@ void xprt_lock_and_alloc_slot(struct rpc_xprt *xprt, struct rpc_task *task)
 }
 EXPORT_SYMBOL_GPL(xprt_lock_and_alloc_slot);
 
-static void xprt_free_slot(struct rpc_xprt *xprt, struct rpc_rqst *req)
+void xprt_free_slot(struct rpc_xprt *xprt, struct rpc_rqst *req)
 {
 	spin_lock(&xprt->reserve_lock);
 	if (!xprt_dynamic_free_slot(xprt, req)) {
@@ -1196,6 +1196,7 @@ static void xprt_free_slot(struct rpc_xprt *xprt, struct rpc_rqst *req)
 	xprt_wake_up_backlog(xprt);
 	spin_unlock(&xprt->reserve_lock);
 }
+EXPORT_SYMBOL_GPL(xprt_free_slot);
 
 static void xprt_free_all_slots(struct rpc_xprt *xprt)
 {
@@ -1375,7 +1376,7 @@ void xprt_release(struct rpc_task *task)
 
 	dprintk("RPC: %5u release request %p\n", task->tk_pid, req);
 	if (likely(!bc_prealloc(req)))
-		xprt_free_slot(xprt, req);
+		xprt->ops->free_slot(xprt, req);
 	else
 		xprt_free_bc_request(req);
 }
diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
index a73632c..1035516 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
@@ -273,6 +273,7 @@ static int svc_rdma_bc_sendto(struct svcxprt_rdma *rdma,
 	.reserve_xprt		= xprt_reserve_xprt_cong,
 	.release_xprt		= xprt_release_xprt_cong,
 	.alloc_slot		= xprt_alloc_slot,
+	.free_slot		= xprt_free_slot,
 	.release_request	= xprt_release_rqst_cong,
 	.buf_alloc		= xprt_rdma_bc_allocate,
 	.buf_free		= xprt_rdma_bc_free,
diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index cc1aad3..40ff91d 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -780,6 +780,7 @@ void xprt_rdma_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
 	.reserve_xprt		= xprt_reserve_xprt_cong,
 	.release_xprt		= xprt_release_xprt_cong, /* sunrpc/xprt.c */
 	.alloc_slot		= xprt_alloc_slot,
+	.free_slot		= xprt_free_slot,
 	.release_request	= xprt_release_rqst_cong,       /* ditto */
 	.set_retrans_timeout	= xprt_set_retrans_timeout_def, /* ditto */
 	.timer			= xprt_rdma_timer,
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index e3c6a3d..2511c21 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2764,6 +2764,7 @@ static void bc_destroy(struct rpc_xprt *xprt)
 	.reserve_xprt		= xprt_reserve_xprt,
 	.release_xprt		= xs_tcp_release_xprt,
 	.alloc_slot		= xprt_alloc_slot,
+	.free_slot		= xprt_free_slot,
 	.rpcbind		= xs_local_rpcbind,
 	.set_port		= xs_local_set_port,
 	.connect		= xs_local_connect,
@@ -2783,6 +2784,7 @@ static void bc_destroy(struct rpc_xprt *xprt)
 	.reserve_xprt		= xprt_reserve_xprt_cong,
 	.release_xprt		= xprt_release_xprt_cong,
 	.alloc_slot		= xprt_alloc_slot,
+	.free_slot		= xprt_free_slot,
 	.rpcbind		= rpcb_getport_async,
 	.set_port		= xs_set_port,
 	.connect		= xs_connect,
@@ -2804,6 +2806,7 @@ static void bc_destroy(struct rpc_xprt *xprt)
 	.reserve_xprt		= xprt_reserve_xprt,
 	.release_xprt		= xs_tcp_release_xprt,
 	.alloc_slot		= xprt_lock_and_alloc_slot,
+	.free_slot		= xprt_free_slot,
 	.rpcbind		= rpcb_getport_async,
 	.set_port		= xs_set_port,
 	.connect		= xs_connect,
@@ -2835,6 +2838,7 @@ static void bc_destroy(struct rpc_xprt *xprt)
 	.reserve_xprt		= xprt_reserve_xprt,
 	.release_xprt		= xprt_release_xprt,
 	.alloc_slot		= xprt_alloc_slot,
+	.free_slot		= xprt_free_slot,
 	.buf_alloc		= bc_malloc,
 	.buf_free		= bc_free,
 	.send_request		= bc_send_request,


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 7/9] xprtrdma: Introduce ->alloc_slot call-out for xprtrdma
  2018-03-05 20:12 [PATCH 0/9] Second round of v4.17 NFS/RDMA client patches Chuck Lever
                   ` (5 preceding siblings ...)
  2018-03-05 20:13 ` [PATCH 6/9] SUNRPC: Add a ->free_slot transport callout Chuck Lever
@ 2018-03-05 20:13 ` Chuck Lever
  2018-03-05 20:13 ` [PATCH 8/9] xprtrdma: Make rpc_rqst part of rpcrdma_req Chuck Lever
  2018-03-05 20:13 ` [PATCH 9/9] xprtrdma: Allocate rpcrdma_reps during Receive completion Chuck Lever
  8 siblings, 0 replies; 17+ messages in thread
From: Chuck Lever @ 2018-03-05 20:13 UTC (permalink / raw)
  To: anna.schumaker; +Cc: linux-rdma, linux-nfs

rpcrdma_buffer_get acquires an rpcrdma_req and rep for each RPC.
Currently this is done in the call_allocate action, and sometimes it
can fail if there are many outstanding RPCs.

When call_allocate fails, the RPC task is put on the delayq. It is
awoken a few milliseconds later, but there's no guarantee it will
get a buffer at that time. The RPC task can be repeatedly put back
to sleep or even starved.

The call_allocate action should rarely fail. The delayq mechanism is
not meant to deal with transport congestion.

In the current sunrpc stack, there is a friendlier way to deal with
this situation. These objects are actually tantamount to an RPC
slot (rpc_rqst) and there is a separate FSM action, distinct from
call_allocate, for allocating slot resources. This is the
call_reserve action.

When allocation fails during this action, the RPC is placed on the
transport's backlog queue. The backlog mechanism provides a stronger
guarantee that when the RPC is awoken, a buffer will be available
for it; and backlogged RPCs are awoken one-at-a-time.

To make slot resource allocation occur in the call_reserve action,
create special ->alloc_slot and ->free_slot call-outs for xprtrdma.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/transport.c |   52 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 50 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 40ff91d..1dac949 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -537,6 +537,54 @@
 	}
 }
 
+/**
+ * xprt_rdma_alloc_slot - allocate an rpc_rqst
+ * @xprt: controlling RPC transport
+ * @task: RPC task requesting a fresh rpc_rqst
+ *
+ * tk_status values:
+ *	%0 if task->tk_rqstp points to a fresh rpc_rqst
+ *	%-EAGAIN if no rpc_rqst is available; queued on backlog
+ */
+static void
+xprt_rdma_alloc_slot(struct rpc_xprt *xprt, struct rpc_task *task)
+{
+	struct rpc_rqst *rqst;
+
+	spin_lock(&xprt->reserve_lock);
+	if (list_empty(&xprt->free))
+		goto out_sleep;
+	rqst = list_first_entry(&xprt->free, struct rpc_rqst, rq_list);
+	list_del(&rqst->rq_list);
+	spin_unlock(&xprt->reserve_lock);
+
+	task->tk_rqstp = rqst;
+	task->tk_status = 0;
+	return;
+
+out_sleep:
+	rpc_sleep_on(&xprt->backlog, task, NULL);
+	spin_unlock(&xprt->reserve_lock);
+	task->tk_status = -EAGAIN;
+}
+
+/**
+ * xprt_rdma_free_slot - release an rpc_rqst
+ * @xprt: controlling RPC transport
+ * @rqst: rpc_rqst to release
+ *
+ */
+static void
+xprt_rdma_free_slot(struct rpc_xprt *xprt, struct rpc_rqst *rqst)
+{
+	memset(rqst, 0, sizeof(*rqst));
+
+	spin_lock(&xprt->reserve_lock);
+	list_add(&rqst->rq_list, &xprt->free);
+	rpc_wake_up_next(&xprt->backlog);
+	spin_unlock(&xprt->reserve_lock);
+}
+
 static bool
 rpcrdma_get_sendbuf(struct rpcrdma_xprt *r_xprt, struct rpcrdma_req *req,
 		    size_t size, gfp_t flags)
@@ -779,8 +827,8 @@ void xprt_rdma_print_stats(struct rpc_xprt *xprt, struct seq_file *seq)
 static const struct rpc_xprt_ops xprt_rdma_procs = {
 	.reserve_xprt		= xprt_reserve_xprt_cong,
 	.release_xprt		= xprt_release_xprt_cong, /* sunrpc/xprt.c */
-	.alloc_slot		= xprt_alloc_slot,
-	.free_slot		= xprt_free_slot,
+	.alloc_slot		= xprt_rdma_alloc_slot,
+	.free_slot		= xprt_rdma_free_slot,
 	.release_request	= xprt_release_rqst_cong,       /* ditto */
 	.set_retrans_timeout	= xprt_set_retrans_timeout_def, /* ditto */
 	.timer			= xprt_rdma_timer,


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 8/9] xprtrdma: Make rpc_rqst part of rpcrdma_req
  2018-03-05 20:12 [PATCH 0/9] Second round of v4.17 NFS/RDMA client patches Chuck Lever
                   ` (6 preceding siblings ...)
  2018-03-05 20:13 ` [PATCH 7/9] xprtrdma: Introduce ->alloc_slot call-out for xprtrdma Chuck Lever
@ 2018-03-05 20:13 ` Chuck Lever
  2018-03-05 20:13 ` [PATCH 9/9] xprtrdma: Allocate rpcrdma_reps during Receive completion Chuck Lever
  8 siblings, 0 replies; 17+ messages in thread
From: Chuck Lever @ 2018-03-05 20:13 UTC (permalink / raw)
  To: anna.schumaker; +Cc: linux-rdma, linux-nfs

This simplifies allocation of the generic RPC slot and xprtrdma
specific per-RPC resources.

It also makes xprtrdma more like the socket-based transports:
->buf_alloc and ->buf_free are now responsible only for send and
receive buffers.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/sunrpc/xprt.h       |    1 
 net/sunrpc/xprtrdma/backchannel.c |   77 +++++++++++++++++--------------------
 net/sunrpc/xprtrdma/transport.c   |   35 ++++-------------
 net/sunrpc/xprtrdma/xprt_rdma.h   |    9 +---
 4 files changed, 46 insertions(+), 76 deletions(-)

diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 706eef1..336fd1a 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -84,7 +84,6 @@ struct rpc_rqst {
 	void (*rq_release_snd_buf)(struct rpc_rqst *); /* release rq_enc_pages */
 	struct list_head	rq_list;
 
-	void			*rq_xprtdata;	/* Per-xprt private data */
 	void			*rq_buffer;	/* Call XDR encode buffer */
 	size_t			rq_callsize;
 	void			*rq_rbuffer;	/* Reply XDR decode buffer */
diff --git a/net/sunrpc/xprtrdma/backchannel.c b/net/sunrpc/xprtrdma/backchannel.c
index 47ebac9..4034788 100644
--- a/net/sunrpc/xprtrdma/backchannel.c
+++ b/net/sunrpc/xprtrdma/backchannel.c
@@ -29,29 +29,41 @@ static void rpcrdma_bc_free_rqst(struct rpcrdma_xprt *r_xprt,
 	spin_unlock(&buf->rb_reqslock);
 
 	rpcrdma_destroy_req(req);
-
-	kfree(rqst);
 }
 
-static int rpcrdma_bc_setup_rqst(struct rpcrdma_xprt *r_xprt,
-				 struct rpc_rqst *rqst)
+static int rpcrdma_bc_setup_reqs(struct rpcrdma_xprt *r_xprt,
+				 unsigned int count)
 {
-	struct rpcrdma_regbuf *rb;
-	struct rpcrdma_req *req;
-	size_t size;
+	struct rpc_xprt *xprt = &r_xprt->rx_xprt;
+	struct rpc_rqst *rqst;
+	unsigned int i;
+
+	for (i = 0; i < (count << 1); i++) {
+		struct rpcrdma_regbuf *rb;
+		struct rpcrdma_req *req;
+		size_t size;
+
+		req = rpcrdma_create_req(r_xprt);
+		if (IS_ERR(req))
+			return PTR_ERR(req);
+		rqst = &req->rl_slot;
+
+		rqst->rq_xprt = xprt;
+		INIT_LIST_HEAD(&rqst->rq_list);
+		INIT_LIST_HEAD(&rqst->rq_bc_list);
+		__set_bit(RPC_BC_PA_IN_USE, &rqst->rq_bc_pa_state);
+		spin_lock_bh(&xprt->bc_pa_lock);
+		list_add(&rqst->rq_bc_pa_list, &xprt->bc_pa_list);
+		spin_unlock_bh(&xprt->bc_pa_lock);
 
-	req = rpcrdma_create_req(r_xprt);
-	if (IS_ERR(req))
-		return PTR_ERR(req);
-
-	size = r_xprt->rx_data.inline_rsize;
-	rb = rpcrdma_alloc_regbuf(size, DMA_TO_DEVICE, GFP_KERNEL);
-	if (IS_ERR(rb))
-		goto out_fail;
-	req->rl_sendbuf = rb;
-	xdr_buf_init(&rqst->rq_snd_buf, rb->rg_base,
-		     min_t(size_t, size, PAGE_SIZE));
-	rpcrdma_set_xprtdata(rqst, req);
+		size = r_xprt->rx_data.inline_rsize;
+		rb = rpcrdma_alloc_regbuf(size, DMA_TO_DEVICE, GFP_KERNEL);
+		if (IS_ERR(rb))
+			goto out_fail;
+		req->rl_sendbuf = rb;
+		xdr_buf_init(&rqst->rq_snd_buf, rb->rg_base,
+			     min_t(size_t, size, PAGE_SIZE));
+	}
 	return 0;
 
 out_fail:
@@ -86,9 +98,6 @@ static int rpcrdma_bc_setup_reps(struct rpcrdma_xprt *r_xprt,
 int xprt_rdma_bc_setup(struct rpc_xprt *xprt, unsigned int reqs)
 {
 	struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(xprt);
-	struct rpcrdma_buffer *buffer = &r_xprt->rx_buf;
-	struct rpc_rqst *rqst;
-	unsigned int i;
 	int rc;
 
 	/* The backchannel reply path returns each rpc_rqst to the
@@ -103,25 +112,9 @@ int xprt_rdma_bc_setup(struct rpc_xprt *xprt, unsigned int reqs)
 	if (reqs > RPCRDMA_BACKWARD_WRS >> 1)
 		goto out_err;
 
-	for (i = 0; i < (reqs << 1); i++) {
-		rqst = kzalloc(sizeof(*rqst), GFP_KERNEL);
-		if (!rqst)
-			goto out_free;
-
-		dprintk("RPC:       %s: new rqst %p\n", __func__, rqst);
-
-		rqst->rq_xprt = &r_xprt->rx_xprt;
-		INIT_LIST_HEAD(&rqst->rq_list);
-		INIT_LIST_HEAD(&rqst->rq_bc_list);
-		__set_bit(RPC_BC_PA_IN_USE, &rqst->rq_bc_pa_state);
-
-		if (rpcrdma_bc_setup_rqst(r_xprt, rqst))
-			goto out_free;
-
-		spin_lock_bh(&xprt->bc_pa_lock);
-		list_add(&rqst->rq_bc_pa_list, &xprt->bc_pa_list);
-		spin_unlock_bh(&xprt->bc_pa_lock);
-	}
+	rc = rpcrdma_bc_setup_reqs(r_xprt, reqs);
+	if (rc)
+		goto out_free;
 
 	rc = rpcrdma_bc_setup_reps(r_xprt, reqs);
 	if (rc)
@@ -131,7 +124,7 @@ int xprt_rdma_bc_setup(struct rpc_xprt *xprt, unsigned int reqs)
 	if (rc)
 		goto out_free;
 
-	buffer->rb_bc_srv_max_requests = reqs;
+	r_xprt->rx_buf.rb_bc_srv_max_requests = reqs;
 	request_module("svcrdma");
 	trace_xprtrdma_cb_setup(r_xprt, reqs);
 	return 0;
diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 1dac949..e0ab2b2 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -330,9 +330,7 @@
 		return ERR_PTR(-EBADF);
 	}
 
-	xprt = xprt_alloc(args->net, sizeof(struct rpcrdma_xprt),
-			xprt_rdma_slot_table_entries,
-			xprt_rdma_slot_table_entries);
+	xprt = xprt_alloc(args->net, sizeof(struct rpcrdma_xprt), 0, 0);
 	if (xprt == NULL) {
 		dprintk("RPC:       %s: couldn't allocate rpcrdma_xprt\n",
 			__func__);
@@ -364,7 +362,7 @@
 		xprt_set_bound(xprt);
 	xprt_rdma_format_addresses(xprt, sap);
 
-	cdata.max_requests = xprt->max_reqs;
+	cdata.max_requests = xprt_rdma_slot_table_entries;
 
 	cdata.rsize = RPCRDMA_MAX_SEGS * PAGE_SIZE; /* RDMA write max */
 	cdata.wsize = RPCRDMA_MAX_SEGS * PAGE_SIZE; /* RDMA read max */
@@ -549,22 +547,18 @@
 static void
 xprt_rdma_alloc_slot(struct rpc_xprt *xprt, struct rpc_task *task)
 {
-	struct rpc_rqst *rqst;
+	struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(xprt);
+	struct rpcrdma_req *req;
 
-	spin_lock(&xprt->reserve_lock);
-	if (list_empty(&xprt->free))
+	req = rpcrdma_buffer_get(&r_xprt->rx_buf);
+	if (!req)
 		goto out_sleep;
-	rqst = list_first_entry(&xprt->free, struct rpc_rqst, rq_list);
-	list_del(&rqst->rq_list);
-	spin_unlock(&xprt->reserve_lock);
-
-	task->tk_rqstp = rqst;
+	task->tk_rqstp = &req->rl_slot;
 	task->tk_status = 0;
 	return;
 
 out_sleep:
 	rpc_sleep_on(&xprt->backlog, task, NULL);
-	spin_unlock(&xprt->reserve_lock);
 	task->tk_status = -EAGAIN;
 }
 
@@ -578,11 +572,8 @@
 xprt_rdma_free_slot(struct rpc_xprt *xprt, struct rpc_rqst *rqst)
 {
 	memset(rqst, 0, sizeof(*rqst));
-
-	spin_lock(&xprt->reserve_lock);
-	list_add(&rqst->rq_list, &xprt->free);
+	rpcrdma_buffer_put(rpcr_to_rdmar(rqst));
 	rpc_wake_up_next(&xprt->backlog);
-	spin_unlock(&xprt->reserve_lock);
 }
 
 static bool
@@ -655,13 +646,9 @@
 {
 	struct rpc_rqst *rqst = task->tk_rqstp;
 	struct rpcrdma_xprt *r_xprt = rpcx_to_rdmax(rqst->rq_xprt);
-	struct rpcrdma_req *req;
+	struct rpcrdma_req *req = rpcr_to_rdmar(rqst);
 	gfp_t flags;
 
-	req = rpcrdma_buffer_get(&r_xprt->rx_buf);
-	if (req == NULL)
-		goto out_get;
-
 	flags = RPCRDMA_DEF_GFP;
 	if (RPC_IS_SWAPPER(task))
 		flags = __GFP_MEMALLOC | GFP_NOWAIT | __GFP_NOWARN;
@@ -671,15 +658,12 @@
 	if (!rpcrdma_get_recvbuf(r_xprt, req, rqst->rq_rcvsize, flags))
 		goto out_fail;
 
-	rpcrdma_set_xprtdata(rqst, req);
 	rqst->rq_buffer = req->rl_sendbuf->rg_base;
 	rqst->rq_rbuffer = req->rl_recvbuf->rg_base;
 	trace_xprtrdma_allocate(task, req);
 	return 0;
 
 out_fail:
-	rpcrdma_buffer_put(req);
-out_get:
 	trace_xprtrdma_allocate(task, NULL);
 	return -ENOMEM;
 }
@@ -700,7 +684,6 @@
 	if (test_bit(RPCRDMA_REQ_F_PENDING, &req->rl_flags))
 		rpcrdma_release_rqst(r_xprt, req);
 	trace_xprtrdma_rpc_done(task, req);
-	rpcrdma_buffer_put(req);
 }
 
 /**
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index 3d3b423..b35d80b 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -334,6 +334,7 @@ enum {
 struct rpcrdma_buffer;
 struct rpcrdma_req {
 	struct list_head	rl_list;
+	struct rpc_rqst		rl_slot;
 	struct rpcrdma_buffer	*rl_buffer;
 	struct rpcrdma_rep	*rl_reply;
 	struct xdr_stream	rl_stream;
@@ -356,16 +357,10 @@ enum {
 	RPCRDMA_REQ_F_TX_RESOURCES,
 };
 
-static inline void
-rpcrdma_set_xprtdata(struct rpc_rqst *rqst, struct rpcrdma_req *req)
-{
-	rqst->rq_xprtdata = req;
-}
-
 static inline struct rpcrdma_req *
 rpcr_to_rdmar(const struct rpc_rqst *rqst)
 {
-	return rqst->rq_xprtdata;
+	return container_of(rqst, struct rpcrdma_req, rl_slot);
 }
 
 static inline void


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 9/9] xprtrdma: Allocate rpcrdma_reps during Receive completion
  2018-03-05 20:12 [PATCH 0/9] Second round of v4.17 NFS/RDMA client patches Chuck Lever
                   ` (7 preceding siblings ...)
  2018-03-05 20:13 ` [PATCH 8/9] xprtrdma: Make rpc_rqst part of rpcrdma_req Chuck Lever
@ 2018-03-05 20:13 ` Chuck Lever
  8 siblings, 0 replies; 17+ messages in thread
From: Chuck Lever @ 2018-03-05 20:13 UTC (permalink / raw)
  To: anna.schumaker; +Cc: linux-rdma, linux-nfs

Receive completion for a CQ runs on one CPU core only. Ensure that
Receive buffers are allocated on the same CPU core where Receive
completions are handled. This guarantees that a transport's Receive
buffers are on the NUMA node that is local to the device no matter
where the transport was created.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/backchannel.c |   21 ---------------------
 net/sunrpc/xprtrdma/rpc_rdma.c    |    8 ++++++++
 net/sunrpc/xprtrdma/verbs.c       |   35 ++++++++++++++++++++++++++---------
 net/sunrpc/xprtrdma/xprt_rdma.h   |    4 +++-
 4 files changed, 37 insertions(+), 31 deletions(-)

diff --git a/net/sunrpc/xprtrdma/backchannel.c b/net/sunrpc/xprtrdma/backchannel.c
index 4034788..6b21fb8 100644
--- a/net/sunrpc/xprtrdma/backchannel.c
+++ b/net/sunrpc/xprtrdma/backchannel.c
@@ -71,23 +71,6 @@ static int rpcrdma_bc_setup_reqs(struct rpcrdma_xprt *r_xprt,
 	return -ENOMEM;
 }
 
-/* Allocate and add receive buffers to the rpcrdma_buffer's
- * existing list of rep's. These are released when the
- * transport is destroyed.
- */
-static int rpcrdma_bc_setup_reps(struct rpcrdma_xprt *r_xprt,
-				 unsigned int count)
-{
-	int rc = 0;
-
-	while (count--) {
-		rc = rpcrdma_create_rep(r_xprt);
-		if (rc)
-			break;
-	}
-	return rc;
-}
-
 /**
  * xprt_rdma_bc_setup - Pre-allocate resources for handling backchannel requests
  * @xprt: transport associated with these backchannel resources
@@ -116,10 +99,6 @@ int xprt_rdma_bc_setup(struct rpc_xprt *xprt, unsigned int reqs)
 	if (rc)
 		goto out_free;
 
-	rc = rpcrdma_bc_setup_reps(r_xprt, reqs);
-	if (rc)
-		goto out_free;
-
 	rc = rpcrdma_ep_post_extra_recv(r_xprt, reqs);
 	if (rc)
 		goto out_free;
diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
index e8adad3..d15aa27 100644
--- a/net/sunrpc/xprtrdma/rpc_rdma.c
+++ b/net/sunrpc/xprtrdma/rpc_rdma.c
@@ -1331,8 +1331,16 @@ void rpcrdma_reply_handler(struct rpcrdma_rep *rep)
 	struct rpcrdma_req *req;
 	struct rpc_rqst *rqst;
 	u32 credits;
+	int total;
 	__be32 *p;
 
+	total = buf->rb_max_requests + (buf->rb_bc_srv_max_requests << 1);
+	total -= buf->rb_reps;
+	if (total > 0)
+		while (total--)
+			if (!rpcrdma_create_rep(r_xprt, false))
+				break;
+
 	if (rep->rr_hdrbuf.head[0].iov_len == 0)
 		goto out_badstatus;
 
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 6a7a5a2..af74953 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -1095,11 +1095,12 @@ struct rpcrdma_req *
 /**
  * rpcrdma_create_rep - Allocate an rpcrdma_rep object
  * @r_xprt: controlling transport
+ * @temp: destroy rep upon release
  *
  * Returns 0 on success or a negative errno on failure.
  */
 int
-rpcrdma_create_rep(struct rpcrdma_xprt *r_xprt)
+rpcrdma_create_rep(struct rpcrdma_xprt *r_xprt, bool temp)
 {
 	struct rpcrdma_create_data_internal *cdata = &r_xprt->rx_data;
 	struct rpcrdma_buffer *buf = &r_xprt->rx_buf;
@@ -1127,9 +1128,11 @@ struct rpcrdma_req *
 	rep->rr_recv_wr.wr_cqe = &rep->rr_cqe;
 	rep->rr_recv_wr.sg_list = &rep->rr_rdmabuf->rg_iov;
 	rep->rr_recv_wr.num_sge = 1;
+	rep->rr_temp = temp;
 
 	spin_lock(&buf->rb_lock);
 	list_add(&rep->rr_list, &buf->rb_recv_bufs);
+	++buf->rb_reps;
 	spin_unlock(&buf->rb_lock);
 	return 0;
 
@@ -1179,11 +1182,9 @@ struct rpcrdma_req *
 	}
 
 	INIT_LIST_HEAD(&buf->rb_recv_bufs);
-	for (i = 0; i <= buf->rb_max_requests; i++) {
-		rc = rpcrdma_create_rep(r_xprt);
-		if (rc)
-			goto out;
-	}
+	rc = rpcrdma_create_rep(r_xprt, true);
+	if (rc)
+		goto out;
 
 	rc = rpcrdma_sendctxs_create(r_xprt);
 	if (rc)
@@ -1220,8 +1221,14 @@ struct rpcrdma_req *
 static void
 rpcrdma_destroy_rep(struct rpcrdma_rep *rep)
 {
+	struct rpcrdma_buffer *buf = &rep->rr_rxprt->rx_buf;
+
 	rpcrdma_free_regbuf(rep->rr_rdmabuf);
 	kfree(rep);
+
+	spin_lock(&buf->rb_lock);
+	--buf->rb_reps;
+	spin_unlock(&buf->rb_lock);
 }
 
 void
@@ -1417,12 +1424,17 @@ struct rpcrdma_req *
 
 	spin_lock(&buffers->rb_lock);
 	buffers->rb_send_count--;
-	list_add_tail(&req->rl_list, &buffers->rb_send_bufs);
+	list_add(&req->rl_list, &buffers->rb_send_bufs);
 	if (rep) {
 		buffers->rb_recv_count--;
-		list_add_tail(&rep->rr_list, &buffers->rb_recv_bufs);
+		if (!rep->rr_temp) {
+			list_add(&rep->rr_list, &buffers->rb_recv_bufs);
+			rep = NULL;
+		}
 	}
 	spin_unlock(&buffers->rb_lock);
+	if (rep)
+		rpcrdma_destroy_rep(rep);
 }
 
 /*
@@ -1450,8 +1462,13 @@ struct rpcrdma_req *
 
 	spin_lock(&buffers->rb_lock);
 	buffers->rb_recv_count--;
-	list_add_tail(&rep->rr_list, &buffers->rb_recv_bufs);
+	if (!rep->rr_temp) {
+		list_add(&rep->rr_list, &buffers->rb_recv_bufs);
+		rep = NULL;
+	}
 	spin_unlock(&buffers->rb_lock);
+	if (rep)
+		rpcrdma_destroy_rep(rep);
 }
 
 /**
diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h b/net/sunrpc/xprtrdma/xprt_rdma.h
index b35d80b..5f069c7 100644
--- a/net/sunrpc/xprtrdma/xprt_rdma.h
+++ b/net/sunrpc/xprtrdma/xprt_rdma.h
@@ -196,6 +196,7 @@ struct rpcrdma_rep {
 	__be32			rr_proc;
 	int			rr_wc_flags;
 	u32			rr_inv_rkey;
+	bool			rr_temp;
 	struct rpcrdma_regbuf	*rr_rdmabuf;
 	struct rpcrdma_xprt	*rr_rxprt;
 	struct work_struct	rr_work;
@@ -401,6 +402,7 @@ struct rpcrdma_buffer {
 	struct list_head	rb_recv_bufs;
 	u32			rb_max_requests;
 	u32			rb_credits;	/* most recent credit grant */
+	unsigned int		rb_reps;
 
 	u32			rb_bc_srv_max_requests;
 	spinlock_t		rb_reqslock;	/* protect rb_allreqs */
@@ -563,7 +565,7 @@ int rpcrdma_ep_post(struct rpcrdma_ia *, struct rpcrdma_ep *,
  */
 struct rpcrdma_req *rpcrdma_create_req(struct rpcrdma_xprt *);
 void rpcrdma_destroy_req(struct rpcrdma_req *);
-int rpcrdma_create_rep(struct rpcrdma_xprt *r_xprt);
+int rpcrdma_create_rep(struct rpcrdma_xprt *r_xprt, bool temp);
 int rpcrdma_buffer_create(struct rpcrdma_xprt *);
 void rpcrdma_buffer_destroy(struct rpcrdma_buffer *);
 struct rpcrdma_sendctx *rpcrdma_sendctx_get_locked(struct rpcrdma_buffer *buf);


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/9] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
  2018-03-05 20:13 ` [PATCH 5/9] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock Chuck Lever
@ 2018-03-06 22:02   ` Anna Schumaker
  2018-03-06 22:07     ` Chuck Lever
  0 siblings, 1 reply; 17+ messages in thread
From: Anna Schumaker @ 2018-03-06 22:02 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-rdma, linux-nfs

Hi Chuck,

I'm seeing a huge performance hit with this patch.  I'm just running cthon over TCP, and it goes from finishing in 22 seconds to taking well over 5 minutes.  I seem to only see this on the read and write tests, such as basic test5 taking a minute to finish:

    ./test5: read and write                                                                                       
            wrote 1048576 byte file 10 times in 60.35 seconds (173737 bytes/sec)                                  
            read 1048576 byte file 10 times in 0.0  seconds (-2147483648 bytes/sec)                               
            ./test5 ok. 

I haven't dug into this too deeply, but my best guess is that maybe it's due to adding a call to xprt_request_init() in net/sunrpc/clnt.c:call_reserveresult()

Thoughts?
Anna

On 03/05/2018 03:13 PM, Chuck Lever wrote:
> alloc_slot is a transport-specific op, but initializing an rpc_rqst
> is common to all transports. Move initialization to common code in
> preparation for adding a transport-specific alloc_slot to xprtrdma.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  include/linux/sunrpc/xprt.h |    1 +
>  net/sunrpc/clnt.c           |    1 +
>  net/sunrpc/xprt.c           |   12 +++++++-----
>  3 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
> index 5fea0fb..9784e28 100644
> --- a/include/linux/sunrpc/xprt.h
> +++ b/include/linux/sunrpc/xprt.h
> @@ -324,6 +324,7 @@ struct xprt_class {
>  struct rpc_xprt		*xprt_create_transport(struct xprt_create *args);
>  void			xprt_connect(struct rpc_task *task);
>  void			xprt_reserve(struct rpc_task *task);
> +void			xprt_request_init(struct rpc_task *task);
>  void			xprt_retry_reserve(struct rpc_task *task);
>  int			xprt_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task);
>  int			xprt_reserve_xprt_cong(struct rpc_xprt *xprt, struct rpc_task *task);
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index 6e432ec..226f558 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -1546,6 +1546,7 @@ void rpc_force_rebind(struct rpc_clnt *clnt)
>  	task->tk_status = 0;
>  	if (status >= 0) {
>  		if (task->tk_rqstp) {
> +			xprt_request_init(task);
>  			task->tk_action = call_refresh;
>  			return;
>  		}
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 70f0050..a394b46 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -66,7 +66,7 @@
>   * Local functions
>   */
>  static void	 xprt_init(struct rpc_xprt *xprt, struct net *net);
> -static void	xprt_request_init(struct rpc_task *, struct rpc_xprt *);
> +static __be32	xprt_alloc_xid(struct rpc_xprt *xprt);
>  static void	xprt_connect_status(struct rpc_task *task);
>  static int      __xprt_get_cong(struct rpc_xprt *, struct rpc_task *);
>  static void     __xprt_put_cong(struct rpc_xprt *, struct rpc_rqst *);
> @@ -987,6 +987,8 @@ bool xprt_prepare_transmit(struct rpc_task *task)
>  		task->tk_status = -EAGAIN;
>  		goto out_unlock;
>  	}
> +	if (likely(!bc_prealloc(req)))
> +		req->rq_xid = xprt_alloc_xid(xprt);
>  	ret = true;
>  out_unlock:
>  	spin_unlock_bh(&xprt->transport_lock);
> @@ -1163,10 +1165,10 @@ void xprt_alloc_slot(struct rpc_xprt *xprt, struct rpc_task *task)
>  out_init_req:
>  	xprt->stat.max_slots = max_t(unsigned int, xprt->stat.max_slots,
>  				     xprt->num_reqs);
> +	spin_unlock(&xprt->reserve_lock);
> +
>  	task->tk_status = 0;
>  	task->tk_rqstp = req;
> -	xprt_request_init(task, xprt);
> -	spin_unlock(&xprt->reserve_lock);
>  }
>  EXPORT_SYMBOL_GPL(xprt_alloc_slot);
>  
> @@ -1303,8 +1305,9 @@ static inline void xprt_init_xid(struct rpc_xprt *xprt)
>  	xprt->xid = prandom_u32();
>  }
>  
> -static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt)
> +void xprt_request_init(struct rpc_task *task)
>  {
> +	struct rpc_xprt *xprt = task->tk_xprt;
>  	struct rpc_rqst	*req = task->tk_rqstp;
>  
>  	INIT_LIST_HEAD(&req->rq_list);
> @@ -1312,7 +1315,6 @@ static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt)
>  	req->rq_task	= task;
>  	req->rq_xprt    = xprt;
>  	req->rq_buffer  = NULL;
> -	req->rq_xid     = xprt_alloc_xid(xprt);
>  	req->rq_connect_cookie = xprt->connect_cookie - 1;
>  	req->rq_bytes_sent = 0;
>  	req->rq_snd_buf.len = 0;
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/9] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
  2018-03-06 22:02   ` Anna Schumaker
@ 2018-03-06 22:07     ` Chuck Lever
  2018-03-06 22:30       ` Chuck Lever
  0 siblings, 1 reply; 17+ messages in thread
From: Chuck Lever @ 2018-03-06 22:07 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-rdma, Linux NFS Mailing List



> On Mar 6, 2018, at 5:02 PM, Anna Schumaker <Anna.Schumaker@netapp.com> =
wrote:
>=20
> Hi Chuck,
>=20
> I'm seeing a huge performance hit with this patch.  I'm just running =
cthon over TCP, and it goes from finishing in 22 seconds to taking well =
over 5 minutes.  I seem to only see this on the read and write tests, =
such as basic test5 taking a minute to finish:
>=20
>    ./test5: read and write                                             =
                                         =20
>            wrote 1048576 byte file 10 times in 60.35 seconds (173737 =
bytes/sec)                                 =20
>            read 1048576 byte file 10 times in 0.0  seconds =
(-2147483648 bytes/sec)                              =20
>            ./test5 ok.=20

OK. This looks like write is impacted, but this test doesn't
actually perform any reads on the wire. Try iozone with -I,
maybe? That would show results for both read and write.


> I haven't dug into this too deeply, but my best guess is that maybe =
it's due to adding a call to xprt_request_init() in =
net/sunrpc/clnt.c:call_reserveresult()

It wasn't added there, it was moved from xprt_alloc_slot. So,
it's not new work per-RPC.

Any additional information would be appreciated!


> Thoughts?
> Anna
>=20
> On 03/05/2018 03:13 PM, Chuck Lever wrote:
>> alloc_slot is a transport-specific op, but initializing an rpc_rqst
>> is common to all transports. Move initialization to common code in
>> preparation for adding a transport-specific alloc_slot to xprtrdma.
>>=20
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> include/linux/sunrpc/xprt.h |    1 +
>> net/sunrpc/clnt.c           |    1 +
>> net/sunrpc/xprt.c           |   12 +++++++-----
>> 3 files changed, 9 insertions(+), 5 deletions(-)
>>=20
>> diff --git a/include/linux/sunrpc/xprt.h =
b/include/linux/sunrpc/xprt.h
>> index 5fea0fb..9784e28 100644
>> --- a/include/linux/sunrpc/xprt.h
>> +++ b/include/linux/sunrpc/xprt.h
>> @@ -324,6 +324,7 @@ struct xprt_class {
>> struct rpc_xprt		*xprt_create_transport(struct =
xprt_create *args);
>> void			xprt_connect(struct rpc_task *task);
>> void			xprt_reserve(struct rpc_task *task);
>> +void			xprt_request_init(struct rpc_task =
*task);
>> void			xprt_retry_reserve(struct rpc_task *task);
>> int			xprt_reserve_xprt(struct rpc_xprt *xprt, struct =
rpc_task *task);
>> int			xprt_reserve_xprt_cong(struct rpc_xprt *xprt, =
struct rpc_task *task);
>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>> index 6e432ec..226f558 100644
>> --- a/net/sunrpc/clnt.c
>> +++ b/net/sunrpc/clnt.c
>> @@ -1546,6 +1546,7 @@ void rpc_force_rebind(struct rpc_clnt *clnt)
>> 	task->tk_status =3D 0;
>> 	if (status >=3D 0) {
>> 		if (task->tk_rqstp) {
>> +			xprt_request_init(task);
>> 			task->tk_action =3D call_refresh;
>> 			return;
>> 		}
>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>> index 70f0050..a394b46 100644
>> --- a/net/sunrpc/xprt.c
>> +++ b/net/sunrpc/xprt.c
>> @@ -66,7 +66,7 @@
>>  * Local functions
>>  */
>> static void	 xprt_init(struct rpc_xprt *xprt, struct net *net);
>> -static void	xprt_request_init(struct rpc_task *, struct rpc_xprt *);
>> +static __be32	xprt_alloc_xid(struct rpc_xprt *xprt);
>> static void	xprt_connect_status(struct rpc_task *task);
>> static int      __xprt_get_cong(struct rpc_xprt *, struct rpc_task =
*);
>> static void     __xprt_put_cong(struct rpc_xprt *, struct rpc_rqst =
*);
>> @@ -987,6 +987,8 @@ bool xprt_prepare_transmit(struct rpc_task *task)
>> 		task->tk_status =3D -EAGAIN;
>> 		goto out_unlock;
>> 	}
>> +	if (likely(!bc_prealloc(req)))
>> +		req->rq_xid =3D xprt_alloc_xid(xprt);
>> 	ret =3D true;
>> out_unlock:
>> 	spin_unlock_bh(&xprt->transport_lock);
>> @@ -1163,10 +1165,10 @@ void xprt_alloc_slot(struct rpc_xprt *xprt, =
struct rpc_task *task)
>> out_init_req:
>> 	xprt->stat.max_slots =3D max_t(unsigned int, =
xprt->stat.max_slots,
>> 				     xprt->num_reqs);
>> +	spin_unlock(&xprt->reserve_lock);
>> +
>> 	task->tk_status =3D 0;
>> 	task->tk_rqstp =3D req;
>> -	xprt_request_init(task, xprt);
>> -	spin_unlock(&xprt->reserve_lock);
>> }
>> EXPORT_SYMBOL_GPL(xprt_alloc_slot);
>>=20
>> @@ -1303,8 +1305,9 @@ static inline void xprt_init_xid(struct =
rpc_xprt *xprt)
>> 	xprt->xid =3D prandom_u32();
>> }
>>=20
>> -static void xprt_request_init(struct rpc_task *task, struct rpc_xprt =
*xprt)
>> +void xprt_request_init(struct rpc_task *task)
>> {
>> +	struct rpc_xprt *xprt =3D task->tk_xprt;
>> 	struct rpc_rqst	*req =3D task->tk_rqstp;
>>=20
>> 	INIT_LIST_HEAD(&req->rq_list);
>> @@ -1312,7 +1315,6 @@ static void xprt_request_init(struct rpc_task =
*task, struct rpc_xprt *xprt)
>> 	req->rq_task	=3D task;
>> 	req->rq_xprt    =3D xprt;
>> 	req->rq_buffer  =3D NULL;
>> -	req->rq_xid     =3D xprt_alloc_xid(xprt);
>> 	req->rq_connect_cookie =3D xprt->connect_cookie - 1;
>> 	req->rq_bytes_sent =3D 0;
>> 	req->rq_snd_buf.len =3D 0;
>>=20

--
Chuck Lever




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/9] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
  2018-03-06 22:07     ` Chuck Lever
@ 2018-03-06 22:30       ` Chuck Lever
  2018-03-07 20:00         ` Anna Schumaker
  0 siblings, 1 reply; 17+ messages in thread
From: Chuck Lever @ 2018-03-06 22:30 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-rdma, Linux NFS Mailing List



> On Mar 6, 2018, at 5:07 PM, Chuck Lever <chuck.lever@oracle.com> =
wrote:
>=20
>=20
>=20
>> On Mar 6, 2018, at 5:02 PM, Anna Schumaker =
<Anna.Schumaker@netapp.com> wrote:
>>=20
>> Hi Chuck,
>>=20
>> I'm seeing a huge performance hit with this patch.  I'm just running =
cthon over TCP, and it goes from finishing in 22 seconds to taking well =
over 5 minutes.  I seem to only see this on the read and write tests, =
such as basic test5 taking a minute to finish:
>>=20
>>   ./test5: read and write                                             =
                                         =20
>>           wrote 1048576 byte file 10 times in 60.35 seconds (173737 =
bytes/sec)                                 =20
>>           read 1048576 byte file 10 times in 0.0  seconds =
(-2147483648 bytes/sec)                              =20
>>           ./test5 ok.=20
>=20
> OK. This looks like write is impacted, but this test doesn't
> actually perform any reads on the wire. Try iozone with -I,
> maybe? That would show results for both read and write.

Hum.

Stock v4.16-rc4:

./test5: read and write
	wrote 1048576 byte file 10 times in 0.2  seconds (350811642 =
bytes/sec)
	read 1048576 byte file 10 times in 0.0  seconds (-2147483648 =
bytes/sec)
	./test5 ok.


v4.16-rc4 with my full set of patches:

./test5: read and write
	wrote 1048576 byte file 10 times in 0.2  seconds (354236681 =
bytes/sec)
	read 1048576 byte file 10 times in 0.0  seconds (-2147483648 =
bytes/sec)
	./test5 ok.

I don't see a regression here. Let me know how it goes!


>> I haven't dug into this too deeply, but my best guess is that maybe =
it's due to adding a call to xprt_request_init() in =
net/sunrpc/clnt.c:call_reserveresult()
>=20
> It wasn't added there, it was moved from xprt_alloc_slot. So,
> it's not new work per-RPC.
>=20
> Any additional information would be appreciated!
>=20
>=20
>> Thoughts?
>> Anna
>>=20
>> On 03/05/2018 03:13 PM, Chuck Lever wrote:
>>> alloc_slot is a transport-specific op, but initializing an rpc_rqst
>>> is common to all transports. Move initialization to common code in
>>> preparation for adding a transport-specific alloc_slot to xprtrdma.
>>>=20
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>> include/linux/sunrpc/xprt.h |    1 +
>>> net/sunrpc/clnt.c           |    1 +
>>> net/sunrpc/xprt.c           |   12 +++++++-----
>>> 3 files changed, 9 insertions(+), 5 deletions(-)
>>>=20
>>> diff --git a/include/linux/sunrpc/xprt.h =
b/include/linux/sunrpc/xprt.h
>>> index 5fea0fb..9784e28 100644
>>> --- a/include/linux/sunrpc/xprt.h
>>> +++ b/include/linux/sunrpc/xprt.h
>>> @@ -324,6 +324,7 @@ struct xprt_class {
>>> struct rpc_xprt		*xprt_create_transport(struct =
xprt_create *args);
>>> void			xprt_connect(struct rpc_task *task);
>>> void			xprt_reserve(struct rpc_task *task);
>>> +void			xprt_request_init(struct rpc_task =
*task);
>>> void			xprt_retry_reserve(struct rpc_task =
*task);
>>> int			xprt_reserve_xprt(struct rpc_xprt *xprt, struct =
rpc_task *task);
>>> int			xprt_reserve_xprt_cong(struct rpc_xprt *xprt, =
struct rpc_task *task);
>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>>> index 6e432ec..226f558 100644
>>> --- a/net/sunrpc/clnt.c
>>> +++ b/net/sunrpc/clnt.c
>>> @@ -1546,6 +1546,7 @@ void rpc_force_rebind(struct rpc_clnt *clnt)
>>> 	task->tk_status =3D 0;
>>> 	if (status >=3D 0) {
>>> 		if (task->tk_rqstp) {
>>> +			xprt_request_init(task);
>>> 			task->tk_action =3D call_refresh;
>>> 			return;
>>> 		}
>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>> index 70f0050..a394b46 100644
>>> --- a/net/sunrpc/xprt.c
>>> +++ b/net/sunrpc/xprt.c
>>> @@ -66,7 +66,7 @@
>>> * Local functions
>>> */
>>> static void	 xprt_init(struct rpc_xprt *xprt, struct net *net);
>>> -static void	xprt_request_init(struct rpc_task *, struct =
rpc_xprt *);
>>> +static __be32	xprt_alloc_xid(struct rpc_xprt *xprt);
>>> static void	xprt_connect_status(struct rpc_task *task);
>>> static int      __xprt_get_cong(struct rpc_xprt *, struct rpc_task =
*);
>>> static void     __xprt_put_cong(struct rpc_xprt *, struct rpc_rqst =
*);
>>> @@ -987,6 +987,8 @@ bool xprt_prepare_transmit(struct rpc_task =
*task)
>>> 		task->tk_status =3D -EAGAIN;
>>> 		goto out_unlock;
>>> 	}
>>> +	if (likely(!bc_prealloc(req)))
>>> +		req->rq_xid =3D xprt_alloc_xid(xprt);
>>> 	ret =3D true;
>>> out_unlock:
>>> 	spin_unlock_bh(&xprt->transport_lock);
>>> @@ -1163,10 +1165,10 @@ void xprt_alloc_slot(struct rpc_xprt *xprt, =
struct rpc_task *task)
>>> out_init_req:
>>> 	xprt->stat.max_slots =3D max_t(unsigned int, =
xprt->stat.max_slots,
>>> 				     xprt->num_reqs);
>>> +	spin_unlock(&xprt->reserve_lock);
>>> +
>>> 	task->tk_status =3D 0;
>>> 	task->tk_rqstp =3D req;
>>> -	xprt_request_init(task, xprt);
>>> -	spin_unlock(&xprt->reserve_lock);
>>> }
>>> EXPORT_SYMBOL_GPL(xprt_alloc_slot);
>>>=20
>>> @@ -1303,8 +1305,9 @@ static inline void xprt_init_xid(struct =
rpc_xprt *xprt)
>>> 	xprt->xid =3D prandom_u32();
>>> }
>>>=20
>>> -static void xprt_request_init(struct rpc_task *task, struct =
rpc_xprt *xprt)
>>> +void xprt_request_init(struct rpc_task *task)
>>> {
>>> +	struct rpc_xprt *xprt =3D task->tk_xprt;
>>> 	struct rpc_rqst	*req =3D task->tk_rqstp;
>>>=20
>>> 	INIT_LIST_HEAD(&req->rq_list);
>>> @@ -1312,7 +1315,6 @@ static void xprt_request_init(struct rpc_task =
*task, struct rpc_xprt *xprt)
>>> 	req->rq_task	=3D task;
>>> 	req->rq_xprt    =3D xprt;
>>> 	req->rq_buffer  =3D NULL;
>>> -	req->rq_xid     =3D xprt_alloc_xid(xprt);
>>> 	req->rq_connect_cookie =3D xprt->connect_cookie - 1;
>>> 	req->rq_bytes_sent =3D 0;
>>> 	req->rq_snd_buf.len =3D 0;
>>>=20
>=20
> --
> Chuck Lever
>=20
>=20
>=20
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" =
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Chuck Lever




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/9] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
  2018-03-06 22:30       ` Chuck Lever
@ 2018-03-07 20:00         ` Anna Schumaker
  2018-03-07 20:23           ` Chuck Lever
  0 siblings, 1 reply; 17+ messages in thread
From: Anna Schumaker @ 2018-03-07 20:00 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-rdma, Linux NFS Mailing List



On 03/06/2018 05:30 PM, Chuck Lever wrote:
> 
> 
>> On Mar 6, 2018, at 5:07 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>>
>>
>>> On Mar 6, 2018, at 5:02 PM, Anna Schumaker <Anna.Schumaker@netapp.com> wrote:
>>>
>>> Hi Chuck,
>>>
>>> I'm seeing a huge performance hit with this patch.  I'm just running cthon over TCP, and it goes from finishing in 22 seconds to taking well over 5 minutes.  I seem to only see this on the read and write tests, such as basic test5 taking a minute to finish:
>>>
>>>   ./test5: read and write                                                                                       
>>>           wrote 1048576 byte file 10 times in 60.35 seconds (173737 bytes/sec)                                  
>>>           read 1048576 byte file 10 times in 0.0  seconds (-2147483648 bytes/sec)                               
>>>           ./test5 ok. 
>>
>> OK. This looks like write is impacted, but this test doesn't
>> actually perform any reads on the wire. Try iozone with -I,
>> maybe? That would show results for both read and write.
> 
> Hum.
> 
> Stock v4.16-rc4:
> 
> ./test5: read and write
> 	wrote 1048576 byte file 10 times in 0.2  seconds (350811642 bytes/sec)
> 	read 1048576 byte file 10 times in 0.0  seconds (-2147483648 bytes/sec)
> 	./test5 ok.
> 
> 
> v4.16-rc4 with my full set of patches:
> 
> ./test5: read and write
> 	wrote 1048576 byte file 10 times in 0.2  seconds (354236681 bytes/sec)
> 	read 1048576 byte file 10 times in 0.0  seconds (-2147483648 bytes/sec)
> 	./test5 ok.
> 
> I don't see a regression here. Let me know how it goes!

I'm using rc4 too, so maybe it's something different in my setup?  Making this change fixes the issue for me:

diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index a394b4635f8e..273847f7e455 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -987,8 +987,6 @@ bool xprt_prepare_transmit(struct rpc_task *task)
                task->tk_status = -EAGAIN;
                goto out_unlock;
        }
-       if (likely(!bc_prealloc(req)))
-               req->rq_xid = xprt_alloc_xid(xprt);
        ret = true;
 out_unlock:
        spin_unlock_bh(&xprt->transport_lock);
@@ -1315,6 +1313,7 @@ void xprt_request_init(struct rpc_task *task)
        req->rq_task    = task;
        req->rq_xprt    = xprt;
        req->rq_buffer  = NULL;
+       req->rq_xid     = xprt_alloc_xid(xprt);
        req->rq_connect_cookie = xprt->connect_cookie - 1;
        req->rq_bytes_sent = 0;
        req->rq_snd_buf.len = 0;


Anna

> 
> 
>>> I haven't dug into this too deeply, but my best guess is that maybe it's due to adding a call to xprt_request_init() in net/sunrpc/clnt.c:call_reserveresult()
>>
>> It wasn't added there, it was moved from xprt_alloc_slot. So,
>> it's not new work per-RPC.
>>
>> Any additional information would be appreciated!
>>
>>
>>> Thoughts?
>>> Anna
>>>
>>> On 03/05/2018 03:13 PM, Chuck Lever wrote:
>>>> alloc_slot is a transport-specific op, but initializing an rpc_rqst
>>>> is common to all transports. Move initialization to common code in
>>>> preparation for adding a transport-specific alloc_slot to xprtrdma.
>>>>
>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>> ---
>>>> include/linux/sunrpc/xprt.h |    1 +
>>>> net/sunrpc/clnt.c           |    1 +
>>>> net/sunrpc/xprt.c           |   12 +++++++-----
>>>> 3 files changed, 9 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
>>>> index 5fea0fb..9784e28 100644
>>>> --- a/include/linux/sunrpc/xprt.h
>>>> +++ b/include/linux/sunrpc/xprt.h
>>>> @@ -324,6 +324,7 @@ struct xprt_class {
>>>> struct rpc_xprt		*xprt_create_transport(struct xprt_create *args);
>>>> void			xprt_connect(struct rpc_task *task);
>>>> void			xprt_reserve(struct rpc_task *task);
>>>> +void			xprt_request_init(struct rpc_task *task);
>>>> void			xprt_retry_reserve(struct rpc_task *task);
>>>> int			xprt_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task);
>>>> int			xprt_reserve_xprt_cong(struct rpc_xprt *xprt, struct rpc_task *task);
>>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>>>> index 6e432ec..226f558 100644
>>>> --- a/net/sunrpc/clnt.c
>>>> +++ b/net/sunrpc/clnt.c
>>>> @@ -1546,6 +1546,7 @@ void rpc_force_rebind(struct rpc_clnt *clnt)
>>>> 	task->tk_status = 0;
>>>> 	if (status >= 0) {
>>>> 		if (task->tk_rqstp) {
>>>> +			xprt_request_init(task);
>>>> 			task->tk_action = call_refresh;
>>>> 			return;
>>>> 		}
>>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>>> index 70f0050..a394b46 100644
>>>> --- a/net/sunrpc/xprt.c
>>>> +++ b/net/sunrpc/xprt.c
>>>> @@ -66,7 +66,7 @@
>>>> * Local functions
>>>> */
>>>> static void	 xprt_init(struct rpc_xprt *xprt, struct net *net);
>>>> -static void	xprt_request_init(struct rpc_task *, struct rpc_xprt *);
>>>> +static __be32	xprt_alloc_xid(struct rpc_xprt *xprt);
>>>> static void	xprt_connect_status(struct rpc_task *task);
>>>> static int      __xprt_get_cong(struct rpc_xprt *, struct rpc_task *);
>>>> static void     __xprt_put_cong(struct rpc_xprt *, struct rpc_rqst *);
>>>> @@ -987,6 +987,8 @@ bool xprt_prepare_transmit(struct rpc_task *task)
>>>> 		task->tk_status = -EAGAIN;
>>>> 		goto out_unlock;
>>>> 	}
>>>> +	if (likely(!bc_prealloc(req)))
>>>> +		req->rq_xid = xprt_alloc_xid(xprt);
>>>> 	ret = true;
>>>> out_unlock:
>>>> 	spin_unlock_bh(&xprt->transport_lock);
>>>> @@ -1163,10 +1165,10 @@ void xprt_alloc_slot(struct rpc_xprt *xprt, struct rpc_task *task)
>>>> out_init_req:
>>>> 	xprt->stat.max_slots = max_t(unsigned int, xprt->stat.max_slots,
>>>> 				     xprt->num_reqs);
>>>> +	spin_unlock(&xprt->reserve_lock);
>>>> +
>>>> 	task->tk_status = 0;
>>>> 	task->tk_rqstp = req;
>>>> -	xprt_request_init(task, xprt);
>>>> -	spin_unlock(&xprt->reserve_lock);
>>>> }
>>>> EXPORT_SYMBOL_GPL(xprt_alloc_slot);
>>>>
>>>> @@ -1303,8 +1305,9 @@ static inline void xprt_init_xid(struct rpc_xprt *xprt)
>>>> 	xprt->xid = prandom_u32();
>>>> }
>>>>
>>>> -static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt)
>>>> +void xprt_request_init(struct rpc_task *task)
>>>> {
>>>> +	struct rpc_xprt *xprt = task->tk_xprt;
>>>> 	struct rpc_rqst	*req = task->tk_rqstp;
>>>>
>>>> 	INIT_LIST_HEAD(&req->rq_list);
>>>> @@ -1312,7 +1315,6 @@ static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt)
>>>> 	req->rq_task	= task;
>>>> 	req->rq_xprt    = xprt;
>>>> 	req->rq_buffer  = NULL;
>>>> -	req->rq_xid     = xprt_alloc_xid(xprt);
>>>> 	req->rq_connect_cookie = xprt->connect_cookie - 1;
>>>> 	req->rq_bytes_sent = 0;
>>>> 	req->rq_snd_buf.len = 0;
>>>>
>>
>> --
>> Chuck Lever
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> 
> 
> 

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/9] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
  2018-03-07 20:00         ` Anna Schumaker
@ 2018-03-07 20:23           ` Chuck Lever
  2018-03-07 20:32             ` Anna Schumaker
  0 siblings, 1 reply; 17+ messages in thread
From: Chuck Lever @ 2018-03-07 20:23 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-rdma, Linux NFS Mailing List



> On Mar 7, 2018, at 3:00 PM, Anna Schumaker <Anna.Schumaker@netapp.com> =
wrote:
>=20
>=20
>=20
> On 03/06/2018 05:30 PM, Chuck Lever wrote:
>>=20
>>=20
>>> On Mar 6, 2018, at 5:07 PM, Chuck Lever <chuck.lever@oracle.com> =
wrote:
>>>=20
>>>=20
>>>=20
>>>> On Mar 6, 2018, at 5:02 PM, Anna Schumaker =
<Anna.Schumaker@netapp.com> wrote:
>>>>=20
>>>> Hi Chuck,
>>>>=20
>>>> I'm seeing a huge performance hit with this patch.  I'm just =
running cthon over TCP, and it goes from finishing in 22 seconds to =
taking well over 5 minutes.  I seem to only see this on the read and =
write tests, such as basic test5 taking a minute to finish:
>>>>=20
>>>>  ./test5: read and write                                            =
                                          =20
>>>>          wrote 1048576 byte file 10 times in 60.35 seconds (173737 =
bytes/sec)                                 =20
>>>>          read 1048576 byte file 10 times in 0.0  seconds =
(-2147483648 bytes/sec)                              =20
>>>>          ./test5 ok.=20
>>>=20
>>> OK. This looks like write is impacted, but this test doesn't
>>> actually perform any reads on the wire. Try iozone with -I,
>>> maybe? That would show results for both read and write.
>>=20
>> Hum.
>>=20
>> Stock v4.16-rc4:
>>=20
>> ./test5: read and write
>> 	wrote 1048576 byte file 10 times in 0.2  seconds (350811642 =
bytes/sec)
>> 	read 1048576 byte file 10 times in 0.0  seconds (-2147483648 =
bytes/sec)
>> 	./test5 ok.
>>=20
>>=20
>> v4.16-rc4 with my full set of patches:
>>=20
>> ./test5: read and write
>> 	wrote 1048576 byte file 10 times in 0.2  seconds (354236681 =
bytes/sec)
>> 	read 1048576 byte file 10 times in 0.0  seconds (-2147483648 =
bytes/sec)
>> 	./test5 ok.
>>=20
>> I don't see a regression here. Let me know how it goes!
>=20
> I'm using rc4 too, so maybe it's something different in my setup?

What is your setup, exactly? I assume your client is maybe a
single CPU guest, and the server is the same, and both are
hosted on one system?


> Making this change fixes the issue for me:
>=20
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index a394b4635f8e..273847f7e455 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -987,8 +987,6 @@ bool xprt_prepare_transmit(struct rpc_task *task)
>                task->tk_status =3D -EAGAIN;
>                goto out_unlock;
>        }
> -       if (likely(!bc_prealloc(req)))
> -               req->rq_xid =3D xprt_alloc_xid(xprt);
>        ret =3D true;
> out_unlock:
>        spin_unlock_bh(&xprt->transport_lock);
> @@ -1315,6 +1313,7 @@ void xprt_request_init(struct rpc_task *task)
>        req->rq_task    =3D task;
>        req->rq_xprt    =3D xprt;
>        req->rq_buffer  =3D NULL;
> +       req->rq_xid     =3D xprt_alloc_xid(xprt);

xprt_alloc_xid is just=20

1299 static inline __be32 xprt_alloc_xid(struct rpc_xprt *xprt)
1300 {
1301         return (__force __be32)xprt->xid++;
1302 }

I don't believe the new call site for xprt_request_init is
adequately serialized for this to be safe in general. That's why
I'm calling xprt_alloc_xid in xprt_prepare_transmit, behind the
transport_lock.

However, I think we need to explain why that helps your performance
issue, because it doesn't make sense to me that this would make a
difference. Why did you think to try this change? Is there evidence
on the wire of XID re-use, for example?


>        req->rq_connect_cookie =3D xprt->connect_cookie - 1;
>        req->rq_bytes_sent =3D 0;
>        req->rq_snd_buf.len =3D 0;
>=20
>=20
> Anna
>=20
>>=20
>>=20
>>>> I haven't dug into this too deeply, but my best guess is that maybe =
it's due to adding a call to xprt_request_init() in =
net/sunrpc/clnt.c:call_reserveresult()
>>>=20
>>> It wasn't added there, it was moved from xprt_alloc_slot. So,
>>> it's not new work per-RPC.
>>>=20
>>> Any additional information would be appreciated!
>>>=20
>>>=20
>>>> Thoughts?
>>>> Anna
>>>>=20
>>>> On 03/05/2018 03:13 PM, Chuck Lever wrote:
>>>>> alloc_slot is a transport-specific op, but initializing an =
rpc_rqst
>>>>> is common to all transports. Move initialization to common code in
>>>>> preparation for adding a transport-specific alloc_slot to =
xprtrdma.
>>>>>=20
>>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>>> ---
>>>>> include/linux/sunrpc/xprt.h |    1 +
>>>>> net/sunrpc/clnt.c           |    1 +
>>>>> net/sunrpc/xprt.c           |   12 +++++++-----
>>>>> 3 files changed, 9 insertions(+), 5 deletions(-)
>>>>>=20
>>>>> diff --git a/include/linux/sunrpc/xprt.h =
b/include/linux/sunrpc/xprt.h
>>>>> index 5fea0fb..9784e28 100644
>>>>> --- a/include/linux/sunrpc/xprt.h
>>>>> +++ b/include/linux/sunrpc/xprt.h
>>>>> @@ -324,6 +324,7 @@ struct xprt_class {
>>>>> struct rpc_xprt		*xprt_create_transport(struct =
xprt_create *args);
>>>>> void			xprt_connect(struct rpc_task *task);
>>>>> void			xprt_reserve(struct rpc_task *task);
>>>>> +void			xprt_request_init(struct rpc_task =
*task);
>>>>> void			xprt_retry_reserve(struct rpc_task =
*task);
>>>>> int			xprt_reserve_xprt(struct rpc_xprt *xprt, =
struct rpc_task *task);
>>>>> int			xprt_reserve_xprt_cong(struct rpc_xprt =
*xprt, struct rpc_task *task);
>>>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>>>>> index 6e432ec..226f558 100644
>>>>> --- a/net/sunrpc/clnt.c
>>>>> +++ b/net/sunrpc/clnt.c
>>>>> @@ -1546,6 +1546,7 @@ void rpc_force_rebind(struct rpc_clnt *clnt)
>>>>> 	task->tk_status =3D 0;
>>>>> 	if (status >=3D 0) {
>>>>> 		if (task->tk_rqstp) {
>>>>> +			xprt_request_init(task);
>>>>> 			task->tk_action =3D call_refresh;
>>>>> 			return;
>>>>> 		}
>>>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>>>> index 70f0050..a394b46 100644
>>>>> --- a/net/sunrpc/xprt.c
>>>>> +++ b/net/sunrpc/xprt.c
>>>>> @@ -66,7 +66,7 @@
>>>>> * Local functions
>>>>> */
>>>>> static void	 xprt_init(struct rpc_xprt *xprt, struct net =
*net);
>>>>> -static void	xprt_request_init(struct rpc_task *, struct =
rpc_xprt *);
>>>>> +static __be32	xprt_alloc_xid(struct rpc_xprt *xprt);
>>>>> static void	xprt_connect_status(struct rpc_task *task);
>>>>> static int      __xprt_get_cong(struct rpc_xprt *, struct rpc_task =
*);
>>>>> static void     __xprt_put_cong(struct rpc_xprt *, struct rpc_rqst =
*);
>>>>> @@ -987,6 +987,8 @@ bool xprt_prepare_transmit(struct rpc_task =
*task)
>>>>> 		task->tk_status =3D -EAGAIN;
>>>>> 		goto out_unlock;
>>>>> 	}
>>>>> +	if (likely(!bc_prealloc(req)))
>>>>> +		req->rq_xid =3D xprt_alloc_xid(xprt);
>>>>> 	ret =3D true;
>>>>> out_unlock:
>>>>> 	spin_unlock_bh(&xprt->transport_lock);
>>>>> @@ -1163,10 +1165,10 @@ void xprt_alloc_slot(struct rpc_xprt =
*xprt, struct rpc_task *task)
>>>>> out_init_req:
>>>>> 	xprt->stat.max_slots =3D max_t(unsigned int, =
xprt->stat.max_slots,
>>>>> 				     xprt->num_reqs);
>>>>> +	spin_unlock(&xprt->reserve_lock);
>>>>> +
>>>>> 	task->tk_status =3D 0;
>>>>> 	task->tk_rqstp =3D req;
>>>>> -	xprt_request_init(task, xprt);
>>>>> -	spin_unlock(&xprt->reserve_lock);
>>>>> }
>>>>> EXPORT_SYMBOL_GPL(xprt_alloc_slot);
>>>>>=20
>>>>> @@ -1303,8 +1305,9 @@ static inline void xprt_init_xid(struct =
rpc_xprt *xprt)
>>>>> 	xprt->xid =3D prandom_u32();
>>>>> }
>>>>>=20
>>>>> -static void xprt_request_init(struct rpc_task *task, struct =
rpc_xprt *xprt)
>>>>> +void xprt_request_init(struct rpc_task *task)
>>>>> {
>>>>> +	struct rpc_xprt *xprt =3D task->tk_xprt;
>>>>> 	struct rpc_rqst	*req =3D task->tk_rqstp;
>>>>>=20
>>>>> 	INIT_LIST_HEAD(&req->rq_list);
>>>>> @@ -1312,7 +1315,6 @@ static void xprt_request_init(struct =
rpc_task *task, struct rpc_xprt *xprt)
>>>>> 	req->rq_task	=3D task;
>>>>> 	req->rq_xprt    =3D xprt;
>>>>> 	req->rq_buffer  =3D NULL;
>>>>> -	req->rq_xid     =3D xprt_alloc_xid(xprt);
>>>>> 	req->rq_connect_cookie =3D xprt->connect_cookie - 1;
>>>>> 	req->rq_bytes_sent =3D 0;
>>>>> 	req->rq_snd_buf.len =3D 0;
>>>>>=20
>>>=20
>>> --
>>> Chuck Lever
>>>=20
>>>=20
>>>=20
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe =
linux-rdma" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>=20
>> --
>> Chuck Lever

--
Chuck Lever




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/9] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
  2018-03-07 20:23           ` Chuck Lever
@ 2018-03-07 20:32             ` Anna Schumaker
  2018-03-07 20:44               ` Chuck Lever
  0 siblings, 1 reply; 17+ messages in thread
From: Anna Schumaker @ 2018-03-07 20:32 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-rdma, Linux NFS Mailing List



On 03/07/2018 03:23 PM, Chuck Lever wrote:
> 
> 
>> On Mar 7, 2018, at 3:00 PM, Anna Schumaker <Anna.Schumaker@netapp.com> wrote:
>>
>>
>>
>> On 03/06/2018 05:30 PM, Chuck Lever wrote:
>>>
>>>
>>>> On Mar 6, 2018, at 5:07 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>
>>>>
>>>>
>>>>> On Mar 6, 2018, at 5:02 PM, Anna Schumaker <Anna.Schumaker@netapp.com> wrote:
>>>>>
>>>>> Hi Chuck,
>>>>>
>>>>> I'm seeing a huge performance hit with this patch.  I'm just running cthon over TCP, and it goes from finishing in 22 seconds to taking well over 5 minutes.  I seem to only see this on the read and write tests, such as basic test5 taking a minute to finish:
>>>>>
>>>>>  ./test5: read and write                                                                                       
>>>>>          wrote 1048576 byte file 10 times in 60.35 seconds (173737 bytes/sec)                                  
>>>>>          read 1048576 byte file 10 times in 0.0  seconds (-2147483648 bytes/sec)                               
>>>>>          ./test5 ok. 
>>>>
>>>> OK. This looks like write is impacted, but this test doesn't
>>>> actually perform any reads on the wire. Try iozone with -I,
>>>> maybe? That would show results for both read and write.
>>>
>>> Hum.
>>>
>>> Stock v4.16-rc4:
>>>
>>> ./test5: read and write
>>> 	wrote 1048576 byte file 10 times in 0.2  seconds (350811642 bytes/sec)
>>> 	read 1048576 byte file 10 times in 0.0  seconds (-2147483648 bytes/sec)
>>> 	./test5 ok.
>>>
>>>
>>> v4.16-rc4 with my full set of patches:
>>>
>>> ./test5: read and write
>>> 	wrote 1048576 byte file 10 times in 0.2  seconds (354236681 bytes/sec)
>>> 	read 1048576 byte file 10 times in 0.0  seconds (-2147483648 bytes/sec)
>>> 	./test5 ok.
>>>
>>> I don't see a regression here. Let me know how it goes!
>>
>> I'm using rc4 too, so maybe it's something different in my setup?
> 
> What is your setup, exactly? I assume your client is maybe a
> single CPU guest, and the server is the same, and both are
> hosted on one system?

Client is single CPU kvm guest with 1 gig ram, server is also kvm on the same system with 2 cpus and 4 gigs ram.

> 
> 
>> Making this change fixes the issue for me:
>>
>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>> index a394b4635f8e..273847f7e455 100644
>> --- a/net/sunrpc/xprt.c
>> +++ b/net/sunrpc/xprt.c
>> @@ -987,8 +987,6 @@ bool xprt_prepare_transmit(struct rpc_task *task)
>>                task->tk_status = -EAGAIN;
>>                goto out_unlock;
>>        }
>> -       if (likely(!bc_prealloc(req)))
>> -               req->rq_xid = xprt_alloc_xid(xprt);
>>        ret = true;
>> out_unlock:
>>        spin_unlock_bh(&xprt->transport_lock);
>> @@ -1315,6 +1313,7 @@ void xprt_request_init(struct rpc_task *task)
>>        req->rq_task    = task;
>>        req->rq_xprt    = xprt;
>>        req->rq_buffer  = NULL;
>> +       req->rq_xid     = xprt_alloc_xid(xprt);
> 
> xprt_alloc_xid is just 
> 
> 1299 static inline __be32 xprt_alloc_xid(struct rpc_xprt *xprt)
> 1300 {
> 1301         return (__force __be32)xprt->xid++;
> 1302 }
> 
> I don't believe the new call site for xprt_request_init is
> adequately serialized for this to be safe in general. That's why
> I'm calling xprt_alloc_xid in xprt_prepare_transmit, behind the
> transport_lock.

This makes sense.

> 
> However, I think we need to explain why that helps your performance
> issue, because it doesn't make sense to me that this would make a
> difference. Why did you think to try this change? Is there evidence
> on the wire of XID re-use, for example?

I selectively reverted parts of your original patch until I found the parts that kill my performance.

> 
> 
>>        req->rq_connect_cookie = xprt->connect_cookie - 1;
>>        req->rq_bytes_sent = 0;
>>        req->rq_snd_buf.len = 0;
>>
>>
>> Anna
>>
>>>
>>>
>>>>> I haven't dug into this too deeply, but my best guess is that maybe it's due to adding a call to xprt_request_init() in net/sunrpc/clnt.c:call_reserveresult()
>>>>
>>>> It wasn't added there, it was moved from xprt_alloc_slot. So,
>>>> it's not new work per-RPC.
>>>>
>>>> Any additional information would be appreciated!
>>>>
>>>>
>>>>> Thoughts?
>>>>> Anna
>>>>>
>>>>> On 03/05/2018 03:13 PM, Chuck Lever wrote:
>>>>>> alloc_slot is a transport-specific op, but initializing an rpc_rqst
>>>>>> is common to all transports. Move initialization to common code in
>>>>>> preparation for adding a transport-specific alloc_slot to xprtrdma.
>>>>>>
>>>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>>>> ---
>>>>>> include/linux/sunrpc/xprt.h |    1 +
>>>>>> net/sunrpc/clnt.c           |    1 +
>>>>>> net/sunrpc/xprt.c           |   12 +++++++-----
>>>>>> 3 files changed, 9 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
>>>>>> index 5fea0fb..9784e28 100644
>>>>>> --- a/include/linux/sunrpc/xprt.h
>>>>>> +++ b/include/linux/sunrpc/xprt.h
>>>>>> @@ -324,6 +324,7 @@ struct xprt_class {
>>>>>> struct rpc_xprt		*xprt_create_transport(struct xprt_create *args);
>>>>>> void			xprt_connect(struct rpc_task *task);
>>>>>> void			xprt_reserve(struct rpc_task *task);
>>>>>> +void			xprt_request_init(struct rpc_task *task);
>>>>>> void			xprt_retry_reserve(struct rpc_task *task);
>>>>>> int			xprt_reserve_xprt(struct rpc_xprt *xprt, struct rpc_task *task);
>>>>>> int			xprt_reserve_xprt_cong(struct rpc_xprt *xprt, struct rpc_task *task);
>>>>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>>>>>> index 6e432ec..226f558 100644
>>>>>> --- a/net/sunrpc/clnt.c
>>>>>> +++ b/net/sunrpc/clnt.c
>>>>>> @@ -1546,6 +1546,7 @@ void rpc_force_rebind(struct rpc_clnt *clnt)
>>>>>> 	task->tk_status = 0;
>>>>>> 	if (status >= 0) {
>>>>>> 		if (task->tk_rqstp) {
>>>>>> +			xprt_request_init(task);
>>>>>> 			task->tk_action = call_refresh;
>>>>>> 			return;
>>>>>> 		}
>>>>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>>>>> index 70f0050..a394b46 100644
>>>>>> --- a/net/sunrpc/xprt.c
>>>>>> +++ b/net/sunrpc/xprt.c
>>>>>> @@ -66,7 +66,7 @@
>>>>>> * Local functions
>>>>>> */
>>>>>> static void	 xprt_init(struct rpc_xprt *xprt, struct net *net);
>>>>>> -static void	xprt_request_init(struct rpc_task *, struct rpc_xprt *);
>>>>>> +static __be32	xprt_alloc_xid(struct rpc_xprt *xprt);
>>>>>> static void	xprt_connect_status(struct rpc_task *task);
>>>>>> static int      __xprt_get_cong(struct rpc_xprt *, struct rpc_task *);
>>>>>> static void     __xprt_put_cong(struct rpc_xprt *, struct rpc_rqst *);
>>>>>> @@ -987,6 +987,8 @@ bool xprt_prepare_transmit(struct rpc_task *task)
>>>>>> 		task->tk_status = -EAGAIN;
>>>>>> 		goto out_unlock;
>>>>>> 	}
>>>>>> +	if (likely(!bc_prealloc(req)))
>>>>>> +		req->rq_xid = xprt_alloc_xid(xprt);
>>>>>> 	ret = true;
>>>>>> out_unlock:
>>>>>> 	spin_unlock_bh(&xprt->transport_lock);
>>>>>> @@ -1163,10 +1165,10 @@ void xprt_alloc_slot(struct rpc_xprt *xprt, struct rpc_task *task)
>>>>>> out_init_req:
>>>>>> 	xprt->stat.max_slots = max_t(unsigned int, xprt->stat.max_slots,
>>>>>> 				     xprt->num_reqs);
>>>>>> +	spin_unlock(&xprt->reserve_lock);
>>>>>> +
>>>>>> 	task->tk_status = 0;
>>>>>> 	task->tk_rqstp = req;
>>>>>> -	xprt_request_init(task, xprt);
>>>>>> -	spin_unlock(&xprt->reserve_lock);
>>>>>> }
>>>>>> EXPORT_SYMBOL_GPL(xprt_alloc_slot);
>>>>>>
>>>>>> @@ -1303,8 +1305,9 @@ static inline void xprt_init_xid(struct rpc_xprt *xprt)
>>>>>> 	xprt->xid = prandom_u32();
>>>>>> }
>>>>>>
>>>>>> -static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt)
>>>>>> +void xprt_request_init(struct rpc_task *task)
>>>>>> {
>>>>>> +	struct rpc_xprt *xprt = task->tk_xprt;
>>>>>> 	struct rpc_rqst	*req = task->tk_rqstp;
>>>>>>
>>>>>> 	INIT_LIST_HEAD(&req->rq_list);
>>>>>> @@ -1312,7 +1315,6 @@ static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt)
>>>>>> 	req->rq_task	= task;
>>>>>> 	req->rq_xprt    = xprt;
>>>>>> 	req->rq_buffer  = NULL;
>>>>>> -	req->rq_xid     = xprt_alloc_xid(xprt);
>>>>>> 	req->rq_connect_cookie = xprt->connect_cookie - 1;
>>>>>> 	req->rq_bytes_sent = 0;
>>>>>> 	req->rq_snd_buf.len = 0;
>>>>>>
>>>>
>>>> --
>>>> Chuck Lever
>>>>
>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>> --
>>> Chuck Lever
> 
> --
> Chuck Lever
> 
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/9] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
  2018-03-07 20:32             ` Anna Schumaker
@ 2018-03-07 20:44               ` Chuck Lever
  0 siblings, 0 replies; 17+ messages in thread
From: Chuck Lever @ 2018-03-07 20:44 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: linux-rdma, Linux NFS Mailing List



> On Mar 7, 2018, at 3:32 PM, Anna Schumaker <Anna.Schumaker@netapp.com> =
wrote:
>=20
>=20
>=20
> On 03/07/2018 03:23 PM, Chuck Lever wrote:
>>=20
>>=20
>>> On Mar 7, 2018, at 3:00 PM, Anna Schumaker =
<Anna.Schumaker@netapp.com> wrote:
>>>=20
>>>=20
>>>=20
>>> On 03/06/2018 05:30 PM, Chuck Lever wrote:
>>>>=20
>>>>=20
>>>>> On Mar 6, 2018, at 5:07 PM, Chuck Lever <chuck.lever@oracle.com> =
wrote:
>>>>>=20
>>>>>=20
>>>>>=20
>>>>>> On Mar 6, 2018, at 5:02 PM, Anna Schumaker =
<Anna.Schumaker@netapp.com> wrote:
>>>>>>=20
>>>>>> Hi Chuck,
>>>>>>=20
>>>>>> I'm seeing a huge performance hit with this patch.  I'm just =
running cthon over TCP, and it goes from finishing in 22 seconds to =
taking well over 5 minutes.  I seem to only see this on the read and =
write tests, such as basic test5 taking a minute to finish:
>>>>>>=20
>>>>>> ./test5: read and write                                           =
                                           =20
>>>>>>         wrote 1048576 byte file 10 times in 60.35 seconds (173737 =
bytes/sec)                                 =20
>>>>>>         read 1048576 byte file 10 times in 0.0  seconds =
(-2147483648 bytes/sec)                              =20
>>>>>>         ./test5 ok.=20
>>>>>=20
>>>>> OK. This looks like write is impacted, but this test doesn't
>>>>> actually perform any reads on the wire. Try iozone with -I,
>>>>> maybe? That would show results for both read and write.
>>>>=20
>>>> Hum.
>>>>=20
>>>> Stock v4.16-rc4:
>>>>=20
>>>> ./test5: read and write
>>>> 	wrote 1048576 byte file 10 times in 0.2  seconds (350811642 =
bytes/sec)
>>>> 	read 1048576 byte file 10 times in 0.0  seconds (-2147483648 =
bytes/sec)
>>>> 	./test5 ok.
>>>>=20
>>>>=20
>>>> v4.16-rc4 with my full set of patches:
>>>>=20
>>>> ./test5: read and write
>>>> 	wrote 1048576 byte file 10 times in 0.2  seconds (354236681 =
bytes/sec)
>>>> 	read 1048576 byte file 10 times in 0.0  seconds (-2147483648 =
bytes/sec)
>>>> 	./test5 ok.
>>>>=20
>>>> I don't see a regression here. Let me know how it goes!
>>>=20
>>> I'm using rc4 too, so maybe it's something different in my setup?
>>=20
>> What is your setup, exactly? I assume your client is maybe a
>> single CPU guest, and the server is the same, and both are
>> hosted on one system?
>=20
> Client is single CPU kvm guest with 1 gig ram, server is also kvm on =
the same system with 2 cpus and 4 gigs ram.
>=20
>>=20
>>=20
>>> Making this change fixes the issue for me:
>>>=20
>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>> index a394b4635f8e..273847f7e455 100644
>>> --- a/net/sunrpc/xprt.c
>>> +++ b/net/sunrpc/xprt.c
>>> @@ -987,8 +987,6 @@ bool xprt_prepare_transmit(struct rpc_task =
*task)
>>>               task->tk_status =3D -EAGAIN;
>>>               goto out_unlock;
>>>       }
>>> -       if (likely(!bc_prealloc(req)))
>>> -               req->rq_xid =3D xprt_alloc_xid(xprt);
>>>       ret =3D true;
>>> out_unlock:
>>>       spin_unlock_bh(&xprt->transport_lock);
>>> @@ -1315,6 +1313,7 @@ void xprt_request_init(struct rpc_task *task)
>>>       req->rq_task    =3D task;
>>>       req->rq_xprt    =3D xprt;
>>>       req->rq_buffer  =3D NULL;
>>> +       req->rq_xid     =3D xprt_alloc_xid(xprt);
>>=20
>> xprt_alloc_xid is just=20
>>=20
>> 1299 static inline __be32 xprt_alloc_xid(struct rpc_xprt *xprt)
>> 1300 {
>> 1301         return (__force __be32)xprt->xid++;
>> 1302 }
>>=20
>> I don't believe the new call site for xprt_request_init is
>> adequately serialized for this to be safe in general. That's why
>> I'm calling xprt_alloc_xid in xprt_prepare_transmit, behind the
>> transport_lock.
>=20
> This makes sense.
>=20
>>=20
>> However, I think we need to explain why that helps your performance
>> issue, because it doesn't make sense to me that this would make a
>> difference. Why did you think to try this change? Is there evidence
>> on the wire of XID re-use, for example?
>=20
> I selectively reverted parts of your original patch until I found the =
parts that kill my performance.

Fair enough, but that doesn't explain why your change helps. :-)
Since I can't reproduce the regression here, try this:

0. Build a kernel with the regression

1. On your client:  # trace-cmd record -e sunrpc:* -e rpcrdma:*

2. Run the cthon04 basic tests

3. ^C the trace-cmd

4. Rename trace.dat

5. Repeat steps 1-4 with stock v4.16-rc4

6. tar and gzip the .dat files and send them to me


>>>       req->rq_connect_cookie =3D xprt->connect_cookie - 1;
>>>       req->rq_bytes_sent =3D 0;
>>>       req->rq_snd_buf.len =3D 0;
>>>=20
>>>=20
>>> Anna
>>>=20
>>>>=20
>>>>=20
>>>>>> I haven't dug into this too deeply, but my best guess is that =
maybe it's due to adding a call to xprt_request_init() in =
net/sunrpc/clnt.c:call_reserveresult()
>>>>>=20
>>>>> It wasn't added there, it was moved from xprt_alloc_slot. So,
>>>>> it's not new work per-RPC.
>>>>>=20
>>>>> Any additional information would be appreciated!
>>>>>=20
>>>>>=20
>>>>>> Thoughts?
>>>>>> Anna
>>>>>>=20
>>>>>> On 03/05/2018 03:13 PM, Chuck Lever wrote:
>>>>>>> alloc_slot is a transport-specific op, but initializing an =
rpc_rqst
>>>>>>> is common to all transports. Move initialization to common code =
in
>>>>>>> preparation for adding a transport-specific alloc_slot to =
xprtrdma.
>>>>>>>=20
>>>>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>>>>> ---
>>>>>>> include/linux/sunrpc/xprt.h |    1 +
>>>>>>> net/sunrpc/clnt.c           |    1 +
>>>>>>> net/sunrpc/xprt.c           |   12 +++++++-----
>>>>>>> 3 files changed, 9 insertions(+), 5 deletions(-)
>>>>>>>=20
>>>>>>> diff --git a/include/linux/sunrpc/xprt.h =
b/include/linux/sunrpc/xprt.h
>>>>>>> index 5fea0fb..9784e28 100644
>>>>>>> --- a/include/linux/sunrpc/xprt.h
>>>>>>> +++ b/include/linux/sunrpc/xprt.h
>>>>>>> @@ -324,6 +324,7 @@ struct xprt_class {
>>>>>>> struct rpc_xprt		*xprt_create_transport(struct =
xprt_create *args);
>>>>>>> void			xprt_connect(struct rpc_task *task);
>>>>>>> void			xprt_reserve(struct rpc_task *task);
>>>>>>> +void			xprt_request_init(struct rpc_task =
*task);
>>>>>>> void			xprt_retry_reserve(struct rpc_task =
*task);
>>>>>>> int			xprt_reserve_xprt(struct rpc_xprt *xprt, =
struct rpc_task *task);
>>>>>>> int			xprt_reserve_xprt_cong(struct rpc_xprt =
*xprt, struct rpc_task *task);
>>>>>>> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
>>>>>>> index 6e432ec..226f558 100644
>>>>>>> --- a/net/sunrpc/clnt.c
>>>>>>> +++ b/net/sunrpc/clnt.c
>>>>>>> @@ -1546,6 +1546,7 @@ void rpc_force_rebind(struct rpc_clnt =
*clnt)
>>>>>>> 	task->tk_status =3D 0;
>>>>>>> 	if (status >=3D 0) {
>>>>>>> 		if (task->tk_rqstp) {
>>>>>>> +			xprt_request_init(task);
>>>>>>> 			task->tk_action =3D call_refresh;
>>>>>>> 			return;
>>>>>>> 		}
>>>>>>> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
>>>>>>> index 70f0050..a394b46 100644
>>>>>>> --- a/net/sunrpc/xprt.c
>>>>>>> +++ b/net/sunrpc/xprt.c
>>>>>>> @@ -66,7 +66,7 @@
>>>>>>> * Local functions
>>>>>>> */
>>>>>>> static void	 xprt_init(struct rpc_xprt *xprt, struct net =
*net);
>>>>>>> -static void	xprt_request_init(struct rpc_task *, struct =
rpc_xprt *);
>>>>>>> +static __be32	xprt_alloc_xid(struct rpc_xprt *xprt);
>>>>>>> static void	xprt_connect_status(struct rpc_task *task);
>>>>>>> static int      __xprt_get_cong(struct rpc_xprt *, struct =
rpc_task *);
>>>>>>> static void     __xprt_put_cong(struct rpc_xprt *, struct =
rpc_rqst *);
>>>>>>> @@ -987,6 +987,8 @@ bool xprt_prepare_transmit(struct rpc_task =
*task)
>>>>>>> 		task->tk_status =3D -EAGAIN;
>>>>>>> 		goto out_unlock;
>>>>>>> 	}
>>>>>>> +	if (likely(!bc_prealloc(req)))
>>>>>>> +		req->rq_xid =3D xprt_alloc_xid(xprt);
>>>>>>> 	ret =3D true;
>>>>>>> out_unlock:
>>>>>>> 	spin_unlock_bh(&xprt->transport_lock);
>>>>>>> @@ -1163,10 +1165,10 @@ void xprt_alloc_slot(struct rpc_xprt =
*xprt, struct rpc_task *task)
>>>>>>> out_init_req:
>>>>>>> 	xprt->stat.max_slots =3D max_t(unsigned int, =
xprt->stat.max_slots,
>>>>>>> 				     xprt->num_reqs);
>>>>>>> +	spin_unlock(&xprt->reserve_lock);
>>>>>>> +
>>>>>>> 	task->tk_status =3D 0;
>>>>>>> 	task->tk_rqstp =3D req;
>>>>>>> -	xprt_request_init(task, xprt);
>>>>>>> -	spin_unlock(&xprt->reserve_lock);
>>>>>>> }
>>>>>>> EXPORT_SYMBOL_GPL(xprt_alloc_slot);
>>>>>>>=20
>>>>>>> @@ -1303,8 +1305,9 @@ static inline void xprt_init_xid(struct =
rpc_xprt *xprt)
>>>>>>> 	xprt->xid =3D prandom_u32();
>>>>>>> }
>>>>>>>=20
>>>>>>> -static void xprt_request_init(struct rpc_task *task, struct =
rpc_xprt *xprt)
>>>>>>> +void xprt_request_init(struct rpc_task *task)
>>>>>>> {
>>>>>>> +	struct rpc_xprt *xprt =3D task->tk_xprt;
>>>>>>> 	struct rpc_rqst	*req =3D task->tk_rqstp;
>>>>>>>=20
>>>>>>> 	INIT_LIST_HEAD(&req->rq_list);
>>>>>>> @@ -1312,7 +1315,6 @@ static void xprt_request_init(struct =
rpc_task *task, struct rpc_xprt *xprt)
>>>>>>> 	req->rq_task	=3D task;
>>>>>>> 	req->rq_xprt    =3D xprt;
>>>>>>> 	req->rq_buffer  =3D NULL;
>>>>>>> -	req->rq_xid     =3D xprt_alloc_xid(xprt);
>>>>>>> 	req->rq_connect_cookie =3D xprt->connect_cookie - 1;
>>>>>>> 	req->rq_bytes_sent =3D 0;
>>>>>>> 	req->rq_snd_buf.len =3D 0;
>>>>>>>=20
>>>>>=20
>>>>> --
>>>>> Chuck Lever
>>>>>=20
>>>>>=20
>>>>>=20
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe =
linux-rdma" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>=20
>>>> --
>>>> Chuck Lever
>>=20
>> --
>> Chuck Lever

--
Chuck Lever




^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-03-07 20:59 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-05 20:12 [PATCH 0/9] Second round of v4.17 NFS/RDMA client patches Chuck Lever
2018-03-05 20:12 ` [PATCH 1/9] SUNRPC: Move xprt_update_rtt callsite Chuck Lever
2018-03-05 20:13 ` [PATCH 2/9] SUNRPC: Make RTT measurement more precise (Receive) Chuck Lever
2018-03-05 20:13 ` [PATCH 3/9] SUNRPC: Make RTT measurement more precise (Send) Chuck Lever
2018-03-05 20:13 ` [PATCH 4/9] SUNRPC: Make num_reqs a non-atomic integer Chuck Lever
2018-03-05 20:13 ` [PATCH 5/9] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock Chuck Lever
2018-03-06 22:02   ` Anna Schumaker
2018-03-06 22:07     ` Chuck Lever
2018-03-06 22:30       ` Chuck Lever
2018-03-07 20:00         ` Anna Schumaker
2018-03-07 20:23           ` Chuck Lever
2018-03-07 20:32             ` Anna Schumaker
2018-03-07 20:44               ` Chuck Lever
2018-03-05 20:13 ` [PATCH 6/9] SUNRPC: Add a ->free_slot transport callout Chuck Lever
2018-03-05 20:13 ` [PATCH 7/9] xprtrdma: Introduce ->alloc_slot call-out for xprtrdma Chuck Lever
2018-03-05 20:13 ` [PATCH 8/9] xprtrdma: Make rpc_rqst part of rpcrdma_req Chuck Lever
2018-03-05 20:13 ` [PATCH 9/9] xprtrdma: Allocate rpcrdma_reps during Receive completion Chuck Lever

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.