Linux-NFS Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v1 0/4] Fix more issues in new connect logic
@ 2020-06-27 16:34 Chuck Lever
  2020-06-27 16:35 ` [PATCH v1 1/4] xprtrdma: Fix double-free in rpcrdma_ep_create() Chuck Lever
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Chuck Lever @ 2020-06-27 16:34 UTC (permalink / raw)
  To: linux-nfs, linux-rdma; +Cc: dan

This series addresses several more flaws in the recent overhaul of
the client's RPC/RDMA connect logic. More testing is called for,
but these are ready for review. They apply on the fixes that were
pulled into Linus' tree yesterday.

See also the "cel-testing" topic branch in my kernel repo:

  git://git.linux-nfs.org/projects/cel/cel-2.6.git

---

Chuck Lever (4):
      xprtrdma: Fix double-free in rpcrdma_ep_create()
      xprtrdma: Fix recursion into rpcrdma_xprt_disconnect()
      xprtrdma: Fix return code from rpcrdma_xprt_connect()
      xprtrdma: Fix handling of connect errors


 net/sunrpc/xprtrdma/transport.c |  5 +++++
 net/sunrpc/xprtrdma/verbs.c     | 28 ++++++++++++++--------------
 2 files changed, 19 insertions(+), 14 deletions(-)

--
Chuck Lever

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v1 1/4] xprtrdma: Fix double-free in rpcrdma_ep_create()
  2020-06-27 16:34 [PATCH v1 0/4] Fix more issues in new connect logic Chuck Lever
@ 2020-06-27 16:35 ` Chuck Lever
  2020-06-27 16:35 ` [PATCH v1 2/4] xprtrdma: Fix recursion into rpcrdma_xprt_disconnect() Chuck Lever
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Chuck Lever @ 2020-06-27 16:35 UTC (permalink / raw)
  To: linux-nfs, linux-rdma; +Cc: dan

In the error paths, there's no need to call kfree(ep) after calling
rpcrdma_ep_put(ep).

Fixes: e28ce90083f0 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/verbs.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 2198c8ec8dff..e4c0df7c7d78 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -406,8 +406,8 @@ static int rpcrdma_ep_create(struct rpcrdma_xprt *r_xprt)
 
 	id = rpcrdma_create_id(r_xprt, ep);
 	if (IS_ERR(id)) {
-		rc = PTR_ERR(id);
-		goto out_free;
+		kfree(ep);
+		return PTR_ERR(id);
 	}
 	__module_get(THIS_MODULE);
 	device = id->device;
@@ -506,9 +506,6 @@ static int rpcrdma_ep_create(struct rpcrdma_xprt *r_xprt)
 out_destroy:
 	rpcrdma_ep_put(ep);
 	rdma_destroy_id(id);
-out_free:
-	kfree(ep);
-	r_xprt->rx_ep = NULL;
 	return rc;
 }
 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v1 2/4] xprtrdma: Fix recursion into rpcrdma_xprt_disconnect()
  2020-06-27 16:34 [PATCH v1 0/4] Fix more issues in new connect logic Chuck Lever
  2020-06-27 16:35 ` [PATCH v1 1/4] xprtrdma: Fix double-free in rpcrdma_ep_create() Chuck Lever
@ 2020-06-27 16:35 ` Chuck Lever
  2020-06-27 16:35 ` [PATCH v1 3/4] xprtrdma: Fix return code from rpcrdma_xprt_connect() Chuck Lever
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Chuck Lever @ 2020-06-27 16:35 UTC (permalink / raw)
  To: linux-nfs, linux-rdma; +Cc: dan

Both Dan and I have observed two processes invoking
rpcrdma_xprt_disconnect() concurrently. In my case:

1. The connect worker invokes rpcrdma_xprt_disconnect(), which
   drains the QP and waits for the final completion
2. This causes the newly posted Receive to flush and invoke
   xprt_force_disconnect()
3. xprt_force_disconnect() sets CLOSE_WAIT and wakes up the RPC task
   that is holding the transport lock
4. The RPC task invokes xprt_connect(), which calls ->ops->close
5. xprt_rdma_close() invokes rpcrdma_xprt_disconnect(), which tries
   to destroy the QP.

Deadlock.

To prevent xprt_force_disconnect() from waking anything, handle the
clean up after a failed connection attempt in the xprt's sndtask.

The retry loop is removed from rpcrdma_xprt_connect() to ensure
that the newly allocated ep and id are properly released before
a REJECTED connection attempt can be retried.

Reported-by: Dan Aloni <dan@kernelim.com>
Fixes: e28ce90083f0 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/transport.c |    5 +++++
 net/sunrpc/xprtrdma/verbs.c     |   10 ++--------
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 14165b673b20..053c8ab1265a 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -249,6 +249,11 @@ xprt_rdma_connect_worker(struct work_struct *work)
 					   xprt->stat.connect_start;
 		xprt_set_connected(xprt);
 		rc = -EAGAIN;
+	} else {
+		/* Force a call to xprt_rdma_close to clean up */
+		spin_lock(&xprt->transport_lock);
+		set_bit(XPRT_CLOSE_WAIT, &xprt->state);
+		spin_unlock(&xprt->transport_lock);
 	}
 	xprt_wake_pending_tasks(xprt, rc);
 }
diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index e4c0df7c7d78..641a3ca0fc8f 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -290,7 +290,7 @@ rpcrdma_cm_event_handler(struct rdma_cm_id *id, struct rdma_cm_event *event)
 			sap, rdma_reject_msg(id, event->status));
 		ep->re_connect_status = -ECONNREFUSED;
 		if (event->status == IB_CM_REJ_STALE_CONN)
-			ep->re_connect_status = -EAGAIN;
+			ep->re_connect_status = -ENOTCONN;
 		goto disconnected;
 	case RDMA_CM_EVENT_DISCONNECTED:
 		ep->re_connect_status = -ECONNABORTED;
@@ -521,8 +521,6 @@ int rpcrdma_xprt_connect(struct rpcrdma_xprt *r_xprt)
 	struct rpcrdma_ep *ep;
 	int rc;
 
-retry:
-	rpcrdma_xprt_disconnect(r_xprt);
 	rc = rpcrdma_ep_create(r_xprt);
 	if (rc)
 		return rc;
@@ -550,17 +548,13 @@ int rpcrdma_xprt_connect(struct rpcrdma_xprt *r_xprt)
 	wait_event_interruptible(ep->re_connect_wait,
 				 ep->re_connect_status != 0);
 	if (ep->re_connect_status <= 0) {
-		if (ep->re_connect_status == -EAGAIN)
-			goto retry;
 		rc = ep->re_connect_status;
 		goto out;
 	}
 
 	rc = rpcrdma_reqs_setup(r_xprt);
-	if (rc) {
-		rpcrdma_xprt_disconnect(r_xprt);
+	if (rc)
 		goto out;
-	}
 	rpcrdma_mrs_create(r_xprt);
 
 out:


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v1 3/4] xprtrdma: Fix return code from rpcrdma_xprt_connect()
  2020-06-27 16:34 [PATCH v1 0/4] Fix more issues in new connect logic Chuck Lever
  2020-06-27 16:35 ` [PATCH v1 1/4] xprtrdma: Fix double-free in rpcrdma_ep_create() Chuck Lever
  2020-06-27 16:35 ` [PATCH v1 2/4] xprtrdma: Fix recursion into rpcrdma_xprt_disconnect() Chuck Lever
@ 2020-06-27 16:35 ` Chuck Lever
  2020-06-27 16:35 ` [PATCH v1 4/4] xprtrdma: Fix handling of connect errors Chuck Lever
  2020-07-06 18:41 ` [PATCH v1 0/4] Fix more issues in new connect logic Chuck Lever
  4 siblings, 0 replies; 6+ messages in thread
From: Chuck Lever @ 2020-06-27 16:35 UTC (permalink / raw)
  To: linux-nfs, linux-rdma; +Cc: dan

I noticed that when rpcrdma_xprt_connect() returns -ENOMEM,
instead of retrying the connect, the RPC client kills the
RPC task that requested the connection. We want a retry
here.

Fixes: cb586decbb88 ("xprtrdma: Make sendctx queue lifetime the same as connection lifetime")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/verbs.c |   16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 641a3ca0fc8f..13d671dccfd8 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -400,7 +400,7 @@ static int rpcrdma_ep_create(struct rpcrdma_xprt *r_xprt)
 
 	ep = kzalloc(sizeof(*ep), GFP_NOFS);
 	if (!ep)
-		return -EAGAIN;
+		return -ENOTCONN;
 	ep->re_xprt = &r_xprt->rx_xprt;
 	kref_init(&ep->re_kref);
 
@@ -535,10 +535,6 @@ int rpcrdma_xprt_connect(struct rpcrdma_xprt *r_xprt)
 	rpcrdma_ep_get(ep);
 	rpcrdma_post_recvs(r_xprt, true);
 
-	rc = rpcrdma_sendctxs_create(r_xprt);
-	if (rc)
-		goto out;
-
 	rc = rdma_connect(ep->re_id, &ep->re_remote_cma);
 	if (rc)
 		goto out;
@@ -552,9 +548,17 @@ int rpcrdma_xprt_connect(struct rpcrdma_xprt *r_xprt)
 		goto out;
 	}
 
+	rc = rpcrdma_sendctxs_create(r_xprt);
+	if (rc) {
+		rc = -ENOTCONN;
+		goto out;
+	}
+
 	rc = rpcrdma_reqs_setup(r_xprt);
-	if (rc)
+	if (rc) {
+		rc = -ENOTCONN;
 		goto out;
+	}
 	rpcrdma_mrs_create(r_xprt);
 
 out:


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v1 4/4] xprtrdma: Fix handling of connect errors
  2020-06-27 16:34 [PATCH v1 0/4] Fix more issues in new connect logic Chuck Lever
                   ` (2 preceding siblings ...)
  2020-06-27 16:35 ` [PATCH v1 3/4] xprtrdma: Fix return code from rpcrdma_xprt_connect() Chuck Lever
@ 2020-06-27 16:35 ` Chuck Lever
  2020-07-06 18:41 ` [PATCH v1 0/4] Fix more issues in new connect logic Chuck Lever
  4 siblings, 0 replies; 6+ messages in thread
From: Chuck Lever @ 2020-06-27 16:35 UTC (permalink / raw)
  To: linux-nfs, linux-rdma; +Cc: dan

Ensure that the connect worker is awoken if an attempt to establish
a connection is unsuccessful. Otherwise the worker waits forever
and the transport workload hangs.

Connect errors should not attempt to destroy the ep, since the
connect worker continues to use it after the handler runs, so these
errors are now handled independently of DISCONNECTED events.

Reported-by: Dan Aloni <dan@kernelim.com>
Fixes: e28ce90083f0 ("xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 net/sunrpc/xprtrdma/verbs.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/sunrpc/xprtrdma/verbs.c b/net/sunrpc/xprtrdma/verbs.c
index 13d671dccfd8..75c646743df3 100644
--- a/net/sunrpc/xprtrdma/verbs.c
+++ b/net/sunrpc/xprtrdma/verbs.c
@@ -281,17 +281,19 @@ rpcrdma_cm_event_handler(struct rdma_cm_id *id, struct rdma_cm_event *event)
 		break;
 	case RDMA_CM_EVENT_CONNECT_ERROR:
 		ep->re_connect_status = -ENOTCONN;
-		goto disconnected;
+		goto wake_connect_worker;
 	case RDMA_CM_EVENT_UNREACHABLE:
 		ep->re_connect_status = -ENETUNREACH;
-		goto disconnected;
+		goto wake_connect_worker;
 	case RDMA_CM_EVENT_REJECTED:
 		dprintk("rpcrdma: connection to %pISpc rejected: %s\n",
 			sap, rdma_reject_msg(id, event->status));
 		ep->re_connect_status = -ECONNREFUSED;
 		if (event->status == IB_CM_REJ_STALE_CONN)
 			ep->re_connect_status = -ENOTCONN;
-		goto disconnected;
+wake_connect_worker:
+		wake_up_all(&ep->re_connect_wait);
+		return 0;
 	case RDMA_CM_EVENT_DISCONNECTED:
 		ep->re_connect_status = -ECONNABORTED;
 disconnected:


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 0/4] Fix more issues in new connect logic
  2020-06-27 16:34 [PATCH v1 0/4] Fix more issues in new connect logic Chuck Lever
                   ` (3 preceding siblings ...)
  2020-06-27 16:35 ` [PATCH v1 4/4] xprtrdma: Fix handling of connect errors Chuck Lever
@ 2020-07-06 18:41 ` Chuck Lever
  4 siblings, 0 replies; 6+ messages in thread
From: Chuck Lever @ 2020-07-06 18:41 UTC (permalink / raw)
  To: Anna Schumaker; +Cc: Dan Aloni, linux-rdma, Linux NFS Mailing List

Anna, I haven't found any additional issues with this series that
can't wait until 5.9 or later. Can you see that it gets into 5.8-rc ?
Thanks!


> On Jun 27, 2020, at 12:34 PM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
> This series addresses several more flaws in the recent overhaul of
> the client's RPC/RDMA connect logic. More testing is called for,
> but these are ready for review. They apply on the fixes that were
> pulled into Linus' tree yesterday.
> 
> See also the "cel-testing" topic branch in my kernel repo:
> 
>  git://git.linux-nfs.org/projects/cel/cel-2.6.git
> 
> ---
> 
> Chuck Lever (4):
>      xprtrdma: Fix double-free in rpcrdma_ep_create()
>      xprtrdma: Fix recursion into rpcrdma_xprt_disconnect()
>      xprtrdma: Fix return code from rpcrdma_xprt_connect()
>      xprtrdma: Fix handling of connect errors
> 
> 
> net/sunrpc/xprtrdma/transport.c |  5 +++++
> net/sunrpc/xprtrdma/verbs.c     | 28 ++++++++++++++--------------
> 2 files changed, 19 insertions(+), 14 deletions(-)
> 
> --
> Chuck Lever

--
Chuck Lever




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, back to index

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-27 16:34 [PATCH v1 0/4] Fix more issues in new connect logic Chuck Lever
2020-06-27 16:35 ` [PATCH v1 1/4] xprtrdma: Fix double-free in rpcrdma_ep_create() Chuck Lever
2020-06-27 16:35 ` [PATCH v1 2/4] xprtrdma: Fix recursion into rpcrdma_xprt_disconnect() Chuck Lever
2020-06-27 16:35 ` [PATCH v1 3/4] xprtrdma: Fix return code from rpcrdma_xprt_connect() Chuck Lever
2020-06-27 16:35 ` [PATCH v1 4/4] xprtrdma: Fix handling of connect errors Chuck Lever
2020-07-06 18:41 ` [PATCH v1 0/4] Fix more issues in new connect logic Chuck Lever

Linux-NFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nfs/0 linux-nfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nfs linux-nfs/ https://lore.kernel.org/linux-nfs \
		linux-nfs@vger.kernel.org
	public-inbox-index linux-nfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-nfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git