All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/11] For 2.6.33
@ 2009-11-05 18:22 Chuck Lever
       [not found] ` <20091105181924.2796.9313.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
  0 siblings, 1 reply; 19+ messages in thread
From: Chuck Lever @ 2009-11-05 18:22 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs

Hi Trond-

Here is what I'd like to submit for 2.6.33.  These have had some
testing here, but not as much as I would like.  Maybe linux-next would
be appropriate.

---

Chuck Lever (11):
      SUNRPC: soft connect semantics for UDP
      SUNRPC: Use soft connect semantics when performing RPC ping
      SUNRPC: Use soft connects for autobinding over TCP
      SUNRPC: Use TCP for local rpcbind upcalls
      SUNRPC: Use a cached RPC client and transport for rpcbind upcalls
      SUNRPC: Simplify synopsis of rpcb_local_clnt()
      SUNRPC: Allow RPCs to fail quickly if the server is unreachable
      SUNRPC: Check explicitly for tk_status == 0 in call_transmit_status()
      NFS: Revert default r/wsize behavior
      NFS: Display compressed (shorthand) IPv6 in /proc/mounts
      SUNRPC: Display compressed (shorthand) IPv6 presentation addresses


 fs/nfs/super.c               |    4 --
 include/linux/sunrpc/sched.h |    2 +
 net/sunrpc/addr.c            |   10 -----
 net/sunrpc/clnt.c            |   54 ++++++++++++++++++++++---
 net/sunrpc/rpcb_clnt.c       |   89 +++++++++++++++++++++++++++++++-----------
 net/sunrpc/sunrpc_syms.c     |    3 +
 net/sunrpc/xprtsock.c        |    2 -
 7 files changed, 120 insertions(+), 44 deletions(-)

-- 
Chuck Lever <chuck.lever@oracle.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 01/11] SUNRPC: Display compressed (shorthand) IPv6 presentation addresses
       [not found] ` <20091105181924.2796.9313.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
@ 2009-11-05 18:22   ` Chuck Lever
  2009-11-05 18:22   ` [PATCH 02/11] NFS: Display compressed (shorthand) IPv6 in /proc/mounts Chuck Lever
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2009-11-05 18:22 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs

Recent changes to snprintf() introduced the %pI6c formatter, which can
display an IPv6 address with standard shorthanding.  Using a
shorthanded address can save us a few bytes of memory for each stored
presentation address, or a few bytes on the wire when sending these in
a universal address.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 net/sunrpc/addr.c |   10 +---------
 1 files changed, 1 insertions(+), 9 deletions(-)

diff --git a/net/sunrpc/addr.c b/net/sunrpc/addr.c
index 22e8fd8..d972ce4 100644
--- a/net/sunrpc/addr.c
+++ b/net/sunrpc/addr.c
@@ -55,16 +55,8 @@ static size_t rpc_ntop6_noscopeid(const struct sockaddr *sap,
 
 	/*
 	 * RFC 4291, Section 2.2.1
-	 *
-	 * To keep the result as short as possible, especially
-	 * since we don't shorthand, we don't want leading zeros
-	 * in each halfword, so avoid %pI6.
 	 */
-	return snprintf(buf, buflen, "%x:%x:%x:%x:%x:%x:%x:%x",
-		ntohs(addr->s6_addr16[0]), ntohs(addr->s6_addr16[1]),
-		ntohs(addr->s6_addr16[2]), ntohs(addr->s6_addr16[3]),
-		ntohs(addr->s6_addr16[4]), ntohs(addr->s6_addr16[5]),
-		ntohs(addr->s6_addr16[6]), ntohs(addr->s6_addr16[7]));
+	return snprintf(buf, buflen, "%pI6c", addr);
 }
 
 static size_t rpc_ntop6(const struct sockaddr *sap,


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 02/11] NFS: Display compressed (shorthand) IPv6 in /proc/mounts
       [not found] ` <20091105181924.2796.9313.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
  2009-11-05 18:22   ` [PATCH 01/11] SUNRPC: Display compressed (shorthand) IPv6 presentation addresses Chuck Lever
@ 2009-11-05 18:22   ` Chuck Lever
  2009-11-05 18:22   ` [PATCH 03/11] NFS: Revert default r/wsize behavior Chuck Lever
                     ` (8 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2009-11-05 18:22 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs

Recent changes to snprintf() introduced the %pI6c formatter, which can
display an IPv6 address with standard shorthanding.  Use this new
formatter when displaying IPv6 server addresses in /proc/mounts.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 fs/nfs/super.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 90be551..968c7ac 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -505,7 +505,7 @@ static void nfs_show_mountd_options(struct seq_file *m, struct nfs_server *nfss,
 	}
 	case AF_INET6: {
 		struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)sap;
-		seq_printf(m, ",mountaddr=%pI6", &sin6->sin6_addr);
+		seq_printf(m, ",mountaddr=%pI6c", &sin6->sin6_addr);
 		break;
 	}
 	default:


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 03/11] NFS: Revert default r/wsize behavior
       [not found] ` <20091105181924.2796.9313.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
  2009-11-05 18:22   ` [PATCH 01/11] SUNRPC: Display compressed (shorthand) IPv6 presentation addresses Chuck Lever
  2009-11-05 18:22   ` [PATCH 02/11] NFS: Display compressed (shorthand) IPv6 in /proc/mounts Chuck Lever
@ 2009-11-05 18:22   ` Chuck Lever
  2009-11-05 18:22   ` [PATCH 04/11] SUNRPC: Check explicitly for tk_status == 0 in call_transmit_status() Chuck Lever
                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2009-11-05 18:22 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs

When the "rsize=" or "wsize=" mount options are not specified,
text-based mounts have slightly different behavior than legacy binary
mounts.  Text-based mounts use the smaller of the server's maximum
and the client's maximum, but binary mounts use the smaller of the
server's _preferred_ size and the client's maximum.

This difference is actually pretty subtle.  Most servers advertise
the same value as their maximum and their preferred transfer size, so
the end result is the same in most cases.

The reason for this difference is that for text-based mounts, if
r/wsize are not specified, they are set to the largest value supported
by the client.  For legacy mounts, the values are set to zero if these
options are not specified.

nfs_server_set_fsinfo() can negotiate the transfer size defaults
correctly in any case.  There's no need to specify any particular
value as default in the text-based option parsing logic.

Note that nfs4 doesn't use nfs_server_set_fsinfo(), but the mount.nfs4
command does set rsize and wsize to 0 if the user didn't specify these
options.  So, make the same change for text-based NFSv4 mounts.

Thanks to James Pearson <james-p-5Ol4pYTxKWu0ML75eksnrtBPR1lH4CV8@public.gmane.org> for reporting and
diagnosing the problem.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 fs/nfs/super.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 968c7ac..7eaa41e 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -734,8 +734,6 @@ static struct nfs_parsed_mount_data *nfs_alloc_parsed_mount_data(unsigned int ve
 
 	data = kzalloc(sizeof(*data), GFP_KERNEL);
 	if (data) {
-		data->rsize		= NFS_MAX_FILE_IO_SIZE;
-		data->wsize		= NFS_MAX_FILE_IO_SIZE;
 		data->acregmin		= NFS_DEF_ACREGMIN;
 		data->acregmax		= NFS_DEF_ACREGMAX;
 		data->acdirmin		= NFS_DEF_ACDIRMIN;


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 04/11] SUNRPC: Check explicitly for tk_status == 0 in call_transmit_status()
       [not found] ` <20091105181924.2796.9313.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
                     ` (2 preceding siblings ...)
  2009-11-05 18:22   ` [PATCH 03/11] NFS: Revert default r/wsize behavior Chuck Lever
@ 2009-11-05 18:22   ` Chuck Lever
  2009-11-05 18:23   ` [PATCH 05/11] SUNRPC: Allow RPCs to fail quickly if the server is unreachable Chuck Lever
                     ` (6 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2009-11-05 18:22 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs

The success case, where task->tk_status == 0, is by far the most
frequent case in call_transmit_status().

The default: arm of the switch statement in call_transmit_status()
handles the 0 case.  default: was moved close to the top of the switch
statement in call_transmit_status() under the theory that the compiler
places object code for the earliest arms of a switch statement first,
making the CPU do less work.

The default: arm of a switch statement, however, is executed only
after all the other cases have been checked.  Even if the compiler
rearranges the object code, the default: arm is the "last resort",
meaning all of the other cases have been explicitly exhausted.  That
makes the current arrangement about as inefficient as it gets for the
common case.

To fix this, add an explicit check for zero before the switch
statement.  That forces the compiler to do the zero check first, no
matter what optimizations it might try to do to the switch statement.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 net/sunrpc/clnt.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 38829e2..7bcd931 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1180,10 +1180,22 @@ static void
 call_transmit_status(struct rpc_task *task)
 {
 	task->tk_action = call_status;
+
+	/*
+	 * Common case: success.  Force the compiler to put this
+	 * test first.
+	 */
+	if (task->tk_status == 0) {
+		xprt_end_transmit(task);
+		rpc_task_force_reencode(task);
+		return;
+	}
+
 	switch (task->tk_status) {
 	case -EAGAIN:
 		break;
 	default:
+		dprint_status(task);
 		xprt_end_transmit(task);
 		/*
 		 * Special cases: if we've been waiting on the


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 05/11] SUNRPC: Allow RPCs to fail quickly if the server is unreachable
       [not found] ` <20091105181924.2796.9313.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
                     ` (3 preceding siblings ...)
  2009-11-05 18:22   ` [PATCH 04/11] SUNRPC: Check explicitly for tk_status == 0 in call_transmit_status() Chuck Lever
@ 2009-11-05 18:23   ` Chuck Lever
  2009-11-05 18:23   ` [PATCH 06/11] SUNRPC: Simplify synopsis of rpcb_local_clnt() Chuck Lever
                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2009-11-05 18:23 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs

The kernel sometimes makes RPC calls to services that aren't running.
Because the kernel's RPC client always assumes the hard retry semantic
when reconnecting a connection-oriented RPC transport, the underlying
reconnect logic takes a long while to time out, even though the remote
may have responded immediately with ECONNREFUSED.

In certain cases, like upcalls to our local rpcbind daemon, or for NFS
mount requests, we'd like the kernel to fail immediately if the remote
service isn't reachable.  This allows another transport to be tried
immediately, or the pending request can be abandoned quickly.

Introduce a per-request flag which controls how call_transmit_status()
behaves when request transmission fails because the server cannot be
reached.

We don't want soft connection semantics to apply to other errors.  The
default case of the switch statement in call_transmit_status() no
longer falls through; the fall through code is copied to the default
case, and a "break;" is added.

The transport's connection re-establishment timeout is also ignored for
such requests.  We want the request to fail immediately, so the
reconnect delay is skipped.  Additionally, we don't want a connect
failure here to further increase the reconnect timeout value, since
this request will not be retried.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 include/linux/sunrpc/sched.h |    2 ++
 net/sunrpc/clnt.c            |   11 +++++++++--
 net/sunrpc/xprtsock.c        |    2 +-
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h
index 4010977..1906782 100644
--- a/include/linux/sunrpc/sched.h
+++ b/include/linux/sunrpc/sched.h
@@ -130,12 +130,14 @@ struct rpc_task_setup {
 #define RPC_TASK_DYNAMIC	0x0080		/* task was kmalloc'ed */
 #define RPC_TASK_KILLED		0x0100		/* task was killed */
 #define RPC_TASK_SOFT		0x0200		/* Use soft timeouts */
+#define RPC_TASK_SOFTCONN	0x0400		/* Fail if can't connect */
 
 #define RPC_IS_ASYNC(t)		((t)->tk_flags & RPC_TASK_ASYNC)
 #define RPC_IS_SWAPPER(t)	((t)->tk_flags & RPC_TASK_SWAPPER)
 #define RPC_DO_ROOTOVERRIDE(t)	((t)->tk_flags & RPC_TASK_ROOTCREDS)
 #define RPC_ASSASSINATED(t)	((t)->tk_flags & RPC_TASK_KILLED)
 #define RPC_IS_SOFT(t)		((t)->tk_flags & RPC_TASK_SOFT)
+#define RPC_IS_SOFTCONN(t)	((t)->tk_flags & RPC_TASK_SOFTCONN)
 
 #define RPC_TASK_RUNNING	0
 #define RPC_TASK_QUEUED		1
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 7bcd931..68a2358 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1197,6 +1197,8 @@ call_transmit_status(struct rpc_task *task)
 	default:
 		dprint_status(task);
 		xprt_end_transmit(task);
+		rpc_task_force_reencode(task);
+		break;
 		/*
 		 * Special cases: if we've been waiting on the
 		 * socket's write_space() callback, or if the
@@ -1204,11 +1206,16 @@ call_transmit_status(struct rpc_task *task)
 		 * then hold onto the transport lock.
 		 */
 	case -ECONNREFUSED:
-	case -ECONNRESET:
-	case -ENOTCONN:
 	case -EHOSTDOWN:
 	case -EHOSTUNREACH:
 	case -ENETUNREACH:
+		if (RPC_IS_SOFTCONN(task)) {
+			xprt_end_transmit(task);
+			rpc_exit(task, task->tk_status);
+			break;
+		}
+	case -ECONNRESET:
+	case -ENOTCONN:
 	case -EPIPE:
 		rpc_task_force_reencode(task);
 	}
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 37c5475..ff312f8 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2033,7 +2033,7 @@ static void xs_connect(struct rpc_task *task)
 	if (xprt_test_and_set_connecting(xprt))
 		return;
 
-	if (transport->sock != NULL) {
+	if (transport->sock != NULL && !RPC_IS_SOFTCONN(task)) {
 		dprintk("RPC:       xs_connect delayed xprt %p for %lu "
 				"seconds\n",
 				xprt, xprt->reestablish_timeout / HZ);


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 06/11] SUNRPC: Simplify synopsis of rpcb_local_clnt()
       [not found] ` <20091105181924.2796.9313.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
                     ` (4 preceding siblings ...)
  2009-11-05 18:23   ` [PATCH 05/11] SUNRPC: Allow RPCs to fail quickly if the server is unreachable Chuck Lever
@ 2009-11-05 18:23   ` Chuck Lever
  2009-11-05 18:23   ` [PATCH 07/11] SUNRPC: Use a cached RPC client and transport for rpcbind upcalls Chuck Lever
                     ` (4 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2009-11-05 18:23 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs

Clean up: At one point, rpcb_local_clnt() handled IPv6 loopback
addresses too, but it doesn't any more; only IPv4 loopback is used
now.  Get rid of the @addr and @addrlen arguments to
rpcb_local_clnt().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 net/sunrpc/rpcb_clnt.c |   11 ++++-------
 1 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index 830faf4..28f50da 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -163,13 +163,12 @@ static const struct sockaddr_in rpcb_inaddr_loopback = {
 	.sin_port		= htons(RPCBIND_PORT),
 };
 
-static struct rpc_clnt *rpcb_create_local(struct sockaddr *addr,
-					  size_t addrlen, u32 version)
+static struct rpc_clnt *rpcb_create_local(u32 version)
 {
 	struct rpc_create_args args = {
 		.protocol	= XPRT_TRANSPORT_UDP,
-		.address	= addr,
-		.addrsize	= addrlen,
+		.address	= (struct sockaddr *)&rpcb_inaddr_loopback,
+		.addrsize	= sizeof(rpcb_inaddr_loopback),
 		.servername	= "localhost",
 		.program	= &rpcb_program,
 		.version	= version,
@@ -211,14 +210,12 @@ static struct rpc_clnt *rpcb_create(char *hostname, struct sockaddr *srvaddr,
 
 static int rpcb_register_call(const u32 version, struct rpc_message *msg)
 {
-	struct sockaddr *addr = (struct sockaddr *)&rpcb_inaddr_loopback;
-	size_t addrlen = sizeof(rpcb_inaddr_loopback);
 	struct rpc_clnt *rpcb_clnt;
 	int result, error = 0;
 
 	msg->rpc_resp = &result;
 
-	rpcb_clnt = rpcb_create_local(addr, addrlen, version);
+	rpcb_clnt = rpcb_create_local(version);
 	if (!IS_ERR(rpcb_clnt)) {
 		error = rpc_call_sync(rpcb_clnt, msg, 0);
 		rpc_shutdown_client(rpcb_clnt);


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 07/11] SUNRPC: Use a cached RPC client and transport for rpcbind upcalls
       [not found] ` <20091105181924.2796.9313.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
                     ` (5 preceding siblings ...)
  2009-11-05 18:23   ` [PATCH 06/11] SUNRPC: Simplify synopsis of rpcb_local_clnt() Chuck Lever
@ 2009-11-05 18:23   ` Chuck Lever
       [not found]     ` <20091105182319.2796.62305.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
  2009-11-05 18:23   ` [PATCH 08/11] SUNRPC: Use TCP for local " Chuck Lever
                     ` (3 subsequent siblings)
  10 siblings, 1 reply; 19+ messages in thread
From: Chuck Lever @ 2009-11-05 18:23 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs

The kernel's rpcbind client creates and deletes an rpc_clnt and its
underlying transport socket for every upcall to the local rpcbind
daemon.

When starting a typical NFS server on IPv4 and IPv6, the NFS service
itself does three upcalls (one per version) times two upcalls (one
per transport) times two upcalls (one per address family), making 12,
plus another one for the initial call to unregister previous NFS
services.  Starting the NLM service adds an additional 13 upcalls,
for similar reasons.

(Currently the NFS service doesn't start IPv6 listeners, but it will
soon enough).

Instead, let's create an rpc_clnt for rpcbind upcalls during the
first local rpcbind query, and cache it.  This saves the overhead of
creating and destroying an rpc_clnt and a socket for every upcall.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 net/sunrpc/rpcb_clnt.c   |   78 +++++++++++++++++++++++++++++++++++++---------
 net/sunrpc/sunrpc_syms.c |    3 ++
 2 files changed, 65 insertions(+), 16 deletions(-)

diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index 28f50da..1ec4a1a 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -20,6 +20,7 @@
 #include <linux/in6.h>
 #include <linux/kernel.h>
 #include <linux/errno.h>
+#include <linux/spinlock.h>
 #include <net/ipv6.h>
 
 #include <linux/sunrpc/clnt.h>
@@ -110,6 +111,9 @@ static void			rpcb_getport_done(struct rpc_task *, void *);
 static void			rpcb_map_release(void *data);
 static struct rpc_program	rpcb_program;
 
+static struct rpc_clnt *	rpcb_local_clnt;
+static struct rpc_clnt *	rpcb_local_clnt4;
+
 struct rpcbind_args {
 	struct rpc_xprt *	r_xprt;
 
@@ -163,7 +167,7 @@ static const struct sockaddr_in rpcb_inaddr_loopback = {
 	.sin_port		= htons(RPCBIND_PORT),
 };
 
-static struct rpc_clnt *rpcb_create_local(u32 version)
+static int rpcb_create_local(void)
 {
 	struct rpc_create_args args = {
 		.protocol	= XPRT_TRANSPORT_UDP,
@@ -171,12 +175,37 @@ static struct rpc_clnt *rpcb_create_local(u32 version)
 		.addrsize	= sizeof(rpcb_inaddr_loopback),
 		.servername	= "localhost",
 		.program	= &rpcb_program,
-		.version	= version,
+		.version	= RPCBVERS_2,
 		.authflavor	= RPC_AUTH_UNIX,
 		.flags		= RPC_CLNT_CREATE_NOPING,
 	};
+	static DEFINE_SPINLOCK(rpcb_create_local_lock);
+	struct rpc_clnt *clnt, *clnt4;
+	int result = 0;
+
+	spin_lock(&rpcb_create_local_lock);
+	if (rpcb_local_clnt)
+		goto out;
+
+	clnt = rpc_create(&args);
+	if (IS_ERR(clnt)) {
+		result = -PTR_ERR(clnt);
+		goto out;
+	}
 
-	return rpc_create(&args);
+	clnt4 = rpc_bind_new_program(clnt, &rpcb_program, RPCBVERS_4);
+	if (IS_ERR(clnt4)) {
+		result = -PTR_ERR(clnt4);
+		rpc_shutdown_client(clnt);
+		goto out;
+	}
+
+	rpcb_local_clnt = clnt;
+	rpcb_local_clnt4 = clnt4;
+
+out:
+	spin_unlock(&rpcb_create_local_lock);
+	return result;
 }
 
 static struct rpc_clnt *rpcb_create(char *hostname, struct sockaddr *srvaddr,
@@ -208,20 +237,13 @@ static struct rpc_clnt *rpcb_create(char *hostname, struct sockaddr *srvaddr,
 	return rpc_create(&args);
 }
 
-static int rpcb_register_call(const u32 version, struct rpc_message *msg)
+static int rpcb_register_call(struct rpc_clnt *clnt, struct rpc_message *msg)
 {
-	struct rpc_clnt *rpcb_clnt;
 	int result, error = 0;
 
 	msg->rpc_resp = &result;
 
-	rpcb_clnt = rpcb_create_local(version);
-	if (!IS_ERR(rpcb_clnt)) {
-		error = rpc_call_sync(rpcb_clnt, msg, 0);
-		rpc_shutdown_client(rpcb_clnt);
-	} else
-		error = PTR_ERR(rpcb_clnt);
-
+	error = rpc_call_sync(clnt, msg, 0);
 	if (error < 0) {
 		dprintk("RPC:       failed to contact local rpcbind "
 				"server (errno %d).\n", -error);
@@ -276,6 +298,13 @@ int rpcb_register(u32 prog, u32 vers, int prot, unsigned short port)
 	struct rpc_message msg = {
 		.rpc_argp	= &map,
 	};
+	int error;
+
+	if (rpcb_local_clnt == NULL) {
+		error = rpcb_create_local();
+		if (error)
+			return error;
+	}
 
 	dprintk("RPC:       %sregistering (%u, %u, %d, %u) with local "
 			"rpcbind\n", (port ? "" : "un"),
@@ -285,7 +314,7 @@ int rpcb_register(u32 prog, u32 vers, int prot, unsigned short port)
 	if (port)
 		msg.rpc_proc = &rpcb_procedures2[RPCBPROC_SET];
 
-	return rpcb_register_call(RPCBVERS_2, &msg);
+	return rpcb_register_call(rpcb_local_clnt, &msg);
 }
 
 /*
@@ -310,7 +339,7 @@ static int rpcb_register_inet4(const struct sockaddr *sap,
 	if (port)
 		msg->rpc_proc = &rpcb_procedures4[RPCBPROC_SET];
 
-	result = rpcb_register_call(RPCBVERS_4, msg);
+	result = rpcb_register_call(rpcb_local_clnt4, msg);
 	kfree(map->r_addr);
 	return result;
 }
@@ -337,7 +366,7 @@ static int rpcb_register_inet6(const struct sockaddr *sap,
 	if (port)
 		msg->rpc_proc = &rpcb_procedures4[RPCBPROC_SET];
 
-	result = rpcb_register_call(RPCBVERS_4, msg);
+	result = rpcb_register_call(rpcb_local_clnt4, msg);
 	kfree(map->r_addr);
 	return result;
 }
@@ -353,7 +382,7 @@ static int rpcb_unregister_all_protofamilies(struct rpc_message *msg)
 	map->r_addr = "";
 	msg->rpc_proc = &rpcb_procedures4[RPCBPROC_UNSET];
 
-	return rpcb_register_call(RPCBVERS_4, msg);
+	return rpcb_register_call(rpcb_local_clnt4, msg);
 }
 
 /**
@@ -411,6 +440,13 @@ int rpcb_v4_register(const u32 program, const u32 version,
 	struct rpc_message msg = {
 		.rpc_argp	= &map,
 	};
+	int error;
+
+	if (rpcb_local_clnt4 == NULL) {
+		error = rpcb_create_local();
+		if (error)
+			return error;
+	}
 
 	if (address == NULL)
 		return rpcb_unregister_all_protofamilies(&msg);
@@ -1024,3 +1060,13 @@ static struct rpc_program rpcb_program = {
 	.version	= rpcb_version,
 	.stats		= &rpcb_stats,
 };
+
+/**
+ * cleanup_rpcb_clnt - remove xprtsock's sysctls, unregister
+ *
+ */
+void cleanup_rpcb_clnt(void)
+{
+	rpc_shutdown_client(rpcb_local_clnt4);
+	rpc_shutdown_client(rpcb_local_clnt);
+}
diff --git a/net/sunrpc/sunrpc_syms.c b/net/sunrpc/sunrpc_syms.c
index 8cce921..f438347 100644
--- a/net/sunrpc/sunrpc_syms.c
+++ b/net/sunrpc/sunrpc_syms.c
@@ -24,6 +24,8 @@
 
 extern struct cache_detail ip_map_cache, unix_gid_cache;
 
+extern void cleanup_rpcb_clnt(void);
+
 static int __init
 init_sunrpc(void)
 {
@@ -53,6 +55,7 @@ out:
 static void __exit
 cleanup_sunrpc(void)
 {
+	cleanup_rpcb_clnt();
 	rpcauth_remove_module();
 	cleanup_socket_xprt();
 	svc_cleanup_xprt_sock();


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 08/11] SUNRPC: Use TCP for local rpcbind upcalls
       [not found] ` <20091105181924.2796.9313.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
                     ` (6 preceding siblings ...)
  2009-11-05 18:23   ` [PATCH 07/11] SUNRPC: Use a cached RPC client and transport for rpcbind upcalls Chuck Lever
@ 2009-11-05 18:23   ` Chuck Lever
  2009-11-05 18:23   ` [PATCH 09/11] SUNRPC: Use soft connects for autobinding over TCP Chuck Lever
                     ` (2 subsequent siblings)
  10 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2009-11-05 18:23 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs

Use TCP with the soft connect semantic for local rpcbind upcalls so
the kernel can detect immediately if the local rpcbind daemon is not
running.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 net/sunrpc/rpcb_clnt.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index 1ec4a1a..1698f50 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -170,7 +170,7 @@ static const struct sockaddr_in rpcb_inaddr_loopback = {
 static int rpcb_create_local(void)
 {
 	struct rpc_create_args args = {
-		.protocol	= XPRT_TRANSPORT_UDP,
+		.protocol	= XPRT_TRANSPORT_TCP,
 		.address	= (struct sockaddr *)&rpcb_inaddr_loopback,
 		.addrsize	= sizeof(rpcb_inaddr_loopback),
 		.servername	= "localhost",
@@ -243,7 +243,7 @@ static int rpcb_register_call(struct rpc_clnt *clnt, struct rpc_message *msg)
 
 	msg->rpc_resp = &result;
 
-	error = rpc_call_sync(clnt, msg, 0);
+	error = rpc_call_sync(clnt, msg, RPC_TASK_SOFTCONN);
 	if (error < 0) {
 		dprintk("RPC:       failed to contact local rpcbind "
 				"server (errno %d).\n", -error);


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 09/11] SUNRPC: Use soft connects for autobinding over TCP
       [not found] ` <20091105181924.2796.9313.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
                     ` (7 preceding siblings ...)
  2009-11-05 18:23   ` [PATCH 08/11] SUNRPC: Use TCP for local " Chuck Lever
@ 2009-11-05 18:23   ` Chuck Lever
  2009-11-05 18:23   ` [PATCH 10/11] SUNRPC: Use soft connect semantics when performing RPC ping Chuck Lever
  2009-11-05 18:23   ` [PATCH 11/11] SUNRPC: soft connect semantics for UDP Chuck Lever
  10 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2009-11-05 18:23 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs

Autobinding is handled by the rpciod process, not in user processes
that are generating regular RPC requests.  Thus autobinding is usually
not affected by signals targetting user processes, such as KILL or
timer expiration events.

In addition, an RPC request generated by a user process that has
RPC_TASK_SOFTCONN set and needs to perform an autobind will hang if
the remote rpcbind service is not available.

For rpcbind queries on connection-oriented transports, let's use the
new soft connect semantic to return control to the user's process
quickly, if the kernel's rpcbind client can't connect to the remote
rpcbind service.

Logic is introduced in call_bind_status() to handle connection errors
that occurred during an asynchronous rpcbind query.  The logic
abandons the rpcbind query if the RPC request has SOFTCONN set, and
retries after a few seconds in the normal case.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 net/sunrpc/clnt.c      |   17 ++++++++++++++++-
 net/sunrpc/rpcb_clnt.c |    2 +-
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 68a2358..4b76ef9 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1060,7 +1060,7 @@ call_bind_status(struct rpc_task *task)
 		goto retry_timeout;
 	case -EPFNOSUPPORT:
 		/* server doesn't support any rpcbind version we know of */
-		dprintk("RPC: %5u remote rpcbind service unavailable\n",
+		dprintk("RPC: %5u unrecognized remote rpcbind service\n",
 				task->tk_pid);
 		break;
 	case -EPROTONOSUPPORT:
@@ -1069,6 +1069,21 @@ call_bind_status(struct rpc_task *task)
 		task->tk_status = 0;
 		task->tk_action = call_bind;
 		return;
+	case -ECONNREFUSED:		/* connection problems */
+	case -ECONNRESET:
+	case -ENOTCONN:
+	case -EHOSTDOWN:
+	case -EHOSTUNREACH:
+	case -ENETUNREACH:
+	case -EPIPE:
+		dprintk("RPC: %5u remote rpcbind unreachable: %d\n",
+				task->tk_pid, task->tk_status);
+		if (!RPC_IS_SOFTCONN(task)) {
+			rpc_delay(task, 5*HZ);
+			goto retry_timeout;
+		}
+		status = task->tk_status;
+		break;
 	default:
 		dprintk("RPC: %5u unrecognized rpcbind error (%d)\n",
 				task->tk_pid, -task->tk_status);
diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
index 1698f50..9bfd13c 100644
--- a/net/sunrpc/rpcb_clnt.c
+++ b/net/sunrpc/rpcb_clnt.c
@@ -524,7 +524,7 @@ static struct rpc_task *rpcb_call_async(struct rpc_clnt *rpcb_clnt, struct rpcbi
 		.rpc_message = &msg,
 		.callback_ops = &rpcb_getport_ops,
 		.callback_data = map,
-		.flags = RPC_TASK_ASYNC,
+		.flags = RPC_TASK_ASYNC | RPC_TASK_SOFTCONN,
 	};
 
 	return rpc_run_task(&task_setup_data);


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 10/11] SUNRPC: Use soft connect semantics when performing RPC ping
       [not found] ` <20091105181924.2796.9313.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
                     ` (8 preceding siblings ...)
  2009-11-05 18:23   ` [PATCH 09/11] SUNRPC: Use soft connects for autobinding over TCP Chuck Lever
@ 2009-11-05 18:23   ` Chuck Lever
  2009-11-05 18:23   ` [PATCH 11/11] SUNRPC: soft connect semantics for UDP Chuck Lever
  10 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2009-11-05 18:23 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs

Currently, if a remote RPC service is unreachable, an RPC ping will
hang until the underlying transport connect attempt times out.  A more
desirable behavior might be to have the ping fail immediately so upper
layers can recover appropriately.

In the case of an NFS mount, for instance, this would mean the
mount(2) system call could fail immediately if the server isn't
listening, rather than hanging uninterruptibly for more than 3
minutes.

Change rpc_ping() so that it fails immediately for connection-oriented
transports.  rpc_create() will then fail immediately for such
transports if an RPC ping was requested.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 net/sunrpc/clnt.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 4b76ef9..97931d9 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -79,7 +79,7 @@ static void	call_connect_status(struct rpc_task *task);
 
 static __be32	*rpc_encode_header(struct rpc_task *task);
 static __be32	*rpc_verify_header(struct rpc_task *task);
-static int	rpc_ping(struct rpc_clnt *clnt, int flags);
+static int	rpc_ping(struct rpc_clnt *clnt);
 
 static void rpc_register_client(struct rpc_clnt *clnt)
 {
@@ -340,7 +340,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
 		return clnt;
 
 	if (!(args->flags & RPC_CLNT_CREATE_NOPING)) {
-		int err = rpc_ping(clnt, RPC_TASK_SOFT);
+		int err = rpc_ping(clnt);
 		if (err != 0) {
 			rpc_shutdown_client(clnt);
 			return ERR_PTR(err);
@@ -528,7 +528,7 @@ struct rpc_clnt *rpc_bind_new_program(struct rpc_clnt *old,
 	clnt->cl_prog     = program->number;
 	clnt->cl_vers     = version->number;
 	clnt->cl_stats    = program->stats;
-	err = rpc_ping(clnt, RPC_TASK_SOFT);
+	err = rpc_ping(clnt);
 	if (err != 0) {
 		rpc_shutdown_client(clnt);
 		clnt = ERR_PTR(err);
@@ -1709,14 +1709,14 @@ static struct rpc_procinfo rpcproc_null = {
 	.p_decode = rpcproc_decode_null,
 };
 
-static int rpc_ping(struct rpc_clnt *clnt, int flags)
+static int rpc_ping(struct rpc_clnt *clnt)
 {
 	struct rpc_message msg = {
 		.rpc_proc = &rpcproc_null,
 	};
 	int err;
 	msg.rpc_cred = authnull_ops.lookup_cred(NULL, NULL, 0);
-	err = rpc_call_sync(clnt, &msg, flags);
+	err = rpc_call_sync(clnt, &msg, RPC_TASK_SOFT | RPC_TASK_SOFTCONN);
 	put_rpccred(msg.rpc_cred);
 	return err;
 }


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 11/11] SUNRPC: soft connect semantics for UDP
       [not found] ` <20091105181924.2796.9313.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
                     ` (9 preceding siblings ...)
  2009-11-05 18:23   ` [PATCH 10/11] SUNRPC: Use soft connect semantics when performing RPC ping Chuck Lever
@ 2009-11-05 18:23   ` Chuck Lever
  10 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2009-11-05 18:23 UTC (permalink / raw)
  To: trond.myklebust; +Cc: linux-nfs

Introduce soft connect behavior for UDP transports.  In this case, a
major timeout returns ETIMEDOUT instead of EIO.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---

 net/sunrpc/clnt.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 97931d9..154034b 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1380,6 +1380,10 @@ call_timeout(struct rpc_task *task)
 	dprintk("RPC: %5u call_timeout (major)\n", task->tk_pid);
 	task->tk_timeouts++;
 
+	if (RPC_IS_SOFTCONN(task)) {
+		rpc_exit(task, -ETIMEDOUT);
+		return;
+	}
 	if (RPC_IS_SOFT(task)) {
 		if (clnt->cl_chatty)
 			printk(KERN_NOTICE "%s: server %s not responding, timed out\n",


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 07/11] SUNRPC: Use a cached RPC client and transport for rpcbind upcalls
       [not found]     ` <20091105182319.2796.62305.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
@ 2009-11-20 20:18       ` Trond Myklebust
  2009-11-20 20:19         ` Chuck Lever
  2009-11-20 21:50         ` Chuck Lever
  0 siblings, 2 replies; 19+ messages in thread
From: Trond Myklebust @ 2009-11-20 20:18 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-nfs

On Thu, 2009-11-05 at 13:23 -0500, Chuck Lever wrote: 
> The kernel's rpcbind client creates and deletes an rpc_clnt and its
> underlying transport socket for every upcall to the local rpcbind
> daemon.
> 
> When starting a typical NFS server on IPv4 and IPv6, the NFS service
> itself does three upcalls (one per version) times two upcalls (one
> per transport) times two upcalls (one per address family), making 12,
> plus another one for the initial call to unregister previous NFS
> services.  Starting the NLM service adds an additional 13 upcalls,
> for similar reasons.
> 
> (Currently the NFS service doesn't start IPv6 listeners, but it will
> soon enough).
> 
> Instead, let's create an rpc_clnt for rpcbind upcalls during the
> first local rpcbind query, and cache it.  This saves the overhead of
> creating and destroying an rpc_clnt and a socket for every upcall.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> 
>  net/sunrpc/rpcb_clnt.c   |   78 +++++++++++++++++++++++++++++++++++++---------
>  net/sunrpc/sunrpc_syms.c |    3 ++
>  2 files changed, 65 insertions(+), 16 deletions(-)
> 
> diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
> index 28f50da..1ec4a1a 100644
> --- a/net/sunrpc/rpcb_clnt.c
> +++ b/net/sunrpc/rpcb_clnt.c
> @@ -20,6 +20,7 @@
>  #include <linux/in6.h>
>  #include <linux/kernel.h>
>  #include <linux/errno.h>
> +#include <linux/spinlock.h>
>  #include <net/ipv6.h>
>  
>  #include <linux/sunrpc/clnt.h>
> @@ -110,6 +111,9 @@ static void			rpcb_getport_done(struct rpc_task *, void *);
>  static void			rpcb_map_release(void *data);
>  static struct rpc_program	rpcb_program;
>  
> +static struct rpc_clnt *	rpcb_local_clnt;
> +static struct rpc_clnt *	rpcb_local_clnt4;
> +
>  struct rpcbind_args {
>  	struct rpc_xprt *	r_xprt;
>  
> @@ -163,7 +167,7 @@ static const struct sockaddr_in rpcb_inaddr_loopback = {
>  	.sin_port		= htons(RPCBIND_PORT),
>  };
>  
> -static struct rpc_clnt *rpcb_create_local(u32 version)
> +static int rpcb_create_local(void)
>  {
>  	struct rpc_create_args args = {
>  		.protocol	= XPRT_TRANSPORT_UDP,
> @@ -171,12 +175,37 @@ static struct rpc_clnt *rpcb_create_local(u32 version)
>  		.addrsize	= sizeof(rpcb_inaddr_loopback),
>  		.servername	= "localhost",
>  		.program	= &rpcb_program,
> -		.version	= version,
> +		.version	= RPCBVERS_2,
>  		.authflavor	= RPC_AUTH_UNIX,
>  		.flags		= RPC_CLNT_CREATE_NOPING,
>  	};
> +	static DEFINE_SPINLOCK(rpcb_create_local_lock);
> +	struct rpc_clnt *clnt, *clnt4;
> +	int result = 0;
> +
> +	spin_lock(&rpcb_create_local_lock);
> +	if (rpcb_local_clnt)
> +		goto out;
> +
> +	clnt = rpc_create(&args);
> +	if (IS_ERR(clnt)) {
> +		result = -PTR_ERR(clnt);
> +		goto out;
> +	}
>  
> -	return rpc_create(&args);
> +	clnt4 = rpc_bind_new_program(clnt, &rpcb_program, RPCBVERS_4);
> +	if (IS_ERR(clnt4)) {
> +		result = -PTR_ERR(clnt4);
> +		rpc_shutdown_client(clnt);
> +		goto out;
> +	}
> +
> +	rpcb_local_clnt = clnt;
> +	rpcb_local_clnt4 = clnt4;
> +
> +out:
> +	spin_unlock(&rpcb_create_local_lock);
> +	return result;
>  }

You can't have tested this. At the very least you cannot have done so
with spinlock debugging enabled...

Trond

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 07/11] SUNRPC: Use a cached RPC client and transport for rpcbind upcalls
  2009-11-20 20:18       ` Trond Myklebust
@ 2009-11-20 20:19         ` Chuck Lever
  2009-11-20 21:50         ` Chuck Lever
  1 sibling, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2009-11-20 20:19 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs


On Nov 20, 2009, at 3:18 PM, Trond Myklebust wrote:

> On Thu, 2009-11-05 at 13:23 -0500, Chuck Lever wrote:
>> The kernel's rpcbind client creates and deletes an rpc_clnt and its
>> underlying transport socket for every upcall to the local rpcbind
>> daemon.
>>
>> When starting a typical NFS server on IPv4 and IPv6, the NFS service
>> itself does three upcalls (one per version) times two upcalls (one
>> per transport) times two upcalls (one per address family), making 12,
>> plus another one for the initial call to unregister previous NFS
>> services.  Starting the NLM service adds an additional 13 upcalls,
>> for similar reasons.
>>
>> (Currently the NFS service doesn't start IPv6 listeners, but it will
>> soon enough).
>>
>> Instead, let's create an rpc_clnt for rpcbind upcalls during the
>> first local rpcbind query, and cache it.  This saves the overhead of
>> creating and destroying an rpc_clnt and a socket for every upcall.
>>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>>
>> net/sunrpc/rpcb_clnt.c   |   78 ++++++++++++++++++++++++++++++++++++ 
>> +---------
>> net/sunrpc/sunrpc_syms.c |    3 ++
>> 2 files changed, 65 insertions(+), 16 deletions(-)
>>
>> diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
>> index 28f50da..1ec4a1a 100644
>> --- a/net/sunrpc/rpcb_clnt.c
>> +++ b/net/sunrpc/rpcb_clnt.c
>> @@ -20,6 +20,7 @@
>> #include <linux/in6.h>
>> #include <linux/kernel.h>
>> #include <linux/errno.h>
>> +#include <linux/spinlock.h>
>> #include <net/ipv6.h>
>>
>> #include <linux/sunrpc/clnt.h>
>> @@ -110,6 +111,9 @@ static void			rpcb_getport_done(struct rpc_task  
>> *, void *);
>> static void			rpcb_map_release(void *data);
>> static struct rpc_program	rpcb_program;
>>
>> +static struct rpc_clnt *	rpcb_local_clnt;
>> +static struct rpc_clnt *	rpcb_local_clnt4;
>> +
>> struct rpcbind_args {
>> 	struct rpc_xprt *	r_xprt;
>>
>> @@ -163,7 +167,7 @@ static const struct sockaddr_in  
>> rpcb_inaddr_loopback = {
>> 	.sin_port		= htons(RPCBIND_PORT),
>> };
>>
>> -static struct rpc_clnt *rpcb_create_local(u32 version)
>> +static int rpcb_create_local(void)
>> {
>> 	struct rpc_create_args args = {
>> 		.protocol	= XPRT_TRANSPORT_UDP,
>> @@ -171,12 +175,37 @@ static struct rpc_clnt *rpcb_create_local(u32  
>> version)
>> 		.addrsize	= sizeof(rpcb_inaddr_loopback),
>> 		.servername	= "localhost",
>> 		.program	= &rpcb_program,
>> -		.version	= version,
>> +		.version	= RPCBVERS_2,
>> 		.authflavor	= RPC_AUTH_UNIX,
>> 		.flags		= RPC_CLNT_CREATE_NOPING,
>> 	};
>> +	static DEFINE_SPINLOCK(rpcb_create_local_lock);
>> +	struct rpc_clnt *clnt, *clnt4;
>> +	int result = 0;
>> +
>> +	spin_lock(&rpcb_create_local_lock);
>> +	if (rpcb_local_clnt)
>> +		goto out;
>> +
>> +	clnt = rpc_create(&args);
>> +	if (IS_ERR(clnt)) {
>> +		result = -PTR_ERR(clnt);
>> +		goto out;
>> +	}
>>
>> -	return rpc_create(&args);
>> +	clnt4 = rpc_bind_new_program(clnt, &rpcb_program, RPCBVERS_4);
>> +	if (IS_ERR(clnt4)) {
>> +		result = -PTR_ERR(clnt4);
>> +		rpc_shutdown_client(clnt);
>> +		goto out;
>> +	}
>> +
>> +	rpcb_local_clnt = clnt;
>> +	rpcb_local_clnt4 = clnt4;
>> +
>> +out:
>> +	spin_unlock(&rpcb_create_local_lock);
>> +	return result;
>> }
>
> You can't have tested this. At the very least you cannot have done so
> with spinlock debugging enabled...

I did test it, but not with spinlock debugging.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 07/11] SUNRPC: Use a cached RPC client and transport for rpcbind upcalls
  2009-11-20 20:18       ` Trond Myklebust
  2009-11-20 20:19         ` Chuck Lever
@ 2009-11-20 21:50         ` Chuck Lever
  2009-11-20 22:05           ` J. Bruce Fields
  1 sibling, 1 reply; 19+ messages in thread
From: Chuck Lever @ 2009-11-20 21:50 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs

On Nov 20, 2009, at 3:18 PM, Trond Myklebust wrote:
> On Thu, 2009-11-05 at 13:23 -0500, Chuck Lever wrote:
>> The kernel's rpcbind client creates and deletes an rpc_clnt and its
>> underlying transport socket for every upcall to the local rpcbind
>> daemon.
>>
>> When starting a typical NFS server on IPv4 and IPv6, the NFS service
>> itself does three upcalls (one per version) times two upcalls (one
>> per transport) times two upcalls (one per address family), making 12,
>> plus another one for the initial call to unregister previous NFS
>> services.  Starting the NLM service adds an additional 13 upcalls,
>> for similar reasons.
>>
>> (Currently the NFS service doesn't start IPv6 listeners, but it will
>> soon enough).
>>
>> Instead, let's create an rpc_clnt for rpcbind upcalls during the
>> first local rpcbind query, and cache it.  This saves the overhead of
>> creating and destroying an rpc_clnt and a socket for every upcall.
>>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>>
>> net/sunrpc/rpcb_clnt.c   |   78 ++++++++++++++++++++++++++++++++++++ 
>> +---------
>> net/sunrpc/sunrpc_syms.c |    3 ++
>> 2 files changed, 65 insertions(+), 16 deletions(-)
>>
>> diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c
>> index 28f50da..1ec4a1a 100644
>> --- a/net/sunrpc/rpcb_clnt.c
>> +++ b/net/sunrpc/rpcb_clnt.c
>> @@ -20,6 +20,7 @@
>> #include <linux/in6.h>
>> #include <linux/kernel.h>
>> #include <linux/errno.h>
>> +#include <linux/spinlock.h>
>> #include <net/ipv6.h>
>>
>> #include <linux/sunrpc/clnt.h>
>> @@ -110,6 +111,9 @@ static void			rpcb_getport_done(struct rpc_task  
>> *, void *);
>> static void			rpcb_map_release(void *data);
>> static struct rpc_program	rpcb_program;
>>
>> +static struct rpc_clnt *	rpcb_local_clnt;
>> +static struct rpc_clnt *	rpcb_local_clnt4;
>> +
>> struct rpcbind_args {
>> 	struct rpc_xprt *	r_xprt;
>>
>> @@ -163,7 +167,7 @@ static const struct sockaddr_in  
>> rpcb_inaddr_loopback = {
>> 	.sin_port		= htons(RPCBIND_PORT),
>> };
>>
>> -static struct rpc_clnt *rpcb_create_local(u32 version)
>> +static int rpcb_create_local(void)
>> {
>> 	struct rpc_create_args args = {
>> 		.protocol	= XPRT_TRANSPORT_UDP,
>> @@ -171,12 +175,37 @@ static struct rpc_clnt *rpcb_create_local(u32  
>> version)
>> 		.addrsize	= sizeof(rpcb_inaddr_loopback),
>> 		.servername	= "localhost",
>> 		.program	= &rpcb_program,
>> -		.version	= version,
>> +		.version	= RPCBVERS_2,
>> 		.authflavor	= RPC_AUTH_UNIX,
>> 		.flags		= RPC_CLNT_CREATE_NOPING,
>> 	};
>> +	static DEFINE_SPINLOCK(rpcb_create_local_lock);
>> +	struct rpc_clnt *clnt, *clnt4;
>> +	int result = 0;
>> +
>> +	spin_lock(&rpcb_create_local_lock);
>> +	if (rpcb_local_clnt)
>> +		goto out;
>> +
>> +	clnt = rpc_create(&args);
>> +	if (IS_ERR(clnt)) {
>> +		result = -PTR_ERR(clnt);
>> +		goto out;
>> +	}
>>
>> -	return rpc_create(&args);
>> +	clnt4 = rpc_bind_new_program(clnt, &rpcb_program, RPCBVERS_4);
>> +	if (IS_ERR(clnt4)) {
>> +		result = -PTR_ERR(clnt4);
>> +		rpc_shutdown_client(clnt);
>> +		goto out;
>> +	}
>> +
>> +	rpcb_local_clnt = clnt;
>> +	rpcb_local_clnt4 = clnt4;
>> +
>> +out:
>> +	spin_unlock(&rpcb_create_local_lock);
>> +	return result;
>> }
>
> You can't have tested this. At the very least you cannot have done so
> with spinlock debugging enabled...

I moved the rpcb_create_local_lock spinlock out of the function,  
enabled every spinlock checkbox I could under kernel hacking, and gave  
the guest 2 CPUs.  The spinlock checker reported a problem almost  
immediately with XFS (even with just one virtual CPU), so I know it's  
enabled and working.

I can't reproduce any problems with the rpcbind upcall here.  Do you  
have anything more specific?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 07/11] SUNRPC: Use a cached RPC client and transport for rpcbind upcalls
  2009-11-20 21:50         ` Chuck Lever
@ 2009-11-20 22:05           ` J. Bruce Fields
  2009-11-20 22:24             ` Chuck Lever
  0 siblings, 1 reply; 19+ messages in thread
From: J. Bruce Fields @ 2009-11-20 22:05 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Trond Myklebust, linux-nfs

On Fri, Nov 20, 2009 at 04:50:34PM -0500, Chuck Lever wrote:
> On Nov 20, 2009, at 3:18 PM, Trond Myklebust wrote:
>> On Thu, 2009-11-05 at 13:23 -0500, Chuck Lever wrote:
>>> +	static DEFINE_SPINLOCK(rpcb_create_local_lock);
>>> +	struct rpc_clnt *clnt, *clnt4;
>>> +	int result = 0;
>>> +
>>> +	spin_lock(&rpcb_create_local_lock);
>>> +	if (rpcb_local_clnt)
>>> +		goto out;
>>> +
>>> +	clnt = rpc_create(&args);
>>> +	if (IS_ERR(clnt)) {
>>> +		result = -PTR_ERR(clnt);
>>> +		goto out;
>>> +	}
>>>
>>> -	return rpc_create(&args);
>>> +	clnt4 = rpc_bind_new_program(clnt, &rpcb_program, RPCBVERS_4);
>>> +	if (IS_ERR(clnt4)) {
>>> +		result = -PTR_ERR(clnt4);
>>> +		rpc_shutdown_client(clnt);
>>> +		goto out;
>>> +	}
>>> +
>>> +	rpcb_local_clnt = clnt;
>>> +	rpcb_local_clnt4 = clnt4;
>>> +
>>> +out:
>>> +	spin_unlock(&rpcb_create_local_lock);
>>> +	return result;
>>> }
>>
>> You can't have tested this. At the very least you cannot have done so
>> with spinlock debugging enabled...
>
> I moved the rpcb_create_local_lock spinlock out of the function, enabled 
> every spinlock checkbox I could under kernel hacking,

Including CONFIG_DEBUG_SPINLOCK_SLEEP?

> and gave the guest 
> 2 CPUs.  The spinlock checker reported a problem almost immediately with 
> XFS (even with just one virtual CPU), so I know it's enabled and working.
>
> I can't reproduce any problems with the rpcbind upcall here.  Do you  
> have anything more specific?

Isn't there an rpc ping in rpc_bind_new_program?

--b.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 07/11] SUNRPC: Use a cached RPC client and transport for rpcbind upcalls
  2009-11-20 22:05           ` J. Bruce Fields
@ 2009-11-20 22:24             ` Chuck Lever
  2009-11-20 22:36               ` J. Bruce Fields
  0 siblings, 1 reply; 19+ messages in thread
From: Chuck Lever @ 2009-11-20 22:24 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: NFSv3 list, Trond Myklebust

On Nov 20, 2009, at 5:05 PM, J. Bruce Fields wrote:
> On Fri, Nov 20, 2009 at 04:50:34PM -0500, Chuck Lever wrote:
>> On Nov 20, 2009, at 3:18 PM, Trond Myklebust wrote:
>>> On Thu, 2009-11-05 at 13:23 -0500, Chuck Lever wrote:
>>>> +	static DEFINE_SPINLOCK(rpcb_create_local_lock);
>>>> +	struct rpc_clnt *clnt, *clnt4;
>>>> +	int result = 0;
>>>> +
>>>> +	spin_lock(&rpcb_create_local_lock);
>>>> +	if (rpcb_local_clnt)
>>>> +		goto out;
>>>> +
>>>> +	clnt = rpc_create(&args);
>>>> +	if (IS_ERR(clnt)) {
>>>> +		result = -PTR_ERR(clnt);
>>>> +		goto out;
>>>> +	}
>>>>
>>>> -	return rpc_create(&args);
>>>> +	clnt4 = rpc_bind_new_program(clnt, &rpcb_program, RPCBVERS_4);
>>>> +	if (IS_ERR(clnt4)) {
>>>> +		result = -PTR_ERR(clnt4);
>>>> +		rpc_shutdown_client(clnt);
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	rpcb_local_clnt = clnt;
>>>> +	rpcb_local_clnt4 = clnt4;
>>>> +
>>>> +out:
>>>> +	spin_unlock(&rpcb_create_local_lock);
>>>> +	return result;
>>>> }
>>>
>>> You can't have tested this. At the very least you cannot have done  
>>> so
>>> with spinlock debugging enabled...
>>
>> I moved the rpcb_create_local_lock spinlock out of the function,  
>> enabled
>> every spinlock checkbox I could under kernel hacking,
>
> Including CONFIG_DEBUG_SPINLOCK_SLEEP?

Yes.  I even rebuilt the kernel under test from scratch.

>> and gave the guest
>> 2 CPUs.  The spinlock checker reported a problem almost immediately  
>> with
>> XFS (even with just one virtual CPU), so I know it's enabled and  
>> working.
>>
>> I can't reproduce any problems with the rpcbind upcall here.  Do you
>> have anything more specific?
>
> Isn't there an rpc ping in rpc_bind_new_program?

Hrm, I suppose there is.  That's weird, clearly I didn't see the  
rpc_ping() call, even though I was looking for it when I wrote this.   
A GFP_KERNEL memory allocation can sleep too, can't it?

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 07/11] SUNRPC: Use a cached RPC client and transport for rpcbind upcalls
  2009-11-20 22:24             ` Chuck Lever
@ 2009-11-20 22:36               ` J. Bruce Fields
  2009-11-20 23:47                 ` Trond Myklebust
  0 siblings, 1 reply; 19+ messages in thread
From: J. Bruce Fields @ 2009-11-20 22:36 UTC (permalink / raw)
  To: Chuck Lever; +Cc: NFSv3 list, Trond Myklebust

On Fri, Nov 20, 2009 at 05:24:43PM -0500, Chuck Lever wrote:
> On Nov 20, 2009, at 5:05 PM, J. Bruce Fields wrote:
>> On Fri, Nov 20, 2009 at 04:50:34PM -0500, Chuck Lever wrote:
>>> On Nov 20, 2009, at 3:18 PM, Trond Myklebust wrote:
>>>> On Thu, 2009-11-05 at 13:23 -0500, Chuck Lever wrote:
>>>>> +	static DEFINE_SPINLOCK(rpcb_create_local_lock);
>>>>> +	struct rpc_clnt *clnt, *clnt4;
>>>>> +	int result = 0;
>>>>> +
>>>>> +	spin_lock(&rpcb_create_local_lock);
>>>>> +	if (rpcb_local_clnt)
>>>>> +		goto out;
>>>>> +
>>>>> +	clnt = rpc_create(&args);
>>>>> +	if (IS_ERR(clnt)) {
>>>>> +		result = -PTR_ERR(clnt);
>>>>> +		goto out;
>>>>> +	}
>>>>>
>>>>> -	return rpc_create(&args);
>>>>> +	clnt4 = rpc_bind_new_program(clnt, &rpcb_program, RPCBVERS_4);
>>>>> +	if (IS_ERR(clnt4)) {
>>>>> +		result = -PTR_ERR(clnt4);
>>>>> +		rpc_shutdown_client(clnt);
>>>>> +		goto out;
>>>>> +	}
>>>>> +
>>>>> +	rpcb_local_clnt = clnt;
>>>>> +	rpcb_local_clnt4 = clnt4;
>>>>> +
>>>>> +out:
>>>>> +	spin_unlock(&rpcb_create_local_lock);
>>>>> +	return result;
>>>>> }
>>>>
>>>> You can't have tested this. At the very least you cannot have done  
>>>> so
>>>> with spinlock debugging enabled...
>>>
>>> I moved the rpcb_create_local_lock spinlock out of the function,  
>>> enabled
>>> every spinlock checkbox I could under kernel hacking,
>>
>> Including CONFIG_DEBUG_SPINLOCK_SLEEP?
>
> Yes.  I even rebuilt the kernel under test from scratch.
>
>>> and gave the guest
>>> 2 CPUs.  The spinlock checker reported a problem almost immediately  
>>> with
>>> XFS (even with just one virtual CPU), so I know it's enabled and  
>>> working.
>>>
>>> I can't reproduce any problems with the rpcbind upcall here.  Do you
>>> have anything more specific?
>>
>> Isn't there an rpc ping in rpc_bind_new_program?
>
> Hrm, I suppose there is.  That's weird, clearly I didn't see the  
> rpc_ping() call, even though I was looking for it when I wrote this.  A 
> GFP_KERNEL memory allocation can sleep too, can't it?

Yes.  I'd be really curious to know how that got through--if
CONFIG_DEBUG_SPINLOCK_SLEEP can't catch a case that cut-and-dried, then
it's totally broken....

--b.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 07/11] SUNRPC: Use a cached RPC client and transport for rpcbind upcalls
  2009-11-20 22:36               ` J. Bruce Fields
@ 2009-11-20 23:47                 ` Trond Myklebust
  0 siblings, 0 replies; 19+ messages in thread
From: Trond Myklebust @ 2009-11-20 23:47 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Chuck Lever, NFSv3 list

On Fri, 2009-11-20 at 17:36 -0500, J. Bruce Fields wrote: 
> On Fri, Nov 20, 2009 at 05:24:43PM -0500, Chuck Lever wrote:
> > On Nov 20, 2009, at 5:05 PM, J. Bruce Fields wrote:
> >> On Fri, Nov 20, 2009 at 04:50:34PM -0500, Chuck Lever wrote:
> >>> I can't reproduce any problems with the rpcbind upcall here.  Do you
> >>> have anything more specific?
> >>
> >> Isn't there an rpc ping in rpc_bind_new_program?
> >
> > Hrm, I suppose there is.  That's weird, clearly I didn't see the  
> > rpc_ping() call, even though I was looking for it when I wrote this.  A 
> > GFP_KERNEL memory allocation can sleep too, can't it?
> 
> Yes.  I'd be really curious to know how that got through--if
> CONFIG_DEBUG_SPINLOCK_SLEEP can't catch a case that cut-and-dried, then
> it's totally broken....
> 
> --b.

Furthermore, there are memory allocations galore in the call to
rpc_create(). Any attempt to run that while holding a spinlock should
cause CONFIG_DEBUG_SPINLOCK_SLEEP to throw a series of fits...

Trond

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2009-11-20 23:48 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-05 18:22 [PATCH 00/11] For 2.6.33 Chuck Lever
     [not found] ` <20091105181924.2796.9313.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
2009-11-05 18:22   ` [PATCH 01/11] SUNRPC: Display compressed (shorthand) IPv6 presentation addresses Chuck Lever
2009-11-05 18:22   ` [PATCH 02/11] NFS: Display compressed (shorthand) IPv6 in /proc/mounts Chuck Lever
2009-11-05 18:22   ` [PATCH 03/11] NFS: Revert default r/wsize behavior Chuck Lever
2009-11-05 18:22   ` [PATCH 04/11] SUNRPC: Check explicitly for tk_status == 0 in call_transmit_status() Chuck Lever
2009-11-05 18:23   ` [PATCH 05/11] SUNRPC: Allow RPCs to fail quickly if the server is unreachable Chuck Lever
2009-11-05 18:23   ` [PATCH 06/11] SUNRPC: Simplify synopsis of rpcb_local_clnt() Chuck Lever
2009-11-05 18:23   ` [PATCH 07/11] SUNRPC: Use a cached RPC client and transport for rpcbind upcalls Chuck Lever
     [not found]     ` <20091105182319.2796.62305.stgit-RytpoXr2tKZ9HhUboXbp9zCvJB+x5qRC@public.gmane.org>
2009-11-20 20:18       ` Trond Myklebust
2009-11-20 20:19         ` Chuck Lever
2009-11-20 21:50         ` Chuck Lever
2009-11-20 22:05           ` J. Bruce Fields
2009-11-20 22:24             ` Chuck Lever
2009-11-20 22:36               ` J. Bruce Fields
2009-11-20 23:47                 ` Trond Myklebust
2009-11-05 18:23   ` [PATCH 08/11] SUNRPC: Use TCP for local " Chuck Lever
2009-11-05 18:23   ` [PATCH 09/11] SUNRPC: Use soft connects for autobinding over TCP Chuck Lever
2009-11-05 18:23   ` [PATCH 10/11] SUNRPC: Use soft connect semantics when performing RPC ping Chuck Lever
2009-11-05 18:23   ` [PATCH 11/11] SUNRPC: soft connect semantics for UDP Chuck Lever

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.