All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] rpc: xs_bind - do not bind when requesting a random ephemeral port
@ 2014-09-05 19:40 Chris Perl
  2014-09-09  0:08 ` Trond Myklebust
  0 siblings, 1 reply; 4+ messages in thread
From: Chris Perl @ 2014-09-05 19:40 UTC (permalink / raw)
  To: linux-nfs; +Cc: Chris Perl

When attempting to establish a local ephemeral endpoint for a TCP or UDP
socket, do not explicitly call bind, instead let it happen implicilty when the
socket is first used.

The main motivating factor for this change is when TCP runs out of unique
ephemeral ports (i.e.  cannot find any ephemeral ports which are not a part of
*any* TCP connection).  In this situation if you explicitly call bind, then the
call will fail with EADDRINUSE.  However, if you allow the allocation of an
ephemeral port to happen implicitly as part of connect (or other functions),
then ephemeral ports can be reused, so long as the combination of (local_ip,
local_port, remote_ip, remote_port) is unique for TCP sockets on the system.

This doesn't matter for UDP sockets, but it seemed easiest to treat TCP and UDP
sockets the same.

This can allow mount.nfs(8) to continue to function successfully, even in the
face of misbehaving applications which are creating a large number of TCP
connections.

Signed-off-by: Chris Perl <chris.perl@gmail.com>
---
 net/sunrpc/xprtsock.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 43cd89e..7ed47b4 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1746,13 +1746,29 @@ static int xs_bind(struct sock_xprt *transport, struct socket *sock)
 	unsigned short port = xs_get_srcport(transport);
 	unsigned short last;
 
+	/*
+	 * If we are asking for any ephemeral port (i.e. port == 0 &&
+	 * transport->xprt.resvport == 0), don't bind.  Let the local
+	 * port selection happen implicitly when the socket is used
+	 * (for example at connect time).
+	 *
+	 * This ensures that we can continue to establish TCP
+	 * connections even when all local ephemeral ports are already
+	 * a part of some TCP connection.  This makes no difference
+	 * for UDP sockets, but also doens't harm them.
+	 *
+	 * If we're asking for any reserved port (i.e. port == 0 &&
+	 * transport->xprt.resvport == 1) xs_get_srcport above will
+	 * ensure that port is non-zero and we will bind as needed.
+	 */
+	if (port == 0)
+		return 0;
+
 	memcpy(&myaddr, &transport->srcaddr, transport->xprt.addrlen);
 	do {
 		rpc_set_port((struct sockaddr *)&myaddr, port);
 		err = kernel_bind(sock, (struct sockaddr *)&myaddr,
 				transport->xprt.addrlen);
-		if (port == 0)
-			break;
 		if (err == 0) {
 			transport->srcport = port;
 			break;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] rpc: xs_bind - do not bind when requesting a random ephemeral port
  2014-09-05 19:40 [PATCH] rpc: xs_bind - do not bind when requesting a random ephemeral port Chris Perl
@ 2014-09-09  0:08 ` Trond Myklebust
  2014-09-09 12:12   ` Chris Perl
  0 siblings, 1 reply; 4+ messages in thread
From: Trond Myklebust @ 2014-09-09  0:08 UTC (permalink / raw)
  To: Chris Perl; +Cc: Linux NFS Mailing List

On Fri, Sep 5, 2014 at 12:40 PM, Chris Perl <chris.perl@gmail.com> wrote:
> When attempting to establish a local ephemeral endpoint for a TCP or UDP
> socket, do not explicitly call bind, instead let it happen implicilty when the
> socket is first used.
>
> The main motivating factor for this change is when TCP runs out of unique
> ephemeral ports (i.e.  cannot find any ephemeral ports which are not a part of
> *any* TCP connection).  In this situation if you explicitly call bind, then the
> call will fail with EADDRINUSE.  However, if you allow the allocation of an
> ephemeral port to happen implicitly as part of connect (or other functions),
> then ephemeral ports can be reused, so long as the combination of (local_ip,
> local_port, remote_ip, remote_port) is unique for TCP sockets on the system.
>
> This doesn't matter for UDP sockets, but it seemed easiest to treat TCP and UDP
> sockets the same.
>
> This can allow mount.nfs(8) to continue to function successfully, even in the
> face of misbehaving applications which are creating a large number of TCP
> connections.
>
> Signed-off-by: Chris Perl <chris.perl@gmail.com>
> ---
>  net/sunrpc/xprtsock.c | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 43cd89e..7ed47b4 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -1746,13 +1746,29 @@ static int xs_bind(struct sock_xprt *transport, struct socket *sock)
>         unsigned short port = xs_get_srcport(transport);
>         unsigned short last;
>
> +       /*
> +        * If we are asking for any ephemeral port (i.e. port == 0 &&
> +        * transport->xprt.resvport == 0), don't bind.  Let the local
> +        * port selection happen implicitly when the socket is used
> +        * (for example at connect time).
> +        *
> +        * This ensures that we can continue to establish TCP
> +        * connections even when all local ephemeral ports are already
> +        * a part of some TCP connection.  This makes no difference
> +        * for UDP sockets, but also doens't harm them.
> +        *
> +        * If we're asking for any reserved port (i.e. port == 0 &&
> +        * transport->xprt.resvport == 1) xs_get_srcport above will
> +        * ensure that port is non-zero and we will bind as needed.
> +        */
> +       if (port == 0)
> +               return 0;
> +
>         memcpy(&myaddr, &transport->srcaddr, transport->xprt.addrlen);
>         do {
>                 rpc_set_port((struct sockaddr *)&myaddr, port);
>                 err = kernel_bind(sock, (struct sockaddr *)&myaddr,
>                                 transport->xprt.addrlen);
> -               if (port == 0)
> -                       break;
>                 if (err == 0) {
>                         transport->srcport = port;
>                         break;
> --
> 1.8.3.1
>

This change looks harmless to me, but since we've lived with the
current situation for several years, I'm not prepared to consider it a
must-fix for 3.17 quite yet. If it's OK with you, I'll therefore queue
it for 3.18.

-- 
Trond Myklebust

Linux NFS client maintainer, PrimaryData

trond.myklebust@primarydata.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] rpc: xs_bind - do not bind when requesting a random ephemeral port
  2014-09-09  0:08 ` Trond Myklebust
@ 2014-09-09 12:12   ` Chris Perl
  0 siblings, 0 replies; 4+ messages in thread
From: Chris Perl @ 2014-09-09 12:12 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Linux NFS Mailing List

Sounds reasonable to me.

On Mon, Sep 8, 2014 at 8:08 PM, Trond Myklebust
<trond.myklebust@primarydata.com> wrote:
> On Fri, Sep 5, 2014 at 12:40 PM, Chris Perl <chris.perl@gmail.com> wrote:
>> When attempting to establish a local ephemeral endpoint for a TCP or UDP
>> socket, do not explicitly call bind, instead let it happen implicilty when the
>> socket is first used.
>>
>> The main motivating factor for this change is when TCP runs out of unique
>> ephemeral ports (i.e.  cannot find any ephemeral ports which are not a part of
>> *any* TCP connection).  In this situation if you explicitly call bind, then the
>> call will fail with EADDRINUSE.  However, if you allow the allocation of an
>> ephemeral port to happen implicitly as part of connect (or other functions),
>> then ephemeral ports can be reused, so long as the combination of (local_ip,
>> local_port, remote_ip, remote_port) is unique for TCP sockets on the system.
>>
>> This doesn't matter for UDP sockets, but it seemed easiest to treat TCP and UDP
>> sockets the same.
>>
>> This can allow mount.nfs(8) to continue to function successfully, even in the
>> face of misbehaving applications which are creating a large number of TCP
>> connections.
>>
>> Signed-off-by: Chris Perl <chris.perl@gmail.com>
>> ---
>>  net/sunrpc/xprtsock.c | 20 ++++++++++++++++++--
>>  1 file changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>> index 43cd89e..7ed47b4 100644
>> --- a/net/sunrpc/xprtsock.c
>> +++ b/net/sunrpc/xprtsock.c
>> @@ -1746,13 +1746,29 @@ static int xs_bind(struct sock_xprt *transport, struct socket *sock)
>>         unsigned short port = xs_get_srcport(transport);
>>         unsigned short last;
>>
>> +       /*
>> +        * If we are asking for any ephemeral port (i.e. port == 0 &&
>> +        * transport->xprt.resvport == 0), don't bind.  Let the local
>> +        * port selection happen implicitly when the socket is used
>> +        * (for example at connect time).
>> +        *
>> +        * This ensures that we can continue to establish TCP
>> +        * connections even when all local ephemeral ports are already
>> +        * a part of some TCP connection.  This makes no difference
>> +        * for UDP sockets, but also doens't harm them.
>> +        *
>> +        * If we're asking for any reserved port (i.e. port == 0 &&
>> +        * transport->xprt.resvport == 1) xs_get_srcport above will
>> +        * ensure that port is non-zero and we will bind as needed.
>> +        */
>> +       if (port == 0)
>> +               return 0;
>> +
>>         memcpy(&myaddr, &transport->srcaddr, transport->xprt.addrlen);
>>         do {
>>                 rpc_set_port((struct sockaddr *)&myaddr, port);
>>                 err = kernel_bind(sock, (struct sockaddr *)&myaddr,
>>                                 transport->xprt.addrlen);
>> -               if (port == 0)
>> -                       break;
>>                 if (err == 0) {
>>                         transport->srcport = port;
>>                         break;
>> --
>> 1.8.3.1
>>
>
> This change looks harmless to me, but since we've lived with the
> current situation for several years, I'm not prepared to consider it a
> must-fix for 3.17 quite yet. If it's OK with you, I'll therefore queue
> it for 3.18.
>
> --
> Trond Myklebust
>
> Linux NFS client maintainer, PrimaryData
>
> trond.myklebust@primarydata.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] rpc: xs_bind - do not bind when requesting a random ephemeral port
@ 2014-09-05 20:17 Chris Perl
  0 siblings, 0 replies; 4+ messages in thread
From: Chris Perl @ 2014-09-05 20:17 UTC (permalink / raw)
  To: linux-nfs; +Cc: Chris Perl

When attempting to establish a local ephemeral endpoint for a TCP or UDP
socket, do not explicitly call bind, instead let it happen implicilty when the
socket is first used.

The main motivating factor for this change is when TCP runs out of unique
ephemeral ports (i.e.  cannot find any ephemeral ports which are not a part of
*any* TCP connection).  In this situation if you explicitly call bind, then the
call will fail with EADDRINUSE.  However, if you allow the allocation of an
ephemeral port to happen implicitly as part of connect (or other functions),
then ephemeral ports can be reused, so long as the combination of (local_ip,
local_port, remote_ip, remote_port) is unique for TCP sockets on the system.

This doesn't matter for UDP sockets, but it seemed easiest to treat TCP and UDP
sockets the same.

This can allow mount.nfs(8) to continue to function successfully, even in the
face of misbehaving applications which are creating a large number of TCP
connections.

Signed-off-by: Chris Perl <chris.perl@gmail.com>
---
 net/sunrpc/xprtsock.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 43cd89e..7ed47b4 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1746,13 +1746,29 @@ static int xs_bind(struct sock_xprt *transport, struct socket *sock)
 	unsigned short port = xs_get_srcport(transport);
 	unsigned short last;
 
+	/*
+	 * If we are asking for any ephemeral port (i.e. port == 0 &&
+	 * transport->xprt.resvport == 0), don't bind.  Let the local
+	 * port selection happen implicitly when the socket is used
+	 * (for example at connect time).
+	 *
+	 * This ensures that we can continue to establish TCP
+	 * connections even when all local ephemeral ports are already
+	 * a part of some TCP connection.  This makes no difference
+	 * for UDP sockets, but also doens't harm them.
+	 *
+	 * If we're asking for any reserved port (i.e. port == 0 &&
+	 * transport->xprt.resvport == 1) xs_get_srcport above will
+	 * ensure that port is non-zero and we will bind as needed.
+	 */
+	if (port == 0)
+		return 0;
+
 	memcpy(&myaddr, &transport->srcaddr, transport->xprt.addrlen);
 	do {
 		rpc_set_port((struct sockaddr *)&myaddr, port);
 		err = kernel_bind(sock, (struct sockaddr *)&myaddr,
 				transport->xprt.addrlen);
-		if (port == 0)
-			break;
 		if (err == 0) {
 			transport->srcport = port;
 			break;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-09-09 12:12 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-05 19:40 [PATCH] rpc: xs_bind - do not bind when requesting a random ephemeral port Chris Perl
2014-09-09  0:08 ` Trond Myklebust
2014-09-09 12:12   ` Chris Perl
2014-09-05 20:17 Chris Perl

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.