All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] inet: Fixes for inet_csk_get_port and soreusport
@ 2016-12-15  0:54 Tom Herbert
  2016-12-15  0:54 ` [PATCH net-next 1/2] inet: Don't go into port scan when looking for specific bind port Tom Herbert
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Tom Herbert @ 2016-12-15  0:54 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team, jbacik, eric.dumazet, raigatgoog

This patch set fixes a couple of issues I noticed while debugging our
softlockup issue in inet_csk_get_port.

- Don't allow jump into port scan in inet_csk_get_port if function
  was called with non-zero port number (looking up explicit port
  number).
- When inet_csk_get_port is called with zero port number (ie. perform
  scan) an reuseport is set on the socket, don't match sockets that
  also have reuseport set. The intent from the user should be
  to get a new port number and then explictly bind other
  sockets to that number using soreuseport.

Tested:

Ran first patch on production workload with no ill effect.

For second patch, ran a little listener application and first
demonstrated that unbound sockets with soreuseport can indeed
be bound to unrelated soreuseport sockets.


Tom Herbert (2):
  inet: Don't go into port scan when looking for specific bind port
  inet: Fix get port to handle zero port number with soreuseport set

 include/net/inet6_connection_sock.h |  3 ++-
 include/net/inet_connection_sock.h  |  6 ++++--
 net/ipv4/inet_connection_sock.c     | 16 ++++++++++------
 net/ipv6/inet6_connection_sock.c    |  7 ++++---
 4 files changed, 20 insertions(+), 12 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net-next 1/2] inet: Don't go into port scan when looking for specific bind port
  2016-12-15  0:54 [PATCH net-next 0/2] inet: Fixes for inet_csk_get_port and soreusport Tom Herbert
@ 2016-12-15  0:54 ` Tom Herbert
  2016-12-15  0:54 ` [PATCH net-next 2/2] inet: Fix get port to handle zero port number with soreuseport set Tom Herbert
  2016-12-17 16:13 ` [PATCH net-next 0/2] inet: Fixes for inet_csk_get_port and soreusport David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Tom Herbert @ 2016-12-15  0:54 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team, jbacik, eric.dumazet, raigatgoog

inet_csk_get_port is called with port number (snum argument) that may be
zero or nonzero. If it is zero, then the intent is to find an available
ephemeral port number to bind to. If snum is non-zero then the caller
is asking to allocate a specific port number. In the latter case we
never want to perform the scan in ephemeral port range. It is
conceivable that this can happen if the "goto again" in "tb_found:"
is done. This patch adds a check that snum is zero before doing
the "goto again".

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 net/ipv4/inet_connection_sock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index d5d3ead..f59838a6 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -212,7 +212,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
 			      sk->sk_reuseport &&
 			      !rcu_access_pointer(sk->sk_reuseport_cb) &&
 			      uid_eq(tb->fastuid, uid))) &&
-			    smallest_size != -1 && --attempts >= 0) {
+			    !snum && smallest_size != -1 && --attempts >= 0) {
 				spin_unlock_bh(&head->lock);
 				goto again;
 			}
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH net-next 2/2] inet: Fix get port to handle zero port number with soreuseport set
  2016-12-15  0:54 [PATCH net-next 0/2] inet: Fixes for inet_csk_get_port and soreusport Tom Herbert
  2016-12-15  0:54 ` [PATCH net-next 1/2] inet: Don't go into port scan when looking for specific bind port Tom Herbert
@ 2016-12-15  0:54 ` Tom Herbert
  2016-12-17 16:13 ` [PATCH net-next 0/2] inet: Fixes for inet_csk_get_port and soreusport David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Tom Herbert @ 2016-12-15  0:54 UTC (permalink / raw)
  To: davem, netdev; +Cc: kernel-team, jbacik, eric.dumazet, raigatgoog

A user may call listen with binding an explicit port with the intent
that the kernel will assign an available port to the socket. In this
case inet_csk_get_port does a port scan. For such sockets, the user may
also set soreuseport with the intent a creating more sockets for the
port that is selected. The problem is that the initial socket being
opened could inadvertently choose an existing and unreleated port
number that was already created with soreuseport.

This patch adds a boolean parameter to inet_bind_conflict that indicates
rather soreuseport is allowed for the check (in addition to
sk->sk_reuseport). In calls to inet_bind_conflict from inet_csk_get_port
the argument is set to true if an explicit port is being looked up (snum
argument is nonzero), and is false if port scan is done.

Signed-off-by: Tom Herbert <tom@herbertland.com>
---
 include/net/inet6_connection_sock.h |  3 ++-
 include/net/inet_connection_sock.h  |  6 ++++--
 net/ipv4/inet_connection_sock.c     | 14 +++++++++-----
 net/ipv6/inet6_connection_sock.c    |  7 ++++---
 4 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/include/net/inet6_connection_sock.h b/include/net/inet6_connection_sock.h
index 954ad6b..3212b39 100644
--- a/include/net/inet6_connection_sock.h
+++ b/include/net/inet6_connection_sock.h
@@ -22,7 +22,8 @@ struct sock;
 struct sockaddr;
 
 int inet6_csk_bind_conflict(const struct sock *sk,
-			    const struct inet_bind_bucket *tb, bool relax);
+			    const struct inet_bind_bucket *tb, bool relax,
+			    bool soreuseport_ok);
 
 struct dst_entry *inet6_csk_route_req(const struct sock *sk, struct flowi6 *fl6,
 				      const struct request_sock *req, u8 proto);
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index 146054c..85ee387 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -63,7 +63,8 @@ struct inet_connection_sock_af_ops {
 #endif
 	void	    (*addr2sockaddr)(struct sock *sk, struct sockaddr *);
 	int	    (*bind_conflict)(const struct sock *sk,
-				     const struct inet_bind_bucket *tb, bool relax);
+				     const struct inet_bind_bucket *tb,
+				     bool relax, bool soreuseport_ok);
 	void	    (*mtu_reduced)(struct sock *sk);
 };
 
@@ -261,7 +262,8 @@ inet_csk_rto_backoff(const struct inet_connection_sock *icsk,
 struct sock *inet_csk_accept(struct sock *sk, int flags, int *err);
 
 int inet_csk_bind_conflict(const struct sock *sk,
-			   const struct inet_bind_bucket *tb, bool relax);
+			   const struct inet_bind_bucket *tb, bool relax,
+			   bool soreuseport_ok);
 int inet_csk_get_port(struct sock *sk, unsigned short snum);
 
 struct dst_entry *inet_csk_route_req(const struct sock *sk, struct flowi4 *fl4,
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index f59838a6..19ea045 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -45,11 +45,12 @@ void inet_get_local_port_range(struct net *net, int *low, int *high)
 EXPORT_SYMBOL(inet_get_local_port_range);
 
 int inet_csk_bind_conflict(const struct sock *sk,
-			   const struct inet_bind_bucket *tb, bool relax)
+			   const struct inet_bind_bucket *tb, bool relax,
+			   bool reuseport_ok)
 {
 	struct sock *sk2;
-	int reuse = sk->sk_reuse;
-	int reuseport = sk->sk_reuseport;
+	bool reuse = sk->sk_reuse;
+	bool reuseport = !!sk->sk_reuseport && reuseport_ok;
 	kuid_t uid = sock_i_uid((struct sock *)sk);
 
 	/*
@@ -105,6 +106,7 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
 	struct inet_bind_bucket *tb;
 	kuid_t uid = sock_i_uid(sk);
 	u32 remaining, offset;
+	bool reuseport_ok = !!snum;
 
 	if (port) {
 have_port:
@@ -165,7 +167,8 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
 					smallest_size = tb->num_owners;
 					smallest_port = port;
 				}
-				if (!inet_csk(sk)->icsk_af_ops->bind_conflict(sk, tb, false))
+				if (!inet_csk(sk)->icsk_af_ops->bind_conflict(sk, tb, false,
+									      reuseport_ok))
 					goto tb_found;
 				goto next_port;
 			}
@@ -206,7 +209,8 @@ int inet_csk_get_port(struct sock *sk, unsigned short snum)
 		      sk->sk_reuseport && uid_eq(tb->fastuid, uid))) &&
 		    smallest_size == -1)
 			goto success;
-		if (inet_csk(sk)->icsk_af_ops->bind_conflict(sk, tb, true)) {
+		if (inet_csk(sk)->icsk_af_ops->bind_conflict(sk, tb, true,
+							     reuseport_ok)) {
 			if ((reuse ||
 			     (tb->fastreuseport > 0 &&
 			      sk->sk_reuseport &&
diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c
index 1c86c47..7396e75 100644
--- a/net/ipv6/inet6_connection_sock.c
+++ b/net/ipv6/inet6_connection_sock.c
@@ -29,11 +29,12 @@
 #include <net/sock_reuseport.h>
 
 int inet6_csk_bind_conflict(const struct sock *sk,
-			    const struct inet_bind_bucket *tb, bool relax)
+			    const struct inet_bind_bucket *tb, bool relax,
+			    bool reuseport_ok)
 {
 	const struct sock *sk2;
-	int reuse = sk->sk_reuse;
-	int reuseport = sk->sk_reuseport;
+	bool reuse = !!sk->sk_reuse;
+	bool reuseport = !!sk->sk_reuseport && reuseport_ok;
 	kuid_t uid = sock_i_uid((struct sock *)sk);
 
 	/* We must walk the whole port owner list in this case. -DaveM */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net-next 0/2] inet: Fixes for inet_csk_get_port and soreusport
  2016-12-15  0:54 [PATCH net-next 0/2] inet: Fixes for inet_csk_get_port and soreusport Tom Herbert
  2016-12-15  0:54 ` [PATCH net-next 1/2] inet: Don't go into port scan when looking for specific bind port Tom Herbert
  2016-12-15  0:54 ` [PATCH net-next 2/2] inet: Fix get port to handle zero port number with soreuseport set Tom Herbert
@ 2016-12-17 16:13 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2016-12-17 16:13 UTC (permalink / raw)
  To: tom; +Cc: netdev, kernel-team, jbacik, eric.dumazet, raigatgoog

From: Tom Herbert <tom@herbertland.com>
Date: Wed, 14 Dec 2016 16:54:14 -0800

> This patch set fixes a couple of issues I noticed while debugging our
> softlockup issue in inet_csk_get_port.
> 
> - Don't allow jump into port scan in inet_csk_get_port if function
>   was called with non-zero port number (looking up explicit port
>   number).
> - When inet_csk_get_port is called with zero port number (ie. perform
>   scan) an reuseport is set on the socket, don't match sockets that
>   also have reuseport set. The intent from the user should be
>   to get a new port number and then explictly bind other
>   sockets to that number using soreuseport.
> 
> Tested:
> 
> Ran first patch on production workload with no ill effect.
> 
> For second patch, ran a little listener application and first
> demonstrated that unbound sockets with soreuseport can indeed
> be bound to unrelated soreuseport sockets.

Series applied, thanks Tom.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-12-17 16:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-15  0:54 [PATCH net-next 0/2] inet: Fixes for inet_csk_get_port and soreusport Tom Herbert
2016-12-15  0:54 ` [PATCH net-next 1/2] inet: Don't go into port scan when looking for specific bind port Tom Herbert
2016-12-15  0:54 ` [PATCH net-next 2/2] inet: Fix get port to handle zero port number with soreuseport set Tom Herbert
2016-12-17 16:13 ` [PATCH net-next 0/2] inet: Fixes for inet_csk_get_port and soreusport David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.