netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/5] Rework ip_ra_chain protection
@ 2018-03-15 13:19 Kirill Tkhai
  2018-03-15 13:20 ` [PATCH net-next 1/5] net: Revert "ipv4: get rid of ip_ra_lock" Kirill Tkhai
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Kirill Tkhai @ 2018-03-15 13:19 UTC (permalink / raw)
  To: davem, yoshfuji, edumazet, yanhaishuang, nikolay, yotamg, soheil,
	ktkhai, avagin, nicolas.dichtel, ebiederm, fw, roman.kapl,
	netdev, xiyou.wangcong, dvyukov, andreyknvl

Commit 1215e51edad1 "ipv4: fix a deadlock in ip_ra_control"
made rtnl_lock() be used in raw_close(). This function is called
on every RAW socket destruction, so that rtnl_mutex is taken
every time. This scales very sadly. I observe cleanup_net()
spending a lot of time in rtnl_lock() and raw_close() is one
of the biggest rtnl user (since we have percpu net->ipv4.icmp_sk).

Another problem is that commit does not explain actual call path
mrtsock_destruct() takes sk lock and the way to deadlock.
But there is no sk lock taking is visible in mrtsock_destruct().
So, it is a question does we need it at all.

This patchset reworks the locking: reverts the problem commit
and its descendant, and introduces rtnl-independent locking.
This may have a continuation, and someone may work on killing
rtnl_lock() in mrtsock_destruct() in the future.

Thanks,
Kirill
---

Kirill Tkhai (5):
      net: Revert "ipv4: get rid of ip_ra_lock"
      net: Revert "ipv4: fix a deadlock in ip_ra_control"
      net: Move IP_ROUTER_ALERT out of lock_sock(sk)
      net: Make ip_ra_chain per struct net
      net: Replace ip_ra_lock with per-net mutex


 include/net/ip.h         |   13 +++++++++++--
 include/net/netns/ipv4.h |    2 ++
 net/core/net_namespace.c |    1 +
 net/ipv4/ip_input.c      |    5 ++---
 net/ipv4/ip_sockglue.c   |   34 +++++++++++++---------------------
 net/ipv4/ipmr.c          |   11 +++++++++--
 net/ipv4/raw.c           |    2 --
 7 files changed, 38 insertions(+), 30 deletions(-)

--
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH net-next 1/5] net: Revert "ipv4: get rid of ip_ra_lock"
  2018-03-15 13:19 [PATCH net-next 0/5] Rework ip_ra_chain protection Kirill Tkhai
@ 2018-03-15 13:20 ` Kirill Tkhai
  2018-03-15 13:20 ` [PATCH net-next 2/5] net: Revert "ipv4: fix a deadlock in ip_ra_control" Kirill Tkhai
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Kirill Tkhai @ 2018-03-15 13:20 UTC (permalink / raw)
  To: davem, yoshfuji, edumazet, yanhaishuang, nikolay, yotamg, soheil,
	ktkhai, avagin, nicolas.dichtel, ebiederm, fw, roman.kapl,
	netdev, xiyou.wangcong, dvyukov, andreyknvl

This reverts commit ba3f571d5dde. The commit was made
after 1215e51edad1 "ipv4: fix a deadlock in ip_ra_control",
and killed ip_ra_lock, which became useless after rtnl_lock()
made used to destroy every raw ipv4 socket. This scales
very bad, and next patch in series reverts 1215e51edad1.
ip_ra_lock will be used again.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/ipv4/ip_sockglue.c |   12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 74c962b9b09c..be7c3b71914d 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -334,6 +334,7 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
    sent to multicast group to reach destination designated router.
  */
 struct ip_ra_chain __rcu *ip_ra_chain;
+static DEFINE_SPINLOCK(ip_ra_lock);
 
 
 static void ip_ra_destroy_rcu(struct rcu_head *head)
@@ -355,17 +356,21 @@ int ip_ra_control(struct sock *sk, unsigned char on,
 
 	new_ra = on ? kmalloc(sizeof(*new_ra), GFP_KERNEL) : NULL;
 
+	spin_lock_bh(&ip_ra_lock);
 	for (rap = &ip_ra_chain;
-	     (ra = rtnl_dereference(*rap)) != NULL;
+	     (ra = rcu_dereference_protected(*rap,
+			lockdep_is_held(&ip_ra_lock))) != NULL;
 	     rap = &ra->next) {
 		if (ra->sk == sk) {
 			if (on) {
+				spin_unlock_bh(&ip_ra_lock);
 				kfree(new_ra);
 				return -EADDRINUSE;
 			}
 			/* dont let ip_call_ra_chain() use sk again */
 			ra->sk = NULL;
 			RCU_INIT_POINTER(*rap, ra->next);
+			spin_unlock_bh(&ip_ra_lock);
 
 			if (ra->destructor)
 				ra->destructor(sk);
@@ -379,14 +384,17 @@ int ip_ra_control(struct sock *sk, unsigned char on,
 			return 0;
 		}
 	}
-	if (!new_ra)
+	if (!new_ra) {
+		spin_unlock_bh(&ip_ra_lock);
 		return -ENOBUFS;
+	}
 	new_ra->sk = sk;
 	new_ra->destructor = destructor;
 
 	RCU_INIT_POINTER(new_ra->next, ra);
 	rcu_assign_pointer(*rap, new_ra);
 	sock_hold(sk);
+	spin_unlock_bh(&ip_ra_lock);
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 2/5] net: Revert "ipv4: fix a deadlock in ip_ra_control"
  2018-03-15 13:19 [PATCH net-next 0/5] Rework ip_ra_chain protection Kirill Tkhai
  2018-03-15 13:20 ` [PATCH net-next 1/5] net: Revert "ipv4: get rid of ip_ra_lock" Kirill Tkhai
@ 2018-03-15 13:20 ` Kirill Tkhai
  2018-03-15 13:20 ` [PATCH net-next 3/5] net: Move IP_ROUTER_ALERT out of lock_sock(sk) Kirill Tkhai
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Kirill Tkhai @ 2018-03-15 13:20 UTC (permalink / raw)
  To: davem, yoshfuji, edumazet, yanhaishuang, nikolay, yotamg, soheil,
	ktkhai, avagin, nicolas.dichtel, ebiederm, fw, roman.kapl,
	netdev, xiyou.wangcong, dvyukov, andreyknvl

This reverts commit 1215e51edad1.
Since raw_close() is used on every RAW socket destruction,
the changes made by 1215e51edad1 scale sadly. This clearly
seen on endless unshare(CLONE_NEWNET) test, and cleanup_net()
kwork spends a lot of time waiting for rtnl_lock() introduced
by this commit.

Next patches in series will rework this in another way,
so now we revert 1215e51edad1. Also, it doesn't seen
mrtsock_destruct() takes sk_lock, and the comment to the commit
does not show the actual stack dump. So, there is a question
did we really need in it.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/ipv4/ip_sockglue.c |    1 -
 net/ipv4/ipmr.c        |   11 +++++++++--
 net/ipv4/raw.c         |    2 --
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index be7c3b71914d..b7bac7351409 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -594,7 +594,6 @@ static bool setsockopt_needs_rtnl(int optname)
 	case MCAST_LEAVE_GROUP:
 	case MCAST_LEAVE_SOURCE_GROUP:
 	case MCAST_UNBLOCK_SOURCE:
-	case IP_ROUTER_ALERT:
 		return true;
 	}
 	return false;
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index d752a70855d8..f6be5db16da2 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -1399,7 +1399,7 @@ static void mrtsock_destruct(struct sock *sk)
 	struct net *net = sock_net(sk);
 	struct mr_table *mrt;
 
-	ASSERT_RTNL();
+	rtnl_lock();
 	ipmr_for_each_table(mrt, net) {
 		if (sk == rtnl_dereference(mrt->mroute_sk)) {
 			IPV4_DEVCONF_ALL(net, MC_FORWARDING)--;
@@ -1411,6 +1411,7 @@ static void mrtsock_destruct(struct sock *sk)
 			mroute_clean_tables(mrt, false);
 		}
 	}
+	rtnl_unlock();
 }
 
 /* Socket options and virtual interface manipulation. The whole
@@ -1475,8 +1476,13 @@ int ip_mroute_setsockopt(struct sock *sk, int optname, char __user *optval,
 		if (sk != rcu_access_pointer(mrt->mroute_sk)) {
 			ret = -EACCES;
 		} else {
+			/* We need to unlock here because mrtsock_destruct takes
+			 * care of rtnl itself and we can't change that due to
+			 * the IP_ROUTER_ALERT setsockopt which runs without it.
+			 */
+			rtnl_unlock();
 			ret = ip_ra_control(sk, 0, NULL);
-			goto out_unlock;
+			goto out;
 		}
 		break;
 	case MRT_ADD_VIF:
@@ -1588,6 +1594,7 @@ int ip_mroute_setsockopt(struct sock *sk, int optname, char __user *optval,
 	}
 out_unlock:
 	rtnl_unlock();
+out:
 	return ret;
 }
 
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 54648d20bf0f..720bef7da2f6 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -711,9 +711,7 @@ static void raw_close(struct sock *sk, long timeout)
 	/*
 	 * Raw sockets may have direct kernel references. Kill them.
 	 */
-	rtnl_lock();
 	ip_ra_control(sk, 0, NULL);
-	rtnl_unlock();
 
 	sk_common_release(sk);
 }

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 3/5] net: Move IP_ROUTER_ALERT out of lock_sock(sk)
  2018-03-15 13:19 [PATCH net-next 0/5] Rework ip_ra_chain protection Kirill Tkhai
  2018-03-15 13:20 ` [PATCH net-next 1/5] net: Revert "ipv4: get rid of ip_ra_lock" Kirill Tkhai
  2018-03-15 13:20 ` [PATCH net-next 2/5] net: Revert "ipv4: fix a deadlock in ip_ra_control" Kirill Tkhai
@ 2018-03-15 13:20 ` Kirill Tkhai
  2018-03-15 13:20 ` [PATCH net-next 4/5] net: Make ip_ra_chain per struct net Kirill Tkhai
  2018-03-15 13:20 ` [PATCH net-next 5/5] net: Replace ip_ra_lock with per-net mutex Kirill Tkhai
  4 siblings, 0 replies; 8+ messages in thread
From: Kirill Tkhai @ 2018-03-15 13:20 UTC (permalink / raw)
  To: davem, yoshfuji, edumazet, yanhaishuang, nikolay, yotamg, soheil,
	ktkhai, avagin, nicolas.dichtel, ebiederm, fw, roman.kapl,
	netdev, xiyou.wangcong, dvyukov, andreyknvl

ip_ra_control() does not need sk_lock. Who are the another
users of ip_ra_chain? ip_mroute_setsockopt() doesn't take
sk_lock, while parallel IP_ROUTER_ALERT syscalls are
synchronized by ip_ra_lock. So, we may move this command
out of sk_lock.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/ipv4/ip_sockglue.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index b7bac7351409..bf5f44b27b7e 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -646,6 +646,8 @@ static int do_ip_setsockopt(struct sock *sk, int level,
 
 	/* If optlen==0, it is equivalent to val == 0 */
 
+	if (optname == IP_ROUTER_ALERT)
+		return ip_ra_control(sk, val ? 1 : 0, NULL);
 	if (ip_mroute_opt(optname))
 		return ip_mroute_setsockopt(sk, optname, optval, optlen);
 
@@ -1156,9 +1158,6 @@ static int do_ip_setsockopt(struct sock *sk, int level,
 			goto e_inval;
 		inet->mc_all = val;
 		break;
-	case IP_ROUTER_ALERT:
-		err = ip_ra_control(sk, val ? 1 : 0, NULL);
-		break;
 
 	case IP_FREEBIND:
 		if (optlen < 1)

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 4/5] net: Make ip_ra_chain per struct net
  2018-03-15 13:19 [PATCH net-next 0/5] Rework ip_ra_chain protection Kirill Tkhai
                   ` (2 preceding siblings ...)
  2018-03-15 13:20 ` [PATCH net-next 3/5] net: Move IP_ROUTER_ALERT out of lock_sock(sk) Kirill Tkhai
@ 2018-03-15 13:20 ` Kirill Tkhai
  2018-03-16 23:22   ` kbuild test robot
  2018-03-15 13:20 ` [PATCH net-next 5/5] net: Replace ip_ra_lock with per-net mutex Kirill Tkhai
  4 siblings, 1 reply; 8+ messages in thread
From: Kirill Tkhai @ 2018-03-15 13:20 UTC (permalink / raw)
  To: davem, yoshfuji, edumazet, yanhaishuang, nikolay, yotamg, soheil,
	ktkhai, avagin, nicolas.dichtel, ebiederm, fw, roman.kapl,
	netdev, xiyou.wangcong, dvyukov, andreyknvl

This is optimization, which makes ip_call_ra_chain()
iterate less sockets to find the sockets it's looking for.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 include/net/ip.h         |   13 +++++++++++--
 include/net/netns/ipv4.h |    1 +
 net/ipv4/ip_input.c      |    5 ++---
 net/ipv4/ip_sockglue.c   |   15 ++-------------
 4 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index fe63ba95d12b..d53b5a9eae34 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -91,6 +91,17 @@ static inline int inet_sdif(struct sk_buff *skb)
 	return 0;
 }
 
+/* Special input handler for packets caught by router alert option.
+   They are selected only by protocol field, and then processed likely
+   local ones; but only if someone wants them! Otherwise, router
+   not running rsvpd will kill RSVP.
+
+   It is user level problem, what it will make with them.
+   I have no idea, how it will masquearde or NAT them (it is joke, joke :-)),
+   but receiver should be enough clever f.e. to forward mtrace requests,
+   sent to multicast group to reach destination designated router.
+ */
+
 struct ip_ra_chain {
 	struct ip_ra_chain __rcu *next;
 	struct sock		*sk;
@@ -101,8 +112,6 @@ struct ip_ra_chain {
 	struct rcu_head		rcu;
 };
 
-extern struct ip_ra_chain __rcu *ip_ra_chain;
-
 /* IP flags. */
 #define IP_CE		0x8000		/* Flag: "Congestion"		*/
 #define IP_DF		0x4000		/* Flag: "Don't Fragment"	*/
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 3a970e429ab6..23c208fcf1a0 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -49,6 +49,7 @@ struct netns_ipv4 {
 #endif
 	struct ipv4_devconf	*devconf_all;
 	struct ipv4_devconf	*devconf_dflt;
+	struct ip_ra_chain	*ra_chain;
 #ifdef CONFIG_IP_MULTIPLE_TABLES
 	struct fib_rules_ops	*rules_ops;
 	bool			fib_has_custom_rules;
diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 57fc13c6ab2b..7582713dd18f 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -159,7 +159,7 @@ bool ip_call_ra_chain(struct sk_buff *skb)
 	struct net_device *dev = skb->dev;
 	struct net *net = dev_net(dev);
 
-	for (ra = rcu_dereference(ip_ra_chain); ra; ra = rcu_dereference(ra->next)) {
+	for (ra = rcu_dereference(net->ipv4.ra_chain); ra; ra = rcu_dereference(ra->next)) {
 		struct sock *sk = ra->sk;
 
 		/* If socket is bound to an interface, only report
@@ -167,8 +167,7 @@ bool ip_call_ra_chain(struct sk_buff *skb)
 		 */
 		if (sk && inet_sk(sk)->inet_num == protocol &&
 		    (!sk->sk_bound_dev_if ||
-		     sk->sk_bound_dev_if == dev->ifindex) &&
-		    net_eq(sock_net(sk), net)) {
+		     sk->sk_bound_dev_if == dev->ifindex)) {
 			if (ip_is_fragment(ip_hdr(skb))) {
 				if (ip_defrag(net, skb, IP_DEFRAG_CALL_RA_CHAIN))
 					return true;
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index bf5f44b27b7e..f36d35fe924b 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -322,18 +322,6 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
 	return 0;
 }
 
-
-/* Special input handler for packets caught by router alert option.
-   They are selected only by protocol field, and then processed likely
-   local ones; but only if someone wants them! Otherwise, router
-   not running rsvpd will kill RSVP.
-
-   It is user level problem, what it will make with them.
-   I have no idea, how it will masquearde or NAT them (it is joke, joke :-)),
-   but receiver should be enough clever f.e. to forward mtrace requests,
-   sent to multicast group to reach destination designated router.
- */
-struct ip_ra_chain __rcu *ip_ra_chain;
 static DEFINE_SPINLOCK(ip_ra_lock);
 
 
@@ -350,6 +338,7 @@ int ip_ra_control(struct sock *sk, unsigned char on,
 {
 	struct ip_ra_chain *ra, *new_ra;
 	struct ip_ra_chain __rcu **rap;
+	struct net *net = sock_net(sk);
 
 	if (sk->sk_type != SOCK_RAW || inet_sk(sk)->inet_num == IPPROTO_RAW)
 		return -EINVAL;
@@ -357,7 +346,7 @@ int ip_ra_control(struct sock *sk, unsigned char on,
 	new_ra = on ? kmalloc(sizeof(*new_ra), GFP_KERNEL) : NULL;
 
 	spin_lock_bh(&ip_ra_lock);
-	for (rap = &ip_ra_chain;
+	for (rap = &net->ipv4.ra_chain;
 	     (ra = rcu_dereference_protected(*rap,
 			lockdep_is_held(&ip_ra_lock))) != NULL;
 	     rap = &ra->next) {

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next 5/5] net: Replace ip_ra_lock with per-net mutex
  2018-03-15 13:19 [PATCH net-next 0/5] Rework ip_ra_chain protection Kirill Tkhai
                   ` (3 preceding siblings ...)
  2018-03-15 13:20 ` [PATCH net-next 4/5] net: Make ip_ra_chain per struct net Kirill Tkhai
@ 2018-03-15 13:20 ` Kirill Tkhai
  4 siblings, 0 replies; 8+ messages in thread
From: Kirill Tkhai @ 2018-03-15 13:20 UTC (permalink / raw)
  To: davem, yoshfuji, edumazet, yanhaishuang, nikolay, yotamg, soheil,
	ktkhai, avagin, nicolas.dichtel, ebiederm, fw, roman.kapl,
	netdev, xiyou.wangcong, dvyukov, andreyknvl

Since ra_chain is per-net, we may use per-net mutexes
to protect them in ip_ra_control(). This improves
scalability.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 include/net/netns/ipv4.h |    1 +
 net/core/net_namespace.c |    1 +
 net/ipv4/ip_sockglue.c   |   15 ++++++---------
 3 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 23c208fcf1a0..7295429732c6 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -50,6 +50,7 @@ struct netns_ipv4 {
 	struct ipv4_devconf	*devconf_all;
 	struct ipv4_devconf	*devconf_dflt;
 	struct ip_ra_chain	*ra_chain;
+	struct mutex		ra_mutex;
 #ifdef CONFIG_IP_MULTIPLE_TABLES
 	struct fib_rules_ops	*rules_ops;
 	bool			fib_has_custom_rules;
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index c340d5cfbdec..95ba2c53bd9a 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -301,6 +301,7 @@ static __net_init int setup_net(struct net *net, struct user_namespace *user_ns)
 	net->user_ns = user_ns;
 	idr_init(&net->netns_ids);
 	spin_lock_init(&net->nsid_lock);
+	mutex_init(&net->ipv4.ra_mutex);
 
 	list_for_each_entry(ops, &pernet_list, list) {
 		error = ops_init(ops, net);
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index f36d35fe924b..5ad2d8ed3a3f 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -322,9 +322,6 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
 	return 0;
 }
 
-static DEFINE_SPINLOCK(ip_ra_lock);
-
-
 static void ip_ra_destroy_rcu(struct rcu_head *head)
 {
 	struct ip_ra_chain *ra = container_of(head, struct ip_ra_chain, rcu);
@@ -345,21 +342,21 @@ int ip_ra_control(struct sock *sk, unsigned char on,
 
 	new_ra = on ? kmalloc(sizeof(*new_ra), GFP_KERNEL) : NULL;
 
-	spin_lock_bh(&ip_ra_lock);
+	mutex_lock(&net->ipv4.ra_mutex);
 	for (rap = &net->ipv4.ra_chain;
 	     (ra = rcu_dereference_protected(*rap,
-			lockdep_is_held(&ip_ra_lock))) != NULL;
+			lockdep_is_held(&net->ipv4.ra_mutex))) != NULL;
 	     rap = &ra->next) {
 		if (ra->sk == sk) {
 			if (on) {
-				spin_unlock_bh(&ip_ra_lock);
+				mutex_unlock(&net->ipv4.ra_mutex);
 				kfree(new_ra);
 				return -EADDRINUSE;
 			}
 			/* dont let ip_call_ra_chain() use sk again */
 			ra->sk = NULL;
 			RCU_INIT_POINTER(*rap, ra->next);
-			spin_unlock_bh(&ip_ra_lock);
+			mutex_unlock(&net->ipv4.ra_mutex);
 
 			if (ra->destructor)
 				ra->destructor(sk);
@@ -374,7 +371,7 @@ int ip_ra_control(struct sock *sk, unsigned char on,
 		}
 	}
 	if (!new_ra) {
-		spin_unlock_bh(&ip_ra_lock);
+		mutex_unlock(&net->ipv4.ra_mutex);
 		return -ENOBUFS;
 	}
 	new_ra->sk = sk;
@@ -383,7 +380,7 @@ int ip_ra_control(struct sock *sk, unsigned char on,
 	RCU_INIT_POINTER(new_ra->next, ra);
 	rcu_assign_pointer(*rap, new_ra);
 	sock_hold(sk);
-	spin_unlock_bh(&ip_ra_lock);
+	mutex_unlock(&net->ipv4.ra_mutex);
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 4/5] net: Make ip_ra_chain per struct net
  2018-03-15 13:20 ` [PATCH net-next 4/5] net: Make ip_ra_chain per struct net Kirill Tkhai
@ 2018-03-16 23:22   ` kbuild test robot
  2018-03-17 10:21     ` Kirill Tkhai
  0 siblings, 1 reply; 8+ messages in thread
From: kbuild test robot @ 2018-03-16 23:22 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: kbuild-all, davem, yoshfuji, edumazet, yanhaishuang, nikolay,
	yotamg, soheil, ktkhai, avagin, nicolas.dichtel, ebiederm, fw,
	roman.kapl, netdev, xiyou.wangcong, dvyukov, andreyknvl

Hi Kirill,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on v4.16-rc4]
[also build test WARNING on next-20180316]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Kirill-Tkhai/Rework-ip_ra_chain-protection/20180317-032841
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> net/ipv4/ip_input.c:162:19: sparse: incompatible types in comparison expression (different address spaces)
--
>> net/ipv4/ip_sockglue.c:347:18: sparse: incorrect type in assignment (different address spaces) @@    expected struct ip_ra_chain [noderef] <asn:4>**rap @@    got n [noderef] <asn:4>**rap @@
   net/ipv4/ip_sockglue.c:347:18:    expected struct ip_ra_chain [noderef] <asn:4>**rap
   net/ipv4/ip_sockglue.c:347:18:    got struct ip_ra_chain **<noident>

vim +162 net/ipv4/ip_input.c

   150	
   151	/*
   152	 *	Process Router Attention IP option (RFC 2113)
   153	 */
   154	bool ip_call_ra_chain(struct sk_buff *skb)
   155	{
   156		struct ip_ra_chain *ra;
   157		u8 protocol = ip_hdr(skb)->protocol;
   158		struct sock *last = NULL;
   159		struct net_device *dev = skb->dev;
   160		struct net *net = dev_net(dev);
   161	
 > 162		for (ra = rcu_dereference(net->ipv4.ra_chain); ra; ra = rcu_dereference(ra->next)) {
   163			struct sock *sk = ra->sk;
   164	
   165			/* If socket is bound to an interface, only report
   166			 * the packet if it came  from that interface.
   167			 */
   168			if (sk && inet_sk(sk)->inet_num == protocol &&
   169			    (!sk->sk_bound_dev_if ||
   170			     sk->sk_bound_dev_if == dev->ifindex)) {
   171				if (ip_is_fragment(ip_hdr(skb))) {
   172					if (ip_defrag(net, skb, IP_DEFRAG_CALL_RA_CHAIN))
   173						return true;
   174				}
   175				if (last) {
   176					struct sk_buff *skb2 = skb_clone(skb, GFP_ATOMIC);
   177					if (skb2)
   178						raw_rcv(last, skb2);
   179				}
   180				last = sk;
   181			}
   182		}
   183	
   184		if (last) {
   185			raw_rcv(last, skb);
   186			return true;
   187		}
   188		return false;
   189	}
   190	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next 4/5] net: Make ip_ra_chain per struct net
  2018-03-16 23:22   ` kbuild test robot
@ 2018-03-17 10:21     ` Kirill Tkhai
  0 siblings, 0 replies; 8+ messages in thread
From: Kirill Tkhai @ 2018-03-17 10:21 UTC (permalink / raw)
  To: kbuild test robot
  Cc: kbuild-all, davem, yoshfuji, edumazet, yanhaishuang, nikolay,
	yotamg, soheil, avagin, nicolas.dichtel, ebiederm, fw,
	roman.kapl, netdev, xiyou.wangcong, dvyukov, andreyknvl

Hi,

On 17.03.2018 02:22, kbuild test robot wrote:
> Hi Kirill,
> 
> Thank you for the patch! Perhaps something to improve:
> 
> [auto build test WARNING on v4.16-rc4]
> [also build test WARNING on next-20180316]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> 
> url:    https://github.com/0day-ci/linux/commits/Kirill-Tkhai/Rework-ip_ra_chain-protection/20180317-032841
> reproduce:
>         # apt-get install sparse
>         make ARCH=x86_64 allmodconfig
>         make C=1 CF=-D__CHECK_ENDIAN__
> 
> 
> sparse warnings: (new ones prefixed by >>)
> 
>>> net/ipv4/ip_input.c:162:19: sparse: incompatible types in comparison expression (different address spaces)

thanks for reporting this! Forgot to add __rcu modifier to member type.
I'll send v2 version.

Kirill

> --
>>> net/ipv4/ip_sockglue.c:347:18: sparse: incorrect type in assignment (different address spaces) @@    expected struct ip_ra_chain [noderef] <asn:4>**rap @@    got n [noderef] <asn:4>**rap @@
>    net/ipv4/ip_sockglue.c:347:18:    expected struct ip_ra_chain [noderef] <asn:4>**rap
>    net/ipv4/ip_sockglue.c:347:18:    got struct ip_ra_chain **<noident>
> 
> vim +162 net/ipv4/ip_input.c
> 
>    150	
>    151	/*
>    152	 *	Process Router Attention IP option (RFC 2113)
>    153	 */
>    154	bool ip_call_ra_chain(struct sk_buff *skb)
>    155	{
>    156		struct ip_ra_chain *ra;
>    157		u8 protocol = ip_hdr(skb)->protocol;
>    158		struct sock *last = NULL;
>    159		struct net_device *dev = skb->dev;
>    160		struct net *net = dev_net(dev);
>    161	
>  > 162		for (ra = rcu_dereference(net->ipv4.ra_chain); ra; ra = rcu_dereference(ra->next)) {
>    163			struct sock *sk = ra->sk;
>    164	
>    165			/* If socket is bound to an interface, only report
>    166			 * the packet if it came  from that interface.
>    167			 */
>    168			if (sk && inet_sk(sk)->inet_num == protocol &&
>    169			    (!sk->sk_bound_dev_if ||
>    170			     sk->sk_bound_dev_if == dev->ifindex)) {
>    171				if (ip_is_fragment(ip_hdr(skb))) {
>    172					if (ip_defrag(net, skb, IP_DEFRAG_CALL_RA_CHAIN))
>    173						return true;
>    174				}
>    175				if (last) {
>    176					struct sk_buff *skb2 = skb_clone(skb, GFP_ATOMIC);
>    177					if (skb2)
>    178						raw_rcv(last, skb2);
>    179				}
>    180				last = sk;
>    181			}
>    182		}
>    183	
>    184		if (last) {
>    185			raw_rcv(last, skb);
>    186			return true;
>    187		}
>    188		return false;
>    189	}
>    190	
> 
> ---
> 0-DAY kernel test infrastructure                Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-03-17 10:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-15 13:19 [PATCH net-next 0/5] Rework ip_ra_chain protection Kirill Tkhai
2018-03-15 13:20 ` [PATCH net-next 1/5] net: Revert "ipv4: get rid of ip_ra_lock" Kirill Tkhai
2018-03-15 13:20 ` [PATCH net-next 2/5] net: Revert "ipv4: fix a deadlock in ip_ra_control" Kirill Tkhai
2018-03-15 13:20 ` [PATCH net-next 3/5] net: Move IP_ROUTER_ALERT out of lock_sock(sk) Kirill Tkhai
2018-03-15 13:20 ` [PATCH net-next 4/5] net: Make ip_ra_chain per struct net Kirill Tkhai
2018-03-16 23:22   ` kbuild test robot
2018-03-17 10:21     ` Kirill Tkhai
2018-03-15 13:20 ` [PATCH net-next 5/5] net: Replace ip_ra_lock with per-net mutex Kirill Tkhai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).