From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-db5eur01on0096.outbound.protection.outlook.com ([104.47.2.96]:35168 "EHLO EUR01-DB5-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932735AbeCSJPD (ORCPT ); Mon, 19 Mar 2018 05:15:03 -0400 Subject: [PATCH net-next v2 2/5] net: Revert "ipv4: fix a deadlock in ip_ra_control" From: Kirill Tkhai To: davem@davemloft.net, yoshfuji@linux-ipv6.org, edumazet@google.com, yanhaishuang@cmss.chinamobile.com, nikolay@cumulusnetworks.com, yotamg@mellanox.com, soheil@google.com, avagin@virtuozzo.com, nicolas.dichtel@6wind.com, ebiederm@xmission.com, fw@strlen.de, roman.kapl@sysgo.com, netdev@vger.kernel.org, xiyou.wangcong@gmail.com, dvyukov@google.com, andreyknvl@google.com, lkp@intel.com, ktkhai@virtuozzo.com Date: Mon, 19 Mar 2018 12:14:54 +0300 Message-ID: <152145089432.7718.3981942805167545803.stgit@localhost.localdomain> In-Reply-To: <152145065475.7718.16297762717744383072.stgit@localhost.localdomain> References: <152145065475.7718.16297762717744383072.stgit@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: netdev-owner@vger.kernel.org List-ID: This reverts commit 1215e51edad1. Since raw_close() is used on every RAW socket destruction, the changes made by 1215e51edad1 scale sadly. This clearly seen on endless unshare(CLONE_NEWNET) test, and cleanup_net() kwork spends a lot of time waiting for rtnl_lock() introduced by this commit. Next patches in series will rework this in another way, so now we revert 1215e51edad1. Also, it doesn't seen mrtsock_destruct() takes sk_lock, and the comment to the commit does not show the actual stack dump. So, there is a question did we really need in it. Signed-off-by: Kirill Tkhai --- net/ipv4/ip_sockglue.c | 1 - net/ipv4/ipmr.c | 11 +++++++++-- net/ipv4/raw.c | 2 -- 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index be7c3b71914d..b7bac7351409 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -594,7 +594,6 @@ static bool setsockopt_needs_rtnl(int optname) case MCAST_LEAVE_GROUP: case MCAST_LEAVE_SOURCE_GROUP: case MCAST_UNBLOCK_SOURCE: - case IP_ROUTER_ALERT: return true; } return false; diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index d752a70855d8..f6be5db16da2 100644 --- a/net/ipv4/ipmr.c +++ b/net/ipv4/ipmr.c @@ -1399,7 +1399,7 @@ static void mrtsock_destruct(struct sock *sk) struct net *net = sock_net(sk); struct mr_table *mrt; - ASSERT_RTNL(); + rtnl_lock(); ipmr_for_each_table(mrt, net) { if (sk == rtnl_dereference(mrt->mroute_sk)) { IPV4_DEVCONF_ALL(net, MC_FORWARDING)--; @@ -1411,6 +1411,7 @@ static void mrtsock_destruct(struct sock *sk) mroute_clean_tables(mrt, false); } } + rtnl_unlock(); } /* Socket options and virtual interface manipulation. The whole @@ -1475,8 +1476,13 @@ int ip_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, if (sk != rcu_access_pointer(mrt->mroute_sk)) { ret = -EACCES; } else { + /* We need to unlock here because mrtsock_destruct takes + * care of rtnl itself and we can't change that due to + * the IP_ROUTER_ALERT setsockopt which runs without it. + */ + rtnl_unlock(); ret = ip_ra_control(sk, 0, NULL); - goto out_unlock; + goto out; } break; case MRT_ADD_VIF: @@ -1588,6 +1594,7 @@ int ip_mroute_setsockopt(struct sock *sk, int optname, char __user *optval, } out_unlock: rtnl_unlock(); +out: return ret; } diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c index 54648d20bf0f..720bef7da2f6 100644 --- a/net/ipv4/raw.c +++ b/net/ipv4/raw.c @@ -711,9 +711,7 @@ static void raw_close(struct sock *sk, long timeout) /* * Raw sockets may have direct kernel references. Kill them. */ - rtnl_lock(); ip_ra_control(sk, 0, NULL); - rtnl_unlock(); sk_common_release(sk); }