From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kirill Tkhai Subject: Re: [Patch net-next] ipv4: initialize ra_mutex in inet_init_net() Date: Wed, 19 Sep 2018 11:25:35 +0300 Message-ID: References: <20180914203242.2712-1-xiyou.wangcong@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: Linux Kernel Network Developers To: Cong Wang Return-path: Received: from mail-eopbgr60120.outbound.protection.outlook.com ([40.107.6.120]:58087 "EHLO EUR04-DB3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727669AbeISOFW (ORCPT ); Wed, 19 Sep 2018 10:05:22 -0400 In-Reply-To: Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 18.09.2018 23:17, Cong Wang wrote: > On Mon, Sep 17, 2018 at 12:25 AM Kirill Tkhai wrote: >> In inet_init() the order of registration is: >> >> ip_mr_init(); >> init_inet_pernet_ops(); >> >> This means, ipmr_net_ops pernet operations are before af_inet_ops >> in pernet_list. So, there is a theoretical probability, sometimes >> in the future, we will have a problem during a fail of net initialization. >> >> Say, >> >> setup_net(): >> ipmr_net_ops->init() returns 0 >> xxx->init() returns error >> and then we do: >> ipmr_net_ops->exit(), >> >> which could touch ra_mutex (theoretically). > > How could ra_mutex be touched in this scenario? > > ra_mutex is only used in ip_ra_control() which is called > only by {get,set}sockopt(). I don't see anything related > to netns exit() path here. Currently, it is not touched. But it's an ordinary practice, someone closes sockets in pernet ->exit methods. For example, we close percpu icmp sockets in icmp_sk_exit(), which are also of RAW type, and there is also called ip_ra_control() for them. Yes, they differ by their protocol; icmp sockets are of IPPROTO_ICMP protocol, while ip_ra_control() acts on IPPROTO_RAW sockets, but it's not good anyway. This does not look reliable for the future. In case of someone changes something here, we may do not notice this for the long time, while some users will meet bugs on their places. Problems on error paths is not easy to detect on testing, while user may meet them. We had issue of same type with uninitialized xfrm_policy_lock. It was introduced in 2013, while the problem was found only in 2017: introduced by 283bc9f35bbb fixed by c282222a45cb (Last week I met it on RH7 kernel, which still has no a fix. But this talk is not about distribution kernels, just about the way). I just want to say if someone makes some creativity on top of this code, it will be to more friendly from us to him/her to not force this person to think about such not obvious details, but just to implement nice architecture right now. Thanks, Kirill