From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBCFEC4338F for ; Mon, 2 Aug 2021 11:18:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9782860F36 for ; Mon, 2 Aug 2021 11:18:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233411AbhHBLSe (ORCPT ); Mon, 2 Aug 2021 07:18:34 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:7774 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231881AbhHBLSc (ORCPT ); Mon, 2 Aug 2021 07:18:32 -0400 Received: from dggeme703-chm.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Gdb8p1TkNzYh96; Mon, 2 Aug 2021 19:18:18 +0800 (CST) Received: from [10.174.179.25] (10.174.179.25) by dggeme703-chm.china.huawei.com (10.1.199.99) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Mon, 2 Aug 2021 19:18:21 +0800 Subject: Re: [PATCH 4/5] mm, memcg: avoid possible NULL pointer dereferencing in mem_cgroup_init() To: Michal Hocko CC: Roman Gushchin , , , , , , , , , , , References: <20210729125755.16871-1-linmiaohe@huawei.com> <20210729125755.16871-5-linmiaohe@huawei.com> <47daf062-f510-edb3-6ec7-f8e7615ad8a0@huawei.com> From: Miaohe Lin Message-ID: <59122ae4-52c9-4ff9-104d-872d770dec0c@huawei.com> Date: Mon, 2 Aug 2021 19:18:20 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.179.25] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggeme703-chm.china.huawei.com (10.1.199.99) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/8/2 18:42, Michal Hocko wrote: > On Mon 02-08-21 18:00:10, Miaohe Lin wrote: >> On 2021/8/2 14:43, Michal Hocko wrote: >>> On Sat 31-07-21 10:05:51, Miaohe Lin wrote: >>>> On 2021/7/30 14:44, Michal Hocko wrote: >>>>> On Thu 29-07-21 20:12:43, Roman Gushchin wrote: >>>>>> On Thu, Jul 29, 2021 at 08:57:54PM +0800, Miaohe Lin wrote: >>>>>>> rtpn might be NULL in very rare case. We have better to check it before >>>>>>> dereferencing it. Since memcg can live with NULL rb_tree_per_node in >>>>>>> soft_limit_tree, warn this case and continue. >>>>>>> >>>>>>> Signed-off-by: Miaohe Lin >>>>>>> --- >>>>>>> mm/memcontrol.c | 2 ++ >>>>>>> 1 file changed, 2 insertions(+) >>>>>>> >>>>>>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >>>>>>> index 5b4592d1e0f2..70a32174e7c4 100644 >>>>>>> --- a/mm/memcontrol.c >>>>>>> +++ b/mm/memcontrol.c >>>>>>> @@ -7109,6 +7109,8 @@ static int __init mem_cgroup_init(void) >>>>>>> rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, >>>>>>> node_online(node) ? node : NUMA_NO_NODE); >>>>>>> >>>>>>> + if (WARN_ON_ONCE(!rtpn)) >>>>>>> + continue; >>>>>> >>>>>> I also really doubt that it makes any sense to continue in this case. >>>>>> If this allocations fails (at the very beginning of the system's life, it's an __init function), >>>>>> something is terribly wrong and panic'ing on a NULL-pointer dereference sounds like >>>>>> a perfect choice. >>>>> >>>>> Moreover this is 24B allocation during early boot. Kernel will OOM and >>>>> panic when not being able to find any victim. I do not think we need to >>>> >>>> Agree with you. But IMO it may not be a good idea to leave the rtpn without NULL check. We should defend >>>> it though it could hardly happen. But I'm not insist on this check. I will drop this patch if you insist. >>> >>> It is not that I would insist. I just do not see any point in the code >>> churn. This check is not going to ever trigger and there is nothing you >>> can do to recover anyway so crashing the kernel is likely the only >>> choice left. >>> >> >> I hope I get the point now. What you mean is nothing we can do to recover and panic'ing on a >> NULL-pointer dereference is a perfect choice ? Should we declare that we leave the rtpn without >> NULL check on purpose like below ? >> >> Many thanks. >> >> @@ -7109,8 +7109,12 @@ static int __init mem_cgroup_init(void) >> rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, >> node_online(node) ? node : NUMA_NO_NODE); >> >> - if (WARN_ON_ONCE(!rtpn)) >> - continue; >> + /* >> + * If this allocation fails (at the very beginning of the >> + * system's life, it's an __init function), something is >> + * terribly wrong and panic'ing on a NULL-pointer >> + * dereference sounds like a perfect choice. >> + */ > > I am not really sure this is really worth it. Really we do not really > want to have similar comments all over the early init code, do we? Maybe not. Will drop this patch. Thanks. > >> rtpn->rb_root = RB_ROOT; >> rtpn->rb_rightmost = NULL; >> spin_lock_init(&rtpn->lock); > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Miaohe Lin Subject: Re: [PATCH 4/5] mm, memcg: avoid possible NULL pointer dereferencing in mem_cgroup_init() Date: Mon, 2 Aug 2021 19:18:20 +0800 Message-ID: <59122ae4-52c9-4ff9-104d-872d770dec0c@huawei.com> References: <20210729125755.16871-1-linmiaohe@huawei.com> <20210729125755.16871-5-linmiaohe@huawei.com> <47daf062-f510-edb3-6ec7-f8e7615ad8a0@huawei.com> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Content-Language: en-US List-ID: Content-Type: text/plain; charset="us-ascii" To: Michal Hocko Cc: Roman Gushchin , hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org, vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org, shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org, willy-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org, alexs-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, richard.weiyang-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org, songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On 2021/8/2 18:42, Michal Hocko wrote: > On Mon 02-08-21 18:00:10, Miaohe Lin wrote: >> On 2021/8/2 14:43, Michal Hocko wrote: >>> On Sat 31-07-21 10:05:51, Miaohe Lin wrote: >>>> On 2021/7/30 14:44, Michal Hocko wrote: >>>>> On Thu 29-07-21 20:12:43, Roman Gushchin wrote: >>>>>> On Thu, Jul 29, 2021 at 08:57:54PM +0800, Miaohe Lin wrote: >>>>>>> rtpn might be NULL in very rare case. We have better to check it before >>>>>>> dereferencing it. Since memcg can live with NULL rb_tree_per_node in >>>>>>> soft_limit_tree, warn this case and continue. >>>>>>> >>>>>>> Signed-off-by: Miaohe Lin >>>>>>> --- >>>>>>> mm/memcontrol.c | 2 ++ >>>>>>> 1 file changed, 2 insertions(+) >>>>>>> >>>>>>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >>>>>>> index 5b4592d1e0f2..70a32174e7c4 100644 >>>>>>> --- a/mm/memcontrol.c >>>>>>> +++ b/mm/memcontrol.c >>>>>>> @@ -7109,6 +7109,8 @@ static int __init mem_cgroup_init(void) >>>>>>> rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, >>>>>>> node_online(node) ? node : NUMA_NO_NODE); >>>>>>> >>>>>>> + if (WARN_ON_ONCE(!rtpn)) >>>>>>> + continue; >>>>>> >>>>>> I also really doubt that it makes any sense to continue in this case. >>>>>> If this allocations fails (at the very beginning of the system's life, it's an __init function), >>>>>> something is terribly wrong and panic'ing on a NULL-pointer dereference sounds like >>>>>> a perfect choice. >>>>> >>>>> Moreover this is 24B allocation during early boot. Kernel will OOM and >>>>> panic when not being able to find any victim. I do not think we need to >>>> >>>> Agree with you. But IMO it may not be a good idea to leave the rtpn without NULL check. We should defend >>>> it though it could hardly happen. But I'm not insist on this check. I will drop this patch if you insist. >>> >>> It is not that I would insist. I just do not see any point in the code >>> churn. This check is not going to ever trigger and there is nothing you >>> can do to recover anyway so crashing the kernel is likely the only >>> choice left. >>> >> >> I hope I get the point now. What you mean is nothing we can do to recover and panic'ing on a >> NULL-pointer dereference is a perfect choice ? Should we declare that we leave the rtpn without >> NULL check on purpose like below ? >> >> Many thanks. >> >> @@ -7109,8 +7109,12 @@ static int __init mem_cgroup_init(void) >> rtpn = kzalloc_node(sizeof(*rtpn), GFP_KERNEL, >> node_online(node) ? node : NUMA_NO_NODE); >> >> - if (WARN_ON_ONCE(!rtpn)) >> - continue; >> + /* >> + * If this allocation fails (at the very beginning of the >> + * system's life, it's an __init function), something is >> + * terribly wrong and panic'ing on a NULL-pointer >> + * dereference sounds like a perfect choice. >> + */ > > I am not really sure this is really worth it. Really we do not really > want to have similar comments all over the early init code, do we? Maybe not. Will drop this patch. Thanks. > >> rtpn->rb_root = RB_ROOT; >> rtpn->rb_rightmost = NULL; >> spin_lock_init(&rtpn->lock); >