From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-23.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C341C4338F for ; Wed, 28 Jul 2021 14:10:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C2F5F60F5E for ; Wed, 28 Jul 2021 14:10:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C2F5F60F5E Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 226A76B0036; Wed, 28 Jul 2021 10:10:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D6F76B005D; Wed, 28 Jul 2021 10:10:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 09FC58D0001; Wed, 28 Jul 2021 10:10:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0253.hostedemail.com [216.40.44.253]) by kanga.kvack.org (Postfix) with ESMTP id E579F6B0036 for ; Wed, 28 Jul 2021 10:10:41 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 78B091CB2E for ; Wed, 28 Jul 2021 14:10:41 +0000 (UTC) X-FDA: 78412182282.05.C8E9296 Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com [209.85.167.43]) by imf25.hostedemail.com (Postfix) with ESMTP id 346D4B003D6D for ; Wed, 28 Jul 2021 14:10:41 +0000 (UTC) Received: by mail-lf1-f43.google.com with SMTP id d18so4219137lfb.6 for ; Wed, 28 Jul 2021 07:10:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tzqXN6ljVfKCtav6KN2V4ySvtOzmVHwY4YSZdyei1YM=; b=Oo1K10dAv82MrkV06wPn/Fowbs0gd5e4MK+SCHnF6WBc2tJT776SsOjQ9oa51NSN9I 2HdSBKBQxbU1dQzGLVdk35huure6rKro16w/NbfDjq/X25fl3MbORFd6jxeUcv9OL2v1 lXRxkNhF8X39IZbcWr+ZHYlJQoLYNWyyNe27ZcIt1+NcuNv853sDZQotIK7VfXVRz4FH fMqTm8yb6XFPe5oV9qx0zaxwt/xXCrMWtQcxKr3Eq50UZ+KboD8QgOPUdSaNCK6tbGY9 7okaH0hHvWs2Ri0r2VhWTlicxpfY2Dl8Cz8umzSmfn4T2jDAs5L0Mcr8wYfozs/qdCFn L1YQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tzqXN6ljVfKCtav6KN2V4ySvtOzmVHwY4YSZdyei1YM=; b=QqXXuy0X2uYTQG8NOxwXYnTdcnc28XEX2azZ45ocuU45g+9q4my5DwZK7uKbhxvKdh 23RlUgWyRli5pzdUgropLQ7dysy/1ErrkyvaYAcgX9OJ+NLSouYyL8b7sXB3MjqMMLpj /a4DDKxBTdbpIq96hrEskrojy6hZyIkRaMgrO6ao6rFvw0+cCsN/6nN3nJivw+WQr2bA Q+LGzE01noqEGNsgSrzRIpKoEwtPzSFfVmpVICom0OnnFr3Bcoy5Z+Lj3p46+LdPC4GQ q52Z7cnokX4HRuKyWnVU10hbYw4kXnDqp1UYRfp2armYmFJTSMxBIcG0L9VSzQ3T9nef VWFg== X-Gm-Message-State: AOAM530ojrD6vxcygVqydMjXCyWhmZffXjDqs/6kf7KGDmamce31ZTAt f/HpmcgimUE3u+lxFVHbFiuI2N8n/L/ufPFthDwCog== X-Google-Smtp-Source: ABdhPJzTI7d889Y4BXdkFlMPZMemCacSNNOf3Te13xQ2wTaAeBbx2u61OgGcXH/twHLkbeRYLs0ieAdyK/ZpYgWxumc= X-Received: by 2002:a19:ae0f:: with SMTP id f15mr20952757lfc.117.1627481438086; Wed, 28 Jul 2021 07:10:38 -0700 (PDT) MIME-Version: 1.0 References: <20210728091348.272714-1-wanghai38@huawei.com> In-Reply-To: From: Shakeel Butt Date: Wed, 28 Jul 2021 07:10:26 -0700 Message-ID: Subject: Re: [PATCH] mm/memcg: fix NULL pointer dereference in memcg_slab_free_hook() To: Michal Hocko , Roman Gushchin Cc: Wang Hai , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Vlastimil Babka , Johannes Weiner , Alexei Starovoitov , wangkefeng.wang@huawei.com, Linux MM , LKML Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20161025 header.b=Oo1K10dA; spf=pass (imf25.hostedemail.com: domain of shakeelb@google.com designates 209.85.167.43 as permitted sender) smtp.mailfrom=shakeelb@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam02 X-Stat-Signature: zjrxyy4wxr4qqi87q3s3cbdq6jq376z5 X-Rspamd-Queue-Id: 346D4B003D6D X-HE-Tag: 1627481441-362331 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: +Roman On Wed, Jul 28, 2021 at 6:23 AM Michal Hocko wrote: > > On Wed 28-07-21 17:13:48, Wang Hai wrote: > > When I use kfree_rcu() to free a large memory allocated by > > kmalloc_node(), the following dump occurs. > > > > BUG: kernel NULL pointer dereference, address: 0000000000000020 > > [...] > > Oops: 0000 [#1] SMP > > [...] > > Workqueue: events kfree_rcu_work > > RIP: 0010:__obj_to_index include/linux/slub_def.h:182 [inline] > > RIP: 0010:obj_to_index include/linux/slub_def.h:191 [inline] > > RIP: 0010:memcg_slab_free_hook+0x120/0x260 mm/slab.h:363 > > [...] > > Call Trace: > > kmem_cache_free_bulk+0x58/0x630 mm/slub.c:3293 > > kfree_bulk include/linux/slab.h:413 [inline] > > kfree_rcu_work+0x1ab/0x200 kernel/rcu/tree.c:3300 > > process_one_work+0x207/0x530 kernel/workqueue.c:2276 > > worker_thread+0x320/0x610 kernel/workqueue.c:2422 > > kthread+0x13d/0x160 kernel/kthread.c:313 > > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 > > > > When kmalloc_node() a large memory, page is allocated, not slab, > > so when freeing memory via kfree_rcu(), this large memory should not > > be used by memcg_slab_free_hook(), because memcg_slab_free_hook() is > > is used for slab. > > > > So in this case, there is no need to do anything with this large > > page in memcg_slab_free_hook(), just skip it. > > > > Fixes: 270c6a71460e ("mm: memcontrol/slab: Use helpers to access slab page's memcg_data") > > Are you sure that this commit is really breaking the code. Unless I have > missed something there shouldn't be any real change wrt. large > allocations here. page_has_obj_cgroups is just a different name for what > what page_objcgs is giving us. Actually they are different. For MEMCG_DATA_KMEM page, page_has_obj_cgroups() will return false while page_objcgs() on non-VM_DEBUG kernels will return "struct obj_cgroup *" instead of "struct obj_cgroup **". > > I haven't studied the kfree_rcu part but isn't the problem its use of > kmem_cache_free_bulk or isn't the problem right there in the bulk free? > SLUB's kmem_cache_free_bulk() is doing two passes. In first pass uncharges all the objects and in the second pass frees them to the slub infra. There is nothing wrong with that. It is just that SLUB mixes page (for higher order) and slab allocations and requires special handling. > > Signed-off-by: Wang Hai > > --- > > mm/slab.h | 15 ++++++++++----- > > 1 file changed, 10 insertions(+), 5 deletions(-) > > > > diff --git a/mm/slab.h b/mm/slab.h > > index 67e06637ff2e..247d3f9c21f7 100644 > > --- a/mm/slab.h > > +++ b/mm/slab.h > > @@ -339,15 +339,20 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s_orig, > > continue; > > > > page = virt_to_head_page(p[i]); > > + if (!s_orig) { > > + if (unlikely(!PageSlab(page))) { > > + BUG_ON(!PageCompound(page)); > > BUG_ON is not really a good idea here. Why should we crash the kernel > just because of an unexpected page showing up. Leaking it would be more > appropriate (the same would apply to kfree btw). I would just warn > here. The simplest fix would be to not call memcg_slab_free_hook() in kmem_cache_free_bulk() because slab_free() will call memcg_slab_free_hook() for individual objects. Not sure if there will be any performance impact. > Also don't we need any hookd here. Looking at kfree path it does > call kfree_hook. Why is that not needed here? These are called later in build_detached_freelist().