From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E85BC433B4 for ; Wed, 21 Apr 2021 08:15:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 286806102A for ; Wed, 21 Apr 2021 08:15:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234403AbhDUIQR (ORCPT ); Wed, 21 Apr 2021 04:16:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59066 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234010AbhDUIQP (ORCPT ); Wed, 21 Apr 2021 04:16:15 -0400 Received: from mail-pg1-x536.google.com (mail-pg1-x536.google.com [IPv6:2607:f8b0:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CB2FC06174A for ; Wed, 21 Apr 2021 01:15:43 -0700 (PDT) Received: by mail-pg1-x536.google.com with SMTP id w10so29070258pgh.5 for ; Wed, 21 Apr 2021 01:15:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tDNi7Pq76OfCbCdDxGFAeY3yEV5OAvKRxPwWNq6c2gI=; b=o8Og5Y6Dn/+kfZJBIZs6BSbOTg5nNzykjXvsqwsIVdyq+0GF0bS3b+DdqN6hXehJDI biDuz3w83z7gZ2p3zCFEgxTozK7f5T/vS/T4lP+erF7v0waEzzszV296BjZMgUPSbINI JyCsfp0RQMKtwvkdEP7Zw9uEKNFwIxYCHtVCyPfWq9appujBXx4FzsxfBdMwgnEvqNXf mTcBcncnis9jwnR1Ldm+6ZG0LjnIrYvoFQ+FVyqr774UZGGGf2HWWhLZ5AwdK6oI6L4r ER0aZh3OT3sajU78pis5FyN1AdgC7Vz638KeEDzGnG5IUGDHdWteZ4RDDdqXgFkpl2zC QP1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tDNi7Pq76OfCbCdDxGFAeY3yEV5OAvKRxPwWNq6c2gI=; b=FmuzGo8hFRp7PgXOgv7Al4+qMJDUBZYl2p8hOqW5Ps738x3l8XYWXskJlR2hfW6FgN 6qjKd5/I6EnJuY04mBeVcdCEIxnF20JXDkQuqyxbwCq6dR1THvojflVaS/uAu9qubp5w r5vSF6/EAsSG7Lrzt+jFzYiieNPOlKNRDy8tIPmplOPBKH6B8F27kTiAzADOG2fFl4AS Rg/vxOu35QhoHr9JodbGRFqjfgbC05scALbZQsyE86CL6UIbVkMlVWaIyQaf4CtDpzoZ ifGcLfdFA/Vjz0/J+hGov4U6rKOTpib/JYhr/geA5u4FJ0oi7393jb1MbgkJKmEaFL3z cR3Q== X-Gm-Message-State: AOAM5327H78HO/cWb2csiCvQa4B2gFM8Dp9r8BrQDtKRP3DlesTiJqvT 36YIrYkzXdxvNf6lYmsYmOrmbEIQBMy80In3wi8VVg== X-Google-Smtp-Source: ABdhPJzXH/SH9ISXGP6z7pxMlQohQseRZtXZzZ6t0MqPW7IN6/wHVEY+DNN9jKRYkpmbFgdPziL4EhPQZKiFliNnuHE= X-Received: by 2002:a63:f07:: with SMTP id e7mr21045228pgl.341.1618992937492; Wed, 21 Apr 2021 01:15:37 -0700 (PDT) MIME-Version: 1.0 References: <20210421060259.67554-1-songmuchun@bytedance.com> In-Reply-To: From: Muchun Song Date: Wed, 21 Apr 2021 16:15:00 +0800 Message-ID: Subject: Re: [External] Re: [PATCH] mm: hugetlb: fix a race between memory-failure/soft_offline and gather_surplus_pages To: Michal Hocko Cc: Mike Kravetz , Andrew Morton , Oscar Salvador , Linux Memory Management List , LKML , Naoya Horiguchi Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 21, 2021 at 4:03 PM Michal Hocko wrote: > > [Cc Naoya] > > On Wed 21-04-21 14:02:59, Muchun Song wrote: > > The possible bad scenario: > > > > CPU0: CPU1: > > > > gather_surplus_pages() > > page = alloc_surplus_huge_page() > > memory_failure_hugetlb() > > get_hwpoison_page(page) > > __get_hwpoison_page(page) > > get_page_unless_zero(page) > > zero = put_page_testzero(page) > > VM_BUG_ON_PAGE(!zero, page) > > enqueue_huge_page(h, page) > > put_page(page) > > > > The refcount can possibly be increased by memory-failure or soft_offline > > handlers, we can trigger VM_BUG_ON_PAGE and wrongly add the page to the > > hugetlb pool list. > > The hwpoison side of this looks really suspicious to me. It shouldn't > really touch the reference count of hugetlb pages without being very > careful (and having hugetlb_lock held). What would happen if the > reference count was increased after the page has been enqueed into the > pool? This can just blow up later. If the page has been enqueued into the pool, then the page can be allocated to other users. The page reference count will be reset to 1 in the dequeue_huge_page_node_exact(). Then memory-failure will free the page because of put_page(). This is wrong. Because there is another user. > > > Signed-off-by: Muchun Song > > --- > > mm/hugetlb.c | 11 ++++------- > > 1 file changed, 4 insertions(+), 7 deletions(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index 3476aa06da70..6c96332db34b 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -2145,17 +2145,14 @@ static int gather_surplus_pages(struct hstate *h, long delta) > > > > /* Free the needed pages to the hugetlb pool */ > > list_for_each_entry_safe(page, tmp, &surplus_list, lru) { > > - int zeroed; > > - > > if ((--needed) < 0) > > break; > > /* > > - * This page is now managed by the hugetlb allocator and has > > - * no users -- drop the buddy allocator's reference. > > + * The refcount can possibly be increased by memory-failure or > > + * soft_offline handlers. > > */ > > - zeroed = put_page_testzero(page); > > - VM_BUG_ON_PAGE(!zeroed, page); > > - enqueue_huge_page(h, page); > > + if (likely(put_page_testzero(page))) > > + enqueue_huge_page(h, page); > > } > > free: > > spin_unlock_irq(&hugetlb_lock); > > -- > > 2.11.0 > > > > -- > Michal Hocko > SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 195ECC433B4 for ; Wed, 21 Apr 2021 08:15:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6CE1761436 for ; Wed, 21 Apr 2021 08:15:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6CE1761436 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BE6046B006C; Wed, 21 Apr 2021 04:15:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B5BAD6B006E; Wed, 21 Apr 2021 04:15:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9B02D6B0070; Wed, 21 Apr 2021 04:15:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0094.hostedemail.com [216.40.44.94]) by kanga.kvack.org (Postfix) with ESMTP id 7C3456B006C for ; Wed, 21 Apr 2021 04:15:39 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 31D89181AEF32 for ; Wed, 21 Apr 2021 08:15:39 +0000 (UTC) X-FDA: 78055665198.18.5E5D65B Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) by imf14.hostedemail.com (Postfix) with ESMTP id 5CFDDC0007CE for ; Wed, 21 Apr 2021 08:15:27 +0000 (UTC) Received: by mail-pf1-f169.google.com with SMTP id q2so3672379pfk.9 for ; Wed, 21 Apr 2021 01:15:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=tDNi7Pq76OfCbCdDxGFAeY3yEV5OAvKRxPwWNq6c2gI=; b=o8Og5Y6Dn/+kfZJBIZs6BSbOTg5nNzykjXvsqwsIVdyq+0GF0bS3b+DdqN6hXehJDI biDuz3w83z7gZ2p3zCFEgxTozK7f5T/vS/T4lP+erF7v0waEzzszV296BjZMgUPSbINI JyCsfp0RQMKtwvkdEP7Zw9uEKNFwIxYCHtVCyPfWq9appujBXx4FzsxfBdMwgnEvqNXf mTcBcncnis9jwnR1Ldm+6ZG0LjnIrYvoFQ+FVyqr774UZGGGf2HWWhLZ5AwdK6oI6L4r ER0aZh3OT3sajU78pis5FyN1AdgC7Vz638KeEDzGnG5IUGDHdWteZ4RDDdqXgFkpl2zC QP1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=tDNi7Pq76OfCbCdDxGFAeY3yEV5OAvKRxPwWNq6c2gI=; b=Fa0yXZUXQMqf8deX4XiMTPBjiEPSuyuKXOqw3nIF/y2xQoBErbZs4d333gmZ/sHOY6 oNHTa7meOggB4WNTYWJz4YpEGKNoVNfvTnMGvBX0WUKlr23VgiBu4oSLbZrpCwIFULKx iUOrcDL9cmNejA/I6RtURK256hxE0wUgQ/ihkzX+p5DDkMDeVr+4sIBCXbeo09kfbrhN h95R14F6pRlndVQCPGcBdjJpBsyedBMyUK+eiYxTl2Hjf1drhMolazzGWfm0dHEa7iDk 3lH6pmcplOTFzXCh2dMLUOyxZmQh93lEIECqGBIUNoVFBai9gD7wI9mRHtmQNCrEArnE hArA== X-Gm-Message-State: AOAM530oML1K14aX3KSUyO3+EGhX9n97zP560NN6N/TT7PSA46nR1qht 0ytDKQ2X6ZVg96W5lV7r/6eobmjF5tMR1S5lKQn8HA== X-Google-Smtp-Source: ABdhPJzXH/SH9ISXGP6z7pxMlQohQseRZtXZzZ6t0MqPW7IN6/wHVEY+DNN9jKRYkpmbFgdPziL4EhPQZKiFliNnuHE= X-Received: by 2002:a63:f07:: with SMTP id e7mr21045228pgl.341.1618992937492; Wed, 21 Apr 2021 01:15:37 -0700 (PDT) MIME-Version: 1.0 References: <20210421060259.67554-1-songmuchun@bytedance.com> In-Reply-To: From: Muchun Song Date: Wed, 21 Apr 2021 16:15:00 +0800 Message-ID: Subject: Re: [External] Re: [PATCH] mm: hugetlb: fix a race between memory-failure/soft_offline and gather_surplus_pages To: Michal Hocko Cc: Mike Kravetz , Andrew Morton , Oscar Salvador , Linux Memory Management List , LKML , Naoya Horiguchi Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 5CFDDC0007CE X-Stat-Signature: r3cksqi6jt4ns93suos4fuae7dgzxoh4 Received-SPF: none (bytedance.com>: No applicable sender policy available) receiver=imf14; identity=mailfrom; envelope-from=""; helo=mail-pf1-f169.google.com; client-ip=209.85.210.169 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618992927-142447 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 21, 2021 at 4:03 PM Michal Hocko wrote: > > [Cc Naoya] > > On Wed 21-04-21 14:02:59, Muchun Song wrote: > > The possible bad scenario: > > > > CPU0: CPU1: > > > > gather_surplus_pages() > > page = alloc_surplus_huge_page() > > memory_failure_hugetlb() > > get_hwpoison_page(page) > > __get_hwpoison_page(page) > > get_page_unless_zero(page) > > zero = put_page_testzero(page) > > VM_BUG_ON_PAGE(!zero, page) > > enqueue_huge_page(h, page) > > put_page(page) > > > > The refcount can possibly be increased by memory-failure or soft_offline > > handlers, we can trigger VM_BUG_ON_PAGE and wrongly add the page to the > > hugetlb pool list. > > The hwpoison side of this looks really suspicious to me. It shouldn't > really touch the reference count of hugetlb pages without being very > careful (and having hugetlb_lock held). What would happen if the > reference count was increased after the page has been enqueed into the > pool? This can just blow up later. If the page has been enqueued into the pool, then the page can be allocated to other users. The page reference count will be reset to 1 in the dequeue_huge_page_node_exact(). Then memory-failure will free the page because of put_page(). This is wrong. Because there is another user. > > > Signed-off-by: Muchun Song > > --- > > mm/hugetlb.c | 11 ++++------- > > 1 file changed, 4 insertions(+), 7 deletions(-) > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index 3476aa06da70..6c96332db34b 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -2145,17 +2145,14 @@ static int gather_surplus_pages(struct hstate *h, long delta) > > > > /* Free the needed pages to the hugetlb pool */ > > list_for_each_entry_safe(page, tmp, &surplus_list, lru) { > > - int zeroed; > > - > > if ((--needed) < 0) > > break; > > /* > > - * This page is now managed by the hugetlb allocator and has > > - * no users -- drop the buddy allocator's reference. > > + * The refcount can possibly be increased by memory-failure or > > + * soft_offline handlers. > > */ > > - zeroed = put_page_testzero(page); > > - VM_BUG_ON_PAGE(!zeroed, page); > > - enqueue_huge_page(h, page); > > + if (likely(put_page_testzero(page))) > > + enqueue_huge_page(h, page); > > } > > free: > > spin_unlock_irq(&hugetlb_lock); > > -- > > 2.11.0 > > > > -- > Michal Hocko > SUSE Labs