From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF953C07E9C for ; Wed, 14 Jul 2021 10:58:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6CB0C61369 for ; Wed, 14 Jul 2021 10:58:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6CB0C61369 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 845E86B0081; Wed, 14 Jul 2021 06:58:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 81D096B0085; Wed, 14 Jul 2021 06:58:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6BDEB6B0088; Wed, 14 Jul 2021 06:58:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0221.hostedemail.com [216.40.44.221]) by kanga.kvack.org (Postfix) with ESMTP id 487F56B0081 for ; Wed, 14 Jul 2021 06:58:16 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 234EE805D404 for ; Wed, 14 Jul 2021 10:58:15 +0000 (UTC) X-FDA: 78360894150.20.730C630 Received: from mail-pl1-f180.google.com (mail-pl1-f180.google.com [209.85.214.180]) by imf03.hostedemail.com (Postfix) with ESMTP id 2705930001A0 for ; Wed, 14 Jul 2021 10:58:14 +0000 (UTC) Received: by mail-pl1-f180.google.com with SMTP id c15so1279975pls.13 for ; Wed, 14 Jul 2021 03:58:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=xK2PN0jEfVdDDtiAjxshjV1gI46HAkaNE/QoRVQK5c0=; b=FM2oThHJUTJV7ROmR5//8EBZOl8RmJ0Vwx7fSfiPuLWaCYnup2pnb8IRR2M6b18uWl 3sEF3hYzgdoZ5Fe6Yt12Mh2BOapQRcKt7wW/td0md1LWr+P7WQc8hNDYIRAHYj3U3KyQ HEn298y2Pd2e2L9dJ6Ke21EofSFFff+yZ4Ukocrs9BfG4sX54dd6aRmSyIAkggNQLtac SWLcCTJpsPbnW416L58rd2UMu4Y+8QpVDo+CCW5m8y8ql80L2kRN4EHaHHcnyumwzapF 9C/XszBEl0iMhq8cC5DhKQzkuoIEtMJX3z2/hH+sIxVd8ndS1IQXjJ6dqYYm9PLE/yea eRQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=xK2PN0jEfVdDDtiAjxshjV1gI46HAkaNE/QoRVQK5c0=; b=CoF3u/928m3G0ScjgLXxYGVDiy533WKUugr2AwtiOGSRsLWWrMvAKc5stkxuXzgdGk EecdCyP24fydDsUpVS2Z8pbaew8M1/LeCnNTjC5RNNH1ncpfVyklMDsG9C+aFXcDZA1b f5ZW9cZySO6r1eAAKWNHrlmqoLu7Lcn4AW5xjCRhvcWd9PCOaVxKVuPsIZvQ/TSWZXKl WCro+nVqXsH35oLX3YMv46wWoqszG5HBk9yeJ+z8E92FQZwMukJponjT0C9gstq3Jqek e08ren1DhhK3rzFax7YMbnMuU/ruqPzI9ZEUBOeqxFxLf+N7hI8HhNDg/MtV9cBwNW0r qX5g== X-Gm-Message-State: AOAM530Y3mNZJ8EfZKyBMOzD+2DHiGSITcp+bx6LWj3h0giT9iaFCqmA a6Ga8fgZFcil3go10rGTcO01wYBCleF9cpY2Ca15eQ== X-Google-Smtp-Source: ABdhPJzQpJxjseEkxgMgjYHaJmcM7dkEy4xFbsZNd3gc4oJf4ecA4j1V2KjLiR5YIdpDICCVwwWg9dggIyt0yc0coJU= X-Received: by 2002:a17:90a:5204:: with SMTP id v4mr3199223pjh.147.1626260292966; Wed, 14 Jul 2021 03:58:12 -0700 (PDT) MIME-Version: 1.0 References: <20210710002441.167759-1-mike.kravetz@oracle.com> <20210710002441.167759-4-mike.kravetz@oracle.com> In-Reply-To: <20210710002441.167759-4-mike.kravetz@oracle.com> From: Muchun Song Date: Wed, 14 Jul 2021 18:57:36 +0800 Message-ID: Subject: Re: [External] [PATCH 3/3] hugetlb: before freeing hugetlb page set dtor to appropriate value To: Mike Kravetz Cc: Linux Memory Management List , LKML , Michal Hocko , Oscar Salvador , David Hildenbrand , Matthew Wilcox , Naoya Horiguchi , Mina Almasry , Andrew Morton Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 2705930001A0 X-Stat-Signature: 5b8cfpb4rxc3s6q6o7odadszuq6pctsm Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=bytedance-com.20150623.gappssmtp.com header.s=20150623 header.b=FM2oThHJ; spf=pass (imf03.hostedemail.com: domain of songmuchun@bytedance.com designates 209.85.214.180 as permitted sender) smtp.mailfrom=songmuchun@bytedance.com; dmarc=pass (policy=none) header.from=bytedance.com X-HE-Tag: 1626260294-963796 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Jul 10, 2021 at 8:25 AM Mike Kravetz wrote: > > When removing a hugetlb page from the pool the ref count is set to > one (as the free page has no ref count) and compound page destructor > is set to NULL_COMPOUND_DTOR. Since a subsequent call to free the > hugetlb page will call __free_pages for non-gigantic pages and > free_gigantic_page for gigantic pages the destructor is not used. > > However, consider the following race with code taking a speculative > reference on the page: > > Thread 0 Thread 1 > -------- -------- > remove_hugetlb_page > set_page_refcounted(page); > set_compound_page_dtor(page, > NULL_COMPOUND_DTOR); > get_page_unless_zero(page) > __update_and_free_page > __free_pages(page, > huge_page_order(h)); > > /* Note that __free_pages() will simply drop > the reference to the page. */ > > put_page(page) > __put_compound_page() > destroy_compound_page > NULL_COMPOUND_DTOR > BUG: kernel NULL pointer > dereference, address: > 0000000000000000 > > To address this race, set the dtor to the normal compound page dtor > for non-gigantic pages. The dtor for gigantic pages does not matter > as gigantic pages are changed from a compound page to 'just a group of > pages' before freeing. Hence, the destructor is not used. > > Signed-off-by: Mike Kravetz > --- > mm/hugetlb.c | 22 +++++++++++++++++++++- > 1 file changed, 21 insertions(+), 1 deletion(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 3132c7395743..fa8ec2072949 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1370,8 +1370,28 @@ static void remove_hugetlb_page(struct hstate *h, struct page *page, > h->surplus_huge_pages_node[nid]--; > } > > + /* > + * Very subtle > + * > + * For non-gigantic pages set the destructor to the normal compound > + * page dtor. This is needed in case someone takes an additional > + * temporary ref to the page, and freeing is delayed until they drop > + * their reference. > + * > + * For gigantic pages set the destructor to the null dtor. This > + * destructor will never be called. Before freeing the gigantic > + * page destroy_compound_gigantic_page will turn the compound page > + * into a simple group of pages. After this the destructor does not > + * apply. > + * > + * This handles the case where more than one ref is held when and > + * after update_and_free_page is called. > + */ > set_page_refcounted(page); > - set_compound_page_dtor(page, NULL_COMPOUND_DTOR); > + if (hstate_is_gigantic(h)) > + set_compound_page_dtor(page, NULL_COMPOUND_DTOR); > + else > + set_compound_page_dtor(page, COMPOUND_PAGE_DTOR); Hi Mike, The race is really subtle. But we also should remove the WARN from free_contig_range, right? Because the refcount of the head page of the gigantic page can be greater than one, but free_contig_range has the following warning. WARN(count != 0, "%lu pages are still in use!\n", count); Thanks. > > h->nr_huge_pages--; > h->nr_huge_pages_node[nid]--; > -- > 2.31.1 >