From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2357C433DB for ; Fri, 22 Jan 2021 06:54:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9FDA723381 for ; Fri, 22 Jan 2021 06:54:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9FDA723381 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A7E356B0006; Fri, 22 Jan 2021 01:54:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A2F056B0007; Fri, 22 Jan 2021 01:54:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 943396B0008; Fri, 22 Jan 2021 01:54:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0045.hostedemail.com [216.40.44.45]) by kanga.kvack.org (Postfix) with ESMTP id 7F1206B0006 for ; Fri, 22 Jan 2021 01:54:05 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id DC78E181AF5D8 for ; Fri, 22 Jan 2021 06:54:04 +0000 (UTC) X-FDA: 77732496408.24.vein06_5310d1f27569 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id BAA651A4A0 for ; Fri, 22 Jan 2021 06:54:04 +0000 (UTC) X-HE-Tag: vein06_5310d1f27569 X-Filterd-Recvd-Size: 9941 Received: from szxga06-in.huawei.com (szxga06-in.huawei.com [45.249.212.32]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Fri, 22 Jan 2021 06:54:03 +0000 (UTC) Received: from DGGEMS413-HUB.china.huawei.com (unknown [172.30.72.58]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4DMVMN4dhyzj62d; Fri, 22 Jan 2021 14:53:04 +0800 (CST) Received: from [10.174.177.2] (10.174.177.2) by DGGEMS413-HUB.china.huawei.com (10.3.19.213) with Microsoft SMTP Server id 14.3.498.0; Fri, 22 Jan 2021 14:53:54 +0800 Subject: Re: [PATCH v2 2/5] hugetlb: convert page_huge_active() HPageMigratable flag To: Mike Kravetz CC: Michal Hocko , Naoya Horiguchi , Muchun Song , "David Hildenbrand" , Oscar Salvador , "Matthew Wilcox" , Andrew Morton , , References: <20210120013049.311822-1-mike.kravetz@oracle.com> <20210120013049.311822-3-mike.kravetz@oracle.com> From: Miaohe Lin Message-ID: <0159cbc4-6d98-fefe-2288-5ba28b3f519f@huawei.com> Date: Fri, 22 Jan 2021 14:53:54 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <20210120013049.311822-3-mike.kravetz@oracle.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.177.2] X-CFilter-Loop: Reflected X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi: On 2021/1/20 9:30, Mike Kravetz wrote: > Use the new hugetlb page specific flag HPageMigratable to replace the > page_huge_active interfaces. By it's name, page_huge_active implied > that a huge page was on the active list. However, that is not really > what code checking the flag wanted to know. It really wanted to determine > if the huge page could be migrated. This happens when the page is actually > added the page cache and/or task page table. This is the reasoning behind s/added/added to/ > the name change. > > The VM_BUG_ON_PAGE() calls in the *_huge_active() interfaces are not > really necessary as we KNOW the page is a hugetlb page. Therefore, they > are removed. > > The routine page_huge_active checked for PageHeadHuge before testing the > active bit. This is unnecessary in the case where we hold a reference or > lock and know it is a hugetlb head page. page_huge_active is also called > without holding a reference or lock (scan_movable_pages), and can race with > code freeing the page. The extra check in page_huge_active shortened the > race window, but did not prevent the race. Offline code calling > scan_movable_pages already deals with these races, so removing the check > is acceptable. Add comment to racy code. > > Signed-off-by: Mike Kravetz > --- > fs/hugetlbfs/inode.c | 2 +- > include/linux/hugetlb.h | 5 +++++ > include/linux/page-flags.h | 6 ----- > mm/hugetlb.c | 45 ++++++++++---------------------------- > mm/memory_hotplug.c | 8 ++++++- > 5 files changed, 24 insertions(+), 42 deletions(-) > > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > index b8a661780c4a..ec9f03aa2738 100644 > --- a/fs/hugetlbfs/inode.c > +++ b/fs/hugetlbfs/inode.c > @@ -735,7 +735,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, > > mutex_unlock(&hugetlb_fault_mutex_table[hash]); > > - set_page_huge_active(page); > + SetHPageMigratable(page); > /* > * unlock_page because locked by add_to_page_cache() > * put_page() due to reference from alloc_huge_page() > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index be71a00ee2a0..ce3d03da0133 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -480,9 +480,13 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, > * HPG_restore_reserve - Set when a hugetlb page consumes a reservation at > * allocation time. Cleared when page is fully instantiated. Free > * routine checks flag to restore a reservation on error paths. > + * HPG_migratable - Set after a newly allocated page is added to the page > + * cache and/or page tables. Indicates the page is a candidate for > + * migration. > */ > enum hugetlb_page_flags { > HPG_restore_reserve = 0, > + HPG_migratable, > __NR_HPAGEFLAGS, > }; > > @@ -529,6 +533,7 @@ static inline void ClearHPage##uname(struct page *page) \ > * Create functions associated with hugetlb page flags > */ > HPAGEFLAG(RestoreReserve, restore_reserve) > +HPAGEFLAG(Migratable, migratable) > > #ifdef CONFIG_HUGETLB_PAGE > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > index bc6fd1ee7dd6..04a34c08e0a6 100644 > --- a/include/linux/page-flags.h > +++ b/include/linux/page-flags.h > @@ -592,15 +592,9 @@ static inline void ClearPageCompound(struct page *page) > #ifdef CONFIG_HUGETLB_PAGE > int PageHuge(struct page *page); > int PageHeadHuge(struct page *page); > -bool page_huge_active(struct page *page); > #else > TESTPAGEFLAG_FALSE(Huge) > TESTPAGEFLAG_FALSE(HeadHuge) > - > -static inline bool page_huge_active(struct page *page) > -{ > - return 0; > -} > #endif > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 8bed6b5202d2..c24da40626d3 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1353,30 +1353,6 @@ struct hstate *size_to_hstate(unsigned long size) > return NULL; > } > > -/* > - * Test to determine whether the hugepage is "active/in-use" (i.e. being linked > - * to hstate->hugepage_activelist.) > - * > - * This function can be called for tail pages, but never returns true for them. > - */ > -bool page_huge_active(struct page *page) > -{ > - return PageHeadHuge(page) && PagePrivate(&page[1]); > -} > - > -/* never called for tail page */ > -void set_page_huge_active(struct page *page) > -{ > - VM_BUG_ON_PAGE(!PageHeadHuge(page), page); > - SetPagePrivate(&page[1]); > -} > - > -static void clear_page_huge_active(struct page *page) > -{ > - VM_BUG_ON_PAGE(!PageHeadHuge(page), page); > - ClearPagePrivate(&page[1]); > -} > - > /* > * Internal hugetlb specific page flag. Do not use outside of the hugetlb > * code > @@ -1438,7 +1414,7 @@ static void __free_huge_page(struct page *page) > } > > spin_lock(&hugetlb_lock); > - clear_page_huge_active(page); > + ClearHPageMigratable(page); > hugetlb_cgroup_uncharge_page(hstate_index(h), > pages_per_huge_page(h), page); > hugetlb_cgroup_uncharge_page_rsvd(hstate_index(h), > @@ -4220,7 +4196,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma, > make_huge_pte(vma, new_page, 1)); > page_remove_rmap(old_page, true); > hugepage_add_new_anon_rmap(new_page, vma, haddr); > - set_page_huge_active(new_page); > + SetHPageMigratable(new_page); > /* Make the old page be freed below */ > new_page = old_page; > } > @@ -4457,12 +4433,12 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, > spin_unlock(ptl); > > /* > - * Only make newly allocated pages active. Existing pages found > - * in the pagecache could be !page_huge_active() if they have been > - * isolated for migration. > + * Only set HPageMigratable in newly allocated pages. Existing pages > + * found in the pagecache may not have HPageMigratableset if they have > + * been isolated for migration. > */ > if (new_page) > - set_page_huge_active(page); > + SetHPageMigratable(page); > > unlock_page(page); > out: > @@ -4773,7 +4749,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, > update_mmu_cache(dst_vma, dst_addr, dst_pte); > > spin_unlock(ptl); > - set_page_huge_active(page); > + SetHPageMigratable(page); > if (vm_shared) > unlock_page(page); > ret = 0; > @@ -5591,12 +5567,13 @@ bool isolate_huge_page(struct page *page, struct list_head *list) > bool ret = true; > > spin_lock(&hugetlb_lock); > - if (!PageHeadHuge(page) || !page_huge_active(page) || > + if (!PageHeadHuge(page) || > + !HPageMigratable(page) || > !get_page_unless_zero(page)) { > ret = false; > goto unlock; > } > - clear_page_huge_active(page); > + ClearHPageMigratable(page); > list_move_tail(&page->lru, list); > unlock: > spin_unlock(&hugetlb_lock); > @@ -5607,7 +5584,7 @@ void putback_active_hugepage(struct page *page) > { > VM_BUG_ON_PAGE(!PageHead(page), page); > spin_lock(&hugetlb_lock); > - set_page_huge_active(page); > + SetHPageMigratable(page); > list_move_tail(&page->lru, &(page_hstate(page))->hugepage_activelist); > spin_unlock(&hugetlb_lock); > put_page(page); > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index f9d57b9be8c7..563da803e0e0 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1260,7 +1260,13 @@ static int scan_movable_pages(unsigned long start, unsigned long end, > if (!PageHuge(page)) > continue; > head = compound_head(page); > - if (page_huge_active(head)) > + /* > + * This test is racy as we hold no reference or lock. The > + * hugetlb page could have been free'ed and head is no longer > + * a hugetlb page before the following check. In such unlikely > + * cases false positives and negatives are possible. > + */ Is it necessary to mention that: "offline_pages() could handle these racy cases." in the comment ? > + if (HPageMigratable(head)) > goto found; > skip = compound_nr(head) - (page - head); > pfn += skip - 1; > Looks good to me. Thanks. Reviewed-by: Miaohe Lin