All of lore.kernel.org
 help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Oscar Salvador <osalvador@suse.de>,
	Michal Hocko <mhocko@suse.com>,
	"Song Bao Hua (Barry Song)" <song.bao.hua@hisilicon.com>,
	David Hildenbrand <david@redhat.com>,
	Chen Huang <chenhuang5@huawei.com>,
	"Bodeddula, Balasubramaniam" <bodeddub@amazon.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Xiongchun duan <duanxiongchun@bytedance.com>,
	fam.zheng@bytedance.com, linux-doc@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Qi Zheng <zhengqi.arch@bytedance.com>
Subject: Re: [PATCH 3/5] mm: hugetlb: free the 2nd vmemmap page associated with each HugeTLB page
Date: Tue, 27 Jul 2021 15:15:52 +0800	[thread overview]
Message-ID: <CAMZfGtUKKcduZb5w7NED53Ke8UwNtiNZRp8ttbUb=TH-K1zgPA@mail.gmail.com> (raw)
In-Reply-To: <YP8mKV4wTp5sPIZg@casper.infradead.org>

On Tue, Jul 27, 2021 at 5:17 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Jul 14, 2021 at 05:17:58PM +0800, Muchun Song wrote:
> > +#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP
> > +extern bool hugetlb_free_vmemmap_enabled;
> > +
> > +/*
> > + * If the feature of freeing some vmemmap pages associated with each HugeTLB
> > + * page is enabled, the head vmemmap page frame is reused and all of the tail
> > + * vmemmap addresses map to the head vmemmap page frame (furture details can
> > + * refer to the figure at the head of the mm/hugetlb_vmemmap.c).  In other
> > + * word, there are more than one page struct with PG_head associated with each
> > + * HugeTLB page.  We __know__ that there is only one head page struct, the tail
> > + * page structs with PG_head are fake head page structs.  We need an approach
> > + * to distinguish between those two different types of page structs so that
> > + * compound_head() can return the real head page struct when the parameter is
> > + * the tail page struct but with PG_head. This is what page_head_if_fake()
> > + * does.
> > + *
> > + * The page_head_if_fake() returns the real head page struct iff the @page may
> > + * be fake, otherwise, returns NULL if the @page cannot be a fake page struct.
> > + * The following figure describes how to distinguish between real and fake head
> > + * page struct.
> > + *
> > + *   if (test_bit(PG_head, &page->flags)) {
> > + *           unsigned long head = READ_ONCE(page[1].compound_head);
> > + *
> > + *           if (head & 1) {
> > + *                   if (head == (unsigned long)page + 1)
> > + *                           ==> head page struct
> > + *                   else
> > + *                           ==> tail page struct
> > + *           } else
> > + *                   ==> head page struct
> > + *   } else
> > + *           ==> cannot be fake head page struct
>
> I'm not sure we need the pseudocode when the code is right there ...

Maybe it is redundant. I'll remove this in the next version.

>
> > + * We can safely access the field of the @page[1] with PG_head because it means
> > + * that the @page is a compound page composed with at least two contiguous
> > + * pages.
> > + */
> > +static __always_inline struct page *page_head_if_fake(const struct page *page)
> > +{
> > +     if (!hugetlb_free_vmemmap_enabled)
> > +             return NULL;
> > +
> > +     /*
> > +      * Only addresses aligned with PAGE_SIZE of struct page may be fake head
> > +      * struct page. The alignment check aims to avoid access the fields (
> > +      * e.g. compound_head) of the @page[1]. It can avoid touch a (possibly)
> > +      * cold cacheline in some cases.
> > +      */
> > +     if (IS_ALIGNED((unsigned long)page, PAGE_SIZE) &&
> > +         test_bit(PG_head, &page->flags)) {
> > +             unsigned long head = READ_ONCE(page[1].compound_head);
> > +
> > +             if (likely(head & 1))
> > +                     return (struct page *)(head - 1);
> > +     }
> > +
> > +     return NULL;
> > +}
>
> Why return 'NULL' instead of 'page'?

Returning @page is also fine. Will do in the next version.

>
> This is going to significantly increase the cost of calling
> compound_page() (by whichever spelling it has).  That will make
> the folio patchset more compelling ;-)

As Mike mentationed, do you have any recommended
benchmark (suspect you have a lot of experience on
this)?

Thanks.

  parent reply	other threads:[~2021-07-27  7:16 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-14  9:17 [PATCH 0/5] Free the 2nd vmemmap page associated with each HugeTLB page Muchun Song
2021-07-14  9:17 ` [PATCH 1/5] mm: introduce PAGEFLAGS_MASK to replace ((1UL << NR_PAGEFLAGS) - 1) Muchun Song
2021-07-26 21:04   ` Mike Kravetz
2021-07-27  6:27     ` Muchun Song
2021-07-27  6:27       ` Muchun Song
2021-07-29  6:00       ` Muchun Song
2021-07-29  6:00         ` Muchun Song
2021-07-14  9:17 ` [PATCH 2/5] mm: introduce save_page_flags to cooperate with show_page_flags Muchun Song
2021-07-26 23:18   ` Mike Kravetz
2021-07-27  7:06     ` Muchun Song
2021-07-27  7:06       ` Muchun Song
2021-07-14  9:17 ` [PATCH 3/5] mm: hugetlb: free the 2nd vmemmap page associated with each HugeTLB page Muchun Song
2021-07-26 21:16   ` Matthew Wilcox
2021-07-26 23:56     ` Mike Kravetz
2021-07-27  7:15     ` Muchun Song [this message]
2021-07-27  7:15       ` Muchun Song
2021-07-14  9:17 ` [PATCH 4/5] mm: hugetlb: replace hugetlb_free_vmemmap_enabled with a static_key Muchun Song
2021-07-14  9:18 ` [PATCH 5/5] mm: sparsemem: use page table lock to protect kernel pmd operations Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMZfGtUKKcduZb5w7NED53Ke8UwNtiNZRp8ttbUb=TH-K1zgPA@mail.gmail.com' \
    --to=songmuchun@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=bodeddub@amazon.com \
    --cc=chenhuang5@huawei.com \
    --cc=corbet@lwn.net \
    --cc=david@redhat.com \
    --cc=duanxiongchun@bytedance.com \
    --cc=fam.zheng@bytedance.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=osalvador@suse.de \
    --cc=song.bao.hua@hisilicon.com \
    --cc=willy@infradead.org \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.