All of lore.kernel.org
 help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com>
To: Joao Martins <joao.m.martins@oracle.com>
Cc: "Xiongchun duan" <duanxiongchun@bytedance.com>,
	linux-doc@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	"Linux Memory Management List" <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Mike Kravetz" <mike.kravetz@oracle.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com,
	dave.hansen@linux.intel.com, luto@kernel.org,
	"Peter Zijlstra" <peterz@infradead.org>,
	viro@zeniv.linux.org.uk,
	"Andrew Morton" <akpm@linux-foundation.org>,
	paulmck@kernel.org, mchehab+huawei@kernel.org,
	pawan.kumar.gupta@linux.intel.com,
	"Randy Dunlap" <rdunlap@infradead.org>,
	oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de,
	"Mina Almasry" <almasrymina@google.com>,
	"David Rientjes" <rientjes@google.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	"Oscar Salvador" <osalvador@suse.de>,
	"Michal Hocko" <mhocko@suse.com>,
	"Song Bao Hua (Barry Song)" <song.bao.hua@hisilicon.com>,
	"David Hildenbrand" <david@redhat.com>,
	"HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>
Subject: Re: [External] Re: [PATCH v14 0/8] Free some vmemmap pages of HugeTLB page
Date: Sat, 6 Feb 2021 00:13:22 +0800	[thread overview]
Message-ID: <CAMZfGtXyWkeO9gGKGpEXYA9DA75mMZUaHboTXH6dGxZgEHvMpA@mail.gmail.com> (raw)
In-Reply-To: <a14113c5-08ae-2819-7e24-3d2687ef88da@oracle.com>

On Sat, Feb 6, 2021 at 12:01 AM Joao Martins <joao.m.martins@oracle.com> wrote:
>
> On 2/4/21 3:50 AM, Muchun Song wrote:
> > Hi all,
> >
>
> [...]
>
> > When a HugeTLB is freed to the buddy system, we should allocate 6 pages for
> > vmemmap pages and restore the previous mapping relationship.
> >
> > Apart from 2MB HugeTLB page, we also have 1GB HugeTLB page. It is similar
> > to the 2MB HugeTLB page. We also can use this approach to free the vmemmap
> > pages.
> >
> > In this case, for the 1GB HugeTLB page, we can save 4094 pages. This is a
> > very substantial gain. On our server, run some SPDK/QEMU applications which
> > will use 1024GB hugetlbpage. With this feature enabled, we can save ~16GB
> > (1G hugepage)/~12GB (2MB hugepage) memory.
> >
> > Because there are vmemmap page tables reconstruction on the freeing/allocating
> > path, it increases some overhead. Here are some overhead analysis.
>
> [...]
>
> > Although the overhead has increased, the overhead is not significant. Like Mike
> > said, "However, remember that the majority of use cases create hugetlb pages at
> > or shortly after boot time and add them to the pool. So, additional overhead is
> > at pool creation time. There is no change to 'normal run time' operations of
> > getting a page from or returning a page to the pool (think page fault/unmap)".
> >
>
> Despite the overhead and in addition to the memory gains from this series ...
> there's an additional benefit there isn't talked here with your vmemmap page
> reuse trick. That is page (un)pinners will see an improvement and I presume because
> there are fewer memmap pages and thus the tail/head pages are staying in cache more
> often.
>
> Out of the box I saw (when comparing linux-next against linux-next + this series)
> with gup_test and pinning a 16G hugetlb file (with 1G pages):
>
>         get_user_pages(): ~32k -> ~9k
>         unpin_user_pages(): ~75k -> ~70k
>
> Usually any tight loop fetching compound_head(), or reading tail pages data (e.g.
> compound_head) benefit a lot. There's some unpinning inefficiencies I am fixing[0], but
> with that in added it shows even more:
>
>         unpin_user_pages(): ~27k -> ~3.8k
>
> FWIW, I was also seeing that with devdax and the ZONE_DEVICE vmemmap page reuse equivalent
> series[1] but it was mixed with other numbers.

It's really a surprise. Thank you very much for the test data.
Very nice. Thanks again.


>
> Anyways, JFYI :)
>
>         Joao
>
> [0] https://lore.kernel.org/linux-mm/20210204202500.26474-1-joao.m.martins@oracle.com/
> [1] https://lore.kernel.org/linux-mm/20201208172901.17384-1-joao.m.martins@oracle.com/

      reply	other threads:[~2021-02-05 23:06 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-04  3:50 [PATCH v14 0/8] Free some vmemmap pages of HugeTLB page Muchun Song
2021-02-04  3:50 ` [PATCH v14 1/8] mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c Muchun Song
2021-02-04  3:50 ` [PATCH v14 2/8] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP Muchun Song
2021-02-04 11:44   ` Miaohe Lin
2021-02-04  3:50 ` [PATCH v14 3/8] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page Muchun Song
2021-02-05  8:54   ` Oscar Salvador
2021-02-05 16:01     ` [External] " Muchun Song
2021-02-04  3:50 ` [PATCH v14 4/8] mm: hugetlb: alloc " Muchun Song
2021-02-05  9:29   ` Muchun Song
2021-02-05 11:54   ` Oscar Salvador
2021-02-06  8:01     ` [External] " Muchun Song
2021-02-04  3:50 ` [PATCH v14 5/8] mm: hugetlb: add a kernel parameter hugetlb_free_vmemmap Muchun Song
2021-02-05  7:25   ` Miaohe Lin
2021-02-04  3:50 ` [PATCH v14 6/8] mm: hugetlb: introduce nr_free_vmemmap_pages in the struct hstate Muchun Song
2021-02-05  7:29   ` Miaohe Lin
2021-02-05  8:22     ` Oscar Salvador
2021-02-05  8:39       ` Miaohe Lin
2021-02-05  8:56         ` Oscar Salvador
2021-02-05  9:12           ` Miaohe Lin
2021-02-04  3:50 ` [PATCH v14 7/8] mm: hugetlb: gather discrete indexes of tail page Muchun Song
2021-02-05  7:30   ` Miaohe Lin
2021-02-04  3:50 ` [PATCH v14 8/8] mm: hugetlb: optimize the code with the help of the compiler Muchun Song
2021-02-04  6:33   ` Miaohe Lin
2021-02-05  9:09   ` Oscar Salvador
2021-02-05  9:16     ` [External] " Muchun Song
2021-02-05  8:59 ` [PATCH v14 0/8] Free some vmemmap pages of HugeTLB page Oscar Salvador
2021-02-05  9:30   ` [External] " Muchun Song
2021-02-05 16:00 ` Joao Martins
2021-02-05 16:13   ` Muchun Song [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMZfGtXyWkeO9gGKGpEXYA9DA75mMZUaHboTXH6dGxZgEHvMpA@mail.gmail.com \
    --to=songmuchun@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=anshuman.khandual@arm.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=duanxiongchun@bytedance.com \
    --cc=hpa@zytor.com \
    --cc=joao.m.martins@oracle.com \
    --cc=jroedel@suse.de \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mchehab+huawei@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=oneukum@suse.com \
    --cc=osalvador@suse.de \
    --cc=paulmck@kernel.org \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=song.bao.hua@hisilicon.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.