From: Muchun Song <songmuchun@bytedance.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>,
Thomas Gleixner <tglx@linutronix.de>,
mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com,
dave.hansen@linux.intel.com, luto@kernel.org,
Peter Zijlstra <peterz@infradead.org>,
viro@zeniv.linux.org.uk,
Andrew Morton <akpm@linux-foundation.org>,
paulmck@kernel.org, mchehab+huawei@kernel.org,
pawan.kumar.gupta@linux.intel.com,
Randy Dunlap <rdunlap@infradead.org>,
oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de,
Mina Almasry <almasrymina@google.com>,
David Rientjes <rientjes@google.com>,
Matthew Wilcox <willy@infradead.org>,
Oscar Salvador <osalvador@suse.de>,
Michal Hocko <mhocko@suse.com>,
Xiongchun duan <duanxiongchun@bytedance.com>,
linux-doc@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
Linux Memory Management List <linux-mm@kvack.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [External] Re: [PATCH v4 04/21] mm/hugetlb: Introduce nr_free_vmemmap_pages in the struct hstate
Date: Thu, 19 Nov 2020 11:00:07 +0800 [thread overview]
Message-ID: <CAMZfGtXRqTpqJoGonMdTcE4HjPEy98FBFiry3Rry5=Jpfen1xw@mail.gmail.com> (raw)
In-Reply-To: <88af8545-14b7-08de-f121-e12295d5d5b9@oracle.com>
On Thu, Nov 19, 2020 at 7:48 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 11/13/20 2:59 AM, Muchun Song wrote:
> > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> > new file mode 100644
> > index 000000000000..a6c9948302e2
> > --- /dev/null
> > +++ b/mm/hugetlb_vmemmap.c
> > @@ -0,0 +1,108 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Free some vmemmap pages of HugeTLB
> > + *
> > + * Copyright (c) 2020, Bytedance. All rights reserved.
> > + *
> > + * Author: Muchun Song <songmuchun@bytedance.com>
> > + *
>
> Oscar has already made some suggestions to change comments. I would suggest
> changing the below text to something like the following.
Thanks Mike. I will change the below comments.
>
> > + * Nowadays we track the status of physical page frames using struct page
> > + * structures arranged in one or more arrays. And here exists one-to-one
> > + * mapping between the physical page frame and the corresponding struct page
> > + * structure.
> > + *
> > + * The HugeTLB support is built on top of multiple page size support that
> > + * is provided by most modern architectures. For example, x86 CPUs normally
> > + * support 4K and 2M (1G if architecturally supported) page sizes. Every
> > + * HugeTLB has more than one struct page structure. The 2M HugeTLB has 512
> > + * struct page structure and 1G HugeTLB has 4096 struct page structures. But
> > + * in the core of HugeTLB only uses the first 4 (Use of first 4 struct page
> > + * structures comes from HUGETLB_CGROUP_MIN_ORDER.) struct page structures to
> > + * store metadata associated with each HugeTLB. The rest of the struct page
> > + * structures are usually read the compound_head field which are all the same
> > + * value. If we can free some struct page memory to buddy system so that we
> > + * can save a lot of memory.
> > + *
>
> struct page structures (page structs) are used to describe a physical page
> frame. By default, there is a one-to-one mapping from a page frame to
> it's corresponding page struct.
>
> HugeTLB pages consist of multiple base page size pages and is supported by
> many architectures. See hugetlbpage.rst in the Documentation directory for
> more details. On the x86 architecture, HugeTLB pages of size 2MB and 1GB
> are currently supported. Since the base page size on x86 is 4KB, a 2MB
> HugeTLB page consists of 512 base pages and a 1GB HugeTLB page consists of
> 4096 base pages. For each base page, there is a corresponding page struct.
>
> Within the HugeTLB subsystem, only the first 4 page structs are used to
> contain unique information about a HugeTLB page. HUGETLB_CGROUP_MIN_ORDER
> provides this upper limit. The only 'useful' information in the remaining
> page structs is the compound_head field, and this field is the same for all
> tail pages.
>
> By removing redundant page structs for HugeTLB pages, memory can returned
> to the buddy allocator for other uses.
>
> > + * When the system boot up, every 2M HugeTLB has 512 struct page structures
> > + * which size is 8 pages(sizeof(struct page) * 512 / PAGE_SIZE).
> > + *
> > + * HugeTLB struct pages(8 pages) page frame(8 pages)
> > + * +-----------+ ---virt_to_page---> +-----------+ mapping to +-----------+
> > + * | | | 0 | -------------> | 0 |
> > + * | | | 1 | -------------> | 1 |
> > + * | | | 2 | -------------> | 2 |
> > + * | | | 3 | -------------> | 3 |
> > + * | | | 4 | -------------> | 4 |
> > + * | 2M | | 5 | -------------> | 5 |
> > + * | | | 6 | -------------> | 6 |
> > + * | | | 7 | -------------> | 7 |
> > + * | | +-----------+ +-----------+
> > + * | |
> > + * | |
> > + * +-----------+
> > + *
> > + *
>
> I think we want the description before the next diagram.
>
> Reworded description here:
>
> The value of compound_head is the same for all tail pages. The first page of
> page structs (page 0) associated with the HugeTLB page contains the 4 page
> structs necessary to describe the HugeTLB. The only use of the remaining pages
> of page structs (page 1 to page 7) is to point to compound_head. Therefore,
> we can remap pages 2 to 7 to page 1. Only 2 pages of page structs will be used
> for each HugeTLB page. This will allow us to free the remaining 6 pages to
> the buddy allocator.
>
> Here is how things look after remapping.
>
> > + *
> > + * HugeTLB struct pages(8 pages) page frame(8 pages)
> > + * +-----------+ ---virt_to_page---> +-----------+ mapping to +-----------+
> > + * | | | 0 | -------------> | 0 |
> > + * | | | 1 | -------------> | 1 |
> > + * | | | 2 | -------------> +-----------+
> > + * | | | 3 | -----------------^ ^ ^ ^ ^
> > + * | | | 4 | -------------------+ | | |
> > + * | 2M | | 5 | ---------------------+ | |
> > + * | | | 6 | -----------------------+ |
> > + * | | | 7 | -------------------------+
> > + * | | +-----------+
> > + * | |
> > + * | |
> > + * +-----------+
>
> --
> Mike Kravetz
--
Yours,
Muchun
next prev parent reply other threads:[~2020-11-19 3:00 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-13 10:59 [PATCH v4 00/21] Free some vmemmap pages of hugetlb page Muchun Song
2020-11-13 10:59 ` [PATCH v4 01/21] mm/memory_hotplug: Move bootmem info registration API to bootmem_info.c Muchun Song
2020-11-16 13:50 ` Oscar Salvador
2020-11-13 10:59 ` [PATCH v4 02/21] mm/memory_hotplug: Move {get,put}_page_bootmem() " Muchun Song
2020-11-16 13:52 ` Oscar Salvador
2020-11-13 10:59 ` [PATCH v4 03/21] mm/hugetlb: Introduce a new config HUGETLB_PAGE_FREE_VMEMMAP Muchun Song
2020-11-18 22:38 ` Mike Kravetz
2020-11-19 2:57 ` [External] " Muchun Song
2020-11-13 10:59 ` [PATCH v4 04/21] mm/hugetlb: Introduce nr_free_vmemmap_pages in the struct hstate Muchun Song
2020-11-16 13:33 ` Oscar Salvador
2020-11-16 15:40 ` [External] " Muchun Song
2020-11-18 22:54 ` Mike Kravetz
2020-11-18 23:48 ` Mike Kravetz
2020-11-19 3:00 ` Muchun Song [this message]
2020-11-13 10:59 ` [PATCH v4 05/21] mm/hugetlb: Introduce pgtable allocation/freeing helpers Muchun Song
2020-11-17 15:06 ` Oscar Salvador
2020-11-17 15:29 ` [External] " Muchun Song
2020-11-19 6:17 ` Muchun Song
2020-11-19 23:21 ` Mike Kravetz
2020-11-20 2:52 ` Muchun Song
2020-11-19 23:37 ` Mike Kravetz
2020-11-13 10:59 ` [PATCH v4 06/21] mm/bootmem_info: Introduce {free,prepare}_vmemmap_page() Muchun Song
2020-11-13 10:59 ` [PATCH v4 07/21] mm/bootmem_info: Combine bootmem info and type into page->freelist Muchun Song
2020-11-13 10:59 ` [PATCH v4 08/21] mm/hugetlb: Initialize page table lock for vmemmap Muchun Song
2020-11-13 10:59 ` [PATCH v4 09/21] mm/hugetlb: Free the vmemmap pages associated with each hugetlb page Muchun Song
2020-11-17 9:54 ` Song Bao Hua (Barry Song)
2020-11-17 10:26 ` [External] " Muchun Song
2020-11-18 3:21 ` Song Bao Hua (Barry Song)
2020-11-13 10:59 ` [PATCH v4 10/21] mm/hugetlb: Defer freeing of hugetlb pages Muchun Song
2020-11-13 10:59 ` [PATCH v4 11/21] mm/hugetlb: Allocate the vmemmap pages associated with each hugetlb page Muchun Song
2020-11-13 10:59 ` [PATCH v4 12/21] mm/hugetlb: Introduce remap_huge_page_pmd_vmemmap helper Muchun Song
2020-11-13 10:59 ` [PATCH v4 13/21] mm/hugetlb: Use PG_slab to indicate split pmd Muchun Song
2020-11-13 10:59 ` [PATCH v4 14/21] mm/hugetlb: Support freeing vmemmap pages of gigantic page Muchun Song
2020-11-13 10:59 ` [PATCH v4 15/21] mm/hugetlb: Set the PageHWPoison to the raw error page Muchun Song
2020-11-13 10:59 ` [PATCH v4 16/21] mm/hugetlb: Flush work when dissolving hugetlb page Muchun Song
2020-11-13 10:59 ` [PATCH v4 17/21] mm/hugetlb: Add a kernel parameter hugetlb_free_vmemmap Muchun Song
2020-11-13 10:59 ` [PATCH v4 18/21] mm/hugetlb: Merge pte to huge pmd only for gigantic page Muchun Song
2020-11-13 10:59 ` [PATCH v4 19/21] mm/hugetlb: Gather discrete indexes of tail page Muchun Song
2020-11-13 10:59 ` [PATCH v4 20/21] mm/hugetlb: Add BUILD_BUG_ON to catch invalid usage of tail struct page Muchun Song
2020-11-13 10:59 ` [PATCH v4 21/21] mm/hugetlb: Disable freeing vmemmap if struct page size is not power of two Muchun Song
2020-11-17 10:15 ` [PATCH v4 00/21] Free some vmemmap pages of hugetlb page Song Bao Hua (Barry Song)
2020-11-17 10:49 ` [External] " Muchun Song
2020-11-17 11:07 ` Song Bao Hua (Barry Song)
2020-11-17 16:29 ` Muchun Song
2020-11-17 19:22 ` Matthew Wilcox
2020-11-18 2:43 ` Muchun Song
2020-11-17 19:45 ` Oscar Salvador
2020-11-18 3:27 ` Muchun Song
2020-11-18 3:27 ` Song Bao Hua (Barry Song)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMZfGtXRqTpqJoGonMdTcE4HjPEy98FBFiry3Rry5=Jpfen1xw@mail.gmail.com' \
--to=songmuchun@bytedance.com \
--cc=akpm@linux-foundation.org \
--cc=almasrymina@google.com \
--cc=anshuman.khandual@arm.com \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=duanxiongchun@bytedance.com \
--cc=hpa@zytor.com \
--cc=jroedel@suse.de \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mchehab+huawei@kernel.org \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=mingo@redhat.com \
--cc=oneukum@suse.com \
--cc=osalvador@suse.de \
--cc=paulmck@kernel.org \
--cc=pawan.kumar.gupta@linux.intel.com \
--cc=peterz@infradead.org \
--cc=rdunlap@infradead.org \
--cc=rientjes@google.com \
--cc=tglx@linutronix.de \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).