linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Muchun Song <songmuchun@bytedance.com>,
	akpm@linux-foundation.org, osalvador@suse.de, mhocko@suse.com,
	song.bao.hua@hisilicon.com, david@redhat.com,
	chenhuang5@huawei.com, bodeddub@amazon.com, corbet@lwn.net
Cc: duanxiongchun@bytedance.com, fam.zheng@bytedance.com,
	zhengqi.arch@bytedance.com, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 2/5] mm: hugetlb: introduce helpers to preallocate page tables from bootmem allocator
Date: Thu, 10 Jun 2021 15:13:03 -0700	[thread overview]
Message-ID: <7c04420e-c564-7e19-6c2e-ea81dd51d396@oracle.com> (raw)
In-Reply-To: <20210609121310.62229-3-songmuchun@bytedance.com>

On 6/9/21 5:13 AM, Muchun Song wrote:
> If we want to split the huge PMD of vmemmap pages associated with each
> gigantic page allocated from bootmem allocator, we should pre-allocate
> the page tables from bootmem allocator.

Just curious why this is necessary and a good idea?  Why not wait until
the gigantic pages allocated from bootmem are added to the pool to
allocate any necessary vmemmmap pages?

> the page tables from bootmem allocator. In this patch, we introduce
> some helpers to preallocate page tables for gigantic pages.
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> ---
>  include/linux/hugetlb.h |  3 +++
>  mm/hugetlb_vmemmap.c    | 63 +++++++++++++++++++++++++++++++++++++++++++++++++
>  mm/hugetlb_vmemmap.h    | 13 ++++++++++
>  3 files changed, 79 insertions(+)
> 
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 03ca83db0a3e..c27a299c4211 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -622,6 +622,9 @@ struct hstate {
>  struct huge_bootmem_page {
>  	struct list_head list;
>  	struct hstate *hstate;
> +#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP
> +	pte_t *vmemmap_pte;
> +#endif
>  };
>  
>  int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list);
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index 628e2752714f..6f3a47b4ebd3 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -171,6 +171,7 @@
>  #define pr_fmt(fmt)	"HugeTLB: " fmt
>  
>  #include <linux/list.h>
> +#include <linux/memblock.h>
>  #include <asm/pgalloc.h>
>  
>  #include "hugetlb_vmemmap.h"
> @@ -263,6 +264,68 @@ int vmemmap_pgtable_prealloc(struct hstate *h, struct list_head *pgtables)
>  	return -ENOMEM;
>  }
>  
> +unsigned long __init gigantic_vmemmap_pgtable_prealloc(void)
> +{
> +	struct huge_bootmem_page *m, *tmp;
> +	unsigned long nr_free = 0;
> +
> +	list_for_each_entry_safe(m, tmp, &huge_boot_pages, list) {
> +		struct hstate *h = m->hstate;
> +		unsigned int nr = pgtable_pages_to_prealloc_per_hpage(h);
> +		unsigned long size;
> +
> +		if (!nr)
> +			continue;
> +
> +		size = nr << PAGE_SHIFT;
> +		m->vmemmap_pte = memblock_alloc_try_nid(size, PAGE_SIZE, 0,
> +							MEMBLOCK_ALLOC_ACCESSIBLE,
> +							NUMA_NO_NODE);
> +		if (!m->vmemmap_pte) {
> +			nr_free++;
> +			list_del(&m->list);
> +			memblock_free_early(__pa(m), huge_page_size(h));

If we can not allocate the vmmmemap pages to split the PMD, then we will
not add the huge page to the pool.  Correct?

Perhaps I am thinking about this incorrectly, but this seems wrong.  We
already have everything we need to add the page to the pool.  vmemmap
reduction is an optimization.  So, the allocation failure is associated
with an optimization.  In this case, it seems like we should just skip
the optimization (vmemmap reduction) and proceed to add the page to the
pool?  It seems we do the same thing in subsequent patches.

Again, I could be thinking about this incorrectly.
-- 
Mike Kravetz

  reply	other threads:[~2021-06-10 22:13 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-09 12:13 [PATCH 0/5] Split huge PMD mapping of vmemmap pages Muchun Song
2021-06-09 12:13 ` [PATCH 1/5] mm: hugetlb: introduce helpers to preallocate/free page tables Muchun Song
2021-06-10 21:49   ` Mike Kravetz
2021-06-09 12:13 ` [PATCH 2/5] mm: hugetlb: introduce helpers to preallocate page tables from bootmem allocator Muchun Song
2021-06-10 22:13   ` Mike Kravetz [this message]
2021-06-09 12:13 ` [PATCH 3/5] mm: sparsemem: split the huge PMD mapping of vmemmap pages Muchun Song
2021-06-10 22:35   ` Mike Kravetz
2021-06-11  7:52     ` [External] " Muchun Song
2021-06-11 12:35       ` Muchun Song
2021-06-09 12:13 ` [PATCH 4/5] mm: sparsemem: use huge PMD mapping for " Muchun Song
2021-06-10 22:49   ` Mike Kravetz
2021-06-09 12:13 ` [PATCH 5/5] mm: hugetlb: introduce CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON Muchun Song
2021-06-10 21:32 ` [PATCH 0/5] Split huge PMD mapping of vmemmap pages Mike Kravetz
2021-06-11  3:23   ` [External] " Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7c04420e-c564-7e19-6c2e-ea81dd51d396@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=bodeddub@amazon.com \
    --cc=chenhuang5@huawei.com \
    --cc=corbet@lwn.net \
    --cc=david@redhat.com \
    --cc=duanxiongchun@bytedance.com \
    --cc=fam.zheng@bytedance.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=song.bao.hua@hisilicon.com \
    --cc=songmuchun@bytedance.com \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).