All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: Muchun Song <songmuchun@bytedance.com>
Cc: corbet@lwn.net, mike.kravetz@oracle.com, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com,
	dave.hansen@linux.intel.com, luto@kernel.org,
	peterz@infradead.org, viro@zeniv.linux.org.uk,
	akpm@linux-foundation.org, paulmck@kernel.org,
	mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com,
	rdunlap@infradead.org, oneukum@suse.com,
	anshuman.khandual@arm.com, jroedel@suse.de,
	almasrymina@google.com, rientjes@google.com, willy@infradead.org,
	mhocko@suse.com, song.bao.hua@hisilicon.com, david@redhat.com,
	naoya.horiguchi@nec.com, joao.m.martins@oracle.com,
	duanxiongchun@bytedance.com, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org, Chen Huang <chenhuang5@huawei.com>,
	Bodeddula Balasubramaniam <bodeddub@amazon.com>
Subject: Re: [PATCH v18 4/9] mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page
Date: Wed, 10 Mar 2021 15:21:02 +0100	[thread overview]
Message-ID: <20210310142057.GA12777@linux> (raw)
In-Reply-To: <20210308102807.59745-5-songmuchun@bytedance.com>

On Mon, Mar 08, 2021 at 06:28:02PM +0800, Muchun Song wrote:
> When we free a HugeTLB page to the buddy allocator, we need to allocate
> the vmemmap pages associated with it. However, we may not be able to
> allocate the vmemmap pages when the system is under memory pressure. In
> this case, we just refuse to free the HugeTLB page. This changes behavior
> in some corner cases as listed below:
> 
>  1) Failing to free a huge page triggered by the user (decrease nr_pages).
> 
>     User needs to try again later.
> 
>  2) Failing to free a surplus huge page when freed by the application.
> 
>     Try again later when freeing a huge page next time.
> 
>  3) Failing to dissolve a free huge page on ZONE_MOVABLE via
>     offline_pages().
> 
>     This can happen when we have plenty of ZONE_MOVABLE memory, but
>     not enough kernel memory to allocate vmemmmap pages.  We may even
>     be able to migrate huge page contents, but will not be able to
>     dissolve the source huge page.  This will prevent an offline
>     operation and is unfortunate as memory offlining is expected to
>     succeed on movable zones.  Users that depend on memory hotplug
>     to succeed for movable zones should carefully consider whether the
>     memory savings gained from this feature are worth the risk of
>     possibly not being able to offline memory in certain situations.

This is nice to have it here, but a normal user won't dig in the kernel to
figure this out, so my question is: Do we have this documented somewhere under
Documentation/?
If not, could we document it there? It is nice to warn about this things were
sysadmins can find them.

>  4) Failing to dissolve a huge page on CMA/ZONE_MOVABLE via
>     alloc_contig_range() - once we have that handling in place. Mainly
>     affects CMA and virtio-mem.
> 
>     Similar to 3). virito-mem will handle migration errors gracefully.
>     CMA might be able to fallback on other free areas within the CMA
>     region.
> 
> Vmemmap pages are allocated from the page freeing context. In order for
> those allocations to be not disruptive (e.g. trigger oom killer)
> __GFP_NORETRY is used. hugetlb_lock is dropped for the allocation
> because a non sleeping allocation would be too fragile and it could fail
> too easily under memory pressure. GFP_ATOMIC or other modes to access
> memory reserves is not used because we want to prevent consuming
> reserves under heavy hugetlb freeing.
> 
> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> Tested-by: Chen Huang <chenhuang5@huawei.com>
> Tested-by: Bodeddula Balasubramaniam <bodeddub@amazon.com>

Sorry for jumping in late.
It looks good to me:

Reviewed-by: Oscar Salvador <osalvador@suse.de>

Minor request above and below:

> ---
>  Documentation/admin-guide/mm/hugetlbpage.rst |  8 +++
>  include/linux/mm.h                           |  2 +
>  mm/hugetlb.c                                 | 92 +++++++++++++++++++++-------
>  mm/hugetlb_vmemmap.c                         | 32 ++++++----
>  mm/hugetlb_vmemmap.h                         | 23 +++++++
>  mm/sparse-vmemmap.c                          | 75 ++++++++++++++++++++++-
>  6 files changed, 197 insertions(+), 35 deletions(-)

[...]



Could we place a brief comment about what we expect to return here?

> -static inline unsigned long free_vmemmap_pages_size_per_hpage(struct hstate *h)
> +int alloc_huge_page_vmemmap(struct hstate *h, struct page *head)
>  {
> -	return (unsigned long)free_vmemmap_pages_per_hpage(h) << PAGE_SHIFT;
> +	unsigned long vmemmap_addr = (unsigned long)head;
> +	unsigned long vmemmap_end, vmemmap_reuse;
> +
> +	if (!free_vmemmap_pages_per_hpage(h))
> +		return 0;
> +
> +	vmemmap_addr += RESERVE_VMEMMAP_SIZE;
> +	vmemmap_end = vmemmap_addr + free_vmemmap_pages_size_per_hpage(h);
> +	vmemmap_reuse = vmemmap_addr - PAGE_SIZE;
> +	/*
> +	 * The pages which the vmemmap virtual address range [@vmemmap_addr,
> +	 * @vmemmap_end) are mapped to are freed to the buddy allocator, and
> +	 * the range is mapped to the page which @vmemmap_reuse is mapped to.
> +	 * When a HugeTLB page is freed to the buddy allocator, previously
> +	 * discarded vmemmap pages must be allocated and remapping.
> +	 */
> +	return vmemmap_remap_alloc(vmemmap_addr, vmemmap_end, vmemmap_reuse,
> +				   GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE);
>  }

-- 
Oscar Salvador
SUSE L3

  reply	other threads:[~2021-03-10 14:21 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-08 10:27 [PATCH v18 0/9] Free some vmemmap pages of HugeTLB page Muchun Song
2021-03-08 10:27 ` [PATCH v18 1/9] mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c Muchun Song
2021-03-10 14:14   ` Michal Hocko
2021-03-11  2:58     ` [External] " Muchun Song
2021-03-11  8:45       ` Muchun Song
2021-03-11  8:53         ` Michal Hocko
2021-03-11  9:05           ` Muchun Song
2021-03-08 10:28 ` [PATCH v18 2/9] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP Muchun Song
2021-03-08 10:28 ` [PATCH v18 3/9] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page Muchun Song
2021-03-10 14:32   ` Michal Hocko
2021-03-11  3:35     ` [External] " Muchun Song
2021-03-08 10:28 ` [PATCH v18 4/9] mm: hugetlb: alloc " Muchun Song
2021-03-10 14:21   ` Oscar Salvador [this message]
2021-03-11  4:13     ` [External] " Muchun Song
2021-03-10 15:19   ` Michal Hocko
2021-03-10 18:56     ` Mike Kravetz
2021-03-10 21:11       ` Michal Hocko
2021-03-10 21:49         ` Paul E. McKenney
2021-03-10 22:10           ` Mike Kravetz
2021-03-10 23:28             ` Paul E. McKenney
2021-03-11  8:40               ` Michal Hocko
2021-03-11 12:17                 ` Michal Hocko
2021-03-11 17:59                   ` Mike Kravetz
2021-03-11 22:53                     ` Mike Kravetz
2021-03-12  8:15                       ` Michal Hocko
2021-03-12 17:50                         ` Mike Kravetz
2021-03-11  4:26     ` [External] " Muchun Song
2021-03-11  8:46       ` Michal Hocko
2021-03-11  8:49         ` Muchun Song
2021-03-08 10:28 ` [PATCH v18 5/9] mm: hugetlb: set the PageHWPoison to the raw error page Muchun Song
2021-03-10 15:27   ` Michal Hocko
2021-03-11  6:34     ` [External] " Muchun Song
2021-03-11  8:50       ` Michal Hocko
2021-03-11  9:13         ` Muchun Song
2021-03-08 10:28 ` [PATCH v18 6/9] mm: hugetlb: add a kernel parameter hugetlb_free_vmemmap Muchun Song
2021-03-10 15:37   ` Michal Hocko
2021-03-10 17:15     ` Randy Dunlap
2021-03-11  6:36       ` [External] " Muchun Song
2021-03-11  6:36     ` Muchun Song
2021-03-08 10:28 ` [PATCH v18 7/9] mm: hugetlb: introduce nr_free_vmemmap_pages in the struct hstate Muchun Song
2021-03-08 10:28 ` [PATCH v18 8/9] mm: hugetlb: gather discrete indexes of tail page Muchun Song
2021-03-10 15:39   ` Michal Hocko
2021-03-08 10:28 ` [PATCH v18 9/9] mm: hugetlb: optimize the code with the help of the compiler Muchun Song
2021-03-10 15:41   ` Michal Hocko
2021-03-11  7:33     ` [External] " Muchun Song
2021-03-11  8:55       ` Michal Hocko
2021-03-11  9:08         ` Muchun Song
2021-03-11  9:39           ` Michal Hocko
2021-03-11 10:00             ` Muchun Song
2021-03-11 12:16               ` Michal Hocko
2021-03-11 13:00                 ` Muchun Song
2021-03-11 13:45                 ` Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210310142057.GA12777@linux \
    --to=osalvador@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=anshuman.khandual@arm.com \
    --cc=bodeddub@amazon.com \
    --cc=bp@alien8.de \
    --cc=chenhuang5@huawei.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=duanxiongchun@bytedance.com \
    --cc=hpa@zytor.com \
    --cc=joao.m.martins@oracle.com \
    --cc=jroedel@suse.de \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mchehab+huawei@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=naoya.horiguchi@nec.com \
    --cc=oneukum@suse.com \
    --cc=paulmck@kernel.org \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=song.bao.hua@hisilicon.com \
    --cc=songmuchun@bytedance.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.