linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: David Rientjes <rientjes@google.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, muchun.song@linux.dev,
	souravpanda@google.com, Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction gurantees
Date: Wed, 12 Apr 2023 12:57:23 -0700	[thread overview]
Message-ID: <20230412195723.GA4759@monkey> (raw)
In-Reply-To: <63736432-5cef-f67c-c809-cc19b236a7f4@google.com>

On 04/12/23 10:54, David Rientjes wrote:
> On Wed, 12 Apr 2023, Pasha Tatashin wrote:
> 
> > HugeTLB pages have a struct page optimizations where struct pages for tail
> > pages are freed. However, when HugeTLB pages are destroyed, the memory for
> > struct pages (vmemmap) need to be allocated again.
> > 
> > Currently, __GFP_NORETRY flag is used to allocate the memory for vmemmap,
> > but given that this flag makes very little effort to actually reclaim
> > memory the returning of huge pages back to the system can be problem. Lets
> > use __GFP_RETRY_MAYFAIL instead. This flag is also performs graceful
> > reclaim without causing ooms, but at least it may perform a few retries,
> > and will fail only when there is genuinely little amount of unused memory
> > in the system.
> > 
> 
> Thanks Pasha, this definitely makes sense.  We want to free the hugetlb 
> page back to the system so it would be a shame to have to strand it in the 
> hugetlb pool because we can't allocate the tail pages (we want to free 
> more memory than we're allocating).

Agree.

The hugetlb vmemmmap freeing series went through more than 20 revisions
before being merged.  One issue with much discussion was the need to
allocate vmemmap pages when hugetlb pages were returned to buddy.

It looks like the current set of GFP flags was suggested here:
https://lore.kernel.org/linux-mm/YC4ji+pMhtOs+KVM@dhcp22.suse.cz/

Although, it was also mentioned that __GFP_RETRY_MAYFAIL could be used
instead of __GFP_NORETRY here:
https://lore.kernel.org/linux-mm/YCafit5ruRJ+SL8I@dhcp22.suse.cz/

Adding Michal on Cc: since these were his suggestions.

> 
> > Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> > Suggested-by: David Rientjes <rientjes@google.com>
> > ---
> >  mm/hugetlb_vmemmap.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> > index a559037cce00..c4226d2af7cc 100644
> > --- a/mm/hugetlb_vmemmap.c
> > +++ b/mm/hugetlb_vmemmap.c
> > @@ -475,9 +475,12 @@ int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head)
> >  	 * the range is mapped to the page which @vmemmap_reuse is mapped to.
> >  	 * When a HugeTLB page is freed to the buddy allocator, previously
> >  	 * discarded vmemmap pages must be allocated and remapping.
> > +	 *
> > +	 * Use __GFP_RETRY_MAYFAIL to fail only when there is genuinely little
> > +	 * unused memory in the system.
> >  	 */
> >  	ret = vmemmap_remap_alloc(vmemmap_start, vmemmap_end, vmemmap_reuse,
> > -				  GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE);
> > +				  GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE);
> >  	if (!ret) {
> >  		ClearHPageVmemmapOptimized(head);
> >  		static_branch_dec(&hugetlb_optimize_vmemmap_key);
> 
> The behavior of __GFP_RETRY_MAYFAIL is different for high-order memory (at 
> least larger than PAGE_ALLOC_COSTLY_ORDER).  The order that we're 
> allocating would depend on the implementation of alloc_vmemmap_page_list() 
> so likely best to move the gfp mask to that function.

Good point.

-- 
Mike Kravetz


  reply	other threads:[~2023-04-12 19:57 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-12 15:23 [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction gurantees Pasha Tatashin
2023-04-12 17:54 ` David Rientjes
2023-04-12 19:57   ` Mike Kravetz [this message]
2023-04-12 20:00     ` Pasha Tatashin
2023-04-12 19:57   ` Pasha Tatashin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230412195723.GA4759@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=pasha.tatashin@soleen.com \
    --cc=rientjes@google.com \
    --cc=souravpanda@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).