linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction gurantees
@ 2023-04-12 15:23 Pasha Tatashin
  2023-04-12 17:54 ` David Rientjes
  0 siblings, 1 reply; 5+ messages in thread
From: Pasha Tatashin @ 2023-04-12 15:23 UTC (permalink / raw)
  To: pasha.tatashin, linux-kernel, linux-mm, akpm, mike.kravetz,
	muchun.song, rientjes, souravpanda

HugeTLB pages have a struct page optimizations where struct pages for tail
pages are freed. However, when HugeTLB pages are destroyed, the memory for
struct pages (vmemmap) need to be allocated again.

Currently, __GFP_NORETRY flag is used to allocate the memory for vmemmap,
but given that this flag makes very little effort to actually reclaim
memory the returning of huge pages back to the system can be problem. Lets
use __GFP_RETRY_MAYFAIL instead. This flag is also performs graceful
reclaim without causing ooms, but at least it may perform a few retries,
and will fail only when there is genuinely little amount of unused memory
in the system.

Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Suggested-by: David Rientjes <rientjes@google.com>
---
 mm/hugetlb_vmemmap.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index a559037cce00..c4226d2af7cc 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -475,9 +475,12 @@ int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head)
 	 * the range is mapped to the page which @vmemmap_reuse is mapped to.
 	 * When a HugeTLB page is freed to the buddy allocator, previously
 	 * discarded vmemmap pages must be allocated and remapping.
+	 *
+	 * Use __GFP_RETRY_MAYFAIL to fail only when there is genuinely little
+	 * unused memory in the system.
 	 */
 	ret = vmemmap_remap_alloc(vmemmap_start, vmemmap_end, vmemmap_reuse,
-				  GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE);
+				  GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE);
 	if (!ret) {
 		ClearHPageVmemmapOptimized(head);
 		static_branch_dec(&hugetlb_optimize_vmemmap_key);
-- 
2.40.0.577.gac1e443424-goog



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction gurantees
  2023-04-12 15:23 [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction gurantees Pasha Tatashin
@ 2023-04-12 17:54 ` David Rientjes
  2023-04-12 19:57   ` Mike Kravetz
  2023-04-12 19:57   ` Pasha Tatashin
  0 siblings, 2 replies; 5+ messages in thread
From: David Rientjes @ 2023-04-12 17:54 UTC (permalink / raw)
  To: Pasha Tatashin
  Cc: linux-kernel, linux-mm, akpm, mike.kravetz, muchun.song, souravpanda

On Wed, 12 Apr 2023, Pasha Tatashin wrote:

> HugeTLB pages have a struct page optimizations where struct pages for tail
> pages are freed. However, when HugeTLB pages are destroyed, the memory for
> struct pages (vmemmap) need to be allocated again.
> 
> Currently, __GFP_NORETRY flag is used to allocate the memory for vmemmap,
> but given that this flag makes very little effort to actually reclaim
> memory the returning of huge pages back to the system can be problem. Lets
> use __GFP_RETRY_MAYFAIL instead. This flag is also performs graceful
> reclaim without causing ooms, but at least it may perform a few retries,
> and will fail only when there is genuinely little amount of unused memory
> in the system.
> 

Thanks Pasha, this definitely makes sense.  We want to free the hugetlb 
page back to the system so it would be a shame to have to strand it in the 
hugetlb pool because we can't allocate the tail pages (we want to free 
more memory than we're allocating).

> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> Suggested-by: David Rientjes <rientjes@google.com>
> ---
>  mm/hugetlb_vmemmap.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index a559037cce00..c4226d2af7cc 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -475,9 +475,12 @@ int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head)
>  	 * the range is mapped to the page which @vmemmap_reuse is mapped to.
>  	 * When a HugeTLB page is freed to the buddy allocator, previously
>  	 * discarded vmemmap pages must be allocated and remapping.
> +	 *
> +	 * Use __GFP_RETRY_MAYFAIL to fail only when there is genuinely little
> +	 * unused memory in the system.
>  	 */
>  	ret = vmemmap_remap_alloc(vmemmap_start, vmemmap_end, vmemmap_reuse,
> -				  GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE);
> +				  GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE);
>  	if (!ret) {
>  		ClearHPageVmemmapOptimized(head);
>  		static_branch_dec(&hugetlb_optimize_vmemmap_key);

The behavior of __GFP_RETRY_MAYFAIL is different for high-order memory (at 
least larger than PAGE_ALLOC_COSTLY_ORDER).  The order that we're 
allocating would depend on the implementation of alloc_vmemmap_page_list() 
so likely best to move the gfp mask to that function.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction gurantees
  2023-04-12 17:54 ` David Rientjes
@ 2023-04-12 19:57   ` Mike Kravetz
  2023-04-12 20:00     ` Pasha Tatashin
  2023-04-12 19:57   ` Pasha Tatashin
  1 sibling, 1 reply; 5+ messages in thread
From: Mike Kravetz @ 2023-04-12 19:57 UTC (permalink / raw)
  To: David Rientjes
  Cc: Pasha Tatashin, linux-kernel, linux-mm, akpm, muchun.song,
	souravpanda, Michal Hocko

On 04/12/23 10:54, David Rientjes wrote:
> On Wed, 12 Apr 2023, Pasha Tatashin wrote:
> 
> > HugeTLB pages have a struct page optimizations where struct pages for tail
> > pages are freed. However, when HugeTLB pages are destroyed, the memory for
> > struct pages (vmemmap) need to be allocated again.
> > 
> > Currently, __GFP_NORETRY flag is used to allocate the memory for vmemmap,
> > but given that this flag makes very little effort to actually reclaim
> > memory the returning of huge pages back to the system can be problem. Lets
> > use __GFP_RETRY_MAYFAIL instead. This flag is also performs graceful
> > reclaim without causing ooms, but at least it may perform a few retries,
> > and will fail only when there is genuinely little amount of unused memory
> > in the system.
> > 
> 
> Thanks Pasha, this definitely makes sense.  We want to free the hugetlb 
> page back to the system so it would be a shame to have to strand it in the 
> hugetlb pool because we can't allocate the tail pages (we want to free 
> more memory than we're allocating).

Agree.

The hugetlb vmemmmap freeing series went through more than 20 revisions
before being merged.  One issue with much discussion was the need to
allocate vmemmap pages when hugetlb pages were returned to buddy.

It looks like the current set of GFP flags was suggested here:
https://lore.kernel.org/linux-mm/YC4ji+pMhtOs+KVM@dhcp22.suse.cz/

Although, it was also mentioned that __GFP_RETRY_MAYFAIL could be used
instead of __GFP_NORETRY here:
https://lore.kernel.org/linux-mm/YCafit5ruRJ+SL8I@dhcp22.suse.cz/

Adding Michal on Cc: since these were his suggestions.

> 
> > Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> > Suggested-by: David Rientjes <rientjes@google.com>
> > ---
> >  mm/hugetlb_vmemmap.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> > index a559037cce00..c4226d2af7cc 100644
> > --- a/mm/hugetlb_vmemmap.c
> > +++ b/mm/hugetlb_vmemmap.c
> > @@ -475,9 +475,12 @@ int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head)
> >  	 * the range is mapped to the page which @vmemmap_reuse is mapped to.
> >  	 * When a HugeTLB page is freed to the buddy allocator, previously
> >  	 * discarded vmemmap pages must be allocated and remapping.
> > +	 *
> > +	 * Use __GFP_RETRY_MAYFAIL to fail only when there is genuinely little
> > +	 * unused memory in the system.
> >  	 */
> >  	ret = vmemmap_remap_alloc(vmemmap_start, vmemmap_end, vmemmap_reuse,
> > -				  GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE);
> > +				  GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE);
> >  	if (!ret) {
> >  		ClearHPageVmemmapOptimized(head);
> >  		static_branch_dec(&hugetlb_optimize_vmemmap_key);
> 
> The behavior of __GFP_RETRY_MAYFAIL is different for high-order memory (at 
> least larger than PAGE_ALLOC_COSTLY_ORDER).  The order that we're 
> allocating would depend on the implementation of alloc_vmemmap_page_list() 
> so likely best to move the gfp mask to that function.

Good point.

-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction gurantees
  2023-04-12 17:54 ` David Rientjes
  2023-04-12 19:57   ` Mike Kravetz
@ 2023-04-12 19:57   ` Pasha Tatashin
  1 sibling, 0 replies; 5+ messages in thread
From: Pasha Tatashin @ 2023-04-12 19:57 UTC (permalink / raw)
  To: David Rientjes
  Cc: linux-kernel, linux-mm, akpm, mike.kravetz, muchun.song, souravpanda

On Wed, Apr 12, 2023 at 1:54 PM David Rientjes <rientjes@google.com> wrote:
>
> On Wed, 12 Apr 2023, Pasha Tatashin wrote:
>
> > HugeTLB pages have a struct page optimizations where struct pages for tail
> > pages are freed. However, when HugeTLB pages are destroyed, the memory for
> > struct pages (vmemmap) need to be allocated again.
> >
> > Currently, __GFP_NORETRY flag is used to allocate the memory for vmemmap,
> > but given that this flag makes very little effort to actually reclaim
> > memory the returning of huge pages back to the system can be problem. Lets
> > use __GFP_RETRY_MAYFAIL instead. This flag is also performs graceful
> > reclaim without causing ooms, but at least it may perform a few retries,
> > and will fail only when there is genuinely little amount of unused memory
> > in the system.
> >
>
> Thanks Pasha, this definitely makes sense.  We want to free the hugetlb
> page back to the system so it would be a shame to have to strand it in the
> hugetlb pool because we can't allocate the tail pages (we want to free
> more memory than we're allocating).
>
> > Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com>
> > Suggested-by: David Rientjes <rientjes@google.com>
> > ---
> >  mm/hugetlb_vmemmap.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> > index a559037cce00..c4226d2af7cc 100644
> > --- a/mm/hugetlb_vmemmap.c
> > +++ b/mm/hugetlb_vmemmap.c
> > @@ -475,9 +475,12 @@ int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head)
> >        * the range is mapped to the page which @vmemmap_reuse is mapped to.
> >        * When a HugeTLB page is freed to the buddy allocator, previously
> >        * discarded vmemmap pages must be allocated and remapping.
> > +      *
> > +      * Use __GFP_RETRY_MAYFAIL to fail only when there is genuinely little
> > +      * unused memory in the system.
> >        */
> >       ret = vmemmap_remap_alloc(vmemmap_start, vmemmap_end, vmemmap_reuse,
> > -                               GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE);
> > +                               GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE);
> >       if (!ret) {
> >               ClearHPageVmemmapOptimized(head);
> >               static_branch_dec(&hugetlb_optimize_vmemmap_key);
>
> The behavior of __GFP_RETRY_MAYFAIL is different for high-order memory (at
> least larger than PAGE_ALLOC_COSTLY_ORDER).  The order that we're
> allocating would depend on the implementation of alloc_vmemmap_page_list()
> so likely best to move the gfp mask to that function.

Thank you David. This makes sense, I will send the 2nd version soon.

Pasha


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction gurantees
  2023-04-12 19:57   ` Mike Kravetz
@ 2023-04-12 20:00     ` Pasha Tatashin
  0 siblings, 0 replies; 5+ messages in thread
From: Pasha Tatashin @ 2023-04-12 20:00 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: David Rientjes, linux-kernel, linux-mm, akpm, muchun.song,
	souravpanda, Michal Hocko

On Wed, Apr 12, 2023 at 3:57 PM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 04/12/23 10:54, David Rientjes wrote:
> > On Wed, 12 Apr 2023, Pasha Tatashin wrote:
> >
> > > HugeTLB pages have a struct page optimizations where struct pages for tail
> > > pages are freed. However, when HugeTLB pages are destroyed, the memory for
> > > struct pages (vmemmap) need to be allocated again.
> > >
> > > Currently, __GFP_NORETRY flag is used to allocate the memory for vmemmap,
> > > but given that this flag makes very little effort to actually reclaim
> > > memory the returning of huge pages back to the system can be problem. Lets
> > > use __GFP_RETRY_MAYFAIL instead. This flag is also performs graceful
> > > reclaim without causing ooms, but at least it may perform a few retries,
> > > and will fail only when there is genuinely little amount of unused memory
> > > in the system.
> > >
> >
> > Thanks Pasha, this definitely makes sense.  We want to free the hugetlb
> > page back to the system so it would be a shame to have to strand it in the
> > hugetlb pool because we can't allocate the tail pages (we want to free
> > more memory than we're allocating).
>
> Agree.
>
> The hugetlb vmemmmap freeing series went through more than 20 revisions
> before being merged.  One issue with much discussion was the need to
> allocate vmemmap pages when hugetlb pages were returned to buddy.
>
> It looks like the current set of GFP flags was suggested here:
> https://lore.kernel.org/linux-mm/YC4ji+pMhtOs+KVM@dhcp22.suse.cz/
>
> Although, it was also mentioned that __GFP_RETRY_MAYFAIL could be used
> instead of __GFP_NORETRY here:
> https://lore.kernel.org/linux-mm/YCafit5ruRJ+SL8I@dhcp22.suse.cz/
>
> Adding Michal on Cc: since these were his suggestions.

Thank you for the background Mike. I have sent the 2nd version, and
added Michal into that patch.

Pasha


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-04-12 20:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-12 15:23 [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction gurantees Pasha Tatashin
2023-04-12 17:54 ` David Rientjes
2023-04-12 19:57   ` Mike Kravetz
2023-04-12 20:00     ` Pasha Tatashin
2023-04-12 19:57   ` Pasha Tatashin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).