xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH] mm: fix regression with deferred struct page init
@ 2019-06-20 16:08 Juergen Gross
  2019-06-20 16:10 ` Alexander Duyck
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Juergen Gross @ 2019-06-20 16:08 UTC (permalink / raw)
  To: xen-devel, linux-mm, linux-kernel; +Cc: Juergen Gross, Alexander Duyck

Commit 0e56acae4b4dd4a9 ("mm: initialize MAX_ORDER_NR_PAGES at a time
instead of doing larger sections") is causing a regression on some
systems when the kernel is booted as Xen dom0.

The system will just hang in early boot.

Reason is an endless loop in get_page_from_freelist() in case the first
zone looked at has no free memory. deferred_grow_zone() is always
returning true due to the following code snipplet:

  /* If the zone is empty somebody else may have cleared out the zone */
  if (!deferred_init_mem_pfn_range_in_zone(&i, zone, &spfn, &epfn,
                                           first_deferred_pfn)) {
          pgdat->first_deferred_pfn = ULONG_MAX;
          pgdat_resize_unlock(pgdat, &flags);
          return true;
  }

This in turn results in the loop as get_page_from_freelist() is
assuming forward progress can be made by doing some more struct page
initialization.

Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Fixes: 0e56acae4b4dd4a9 ("mm: initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections")
Suggested-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
---
 mm/page_alloc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d66bc8abe0af..8e3bc949ebcc 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1826,7 +1826,8 @@ deferred_grow_zone(struct zone *zone, unsigned int order)
 						 first_deferred_pfn)) {
 		pgdat->first_deferred_pfn = ULONG_MAX;
 		pgdat_resize_unlock(pgdat, &flags);
-		return true;
+		/* Retry only once. */
+		return first_deferred_pfn != ULONG_MAX;
 	}
 
 	/*
-- 
2.16.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] mm: fix regression with deferred struct page init
  2019-06-20 16:08 [Xen-devel] [PATCH] mm: fix regression with deferred struct page init Juergen Gross
@ 2019-06-20 16:10 ` Alexander Duyck
  2019-06-25  8:25 ` Juergen Gross
  2019-06-28 15:17 ` Michal Hocko
  2 siblings, 0 replies; 7+ messages in thread
From: Alexander Duyck @ 2019-06-20 16:10 UTC (permalink / raw)
  To: Juergen Gross, xen-devel, linux-mm, linux-kernel

On Thu, 2019-06-20 at 18:08 +0200, Juergen Gross wrote:
> Commit 0e56acae4b4dd4a9 ("mm: initialize MAX_ORDER_NR_PAGES at a time
> instead of doing larger sections") is causing a regression on some
> systems when the kernel is booted as Xen dom0.
> 
> The system will just hang in early boot.
> 
> Reason is an endless loop in get_page_from_freelist() in case the first
> zone looked at has no free memory. deferred_grow_zone() is always
> returning true due to the following code snipplet:
> 
>   /* If the zone is empty somebody else may have cleared out the zone */
>   if (!deferred_init_mem_pfn_range_in_zone(&i, zone, &spfn, &epfn,
>                                            first_deferred_pfn)) {
>           pgdat->first_deferred_pfn = ULONG_MAX;
>           pgdat_resize_unlock(pgdat, &flags);
>           return true;
>   }
> 
> This in turn results in the loop as get_page_from_freelist() is
> assuming forward progress can be made by doing some more struct page
> initialization.
> 
> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> Fixes: 0e56acae4b4dd4a9 ("mm: initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections")
> Suggested-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> Signed-off-by: Juergen Gross <jgross@suse.com>

Acked-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>

> ---
>  mm/page_alloc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d66bc8abe0af..8e3bc949ebcc 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1826,7 +1826,8 @@ deferred_grow_zone(struct zone *zone, unsigned int order)
>  						 first_deferred_pfn)) {
>  		pgdat->first_deferred_pfn = ULONG_MAX;
>  		pgdat_resize_unlock(pgdat, &flags);
> -		return true;
> +		/* Retry only once. */
> +		return first_deferred_pfn != ULONG_MAX;
>  	}
>  
>  	/*



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] mm: fix regression with deferred struct page init
  2019-06-20 16:08 [Xen-devel] [PATCH] mm: fix regression with deferred struct page init Juergen Gross
  2019-06-20 16:10 ` Alexander Duyck
@ 2019-06-25  8:25 ` Juergen Gross
  2019-06-27 15:35   ` Juergen Gross
  2019-06-28 15:17 ` Michal Hocko
  2 siblings, 1 reply; 7+ messages in thread
From: Juergen Gross @ 2019-06-25  8:25 UTC (permalink / raw)
  To: xen-devel, linux-mm, linux-kernel; +Cc: Alexander Duyck

Gentle ping.

I'd really like to have that in 5.2 in order to avoid the regression
introduced with 5.2-rc1.


Juergen

On 20.06.19 18:08, Juergen Gross wrote:
> Commit 0e56acae4b4dd4a9 ("mm: initialize MAX_ORDER_NR_PAGES at a time
> instead of doing larger sections") is causing a regression on some
> systems when the kernel is booted as Xen dom0.
> 
> The system will just hang in early boot.
> 
> Reason is an endless loop in get_page_from_freelist() in case the first
> zone looked at has no free memory. deferred_grow_zone() is always
> returning true due to the following code snipplet:
> 
>    /* If the zone is empty somebody else may have cleared out the zone */
>    if (!deferred_init_mem_pfn_range_in_zone(&i, zone, &spfn, &epfn,
>                                             first_deferred_pfn)) {
>            pgdat->first_deferred_pfn = ULONG_MAX;
>            pgdat_resize_unlock(pgdat, &flags);
>            return true;
>    }
> 
> This in turn results in the loop as get_page_from_freelist() is
> assuming forward progress can be made by doing some more struct page
> initialization.
> 
> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> Fixes: 0e56acae4b4dd4a9 ("mm: initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections")
> Suggested-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
>   mm/page_alloc.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d66bc8abe0af..8e3bc949ebcc 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1826,7 +1826,8 @@ deferred_grow_zone(struct zone *zone, unsigned int order)
>   						 first_deferred_pfn)) {
>   		pgdat->first_deferred_pfn = ULONG_MAX;
>   		pgdat_resize_unlock(pgdat, &flags);
> -		return true;
> +		/* Retry only once. */
> +		return first_deferred_pfn != ULONG_MAX;
>   	}
>   
>   	/*
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] mm: fix regression with deferred struct page init
  2019-06-25  8:25 ` Juergen Gross
@ 2019-06-27 15:35   ` Juergen Gross
  0 siblings, 0 replies; 7+ messages in thread
From: Juergen Gross @ 2019-06-27 15:35 UTC (permalink / raw)
  To: xen-devel, linux-mm, linux-kernel
  Cc: rppt, Alexander Duyck, Michal Hocko, Andrew Morton, pasha.tatashin

On 25.06.19 10:25, Juergen Gross wrote:
> Gentle ping.
> 
> I'd really like to have that in 5.2 in order to avoid the regression
> introduced with 5.2-rc1.

Adding some maintainers directly...


Juergen

> 
> 
> Juergen
> 
> On 20.06.19 18:08, Juergen Gross wrote:
>> Commit 0e56acae4b4dd4a9 ("mm: initialize MAX_ORDER_NR_PAGES at a time
>> instead of doing larger sections") is causing a regression on some
>> systems when the kernel is booted as Xen dom0.
>>
>> The system will just hang in early boot.
>>
>> Reason is an endless loop in get_page_from_freelist() in case the first
>> zone looked at has no free memory. deferred_grow_zone() is always
>> returning true due to the following code snipplet:
>>
>>    /* If the zone is empty somebody else may have cleared out the zone */
>>    if (!deferred_init_mem_pfn_range_in_zone(&i, zone, &spfn, &epfn,
>>                                             first_deferred_pfn)) {
>>            pgdat->first_deferred_pfn = ULONG_MAX;
>>            pgdat_resize_unlock(pgdat, &flags);
>>            return true;
>>    }
>>
>> This in turn results in the loop as get_page_from_freelist() is
>> assuming forward progress can be made by doing some more struct page
>> initialization.
>>
>> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>> Fixes: 0e56acae4b4dd4a9 ("mm: initialize MAX_ORDER_NR_PAGES at a time 
>> instead of doing larger sections")
>> Suggested-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>>   mm/page_alloc.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index d66bc8abe0af..8e3bc949ebcc 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -1826,7 +1826,8 @@ deferred_grow_zone(struct zone *zone, unsigned 
>> int order)
>>                            first_deferred_pfn)) {
>>           pgdat->first_deferred_pfn = ULONG_MAX;
>>           pgdat_resize_unlock(pgdat, &flags);
>> -        return true;
>> +        /* Retry only once. */
>> +        return first_deferred_pfn != ULONG_MAX;
>>       }
>>       /*
>>
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] mm: fix regression with deferred struct page init
  2019-06-20 16:08 [Xen-devel] [PATCH] mm: fix regression with deferred struct page init Juergen Gross
  2019-06-20 16:10 ` Alexander Duyck
  2019-06-25  8:25 ` Juergen Gross
@ 2019-06-28 15:17 ` Michal Hocko
  2019-06-28 17:38   ` Juergen Gross
  2 siblings, 1 reply; 7+ messages in thread
From: Michal Hocko @ 2019-06-28 15:17 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel, Alexander Duyck, linux-kernel, linux-mm

On Thu 20-06-19 18:08:21, Juergen Gross wrote:
> Commit 0e56acae4b4dd4a9 ("mm: initialize MAX_ORDER_NR_PAGES at a time
> instead of doing larger sections") is causing a regression on some
> systems when the kernel is booted as Xen dom0.
> 
> The system will just hang in early boot.
> 
> Reason is an endless loop in get_page_from_freelist() in case the first
> zone looked at has no free memory. deferred_grow_zone() is always

Could you explain how we ended up with the zone having no memory? Is
xen "stealing" memblock memory without adding it to memory.reserved?
In other words, how do we end up with an empty zone that has non zero
end_pfn?

> returning true due to the following code snipplet:
> 
>   /* If the zone is empty somebody else may have cleared out the zone */
>   if (!deferred_init_mem_pfn_range_in_zone(&i, zone, &spfn, &epfn,
>                                            first_deferred_pfn)) {
>           pgdat->first_deferred_pfn = ULONG_MAX;
>           pgdat_resize_unlock(pgdat, &flags);
>           return true;
>   }
> 
> This in turn results in the loop as get_page_from_freelist() is
> assuming forward progress can be made by doing some more struct page
> initialization.

The patch looks correct. The code is subtle but the comment helps.

> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> Fixes: 0e56acae4b4dd4a9 ("mm: initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections")
> Suggested-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> Signed-off-by: Juergen Gross <jgross@suse.com>

Acked-by: Michal Hocko <mhocko@suse.com>
> ---
>  mm/page_alloc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d66bc8abe0af..8e3bc949ebcc 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1826,7 +1826,8 @@ deferred_grow_zone(struct zone *zone, unsigned int order)
>  						 first_deferred_pfn)) {
>  		pgdat->first_deferred_pfn = ULONG_MAX;
>  		pgdat_resize_unlock(pgdat, &flags);
> -		return true;
> +		/* Retry only once. */
> +		return first_deferred_pfn != ULONG_MAX;
>  	}
>  
>  	/*
> -- 
> 2.16.4

-- 
Michal Hocko
SUSE Labs

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] mm: fix regression with deferred struct page init
  2019-06-28 15:17 ` Michal Hocko
@ 2019-06-28 17:38   ` Juergen Gross
  2019-06-28 18:47     ` Michal Hocko
  0 siblings, 1 reply; 7+ messages in thread
From: Juergen Gross @ 2019-06-28 17:38 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-mm, Alexander Duyck, linux-kernel, xen-devel

On 28.06.19 17:17, Michal Hocko wrote:
> On Thu 20-06-19 18:08:21, Juergen Gross wrote:
>> Commit 0e56acae4b4dd4a9 ("mm: initialize MAX_ORDER_NR_PAGES at a time
>> instead of doing larger sections") is causing a regression on some
>> systems when the kernel is booted as Xen dom0.
>>
>> The system will just hang in early boot.
>>
>> Reason is an endless loop in get_page_from_freelist() in case the first
>> zone looked at has no free memory. deferred_grow_zone() is always
> 
> Could you explain how we ended up with the zone having no memory? Is
> xen "stealing" memblock memory without adding it to memory.reserved?
> In other words, how do we end up with an empty zone that has non zero
> end_pfn?

Why do you think Xen is stealing the memory in an odd way?

Doesn't deferred_init_mem_pfn_range_in_zone() return false when no free
memory is found? So exactly if the memory was added to memory.reserved
that will happen.

I guess the difference to a bare metal boot is that a Xen dom0 will need
probably more memory in early boot phase, so that issue is more likely
to occur.

In my case the system had two zones, where the 2nd zone had some free
memory. The search never made it to the 2nd zone as the search ended in
an endless loop for the 1st zone.

> 
>> returning true due to the following code snipplet:
>>
>>    /* If the zone is empty somebody else may have cleared out the zone */
>>    if (!deferred_init_mem_pfn_range_in_zone(&i, zone, &spfn, &epfn,
>>                                             first_deferred_pfn)) {
>>            pgdat->first_deferred_pfn = ULONG_MAX;
>>            pgdat_resize_unlock(pgdat, &flags);
>>            return true;
>>    }
>>
>> This in turn results in the loop as get_page_from_freelist() is
>> assuming forward progress can be made by doing some more struct page
>> initialization.
> 
> The patch looks correct. The code is subtle but the comment helps.
> 
>> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>> Fixes: 0e56acae4b4dd4a9 ("mm: initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections")
>> Suggested-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
> 
> Acked-by: Michal Hocko <mhocko@suse.com>

Thanks,

Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xen-devel] [PATCH] mm: fix regression with deferred struct page init
  2019-06-28 17:38   ` Juergen Gross
@ 2019-06-28 18:47     ` Michal Hocko
  0 siblings, 0 replies; 7+ messages in thread
From: Michal Hocko @ 2019-06-28 18:47 UTC (permalink / raw)
  To: Juergen Gross; +Cc: linux-mm, Alexander Duyck, linux-kernel, xen-devel

On Fri 28-06-19 19:38:13, Juergen Gross wrote:
> On 28.06.19 17:17, Michal Hocko wrote:
> > On Thu 20-06-19 18:08:21, Juergen Gross wrote:
> > > Commit 0e56acae4b4dd4a9 ("mm: initialize MAX_ORDER_NR_PAGES at a time
> > > instead of doing larger sections") is causing a regression on some
> > > systems when the kernel is booted as Xen dom0.
> > > 
> > > The system will just hang in early boot.
> > > 
> > > Reason is an endless loop in get_page_from_freelist() in case the first
> > > zone looked at has no free memory. deferred_grow_zone() is always
> > 
> > Could you explain how we ended up with the zone having no memory? Is
> > xen "stealing" memblock memory without adding it to memory.reserved?
> > In other words, how do we end up with an empty zone that has non zero
> > end_pfn?
> 
> Why do you think Xen is stealing the memory in an odd way?
> 
> Doesn't deferred_init_mem_pfn_range_in_zone() return false when no free
> memory is found? So exactly if the memory was added to memory.reserved
> that will happen.

You are right. I managed to confuse myself and thought that __next_mem_range
return index to both memblock types. But I am wrong here and it excludes
type_b regions. I should have read the documentation. My bad and sorry
for the confusion.
-- 
Michal Hocko
SUSE Labs

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-06-28 18:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-20 16:08 [Xen-devel] [PATCH] mm: fix regression with deferred struct page init Juergen Gross
2019-06-20 16:10 ` Alexander Duyck
2019-06-25  8:25 ` Juergen Gross
2019-06-27 15:35   ` Juergen Gross
2019-06-28 15:17 ` Michal Hocko
2019-06-28 17:38   ` Juergen Gross
2019-06-28 18:47     ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).