linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm, page_alloc: fix core hung in free_pcppages_bulk()
@ 2020-08-10 16:10 Charan Teja Reddy
  2020-08-10 19:36 ` David Rientjes
  2020-08-11  8:29 ` David Hildenbrand
  0 siblings, 2 replies; 7+ messages in thread
From: Charan Teja Reddy @ 2020-08-10 16:10 UTC (permalink / raw)
  To: akpm, mhocko, vbabka, david, linux-mm
  Cc: linux-kernel, vinmenon, Charan Teja Reddy

The following race is observed with the repeated online, offline and a
delay between two successive online of memory blocks of movable zone.

P1						P2

Online the first memory block in
the movable zone. The pcp struct
values are initialized to default
values,i.e., pcp->high = 0 &
pcp->batch = 1.

					Allocate the pages from the
					movable zone.

Try to Online the second memory
block in the movable zone thus it
entered the online_pages() but yet
to call zone_pcp_update().
					This process is entered into
					the exit path thus it tries
					to release the order-0 pages
					to pcp lists through
					free_unref_page_commit().
					As pcp->high = 0, pcp->count = 1
					proceed to call the function
					free_pcppages_bulk().
Update the pcp values thus the
new pcp values are like, say,
pcp->high = 378, pcp->batch = 63.
					Read the pcp's batch value using
					READ_ONCE() and pass the same to
					free_pcppages_bulk(), pcp values
					passed here are, batch = 63,
					count = 1.

					Since num of pages in the pcp
					lists are less than ->batch,
					then it will stuck in
					while(list_empty(list)) loop
					with interrupts disabled thus
					a core hung.

Avoid this by ensuring free_pcppages_bulk() called with proper count of
pcp list pages.

The mentioned race is some what easily reproducible without [1] because
pcp's are not updated for the first memory block online and thus there
is a enough race window for P2 between alloc+free and pcp struct values
update through onlining of second memory block.

With [1], the race is still exists but it is very much narrow as we
update the pcp struct values for the first memory block online itself.

[1]: https://patchwork.kernel.org/patch/11696389/

Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
---
 mm/page_alloc.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e4896e6..25e7e12 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3106,6 +3106,7 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn)
 	struct zone *zone = page_zone(page);
 	struct per_cpu_pages *pcp;
 	int migratetype;
+	int high;
 
 	migratetype = get_pcppage_migratetype(page);
 	__count_vm_event(PGFREE);
@@ -3128,8 +3129,19 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn)
 	pcp = &this_cpu_ptr(zone->pageset)->pcp;
 	list_add(&page->lru, &pcp->lists[migratetype]);
 	pcp->count++;
-	if (pcp->count >= pcp->high) {
-		unsigned long batch = READ_ONCE(pcp->batch);
+	high = READ_ONCE(pcp->high);
+	if (pcp->count >= high) {
+		int batch;
+
+		batch = READ_ONCE(pcp->batch);
+		/*
+		 * For non-default pcp struct values, high is always
+		 * greater than the batch. If high < batch then pass
+		 * proper count to free the pcp's list pages.
+		 */
+		if (unlikely(high < batch))
+			batch = min(pcp->count, batch);
+
 		free_pcppages_bulk(zone, batch, pcp);
 	}
 }
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm, page_alloc: fix core hung in free_pcppages_bulk()
  2020-08-10 16:10 [PATCH] mm, page_alloc: fix core hung in free_pcppages_bulk() Charan Teja Reddy
@ 2020-08-10 19:36 ` David Rientjes
  2020-08-11 13:01   ` Charan Teja Kalla
  2020-08-11  8:29 ` David Hildenbrand
  1 sibling, 1 reply; 7+ messages in thread
From: David Rientjes @ 2020-08-10 19:36 UTC (permalink / raw)
  To: Charan Teja Reddy
  Cc: akpm, mhocko, vbabka, david, linux-mm, linux-kernel, vinmenon

On Mon, 10 Aug 2020, Charan Teja Reddy wrote:

> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e4896e6..25e7e12 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3106,6 +3106,7 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn)
>  	struct zone *zone = page_zone(page);
>  	struct per_cpu_pages *pcp;
>  	int migratetype;
> +	int high;
>  
>  	migratetype = get_pcppage_migratetype(page);
>  	__count_vm_event(PGFREE);
> @@ -3128,8 +3129,19 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn)
>  	pcp = &this_cpu_ptr(zone->pageset)->pcp;
>  	list_add(&page->lru, &pcp->lists[migratetype]);
>  	pcp->count++;
> -	if (pcp->count >= pcp->high) {
> -		unsigned long batch = READ_ONCE(pcp->batch);
> +	high = READ_ONCE(pcp->high);
> +	if (pcp->count >= high) {
> +		int batch;
> +
> +		batch = READ_ONCE(pcp->batch);
> +		/*
> +		 * For non-default pcp struct values, high is always
> +		 * greater than the batch. If high < batch then pass
> +		 * proper count to free the pcp's list pages.
> +		 */
> +		if (unlikely(high < batch))
> +			batch = min(pcp->count, batch);
> +
>  		free_pcppages_bulk(zone, batch, pcp);
>  	}
>  }

I'm wondering if a fix to free_pcppages_bulk() is more appropriate here 
because the count passed into it seems otherwise fragile if this results 
in a hung core?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm, page_alloc: fix core hung in free_pcppages_bulk()
  2020-08-10 16:10 [PATCH] mm, page_alloc: fix core hung in free_pcppages_bulk() Charan Teja Reddy
  2020-08-10 19:36 ` David Rientjes
@ 2020-08-11  8:29 ` David Hildenbrand
  2020-08-11 13:11   ` Charan Teja Kalla
  2020-08-13  7:07   ` Michal Hocko
  1 sibling, 2 replies; 7+ messages in thread
From: David Hildenbrand @ 2020-08-11  8:29 UTC (permalink / raw)
  To: Charan Teja Reddy, akpm, mhocko, vbabka, linux-mm; +Cc: linux-kernel, vinmenon

On 10.08.20 18:10, Charan Teja Reddy wrote:
> The following race is observed with the repeated online, offline and a
> delay between two successive online of memory blocks of movable zone.
> 
> P1						P2
> 
> Online the first memory block in
> the movable zone. The pcp struct
> values are initialized to default
> values,i.e., pcp->high = 0 &
> pcp->batch = 1.
> 
> 					Allocate the pages from the
> 					movable zone.
> 
> Try to Online the second memory
> block in the movable zone thus it
> entered the online_pages() but yet
> to call zone_pcp_update().
> 					This process is entered into
> 					the exit path thus it tries
> 					to release the order-0 pages
> 					to pcp lists through
> 					free_unref_page_commit().
> 					As pcp->high = 0, pcp->count = 1
> 					proceed to call the function
> 					free_pcppages_bulk().
> Update the pcp values thus the
> new pcp values are like, say,
> pcp->high = 378, pcp->batch = 63.
> 					Read the pcp's batch value using
> 					READ_ONCE() and pass the same to
> 					free_pcppages_bulk(), pcp values
> 					passed here are, batch = 63,
> 					count = 1.
> 
> 					Since num of pages in the pcp
> 					lists are less than ->batch,
> 					then it will stuck in
> 					while(list_empty(list)) loop
> 					with interrupts disabled thus
> 					a core hung.
> 
> Avoid this by ensuring free_pcppages_bulk() called with proper count of
> pcp list pages.
> 
> The mentioned race is some what easily reproducible without [1] because
> pcp's are not updated for the first memory block online and thus there
> is a enough race window for P2 between alloc+free and pcp struct values
> update through onlining of second memory block.
> 
> With [1], the race is still exists but it is very much narrow as we
> update the pcp struct values for the first memory block online itself.
> 
> [1]: https://patchwork.kernel.org/patch/11696389/
> 
> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
> ---
>  mm/page_alloc.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e4896e6..25e7e12 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3106,6 +3106,7 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn)
>  	struct zone *zone = page_zone(page);
>  	struct per_cpu_pages *pcp;
>  	int migratetype;
> +	int high;
>  
>  	migratetype = get_pcppage_migratetype(page);
>  	__count_vm_event(PGFREE);
> @@ -3128,8 +3129,19 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn)
>  	pcp = &this_cpu_ptr(zone->pageset)->pcp;
>  	list_add(&page->lru, &pcp->lists[migratetype]);
>  	pcp->count++;
> -	if (pcp->count >= pcp->high) {
> -		unsigned long batch = READ_ONCE(pcp->batch);
> +	high = READ_ONCE(pcp->high);
> +	if (pcp->count >= high) {
> +		int batch;
> +
> +		batch = READ_ONCE(pcp->batch);
> +		/*
> +		 * For non-default pcp struct values, high is always
> +		 * greater than the batch. If high < batch then pass
> +		 * proper count to free the pcp's list pages.
> +		 */
> +		if (unlikely(high < batch))
> +			batch = min(pcp->count, batch);
> +
>  		free_pcppages_bulk(zone, batch, pcp);
>  	}
>  }
> 

I was wondering if we should rather set all pageblocks to
MIGRATE_ISOLATE in online_pages() before doing the online_pages_range()
call, and do undo_isolate_page_range() after onlining is done.

move_pfn_range_to_zone()->memmap_init_zone() marks all pageblocks
MIGRATE_MOVABLE, and as that function is used also during boot, we could
supply a parameter to configure this.

This would prevent another race from happening: Having pages exposed to
the buddy ready for allocation in online_pages_range() before the
sections are marked online.

This would avoid any pages from getting allocated before we're
completely done onlining.

We would need MIGRATE_ISOLATE/CONFIG_MEMORY_ISOLATION also for
CONFIG_MEMORY_HOTPLUG.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm, page_alloc: fix core hung in free_pcppages_bulk()
  2020-08-10 19:36 ` David Rientjes
@ 2020-08-11 13:01   ` Charan Teja Kalla
  0 siblings, 0 replies; 7+ messages in thread
From: Charan Teja Kalla @ 2020-08-11 13:01 UTC (permalink / raw)
  To: David Rientjes
  Cc: akpm, mhocko, vbabka, david, linux-mm, linux-kernel, vinmenon

Thanks David.

On 8/11/2020 1:06 AM, David Rientjes wrote:
> On Mon, 10 Aug 2020, Charan Teja Reddy wrote:
> 
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index e4896e6..25e7e12 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3106,6 +3106,7 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn)
>>  	struct zone *zone = page_zone(page);
>>  	struct per_cpu_pages *pcp;
>>  	int migratetype;
>> +	int high;
>>  
>>  	migratetype = get_pcppage_migratetype(page);
>>  	__count_vm_event(PGFREE);
>> @@ -3128,8 +3129,19 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn)
>>  	pcp = &this_cpu_ptr(zone->pageset)->pcp;
>>  	list_add(&page->lru, &pcp->lists[migratetype]);
>>  	pcp->count++;
>> -	if (pcp->count >= pcp->high) {
>> -		unsigned long batch = READ_ONCE(pcp->batch);
>> +	high = READ_ONCE(pcp->high);
>> +	if (pcp->count >= high) {
>> +		int batch;
>> +
>> +		batch = READ_ONCE(pcp->batch);
>> +		/*
>> +		 * For non-default pcp struct values, high is always
>> +		 * greater than the batch. If high < batch then pass
>> +		 * proper count to free the pcp's list pages.
>> +		 */
>> +		if (unlikely(high < batch))
>> +			batch = min(pcp->count, batch);
>> +
>>  		free_pcppages_bulk(zone, batch, pcp);
>>  	}
>>  }
> 
> I'm wondering if a fix to free_pcppages_bulk() is more appropriate here 
> because the count passed into it seems otherwise fragile if this results 
> in a hung core?
> 

Agree that the free_pcppages_bulk() is appropriate place to fix and it
actually much cleaner. Raised V2:
https://patchwork.kernel.org/patch/11709225/

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm, page_alloc: fix core hung in free_pcppages_bulk()
  2020-08-11  8:29 ` David Hildenbrand
@ 2020-08-11 13:11   ` Charan Teja Kalla
  2020-08-11 13:13     ` David Hildenbrand
  2020-08-13  7:07   ` Michal Hocko
  1 sibling, 1 reply; 7+ messages in thread
From: Charan Teja Kalla @ 2020-08-11 13:11 UTC (permalink / raw)
  To: David Hildenbrand, akpm, mhocko, vbabka, linux-mm; +Cc: linux-kernel, vinmenon

Thanks David for the comments.

On 8/11/2020 1:59 PM, David Hildenbrand wrote:
> On 10.08.20 18:10, Charan Teja Reddy wrote:
>> The following race is observed with the repeated online, offline and a
>> delay between two successive online of memory blocks of movable zone.
>>
>> P1						P2
>>
>> Online the first memory block in
>> the movable zone. The pcp struct
>> values are initialized to default
>> values,i.e., pcp->high = 0 &
>> pcp->batch = 1.
>>
>> 					Allocate the pages from the
>> 					movable zone.
>>
>> Try to Online the second memory
>> block in the movable zone thus it
>> entered the online_pages() but yet
>> to call zone_pcp_update().
>> 					This process is entered into
>> 					the exit path thus it tries
>> 					to release the order-0 pages
>> 					to pcp lists through
>> 					free_unref_page_commit().
>> 					As pcp->high = 0, pcp->count = 1
>> 					proceed to call the function
>> 					free_pcppages_bulk().
>> Update the pcp values thus the
>> new pcp values are like, say,
>> pcp->high = 378, pcp->batch = 63.
>> 					Read the pcp's batch value using
>> 					READ_ONCE() and pass the same to
>> 					free_pcppages_bulk(), pcp values
>> 					passed here are, batch = 63,
>> 					count = 1.
>>
>> 					Since num of pages in the pcp
>> 					lists are less than ->batch,
>> 					then it will stuck in
>> 					while(list_empty(list)) loop
>> 					with interrupts disabled thus
>> 					a core hung.
>>
>> Avoid this by ensuring free_pcppages_bulk() called with proper count of
>> pcp list pages.
>>
>> The mentioned race is some what easily reproducible without [1] because
>> pcp's are not updated for the first memory block online and thus there
>> is a enough race window for P2 between alloc+free and pcp struct values
>> update through onlining of second memory block.
>>
>> With [1], the race is still exists but it is very much narrow as we
>> update the pcp struct values for the first memory block online itself.
>>
>> [1]: https://patchwork.kernel.org/patch/11696389/
>>
>> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
>> ---
>>  mm/page_alloc.c | 16 ++++++++++++++--
>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index e4896e6..25e7e12 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3106,6 +3106,7 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn)
>>  	struct zone *zone = page_zone(page);
>>  	struct per_cpu_pages *pcp;
>>  	int migratetype;
>> +	int high;
>>  
>>  	migratetype = get_pcppage_migratetype(page);
>>  	__count_vm_event(PGFREE);
>> @@ -3128,8 +3129,19 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn)
>>  	pcp = &this_cpu_ptr(zone->pageset)->pcp;
>>  	list_add(&page->lru, &pcp->lists[migratetype]);
>>  	pcp->count++;
>> -	if (pcp->count >= pcp->high) {
>> -		unsigned long batch = READ_ONCE(pcp->batch);
>> +	high = READ_ONCE(pcp->high);
>> +	if (pcp->count >= high) {
>> +		int batch;
>> +
>> +		batch = READ_ONCE(pcp->batch);
>> +		/*
>> +		 * For non-default pcp struct values, high is always
>> +		 * greater than the batch. If high < batch then pass
>> +		 * proper count to free the pcp's list pages.
>> +		 */
>> +		if (unlikely(high < batch))
>> +			batch = min(pcp->count, batch);
>> +
>>  		free_pcppages_bulk(zone, batch, pcp);
>>  	}
>>  }
>>
> 
> I was wondering if we should rather set all pageblocks to
> MIGRATE_ISOLATE in online_pages() before doing the online_pages_range()
> call, and do undo_isolate_page_range() after onlining is done.
> 
> move_pfn_range_to_zone()->memmap_init_zone() marks all pageblocks
> MIGRATE_MOVABLE, and as that function is used also during boot, we could
> supply a parameter to configure this.
> 
> This would prevent another race from happening: Having pages exposed to
> the buddy ready for allocation in online_pages_range() before the
> sections are marked online.

Yeah this is another bug. And idea of isolate first, online and undoing
the isolation after zonelist and pcp struct update should work even for
the mentioned issue. This needs to go as a separate fix?

However, IMO, issue in free_pcppages_bulk() should be fixed by checking
if sane count value is passed. NO?
Posted V2: https://patchwork.kernel.org/patch/11709225/

> 
> This would avoid any pages from getting allocated before we're
> completely done onlining.
> 
> We would need MIGRATE_ISOLATE/CONFIG_MEMORY_ISOLATION also for
> CONFIG_MEMORY_HOTPLUG.
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum, a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm, page_alloc: fix core hung in free_pcppages_bulk()
  2020-08-11 13:11   ` Charan Teja Kalla
@ 2020-08-11 13:13     ` David Hildenbrand
  0 siblings, 0 replies; 7+ messages in thread
From: David Hildenbrand @ 2020-08-11 13:13 UTC (permalink / raw)
  To: Charan Teja Kalla, akpm, mhocko, vbabka, linux-mm; +Cc: linux-kernel, vinmenon

On 11.08.20 15:11, Charan Teja Kalla wrote:
> Thanks David for the comments.
> 
> On 8/11/2020 1:59 PM, David Hildenbrand wrote:
>> On 10.08.20 18:10, Charan Teja Reddy wrote:
>>> The following race is observed with the repeated online, offline and a
>>> delay between two successive online of memory blocks of movable zone.
>>>
>>> P1						P2
>>>
>>> Online the first memory block in
>>> the movable zone. The pcp struct
>>> values are initialized to default
>>> values,i.e., pcp->high = 0 &
>>> pcp->batch = 1.
>>>
>>> 					Allocate the pages from the
>>> 					movable zone.
>>>
>>> Try to Online the second memory
>>> block in the movable zone thus it
>>> entered the online_pages() but yet
>>> to call zone_pcp_update().
>>> 					This process is entered into
>>> 					the exit path thus it tries
>>> 					to release the order-0 pages
>>> 					to pcp lists through
>>> 					free_unref_page_commit().
>>> 					As pcp->high = 0, pcp->count = 1
>>> 					proceed to call the function
>>> 					free_pcppages_bulk().
>>> Update the pcp values thus the
>>> new pcp values are like, say,
>>> pcp->high = 378, pcp->batch = 63.
>>> 					Read the pcp's batch value using
>>> 					READ_ONCE() and pass the same to
>>> 					free_pcppages_bulk(), pcp values
>>> 					passed here are, batch = 63,
>>> 					count = 1.
>>>
>>> 					Since num of pages in the pcp
>>> 					lists are less than ->batch,
>>> 					then it will stuck in
>>> 					while(list_empty(list)) loop
>>> 					with interrupts disabled thus
>>> 					a core hung.
>>>
>>> Avoid this by ensuring free_pcppages_bulk() called with proper count of
>>> pcp list pages.
>>>
>>> The mentioned race is some what easily reproducible without [1] because
>>> pcp's are not updated for the first memory block online and thus there
>>> is a enough race window for P2 between alloc+free and pcp struct values
>>> update through onlining of second memory block.
>>>
>>> With [1], the race is still exists but it is very much narrow as we
>>> update the pcp struct values for the first memory block online itself.
>>>
>>> [1]: https://patchwork.kernel.org/patch/11696389/
>>>
>>> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
>>> ---
>>>  mm/page_alloc.c | 16 ++++++++++++++--
>>>  1 file changed, 14 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index e4896e6..25e7e12 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -3106,6 +3106,7 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn)
>>>  	struct zone *zone = page_zone(page);
>>>  	struct per_cpu_pages *pcp;
>>>  	int migratetype;
>>> +	int high;
>>>  
>>>  	migratetype = get_pcppage_migratetype(page);
>>>  	__count_vm_event(PGFREE);
>>> @@ -3128,8 +3129,19 @@ static void free_unref_page_commit(struct page *page, unsigned long pfn)
>>>  	pcp = &this_cpu_ptr(zone->pageset)->pcp;
>>>  	list_add(&page->lru, &pcp->lists[migratetype]);
>>>  	pcp->count++;
>>> -	if (pcp->count >= pcp->high) {
>>> -		unsigned long batch = READ_ONCE(pcp->batch);
>>> +	high = READ_ONCE(pcp->high);
>>> +	if (pcp->count >= high) {
>>> +		int batch;
>>> +
>>> +		batch = READ_ONCE(pcp->batch);
>>> +		/*
>>> +		 * For non-default pcp struct values, high is always
>>> +		 * greater than the batch. If high < batch then pass
>>> +		 * proper count to free the pcp's list pages.
>>> +		 */
>>> +		if (unlikely(high < batch))
>>> +			batch = min(pcp->count, batch);
>>> +
>>>  		free_pcppages_bulk(zone, batch, pcp);
>>>  	}
>>>  }
>>>
>>
>> I was wondering if we should rather set all pageblocks to
>> MIGRATE_ISOLATE in online_pages() before doing the online_pages_range()
>> call, and do undo_isolate_page_range() after onlining is done.
>>
>> move_pfn_range_to_zone()->memmap_init_zone() marks all pageblocks
>> MIGRATE_MOVABLE, and as that function is used also during boot, we could
>> supply a parameter to configure this.
>>
>> This would prevent another race from happening: Having pages exposed to
>> the buddy ready for allocation in online_pages_range() before the
>> sections are marked online.
> 
> Yeah this is another bug. And idea of isolate first, online and undoing
> the isolation after zonelist and pcp struct update should work even for
> the mentioned issue. This needs to go as a separate fix?

Yeah, also requires more work to be done. Will add it to my list of TODOs.

> 
> However, IMO, issue in free_pcppages_bulk() should be fixed by checking
> if sane count value is passed. NO?
> Posted V2: https://patchwork.kernel.org/patch/11709225/

Yeah, I'm fine with fixing this issue that can actually be reproduced.


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm, page_alloc: fix core hung in free_pcppages_bulk()
  2020-08-11  8:29 ` David Hildenbrand
  2020-08-11 13:11   ` Charan Teja Kalla
@ 2020-08-13  7:07   ` Michal Hocko
  1 sibling, 0 replies; 7+ messages in thread
From: Michal Hocko @ 2020-08-13  7:07 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Charan Teja Reddy, akpm, vbabka, linux-mm, linux-kernel, vinmenon

On Tue 11-08-20 10:29:24, David Hildenbrand wrote:
[...]
> I was wondering if we should rather set all pageblocks to
> MIGRATE_ISOLATE in online_pages() before doing the online_pages_range()
> call, and do undo_isolate_page_range() after onlining is done.
> 
> move_pfn_range_to_zone()->memmap_init_zone() marks all pageblocks
> MIGRATE_MOVABLE, and as that function is used also during boot, we could
> supply a parameter to configure this.
> 
> This would prevent another race from happening: Having pages exposed to
> the buddy ready for allocation in online_pages_range() before the
> sections are marked online.
> 
> This would avoid any pages from getting allocated before we're
> completely done onlining.

This sounds like a reasonable idea to me.

> We would need MIGRATE_ISOLATE/CONFIG_MEMORY_ISOLATION also for
> CONFIG_MEMORY_HOTPLUG.

We already do depend on the memory isolation in the hotremove. Doing the
same for hotplug in general makes sense as well.

Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-08-13  7:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-10 16:10 [PATCH] mm, page_alloc: fix core hung in free_pcppages_bulk() Charan Teja Reddy
2020-08-10 19:36 ` David Rientjes
2020-08-11 13:01   ` Charan Teja Kalla
2020-08-11  8:29 ` David Hildenbrand
2020-08-11 13:11   ` Charan Teja Kalla
2020-08-11 13:13     ` David Hildenbrand
2020-08-13  7:07   ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).