linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Charan Teja Reddy <charante@codeaurora.org>,
	akpm@linux-foundation.org, mhocko@suse.com, vbabka@suse.cz,
	rientjes@google.com, linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org, vinmenon@codeaurora.org
Subject: Re: [PATCH V2] mm, page_alloc: fix core hung in free_pcppages_bulk()
Date: Tue, 11 Aug 2020 23:05:37 +0200	[thread overview]
Message-ID: <fdf574c8-82be-6bde-b73b-c97055f530a8@redhat.com> (raw)
In-Reply-To: <1597150703-19003-1-git-send-email-charante@codeaurora.org>

On 11.08.20 14:58, Charan Teja Reddy wrote:
> The following race is observed with the repeated online, offline and a
> delay between two successive online of memory blocks of movable zone.
> 
> P1						P2
> 
> Online the first memory block in
> the movable zone. The pcp struct
> values are initialized to default
> values,i.e., pcp->high = 0 &
> pcp->batch = 1.
> 
> 					Allocate the pages from the
> 					movable zone.
> 
> Try to Online the second memory
> block in the movable zone thus it
> entered the online_pages() but yet
> to call zone_pcp_update().
> 					This process is entered into
> 					the exit path thus it tries
> 					to release the order-0 pages
> 					to pcp lists through
> 					free_unref_page_commit().
> 					As pcp->high = 0, pcp->count = 1
> 					proceed to call the function
> 					free_pcppages_bulk().
> Update the pcp values thus the
> new pcp values are like, say,
> pcp->high = 378, pcp->batch = 63.
> 					Read the pcp's batch value using
> 					READ_ONCE() and pass the same to
> 					free_pcppages_bulk(), pcp values
> 					passed here are, batch = 63,
> 					count = 1.
> 
> 					Since num of pages in the pcp
> 					lists are less than ->batch,
> 					then it will stuck in
> 					while(list_empty(list)) loop
> 					with interrupts disabled thus
> 					a core hung.
> 
> Avoid this by ensuring free_pcppages_bulk() is called with proper count
> of pcp list pages.
> 
> The mentioned race is some what easily reproducible without [1] because
> pcp's are not updated for the first memory block online and thus there
> is a enough race window for P2 between alloc+free and pcp struct values
> update through onlining of second memory block.
> 
> With [1], the race is still exists but it is very much narrow as we
> update the pcp struct values for the first memory block online itself.
> 
> [1]: https://patchwork.kernel.org/patch/11696389/
> 

IIUC, this is not limited to the movable zone, it could also happen in
corner cases with the normal zone (e.g., hotplug to a node that only has
DMA memory, or no other memory yet).

> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
> ---
> 
> v1: https://patchwork.kernel.org/patch/11707637/
> 
>  mm/page_alloc.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e4896e6..839039f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1304,6 +1304,11 @@ static void free_pcppages_bulk(struct zone *zone, int count,
>  	struct page *page, *tmp;
>  	LIST_HEAD(head);
>  
> +	/*
> +	 * Ensure proper count is passed which otherwise would stuck in the
> +	 * below while (list_empty(list)) loop.
> +	 */
> +	count = min(pcp->count, count);
>  	while (count) {
>  		struct list_head *list;
>  
> 

Fixes: and Cc: stable... tags?

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2020-08-11 21:05 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-11 12:58 [PATCH V2] mm, page_alloc: fix core hung in free_pcppages_bulk() Charan Teja Reddy
2020-08-11 21:05 ` David Hildenbrand [this message]
2020-08-12  9:46   ` Charan Teja Kalla
2020-08-12 10:00     ` David Hildenbrand
2020-08-12 10:11       ` Charan Teja Kalla
2020-08-13  9:32         ` David Hildenbrand
2020-08-12 18:53     ` David Rientjes
2020-08-13 11:41 ` Michal Hocko
2020-08-13 16:21   ` Charan Teja Kalla
2020-08-13 16:30     ` Michal Hocko
2020-08-13 17:27       ` Charan Teja Kalla
2020-08-14  6:39         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fdf574c8-82be-6bde-b73b-c97055f530a8@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=charante@codeaurora.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    --cc=vinmenon@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).