linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	 Mel Gorman <mgorman@techsingularity.net>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org
Subject: Re: [patch] mm, page_alloc: move_freepages should not examine struct page of reserved memory
Date: Tue, 13 Aug 2019 10:22:41 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.21.1908131018450.230426@chino.kir.corp.google.com> (raw)
In-Reply-To: <3aadeed1-3f38-267d-8dae-839e10a2f9d2@suse.cz>

On Tue, 13 Aug 2019, Vlastimil Babka wrote:

> > After commit 907ec5fca3dc ("mm: zero remaining unavailable struct pages"),
> > struct page of reserved memory is zeroed.  This causes page->flags to be 0
> > and fixes issues related to reading /proc/kpageflags, for example, of
> > reserved memory.
> > 
> > The VM_BUG_ON() in move_freepages_block(), however, assumes that
> > page_zone() is meaningful even for reserved memory.  That assumption is no
> > longer true after the aforementioned commit.
> 
> How comes that move_freepages_block() gets called on reserved memory in
> the first place?
> 

It's simply math after finding a valid free page from the per-zone free 
area to use as fallback.  We find the beginning and end of the pageblock 
of the valid page and that can bring us into memory that was reserved per 
the e820.  pfn_valid() is still true (it's backed by a struct page), but 
since it's zero'd we shouldn't make any inferences here about comparing 
its node or zone.  The current node check just happens to succeed most of 
the time by luck because reserved memory typically appears on node 0.

The fix here is to validate that we actually have buddy pages before 
testing if there's any type of zone or node strangeness going on.

> > There's no reason why move_freepages_block() should be testing the
> > legitimacy of page_zone() for reserved memory; its scope is limited only
> > to pages on the zone's freelist.
> > 
> > Note that pfn_valid() can be true for reserved memory: there is a backing
> > struct page.  The check for page_to_nid(page) is also buggy but reserved
> > memory normally only appears on node 0 so the zeroing doesn't affect this.
> > 
> > Move the debug checks to after verifying PageBuddy is true.  This isolates
> > the scope of the checks to only be for buddy pages which are on the zone's
> > freelist which move_freepages_block() is operating on.  In this case, an
> > incorrect node or zone is a bug worthy of being warned about (and the
> > examination of struct page is acceptable bcause this memory is not
> > reserved).
> > 
> > Signed-off-by: David Rientjes <rientjes@google.com>
> > ---
> >  mm/page_alloc.c | 19 ++++---------------
> >  1 file changed, 4 insertions(+), 15 deletions(-)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -2238,27 +2238,12 @@ static int move_freepages(struct zone *zone,
> >  	unsigned int order;
> >  	int pages_moved = 0;
> >  
> > -#ifndef CONFIG_HOLES_IN_ZONE
> > -	/*
> > -	 * page_zone is not safe to call in this context when
> > -	 * CONFIG_HOLES_IN_ZONE is set. This bug check is probably redundant
> > -	 * anyway as we check zone boundaries in move_freepages_block().
> > -	 * Remove at a later date when no bug reports exist related to
> > -	 * grouping pages by mobility
> > -	 */
> > -	VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) &&
> > -	          pfn_valid(page_to_pfn(end_page)) &&
> > -	          page_zone(start_page) != page_zone(end_page));
> > -#endif
> >  	for (page = start_page; page <= end_page;) {
> >  		if (!pfn_valid_within(page_to_pfn(page))) {
> >  			page++;
> >  			continue;
> >  		}
> >  
> > -		/* Make sure we are not inadvertently changing nodes */
> > -		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
> > -
> >  		if (!PageBuddy(page)) {
> >  			/*
> >  			 * We assume that pages that could be isolated for
> > @@ -2273,6 +2258,10 @@ static int move_freepages(struct zone *zone,
> >  			continue;
> >  		}
> >  
> > +		/* Make sure we are not inadvertently changing nodes */
> > +		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
> > +		VM_BUG_ON_PAGE(page_zone(page) != zone, page);
> 
> The later check implies the former check, so if it's to stay, the first
> one could be removed and comment adjusted s/nodes/zones/
> 

Does it?  The first is checking for a corrupted page_to_nid the second is 
checking for a corrupted or unexpected page_zone.  What's being tested 
here is the state of struct page, as it was previous to this patch, not 
the state of struct zone.


  reply	other threads:[~2019-08-13 17:22 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-13  3:37 [patch] mm, page_alloc: move_freepages should not examine struct page of reserved memory David Rientjes
2019-08-13 13:03 ` Vlastimil Babka
2019-08-13 17:22   ` David Rientjes [this message]
2019-08-14  7:42     ` Vlastimil Babka
2019-08-13 21:16 ` Andrew Morton
2019-08-13 23:31   ` David Rientjes
2019-08-14 22:49     ` Andrew Morton
2019-08-19 13:35       ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.1908131018450.230426@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).