mm-commits.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@linux.ibm.com>
To: akpm@linux-foundation.org
Cc: aarcange@redhat.com, bhe@redhat.com, cai@lca.pw,
	david@redhat.com, mgorman@suse.de, mhocko@kernel.org,
	mm-commits@vger.kernel.org, stable@vger.kernel.org,
	vbabka@suse.cz
Subject: Re: + mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch added to -mm tree
Date: Sun, 6 Dec 2020 08:48:49 +0200	[thread overview]
Message-ID: <20201206064849.GW123287@linux.ibm.com> (raw)
In-Reply-To: <20201206005401.qKuAVgOXr%akpm@linux-foundation.org>

Hi,

On Sat, Dec 05, 2020 at 04:54:01PM -0800, akpm@linux-foundation.org wrote:
> 
> The patch titled
>      Subject: mm: initialize struct pages in reserved regions outside of the zone ranges
> has been added to the -mm tree.  Its filename is
>      mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch
> 
> This patch should soon appear at
>     https://ozlabs.org/~akpm/mmots/broken-out/mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch
> and later at
>     https://ozlabs.org/~akpm/mmotm/broken-out/mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch
> 
> Before you just go and hit "reply", please:
>    a) Consider who else should be cc'ed
>    b) Prefer to cc a suitable mailing list as well
>    c) Ideally: find the original patch on the mailing list and do a
>       reply-to-all to that, adding suitable additional cc's
> 
> *** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
> 
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
> 
> ------------------------------------------------------
> From: Andrea Arcangeli <aarcange@redhat.com>
> Subject: mm: initialize struct pages in reserved regions outside of the zone ranges
> 
> Without this change, the pfn 0 isn't in any zone spanned range, and it's
> also not in any memory.memblock range, so the struct page of pfn 0 wasn't
> initialized and the PagePoison remained set when reserve_bootmem_region
> called __SetPageReserved, inducing a silent boot failure with DEBUG_VM
> (and correctly so, because the crash signaled the nodeid/nid of pfn 0
> would be again wrong).
> 
> There's no enforcement that all memblock.reserved ranges must overlap
> memblock.memory ranges, so the memblock.reserved ranges also require an
> explicit initialization and the zones ranges need to be extended to
> include all memblock.reserved ranges with struct pages too or they'll be
> left uninitialized with PagePoison as it happened to pfn 0.
> 
> Link: https://lkml.kernel.org/r/20201205013238.21663-2-aarcange@redhat.com
> Fixes: 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that check each PFN")
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Mike Rapoport <rppt@linux.ibm.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Qian Cai <cai@lca.pw>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  include/linux/memblock.h |   17 ++++++++---
>  mm/debug.c               |    3 +
>  mm/memblock.c            |    4 +-
>  mm/page_alloc.c          |   57 +++++++++++++++++++++++++++++--------
>  4 files changed, 63 insertions(+), 18 deletions(-)

I don't see why we need all this complexity when a simple fixup was
enough.

> --- a/include/linux/memblock.h~mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges
> +++ a/include/linux/memblock.h

...

> --- a/mm/page_alloc.c~mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges
> +++ a/mm/page_alloc.c

...

> @@ -6227,7 +6233,7 @@ void __init __weak memmap_init(unsigned
>  			       unsigned long zone,
>  			       unsigned long range_start_pfn)
>  {
> -	unsigned long start_pfn, end_pfn, next_pfn = 0;
> +	unsigned long start_pfn, end_pfn, prev_pfn = 0;
>  	unsigned long range_end_pfn = range_start_pfn + size;
>  	u64 pgcnt = 0;
>  	int i;
> @@ -6235,7 +6241,7 @@ void __init __weak memmap_init(unsigned
>  	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) {
>  		start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn);
>  		end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn);
> -		next_pfn = clamp(next_pfn, range_start_pfn, range_end_pfn);
> +		prev_pfn = clamp(prev_pfn, range_start_pfn, range_end_pfn);
>  
>  		if (end_pfn > start_pfn) {
>  			size = end_pfn - start_pfn;
> @@ -6243,10 +6249,10 @@ void __init __weak memmap_init(unsigned
>  					 MEMINIT_EARLY, NULL, MIGRATE_MOVABLE);
>  		}
>  
> -		if (next_pfn < start_pfn)
> -			pgcnt += init_unavailable_range(next_pfn, start_pfn,
> +		if (prev_pfn < start_pfn)
> +			pgcnt += init_unavailable_range(prev_pfn, start_pfn,
>  							zone, nid);
> -		next_pfn = end_pfn;
> +		prev_pfn = end_pfn;
>  	}
>  
>  	/*
> @@ -6256,12 +6262,31 @@ void __init __weak memmap_init(unsigned
>  	 * considered initialized. Make sure that memmap has a well defined
>  	 * state.
>  	 */
> -	if (next_pfn < range_end_pfn)
> -		pgcnt += init_unavailable_range(next_pfn, range_end_pfn,
> +	if (prev_pfn < range_end_pfn)
> +		pgcnt += init_unavailable_range(prev_pfn, range_end_pfn,
>  						zone, nid);
>  
> +	/*
> +	 * memblock.reserved isn't enforced to overlap with
> +	 * memblock.memory so initialize the struct pages for
> +	 * memblock.reserved too in case it wasn't overlapping.
> +	 *
> +	 * If any struct page associated with a memblock.reserved
> +	 * range isn't overlapping with a zone range, it'll be left
> +	 * uninitialized, ideally with PagePoison, and it'll be a more
> +	 * easily detectable error.
> +	 */
> +	for_each_res_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) {
> +		start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn);
> +		end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn);
> +
> +		if (end_pfn > start_pfn)
> +			pgcnt += init_unavailable_range(start_pfn, end_pfn,
> +							zone, nid);
> +	}

This means we are going iterate over all memory allocated before
free_area_ini() from memblock extra time. One time here and another time
in reserve_bootmem_region().
And this can be substantial for CMA and alloc_large_system_hash().

> +
>  	if (pgcnt)
> -		pr_info("%s: Zeroed struct page in unavailable ranges: %lld\n",
> +		pr_info("%s: pages in unavailable ranges: %lld\n",
>  			zone_names[zone], pgcnt);
>  }
>  
> @@ -6499,6 +6524,10 @@ void __init get_pfn_range_for_nid(unsign
>  		*start_pfn = min(*start_pfn, this_start_pfn);
>  		*end_pfn = max(*end_pfn, this_end_pfn);
>  	}
> +	for_each_res_pfn_range(i, nid, &this_start_pfn, &this_end_pfn, NULL) {
> +		*start_pfn = min(*start_pfn, this_start_pfn);
> +		*end_pfn = max(*end_pfn, this_end_pfn);
> +	}
>  
>  	if (*start_pfn == -1UL)
>  		*start_pfn = 0;
> @@ -7126,7 +7155,13 @@ unsigned long __init node_map_pfn_alignm
>   */
>  unsigned long __init find_min_pfn_with_active_regions(void)
>  {
> -	return PHYS_PFN(memblock_start_of_DRAM());
> +	/*
> +	 * reserved regions must be included so that their page
> +	 * structure can be part of a zone and obtain a valid zoneid
> +	 * before __SetPageReserved().
> +	 */
> +	return min(PHYS_PFN(memblock_start_of_DRAM()),
> +		   PHYS_PFN(memblock.reserved.regions[0].base));

So this implies that reserved memory starts before memory. Don't you
find this weird?

>  }
>  
>  /*
> _
> 
> Patches currently in -mm which might be from aarcange@redhat.com are
> 
> mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch
> 

-- 
Sincerely yours,
Mike.

  reply	other threads:[~2020-12-06  6:49 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-06  0:54 + mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch added to -mm tree akpm
2020-12-06  6:48 ` Mike Rapoport [this message]
2020-12-06 19:30   ` Andrea Arcangeli
2020-12-07 16:50     ` Mike Rapoport
2020-12-08  2:57       ` Andrea Arcangeli
2020-12-08 21:46         ` Mike Rapoport
2020-12-08 23:13           ` Andrea Arcangeli
2020-12-09 21:42             ` Mike Rapoport
2020-12-07  8:58 ` David Hildenbrand
2020-12-07  9:45   ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201206064849.GW123287@linux.ibm.com \
    --to=rppt@linux.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=cai@lca.pw \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=mm-commits@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).