All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@linux.ibm.com>
To: akpm@linux-foundation.org
Cc: aarcange@redhat.com, bhe@redhat.com, cai@lca.pw,
	david@redhat.com, mgorman@suse.de, mhocko@kernel.org,
	mm-commits@vger.kernel.org, stable@vger.kernel.org,
	vbabka@suse.cz
Subject: Re: + mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch added to -mm tree
Date: Sun, 6 Dec 2020 08:48:49 +0200	[thread overview]
Message-ID: <20201206064849.GW123287@linux.ibm.com> (raw)
In-Reply-To: <20201206005401.qKuAVgOXr%akpm@linux-foundation.org>

Hi,

On Sat, Dec 05, 2020 at 04:54:01PM -0800, akpm@linux-foundation.org wrote:
> 
> The patch titled
>      Subject: mm: initialize struct pages in reserved regions outside of the zone ranges
> has been added to the -mm tree.  Its filename is
>      mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch
> 
> This patch should soon appear at
>     https://ozlabs.org/~akpm/mmots/broken-out/mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch
> and later at
>     https://ozlabs.org/~akpm/mmotm/broken-out/mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch
> 
> Before you just go and hit "reply", please:
>    a) Consider who else should be cc'ed
>    b) Prefer to cc a suitable mailing list as well
>    c) Ideally: find the original patch on the mailing list and do a
>       reply-to-all to that, adding suitable additional cc's
> 
> *** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
> 
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
> 
> ------------------------------------------------------
> From: Andrea Arcangeli <aarcange@redhat.com>
> Subject: mm: initialize struct pages in reserved regions outside of the zone ranges
> 
> Without this change, the pfn 0 isn't in any zone spanned range, and it's
> also not in any memory.memblock range, so the struct page of pfn 0 wasn't
> initialized and the PagePoison remained set when reserve_bootmem_region
> called __SetPageReserved, inducing a silent boot failure with DEBUG_VM
> (and correctly so, because the crash signaled the nodeid/nid of pfn 0
> would be again wrong).
> 
> There's no enforcement that all memblock.reserved ranges must overlap
> memblock.memory ranges, so the memblock.reserved ranges also require an
> explicit initialization and the zones ranges need to be extended to
> include all memblock.reserved ranges with struct pages too or they'll be
> left uninitialized with PagePoison as it happened to pfn 0.
> 
> Link: https://lkml.kernel.org/r/20201205013238.21663-2-aarcange@redhat.com
> Fixes: 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that check each PFN")
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Mike Rapoport <rppt@linux.ibm.com>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Qian Cai <cai@lca.pw>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  include/linux/memblock.h |   17 ++++++++---
>  mm/debug.c               |    3 +
>  mm/memblock.c            |    4 +-
>  mm/page_alloc.c          |   57 +++++++++++++++++++++++++++++--------
>  4 files changed, 63 insertions(+), 18 deletions(-)

I don't see why we need all this complexity when a simple fixup was
enough.

> --- a/include/linux/memblock.h~mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges
> +++ a/include/linux/memblock.h

...

> --- a/mm/page_alloc.c~mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges
> +++ a/mm/page_alloc.c

...

> @@ -6227,7 +6233,7 @@ void __init __weak memmap_init(unsigned
>  			       unsigned long zone,
>  			       unsigned long range_start_pfn)
>  {
> -	unsigned long start_pfn, end_pfn, next_pfn = 0;
> +	unsigned long start_pfn, end_pfn, prev_pfn = 0;
>  	unsigned long range_end_pfn = range_start_pfn + size;
>  	u64 pgcnt = 0;
>  	int i;
> @@ -6235,7 +6241,7 @@ void __init __weak memmap_init(unsigned
>  	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) {
>  		start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn);
>  		end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn);
> -		next_pfn = clamp(next_pfn, range_start_pfn, range_end_pfn);
> +		prev_pfn = clamp(prev_pfn, range_start_pfn, range_end_pfn);
>  
>  		if (end_pfn > start_pfn) {
>  			size = end_pfn - start_pfn;
> @@ -6243,10 +6249,10 @@ void __init __weak memmap_init(unsigned
>  					 MEMINIT_EARLY, NULL, MIGRATE_MOVABLE);
>  		}
>  
> -		if (next_pfn < start_pfn)
> -			pgcnt += init_unavailable_range(next_pfn, start_pfn,
> +		if (prev_pfn < start_pfn)
> +			pgcnt += init_unavailable_range(prev_pfn, start_pfn,
>  							zone, nid);
> -		next_pfn = end_pfn;
> +		prev_pfn = end_pfn;
>  	}
>  
>  	/*
> @@ -6256,12 +6262,31 @@ void __init __weak memmap_init(unsigned
>  	 * considered initialized. Make sure that memmap has a well defined
>  	 * state.
>  	 */
> -	if (next_pfn < range_end_pfn)
> -		pgcnt += init_unavailable_range(next_pfn, range_end_pfn,
> +	if (prev_pfn < range_end_pfn)
> +		pgcnt += init_unavailable_range(prev_pfn, range_end_pfn,
>  						zone, nid);
>  
> +	/*
> +	 * memblock.reserved isn't enforced to overlap with
> +	 * memblock.memory so initialize the struct pages for
> +	 * memblock.reserved too in case it wasn't overlapping.
> +	 *
> +	 * If any struct page associated with a memblock.reserved
> +	 * range isn't overlapping with a zone range, it'll be left
> +	 * uninitialized, ideally with PagePoison, and it'll be a more
> +	 * easily detectable error.
> +	 */
> +	for_each_res_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) {
> +		start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn);
> +		end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn);
> +
> +		if (end_pfn > start_pfn)
> +			pgcnt += init_unavailable_range(start_pfn, end_pfn,
> +							zone, nid);
> +	}

This means we are going iterate over all memory allocated before
free_area_ini() from memblock extra time. One time here and another time
in reserve_bootmem_region().
And this can be substantial for CMA and alloc_large_system_hash().

> +
>  	if (pgcnt)
> -		pr_info("%s: Zeroed struct page in unavailable ranges: %lld\n",
> +		pr_info("%s: pages in unavailable ranges: %lld\n",
>  			zone_names[zone], pgcnt);
>  }
>  
> @@ -6499,6 +6524,10 @@ void __init get_pfn_range_for_nid(unsign
>  		*start_pfn = min(*start_pfn, this_start_pfn);
>  		*end_pfn = max(*end_pfn, this_end_pfn);
>  	}
> +	for_each_res_pfn_range(i, nid, &this_start_pfn, &this_end_pfn, NULL) {
> +		*start_pfn = min(*start_pfn, this_start_pfn);
> +		*end_pfn = max(*end_pfn, this_end_pfn);
> +	}
>  
>  	if (*start_pfn == -1UL)
>  		*start_pfn = 0;
> @@ -7126,7 +7155,13 @@ unsigned long __init node_map_pfn_alignm
>   */
>  unsigned long __init find_min_pfn_with_active_regions(void)
>  {
> -	return PHYS_PFN(memblock_start_of_DRAM());
> +	/*
> +	 * reserved regions must be included so that their page
> +	 * structure can be part of a zone and obtain a valid zoneid
> +	 * before __SetPageReserved().
> +	 */
> +	return min(PHYS_PFN(memblock_start_of_DRAM()),
> +		   PHYS_PFN(memblock.reserved.regions[0].base));

So this implies that reserved memory starts before memory. Don't you
find this weird?

>  }
>  
>  /*
> _
> 
> Patches currently in -mm which might be from aarcange@redhat.com are
> 
> mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch
> 

-- 
Sincerely yours,
Mike.

  reply	other threads:[~2020-12-06  6:49 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-06  0:54 + mm-initialize-struct-pages-in-reserved-regions-outside-of-the-zone-ranges.patch added to -mm tree akpm
2020-12-06  6:48 ` Mike Rapoport [this message]
2020-12-06 19:30   ` Andrea Arcangeli
2020-12-07 16:50     ` Mike Rapoport
2020-12-08  2:57       ` Andrea Arcangeli
2020-12-08 21:46         ` Mike Rapoport
2020-12-08 23:13           ` Andrea Arcangeli
2020-12-09 21:42             ` Mike Rapoport
2020-12-07  8:58 ` David Hildenbrand
2020-12-07  9:45   ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201206064849.GW123287@linux.ibm.com \
    --to=rppt@linux.ibm.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=cai@lca.pw \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=mm-commits@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.