All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Baoquan He <bhe@redhat.com>, Mel Gorman <mgorman@suse.de>,
	Michal Hocko <mhocko@kernel.org>,
	Mike Rapoport <rppt@linux.ibm.com>, Qian Cai <cai@lca.pw>,
	Vlastimil Babka <vbabka@suse.cz>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	stable@vger.kernel.org, Dan Williams <dan.j.williams@intel.com>
Subject: Re: [PATCH v2 1/2] mm: memblock: enforce overlap of memory.memblock and memory.reserved
Date: Mon, 14 Dec 2020 12:18:07 +0100	[thread overview]
Message-ID: <a512bd63-b171-3ed5-6996-2c99b6c9a226@redhat.com> (raw)
In-Reply-To: <20201214111221.GC198219@kernel.org>

On 14.12.20 12:12, Mike Rapoport wrote:
> On Mon, Dec 14, 2020 at 11:11:35AM +0100, David Hildenbrand wrote:
>> On 09.12.20 22:43, Mike Rapoport wrote:
>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>
>>> memblock does not require that the reserved memory ranges will be a subset
>>> of memblock.memory.
>>>
>>> As the result there maybe reserved pages that are not in the range of any
>>> zone or node because zone and node boundaries are detected based on
>>> memblock.memory and pages that only present in memblock.reserved are not
>>> taken into account during zone/node size detection.
>>>
>>> Make sure that all ranges in memblock.reserved are added to memblock.memory
>>> before calculating node and zone boundaries.
>>>
>>> Fixes: 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that check each PFN")
>>> Reported-by: Andrea Arcangeli <aarcange@redhat.com>
>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
>>> ---
>>>  include/linux/memblock.h |  1 +
>>>  mm/memblock.c            | 24 ++++++++++++++++++++++++
>>>  mm/page_alloc.c          |  7 +++++++
>>>  3 files changed, 32 insertions(+)
>>>
>>> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
>>> index ef131255cedc..e64dae2dd1ce 100644
>>> --- a/include/linux/memblock.h
>>> +++ b/include/linux/memblock.h
>>> @@ -120,6 +120,7 @@ int memblock_clear_nomap(phys_addr_t base, phys_addr_t size);
>>>  unsigned long memblock_free_all(void);
>>>  void reset_node_managed_pages(pg_data_t *pgdat);
>>>  void reset_all_zones_managed_pages(void);
>>> +void memblock_enforce_memory_reserved_overlap(void);
>>>  
>>>  /* Low level functions */
>>>  void __next_mem_range(u64 *idx, int nid, enum memblock_flags flags,
>>> diff --git a/mm/memblock.c b/mm/memblock.c
>>> index b68ee86788af..9277aca642b2 100644
>>> --- a/mm/memblock.c
>>> +++ b/mm/memblock.c
>>> @@ -1857,6 +1857,30 @@ void __init_memblock memblock_trim_memory(phys_addr_t align)
>>>  	}
>>>  }
>>>  
>>> +/**
>>> + * memblock_enforce_memory_reserved_overlap - make sure every range in
>>> + * @memblock.reserved is covered by @memblock.memory
>>> + *
>>> + * The data in @memblock.memory is used to detect zone and node boundaries
>>> + * during initialization of the memory map and the page allocator. Make
>>> + * sure that every memory range present in @memblock.reserved is also added
>>> + * to @memblock.memory even if the architecture specific memory
>>> + * initialization failed to do so
>>> + */
>>> +void __init memblock_enforce_memory_reserved_overlap(void)
>>> +{
>>> +	phys_addr_t start, end;
>>> +	int nid;
>>> +	u64 i;
>>> +
>>> +	__for_each_mem_range(i, &memblock.reserved, &memblock.memory,
>>> +			     NUMA_NO_NODE, MEMBLOCK_NONE, &start, &end, &nid) {
>>> +		pr_warn("memblock: reserved range [%pa-%pa] is not in memory\n",
>>> +			&start, &end);
>>> +		memblock_add_node(start, (end - start), nid);
>>> +	}
>>> +}
>>> +
>>>  void __init_memblock memblock_set_current_limit(phys_addr_t limit)
>>>  {
>>>  	memblock.current_limit = limit;
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index eaa227a479e4..dbc57dbbacd8 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -7436,6 +7436,13 @@ void __init free_area_init(unsigned long *max_zone_pfn)
>>>  	memset(arch_zone_highest_possible_pfn, 0,
>>>  				sizeof(arch_zone_highest_possible_pfn));
>>>  
>>> +	/*
>>> +	 * Some architectures (e.g. x86) have reserved pages outside of
>>> +	 * memblock.memory. Make sure these pages are taken into account
>>> +	 * when detecting zone and node boundaries
>>> +	 */
>>> +	memblock_enforce_memory_reserved_overlap();
>>> +
>>>  	start_pfn = find_min_pfn_with_active_regions();
>>>  	descending = arch_has_descending_max_zone_pfns();
>>>  
>>>
>>
>> CCing Dan.
>>
>> This implies that any memory that is E820_TYPE_SOFT_RESERVED that was
>> reserved via memblock_reserve() will be added via memblock_add_node() as
>> well, resulting in all such memory getting a memmap allocated right when
>> booting up, right?
>>
>> IIRC, there are use cases where that is absolutely not desired.
> 
> Hmm, if this is the case we need entirely different solution to ensure
> that we don't have partial pageblocks in a zone and we have all the
> memory map initialized to a known state.
> 
>> Am I missing something? (@Dan?)
> 
> BTW, @Dan, why did you need to memblock_reserve(E820_TYPE_SOFT_RESERVED)
> without memblock_add()ing it?

I suspect to cover cases where it might partially span memory sections
(or even sub-sections). Maybe we should focus on initializing that part
only - meaning, not adding all memory to .memory but only !section
aligned pieces.


-- 
Thanks,

David / dhildenb


  reply	other threads:[~2020-12-14 11:19 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-09 21:43 [PATCH v2 0/2] mm: fix initialization of struct page for holes in memory layout Mike Rapoport
2020-12-09 21:43 ` [PATCH v2 1/2] mm: memblock: enforce overlap of memory.memblock and memory.reserved Mike Rapoport
2020-12-10  9:28   ` Greg KH
2020-12-14 10:11   ` David Hildenbrand
2020-12-14 11:12     ` Mike Rapoport
2020-12-14 11:18       ` David Hildenbrand [this message]
2020-12-14 13:58         ` Andrea Arcangeli
2020-12-09 21:43 ` [PATCH v2 2/2] mm: fix initialization of struct page for holes in memory layout Mike Rapoport
2020-12-10  1:51   ` Andrea Arcangeli
2020-12-10  9:29   ` Greg KH
2021-01-04 19:03   ` Qian Cai
2021-01-04 19:03     ` Qian Cai
2021-01-05  8:24     ` Mike Rapoport
2021-01-05 18:45       ` Qian Cai
2021-01-05 18:45         ` Qian Cai
2021-01-06  8:05         ` Mike Rapoport
2021-01-06 21:04           ` Qian Cai
2021-01-06 21:04             ` Qian Cai
2021-01-10 15:39             ` Mike Rapoport
2021-01-11 15:06               ` Qian Cai
2021-01-11 15:06                 ` Qian Cai
2021-01-11 17:47                 ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a512bd63-b171-3ed5-6996-2c99b6c9a226@redhat.com \
    --to=david@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=cai@lca.pw \
    --cc=dan.j.williams@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=rppt@kernel.org \
    --cc=rppt@linux.ibm.com \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.