linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Mikhail Zaslonko <zaslonko@linux.ibm.com>,
	akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, mhocko@kernel.org,
	Pavel.Tatashin@microsoft.com, schwidefsky@de.ibm.com,
	heiko.carstens@de.ibm.com
Subject: Re: [PATCH v2 1/1] mm, memory_hotplug: Initialize struct pages for the full memory section
Date: Mon, 17 Dec 2018 10:38:32 +0100	[thread overview]
Message-ID: <cffab731-81e0-b80c-665e-c9a62faed4ec@redhat.com> (raw)
In-Reply-To: <20181214202315.1c685f1e@thinkpad>

On 14.12.18 20:23, Gerald Schaefer wrote:
> On Fri, 14 Dec 2018 16:49:14 +0100
> David Hildenbrand <david@redhat.com> wrote:
> 
>> On 14.12.18 16:22, David Hildenbrand wrote:
>>> On 12.12.18 18:27, Mikhail Zaslonko wrote:  
>>>> If memory end is not aligned with the sparse memory section boundary, the
>>>> mapping of such a section is only partly initialized. This may lead to
>>>> VM_BUG_ON due to uninitialized struct page access from
>>>> is_mem_section_removable() or test_pages_in_a_zone() function triggered by
>>>> memory_hotplug sysfs handlers:
>>>>
>>>> Here are the the panic examples:
>>>>  CONFIG_DEBUG_VM=y
>>>>  CONFIG_DEBUG_VM_PGFLAGS=y
>>>>
>>>>  kernel parameter mem=2050M
>>>>  --------------------------
>>>>  page:000003d082008000 is uninitialized and poisoned
>>>>  page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
>>>>  Call Trace:
>>>>  ([<0000000000385b26>] test_pages_in_a_zone+0xde/0x160)
>>>>   [<00000000008f15c4>] show_valid_zones+0x5c/0x190
>>>>   [<00000000008cf9c4>] dev_attr_show+0x34/0x70
>>>>   [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
>>>>   [<00000000003e4194>] seq_read+0x204/0x480
>>>>   [<00000000003b53ea>] __vfs_read+0x32/0x178
>>>>   [<00000000003b55b2>] vfs_read+0x82/0x138
>>>>   [<00000000003b5be2>] ksys_read+0x5a/0xb0
>>>>   [<0000000000b86ba0>] system_call+0xdc/0x2d8
>>>>  Last Breaking-Event-Address:
>>>>   [<0000000000385b26>] test_pages_in_a_zone+0xde/0x160
>>>>  Kernel panic - not syncing: Fatal exception: panic_on_oops
>>>>
>>>>  kernel parameter mem=3075M
>>>>  --------------------------
>>>>  page:000003d08300c000 is uninitialized and poisoned
>>>>  page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
>>>>  Call Trace:
>>>>  ([<000000000038596c>] is_mem_section_removable+0xb4/0x190)
>>>>   [<00000000008f12fa>] show_mem_removable+0x9a/0xd8
>>>>   [<00000000008cf9c4>] dev_attr_show+0x34/0x70
>>>>   [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
>>>>   [<00000000003e4194>] seq_read+0x204/0x480
>>>>   [<00000000003b53ea>] __vfs_read+0x32/0x178
>>>>   [<00000000003b55b2>] vfs_read+0x82/0x138
>>>>   [<00000000003b5be2>] ksys_read+0x5a/0xb0
>>>>   [<0000000000b86ba0>] system_call+0xdc/0x2d8
>>>>  Last Breaking-Event-Address:
>>>>   [<000000000038596c>] is_mem_section_removable+0xb4/0x190
>>>>  Kernel panic - not syncing: Fatal exception: panic_on_oops
>>>>
>>>> Fix the problem by initializing the last memory section of each zone
>>>> in memmap_init_zone() till the very end, even if it goes beyond the zone
>>>> end.
>>>>
>>>> Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
>>>> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
>>>> Cc: <stable@vger.kernel.org>
>>>> ---
>>>>  mm/page_alloc.c | 12 ++++++++++++
>>>>  1 file changed, 12 insertions(+)
>>>>
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index 2ec9cc407216..e2afdb2dc2c5 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -5542,6 +5542,18 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>>>>  			cond_resched();
>>>>  		}
>>>>  	}
>>>> +#ifdef CONFIG_SPARSEMEM
>>>> +	/*
>>>> +	 * If the zone does not span the rest of the section then
>>>> +	 * we should at least initialize those pages. Otherwise we
>>>> +	 * could blow up on a poisoned page in some paths which depend
>>>> +	 * on full sections being initialized (e.g. memory hotplug).
>>>> +	 */
>>>> +	while (end_pfn % PAGES_PER_SECTION) {
>>>> +		__init_single_page(pfn_to_page(end_pfn), end_pfn, zone, nid);
>>>> +		end_pfn++;  
>>>
>>> This page will not be marked as PG_reserved - although it is a physical
>>> memory gap. Do we care?
>>>   
>>
>> Hm, or do we even have any idea what this is (e.g. could it also be
>> something not a gap)?
> 
> In the "mem=" restriction scenario it would be a gap, and probably fall
> into the PG_reserved categorization from your recent patch:
>  * - Pages falling into physical memory gaps - not IORESOURCE_SYSRAM. Trying
>  *   to read/write these pages might end badly. Don't touch!
> 
> Not sure if it could be something else. In theory, if it is possible to have
> a scenario where memory zones are not section-aligned, then this
> end_pfn % PAGES_PER_SECTION part could be part of another zone. But then it
> should not matter if the pages get pre-initialized here, with or w/o
> PG_reseved, because they should later be properly initialized in their zone.
> 
> So marking them as PG_reserved sounds right, especially in the light of your
> current PG_reserved clean-up.
> 
>>
>> For physical memory gaps within a section, architectures usually exclude
>> that memory from getting passed to e.g. the page allocator by
>> memblock_reserve().
>>
>> Before handing all free pages to the page allocator, all such reserved
>> memblocks will be marked reserved.
>>
>> But this here seems to be different. We don't have a previous
>> memblock_reserve(), because otherwise these pages would have properly
>> been initialized already when marking them reserved.
> 
> Not sure how memblock_reserve() and struct page initialization are
> related, but at least on s390 there is a memblock_reserve() on the range
> in question in setup_arch() -> reserve_memory_end(). However, in this
> "mem=" scenario, the range is also removed later with memblock_remove()
> in setup_memory_end(), because it is beyond memory_end.
> 

I am wondering if we should fix this on the memblock level instead than.
Something like, before handing memory over to the page allocator, add
memory as reserved up to the last section boundary. Or even when setting
the physical memory limit (mem= scenario).

-- 

Thanks,

David / dhildenb

  reply	other threads:[~2018-12-17  9:38 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-12 17:27 [PATCH v2 0/1] Initialize struct pages for the full section Mikhail Zaslonko
2018-12-12 17:27 ` [PATCH v2 1/1] mm, memory_hotplug: Initialize struct pages for the full memory section Mikhail Zaslonko
2018-12-13  3:46   ` Wei Yang
     [not found]     ` <e4cebbae-3fcb-f03c-3d0e-a1a44ff0675a@linux.bm.com>
2018-12-13 15:12       ` Wei Yang
     [not found]         ` <674c53e2-e4b3-f21f-4613-b149acef7e53@linux.bm.com>
2018-12-14 10:19           ` Michal Hocko
2018-12-15  0:26             ` Wei Yang
2018-12-13 12:59   ` Michal Hocko
2018-12-14 15:22   ` David Hildenbrand
2018-12-14 15:49     ` David Hildenbrand
2018-12-14 19:23       ` Gerald Schaefer
2018-12-17  9:38         ` David Hildenbrand [this message]
2018-12-17 12:28           ` Michal Hocko
2018-12-17 13:29             ` David Hildenbrand
2018-12-17 13:35               ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cffab731-81e0-b80c-665e-c9a62faed4ec@redhat.com \
    --to=david@redhat.com \
    --cc=Pavel.Tatashin@microsoft.com \
    --cc=akpm@linux-foundation.org \
    --cc=gerald.schaefer@de.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=schwidefsky@de.ibm.com \
    --cc=zaslonko@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).