linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Baoquan He <bhe@redhat.com>,
	kkabe@vega.pgw.jp, bugzilla-daemon@bugzilla.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Wei Yang <richardw.yang@linux.intel.com>,
	Michal Hocko <mhocko@kernel.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	linux-mm@kvack.org, Dan Williams <dan.j.williams@intel.com>
Subject: Re: [Bug 206401] kernel panic on Hyper-V after 5 minutes due to memory hot-add
Date: Wed, 12 Feb 2020 09:21:56 +0100	[thread overview]
Message-ID: <544bd3d0-c962-e97c-7c1c-5d4ffdd2046b@redhat.com> (raw)
In-Reply-To: <20200212073123.GG8965@MiWiFi-R3L-srv>

On 12.02.20 08:31, Baoquan He wrote:
> On 02/11/20 at 04:41pm, Andrew Morton wrote:
>> On Tue, 11 Feb 2020 07:07:41 +0800 Wei Yang <richardw.yang@linux.intel.com> wrote:
>>
>>> On Mon, Feb 10, 2020 at 02:15:51PM +0800, Baoquan He wrote:
>>>> On 02/10/20 at 02:09pm, Baoquan He wrote:
>>>>> On 02/09/20 at 09:56pm, Andrew Morton wrote:
>>>>>> On Mon, 10 Feb 2020 13:40:27 +0800 Baoquan He <bhe@redhat.com> wrote:
>>>>>>
>>>>>>> Hi Andrew,
>>>>>>>
>>>>>>> On 02/09/20 at 09:32pm, Andrew Morton wrote:
>>>>>>>> On Tue, 04 Feb 2020 11:25:48 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
>>>>>>>>
>>>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=206401
>>>>>>>>>
>>>>>>>>
>>>>>>>> An oops during mem hotadd.  Could someone please take a look when
>>>>>>>> convenient?
>>>>>>>
>>>>>>> This has been addressed by Wei Yang's patch, please check it here:
>>>>>>>
>>>>>>> http://lkml.kernel.org/r/20200209104826.3385-7-bhe@redhat.com
>>>>>>>
>>>>>>
>>>>>> hm, OK, thanks.  It's unfortunate that a 5.5 fix is buried in a
>>>>>> six-patch series which is still in progress!  Can we please merge that
>>>>>> as a standalone fix with a cc:stable, Fixes:, etc?
>>>>
>>>> Maybe can add Fixes tag as follow when merge:
>>>>
>>>> Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
>>>>
>>
>> The reporter (cc'ed here) is still seeing issues:
>> https://bugzilla.kernel.org/show_bug.cgi?id=206401
>>
>> Could we please continue this investigation via emailed reply-to-all,
>> rather than via the bugzilla interface?
> 
> Yes, people prefer mailing list to discuss issues.
> 
> Hi T.Kabe, 
> 
> Could you provide the call trace again after below patch is applied?
> The comment #9 in bugzilla is not very clear to me.
> 
> mm/sparsemem: pfn_to_page is not valid yet on SPARSEMEM
> http://lkml.kernel.org/r/20200209104826.3385-7-bhe@redhat.com
> 
> And, as you said, applying above patch, and do not call
> __free_pages_core() in generic_online_page() will work. I doubt it,
> because without __free_pages_core(), your added pages are not added
> into buddy for managing. 

Removing __free_pages_core() from generic_online_page() is just
plain wrong and would break memory hotplug in general. So that is
certainly not the right fix.

HV supports memory sections that are fully added, but only parts of
it are actually backed in the hypervisor, "online" and exposed to the buddy.

When onlining memory, it will online the backed parts via
hv_online_page()->generic_online_page(). When requested to hot add
more memory, the guest will online remaining parts that are now
backed handle_pg_range()->hv_bring_pgs_online().

So if generic_online_page() fails it's either because

1. HV guest driver has a bug and tries to online something it shouldn't
2. HV hypervisor has a bug and does not back memory properly before hot/adding
3. Memory hotplug code has a bug and does not properly add the memory block/sections


Please note that to using generic_online_page() in 

commit 30a9c246b9f6fe0591e8afb05758a3e3b096fabe
Author: David Hildenbrand <david@redhat.com>
Date:   Sat Nov 30 17:53:55 2019 -0800

    hv_balloon: use generic_online_page()
    
    Let's use the generic onlining function - which will now also take care
    of calling kernel_map_pages().

However, the old code ended up calling
	__free_pages_core() -> __free_pages()
End the new one ends up calling
	__online_page_free() -> __free_reserved_page() -> __free_page()
So I don't think it's related to that.


Especially, looking at the kernel messages, I can see that the kernel crashes
when adding memory, not when onlining it? So I do think there is still
something wrong in the SPARSE hot-add code if you keep seeing issues.

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2020-02-12  8:22 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-206401-27@https.bugzilla.kernel.org/>
     [not found] ` <bug-206401-27-zYD8WfDKqD@https.bugzilla.kernel.org/>
2020-02-10  5:32   ` [Bug 206401] kernel panic on Hyper-V after 5 minutes due to memory hot-add Andrew Morton
2020-02-10  5:40     ` Baoquan He
2020-02-10  5:56       ` Andrew Morton
2020-02-10  6:09         ` Baoquan He
2020-02-10  6:15           ` Baoquan He
2020-02-10 23:07             ` Wei Yang
2020-02-12  0:41               ` Andrew Morton
2020-02-12  7:31                 ` Baoquan He
2020-02-12  8:21                   ` David Hildenbrand [this message]
2020-02-13  4:22                   ` [Bug 206401] kernel panic on Hyper-V after 5 minutes due tomemory hot-add kabe
2020-02-13  8:19                     ` Baoquan He
2020-02-14 14:26                       ` [Bug 206401] kernel panic on Hyper-V after 5 minutes duetomemory hot-add kkabe
2020-02-14 14:48                         ` Baoquan He
2020-02-14 15:01                           ` Baoquan He
2020-02-17  4:48                         ` Baoquan He
2020-02-17  5:31                           ` [Bug 206401] kernel panic on Hyper-V after 5 minutes duetomemoryhot-add kkabe
2020-02-17  8:00                             ` David Hildenbrand
2020-02-17 10:33                         ` [Bug 206401] kernel panic on Hyper-V after 5 minutes duetomemory hot-add Michal Hocko
2020-02-17 11:21                           ` [Bug 206401] kernel panic on Hyper-V after 5 minutes due to memory hot-add kkabe
2020-02-17  5:46                   ` kkabe
2020-02-17  7:44                     ` Baoquan He
2020-02-17  9:34                     ` Oscar Salvador
2020-02-17 10:13                       ` Baoquan He
2020-02-17 10:17                         ` Baoquan He
2020-02-17 10:24                         ` David Hildenbrand
2020-02-17 10:33                           ` Baoquan He
2020-02-17 10:38                             ` David Hildenbrand
2020-02-17 11:20                               ` Baoquan He
2020-02-17 12:47                                 ` Michal Hocko
2020-02-18  6:24                                 ` kkabe
2020-02-18  8:47                                   ` Michal Hocko
2020-02-18  9:19                                     ` kkabe
2020-02-18  9:26                                       ` David Hildenbrand
2020-02-18 10:05                                       ` [RFC PATCH] memory_hotplug: disable the functionality for 32b (was: Re: [Bug 206401] kernel panic on Hyper-V after 5 minutes due to) " Michal Hocko
2020-02-18 10:11                                         ` David Hildenbrand
2020-02-19  3:23                                         ` Baoquan He
2020-02-19 21:46                                         ` Andrew Morton
2020-02-19 23:07                                           ` [RFC PATCH] memory_hotplug: disable the functionality for 32b Robin Murphy
2020-02-19  3:39                                   ` [Bug 206401] kernel panic on Hyper-V after 5 minutes due to memory hot-add Baoquan He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=544bd3d0-c962-e97c-7c1c-5d4ffdd2046b@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=bugzilla-daemon@bugzilla.kernel.org \
    --cc=dan.j.williams@intel.com \
    --cc=kkabe@vega.pgw.jp \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=richardw.yang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).