linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Qian Cai <cai@lca.pw>
To: David Hildenbrand <david@redhat.com>, akpm@linux-foundation.org
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	"mhocko@suse.com" <mhocko@suse.com>
Subject: Re: [PATCH] mm/page_owner: fix a crash after memory offline
Date: Thu, 10 Oct 2019 15:12:00 -0400	[thread overview]
Message-ID: <1570734720.5937.32.camel@lca.pw> (raw)
In-Reply-To: <2e36a929-0fc7-d32a-d838-de746ff071fc@redhat.com>

On Thu, 2019-10-10 at 20:55 +0200, David Hildenbrand wrote:
> On 10.10.19 20:32, Qian Cai wrote:
> > The linux-next series "mm/memory_hotplug: Shrink zones before removing
> > memory" [1] seems make a crash easier to reproduce while reading
> > /proc/pagetypeinfo after offlining a memory section. Fix it by using
> > pfn_to_online_page() in the PFN walker.
> 
> Can you please rephrase the subject+description to describe the actual 
> problem and drop the reference to the series?

I'd figure it is better for you to post this as you are on the top of this whole
mess. What do you think?

> 
> E.g., similar to my recent patches:
> 
> "mm/page_owner: Don't access uninitialized memmaps when reading 
> /proc/pagetypeinfo
> 
> Uninitialized memmaps contain garbage and in the worst case trigger 
> kernel BUGs, especially with CONFIG_PAGE_POISONING. They should not get
> touched.
> 
> For example, when not onlining a memory block that is spanned by a zone 
> and reading /proc/pagetypeinfo, we can trigger a kernel BUG: ...
> "
> 
> However, you also have to justify why it is okay to no longer consider 
> ZONE_DEVICE (I think walk_zones_in_node() will skip ZONE_DEVICE due to 
> assert_populated == true and ZONE_DEVICE will never be populated, 
> Therefore, we will never end in this code path with ZONE_DEVICE).
> 
> 
> > 
> > [1] https://lore.kernel.org/linux-mm/20191006085646.5768-1-david@redhat.com/
> > 
> > page:ffffea0021200000 is uninitialized and poisoned
> > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> > page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
> > There is not page extension available.
> > ------------[ cut here ]------------
> > kernel BUG at include/linux/mm.h:1107!
> > RIP: 0010:pagetypeinfo_showmixedcount_print+0x4fb/0x680
> > Call Trace:
> >   walk_zones_in_node+0x3a/0xc0
> >   pagetypeinfo_show+0x260/0x2c0
> >   seq_read+0x27e/0x710
> >   proc_reg_read+0x12e/0x190
> >   __vfs_read+0x50/0xa0
> >   vfs_read+0xcb/0x1e0
> >   ksys_read+0xc6/0x160
> >   __x64_sys_read+0x43/0x50
> >   do_syscall_64+0xcc/0xaec
> >   entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > 
> > Signed-off-by: Qian Cai <cai@lca.pw>
> > ---
> >   mm/page_owner.c | 5 ++---
> >   1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/mm/page_owner.c b/mm/page_owner.c
> > index dee931184788..03a6b19b3cdd 100644
> > --- a/mm/page_owner.c
> > +++ b/mm/page_owner.c
> > @@ -296,11 +296,10 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
> >   		pageblock_mt = get_pageblock_migratetype(page);
> >   
> 
> What about the pfn_valid() in the outermost loop? You can skip over the 
> whole pageblock if the first page is not online.
> 
> >   		for (; pfn < block_end_pfn; pfn++) {
> > -			if (!pfn_valid_within(pfn))
> > +			page = pfn_to_online_page(pfn);
> > +			if (!page)
> >   				continue;
> >   
> > -			page = pfn_to_page(pfn);
> > -
> >   			if (page_zone(page) != zone)
> >   				continue;
> >   
> > 
> 
> 


  reply	other threads:[~2019-10-10 19:12 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-10 18:32 [PATCH] mm/page_owner: fix a crash after memory offline Qian Cai
2019-10-10 18:55 ` David Hildenbrand
2019-10-10 19:12   ` Qian Cai [this message]
2019-10-11  9:57     ` David Hildenbrand
2019-10-11 11:07       ` Qian Cai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1570734720.5937.32.camel@lca.pw \
    --to=cai@lca.pw \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).