All of lore.kernel.org
 help / color / mirror / Atom feed
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"dan.j.williams@intel.com" <dan.j.williams@intel.com>,
	Huang Ying <ying.huang@intel.com>,
	Pavel Tatashin <pasha.tatashin@oracle.com>
Subject: Re: kernel panic in reading /proc/kpageflags when enabling RAM-simulated PMEM
Date: Wed, 6 Jun 2018 05:16:24 +0000	[thread overview]
Message-ID: <20180606051624.GA16021@hori1.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <20180605073500.GA23766@hori1.linux.bs1.fc.nec.co.jp>

On Tue, Jun 05, 2018 at 07:35:01AM +0000, Horiguchi Naoya(堀口 直也) wrote:
> On Mon, Jun 04, 2018 at 06:18:36PM -0700, Matthew Wilcox wrote:
> > On Tue, Jun 05, 2018 at 12:54:03AM +0000, Naoya Horiguchi wrote:
> > > Reproduction precedure is like this:
> > >  - enable RAM based PMEM (with a kernel boot parameter like memmap=1G!4G)
> > >  - read /proc/kpageflags (or call tools/vm/page-types with no arguments)
> > >  (- my kernel config is attached)
> > > 
> > > I spent a few days on this, but didn't reach any solutions.
> > > So let me report this with some details below ...
> > > 
> > > In the critial page request, stable_page_flags() is called with an argument
> > > page whose ->compound_head was somehow filled with '0xffffffffffffffff'.
> > > And compound_head() returns (struct page *)(head - 1), which explains the
> > > address 0xfffffffffffffffe in the above message.
> > 
> > Hm.  compound_head shares with:
> > 
> >                         struct list_head lru;
> >                                 struct list_head slab_list;     /* uses lru */
> >                                 struct {        /* Partial pages */
> >                                         struct page *next;
> >                         unsigned long _compound_pad_1;  /* compound_head */
> >                         unsigned long _pt_pad_1;        /* compound_head */
> >                         struct dev_pagemap *pgmap;
> >                 struct rcu_head rcu_head;
> > 
> > None of them should be -1.
> > 
> > > It seems that this kernel panic happens when reading kpageflags of pfn range
> > > [0xbffd7, 0xc0000), which coresponds to a 'reserved' range.
> > > 
> > > [    0.000000] user-defined physical RAM map:
> > > [    0.000000] user: [mem 0x0000000000000000-0x000000000009fbff] usable
> > > [    0.000000] user: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> > > [    0.000000] user: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> > > [    0.000000] user: [mem 0x0000000000100000-0x00000000bffd6fff] usable
> > > [    0.000000] user: [mem 0x00000000bffd7000-0x00000000bfffffff] reserved
> > > [    0.000000] user: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> > > [    0.000000] user: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> > > [    0.000000] user: [mem 0x0000000100000000-0x000000013fffffff] persistent (type 12)
> > > 
> > > So I guess 'memmap=' parameter might badly affect the memory initialization process.
> > > 
> > > This problem doesn't reproduce on v4.17, so some pre-released patch introduces it.
> > > I hope this info helps you find the solution/workaround.
> > 
> > Can you try bisecting this?  It could be one of my patches to reorder struct
> > page, or it could be one of Pavel's deferred page initialisation patches.
> > Or something else ;-)
> 
> Thank you for the comment. I'm trying bisecting now, let you know the result later.
> 
> And I found that my statement "not reproduce on v4.17" was wrong (I used
> different kvm guests, which made some different test condition and misguided me),
> this seems an older (at least < 4.15) bug.

(Cc: Pavel)

Bisection showed that the following commit introduced this issue:

  commit f7f99100d8d95dbcf09e0216a143211e79418b9f
  Author: Pavel Tatashin <pasha.tatashin@oracle.com>
  Date:   Wed Nov 15 17:36:44 2017 -0800
  
      mm: stop zeroing memory during allocation in vmemmap

This patch postpones struct page zeroing to later stage of memory initialization.
My kernel config disabled CONFIG_DEFERRED_STRUCT_PAGE_INIT so two callsites of
__init_single_page() were never reached. So in such case, struct pages populated
by vmemmap_pte_populate() could be left uninitialized?
And I'm not sure yet how this issue becomes visible with memmap= setting.

Thanks,
Naoya Horiguchi

  reply	other threads:[~2018-06-06  5:17 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-05  0:54 kernel panic in reading /proc/kpageflags when enabling RAM-simulated PMEM Naoya Horiguchi
2018-06-05  1:18 ` Matthew Wilcox
2018-06-05  7:35   ` Naoya Horiguchi
2018-06-06  5:16     ` Naoya Horiguchi [this message]
2018-06-06  8:04       ` Oscar Salvador
2018-06-06  8:04         ` Oscar Salvador
2018-06-06  8:53         ` Oscar Salvador
2018-06-06  8:53           ` Oscar Salvador
2018-06-06  9:06           ` Naoya Horiguchi
2018-06-06  9:24             ` Naoya Horiguchi
2018-06-07  6:22               ` Naoya Horiguchi
2018-06-07  6:59                 ` Oscar Salvador
2018-06-07  6:59                   ` Oscar Salvador
2018-06-07  9:49                   ` Oscar Salvador
2018-06-07  9:49                     ` Oscar Salvador
2018-06-07 10:02                     ` Naoya Horiguchi
2018-06-11  9:05                       ` Naoya Horiguchi
2018-06-13  5:41                       ` [PATCH v1] mm: zero remaining unavailable struct pages (Re: kernel panic in reading /proc/kpageflags when enabling RAM-simulated PMEM) Naoya Horiguchi
2018-06-13  8:40                         ` Oscar Salvador
2018-06-14  4:56                           ` Naoya Horiguchi
2018-06-13  9:07                         ` Michal Hocko
2018-06-14  5:16                           ` Naoya Horiguchi
2018-06-14  5:38                             ` Oscar Salvador
2018-06-14  6:34                               ` [PATCH v2] x86/e820: put !E820_TYPE_RAM regions into memblock.reserved Naoya Horiguchi
2018-06-14  7:21                                 ` Oscar Salvador
2018-06-14 11:24                                   ` Oscar Salvador
2018-06-15  0:58                                     ` Naoya Horiguchi
2018-06-14 21:30                                 ` Oscar Salvador
2018-06-15  1:09                                   ` Naoya Horiguchi
2018-06-15  7:29                                     ` [PATCH v3] " Naoya Horiguchi
2018-06-15  8:41                                       ` Michal Hocko
2018-06-15 14:00                                         ` Pavel Tatashin
2018-06-15 14:10                                           ` Michal Hocko
2018-06-15 14:33                                           ` Oscar Salvador
2018-06-15 16:02                                             ` Pavel Tatashin
2018-06-18 23:36                                           ` Andrew Morton
2018-06-19  0:49                                             ` Pavel Tatashin
2018-07-02 20:05                                             ` Pavel Tatashin
2018-07-02 20:28                                               ` Andrew Morton
2018-07-02 20:31                                                 ` Pavel Tatashin
2018-06-14  7:00                             ` [PATCH v1] mm: zero remaining unavailable struct pages (Re: kernel panic in reading /proc/kpageflags when enabling RAM-simulated PMEM) Michal Hocko
2018-06-15  1:07                               ` Naoya Horiguchi
2018-06-15  8:39                                 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180606051624.GA16021@hori1.linux.bs1.fc.nec.co.jp \
    --to=n-horiguchi@ah.jp.nec.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mingo@kernel.org \
    --cc=pasha.tatashin@oracle.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.