linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Łukasz Majczak" <lma@semihalf.com>
To: Mike Rapoport <rppt@linux.ibm.com>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	"Radosław Biernacki" <rad@semihalf.com>,
	"Marcin Wojtas" <mw@semihalf.com>,
	"Alex Levin" <levinale@google.com>,
	"Guenter Roeck" <groeck@google.com>,
	"Jesse Barnes" <jsbarnes@google.com>,
	"Chris Wilson" <chris@chris-wilson.co.uk>,
	"Sarvela, Tomi P" <tomi.p.sarvela@intel.com>,
	"Łukasz Bartosik" <lb@semihalf.com>
Subject: Re: PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout
Date: Wed, 27 Jan 2021 20:18:34 +0100	[thread overview]
Message-ID: <CAFJ_xbqHq-P8z-xpkVDAQsG1s4j5FMmXFnyXLzd=ja8y_=8LfA@mail.gmail.com> (raw)
In-Reply-To: <20210127182651.GA281042@linux.ibm.com>

Hi Mike,

Great ! it seems to work well - I have built a valila kernel v5.11-rc5
with your patch and it boots properly.
Full log available here:
https://gist.github.com/semihalf-majczak-lukasz/ea89bf52f6fad7907a18d1870e7ce9bd

Best regards,
Lukasz

śr., 27 sty 2021 o 19:27 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
>
> Hi Lukasz,
>
> On Wed, Jan 27, 2021 at 02:15:53PM +0100, Łukasz Majczak wrote:
> > Hi Mike,
> >
> > I have started bisecting your patch and I have figured out that there
> > might be something wrong with clamping - with comments out these lines
> > it started to work.
> > The full log (with logs from below patch) can be found here:
> > https://gist.github.com/semihalf-majczak-lukasz/3cecbab0ddc59a6c3ce11ddc29645725
> > it's fresh - I haven't analyze it yet, just sharing with hope it will help.
>
> Thanks, that helps!
>
> The first page is never considered by the kernel as memory and so
> arch_zone_lowest_possible_pfn[ZONE_DMA] is set to 0x1000. As the result,
> init_unavailable_mem() skips pfn 0 and then __SetPageReserved(page) in
> reserve_bootmem_region() panics because the struct page for pfn 0 remains
> poisoned.
>
> Can you please try the below patch on top of v5.11-rc5?
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 783913e41f65..3ce9ef238dfc 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7083,10 +7083,11 @@ void __init free_area_init_memoryless_node(int nid)
>  static u64 __init init_unavailable_range(unsigned long spfn, unsigned long epfn,
>                                          int zone, int nid)
>  {
> -       unsigned long pfn, zone_spfn, zone_epfn;
> +       unsigned long pfn, zone_spfn = 0, zone_epfn;
>         u64 pgcnt = 0;
>
> -       zone_spfn = arch_zone_lowest_possible_pfn[zone];
> +       if (zone > 0)
> +               zone_spfn = arch_zone_highest_possible_pfn[zone - 1];
>         zone_epfn = arch_zone_highest_possible_pfn[zone];
>
>         spfn = clamp(spfn, zone_spfn, zone_epfn);
>
>
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index eed54ce26ad1..9f4468c413a1 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7093,9 +7093,11 @@ static u64 __init
> > init_unavailable_range(unsigned long spfn, unsigned long epfn,
> >         zone_spfn = arch_zone_lowest_possible_pfn[zone];
> >         zone_epfn = arch_zone_highest_possible_pfn[zone];
> >
> > -       spfn = clamp(spfn, zone_spfn, zone_epfn);
> > -       epfn = clamp(epfn, zone_spfn, zone_epfn);
> > -
> > +       //spfn = clamp(spfn, zone_spfn, zone_epfn);
> > +       //epfn = clamp(epfn, zone_spfn, zone_epfn);
> > +       pr_info("LMA DBG: zone_spfn: %llx, zone_epfn %llx\n",
> > zone_spfn, zone_epfn);
> > +       pr_info("LMA DBG: spfn: %llx, epfn %llx\n", spfn, epfn);
> > +       pr_info("LMA DBG: clamp_spfn: %llx, clamp_epfn %llx\n",
> > clamp(spfn, zone_spfn, zone_epfn), clamp(epfn, zone_spfn, zone_epfn));
> >         for (pfn = spfn; pfn < epfn; pfn++) {
> >                 if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))) {
> >                         pfn = ALIGN_DOWN(pfn, pageblock_nr_pages)
> >
> > Best regards,
> > Lukasz
> >
> >
> > śr., 27 sty 2021 o 13:15 Łukasz Majczak <lma@semihalf.com> napisał(a):
> > >
> > > Unfortunately nothing :( my current kernel command line contains:
> > > console=ttyS0,115200n8 debug earlyprintk=serial loglevel=7
> > >
> > > I was thinking about using earlycon, but it seems to be blocked.
> > > (I think the lack of earlycon might be related to Chromebook HW
> > > security design. There is an EC controller which is a part of AP ->
> > > serial chain as kernel messages are considered sensitive from a
> > > security standpoint.)
> > >
> > > Best regards,
> > > Lukasz
> > >
> > > śr., 27 sty 2021 o 12:19 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> > > >
> > > > On Wed, Jan 27, 2021 at 11:08:17AM +0100, Łukasz Majczak wrote:
> > > > > Hi Mike,
> > > > >
> > > > > Actually I have a serial console attached (via servo device), but
> > > > > there is no output :( and also the reboot/crash is very fast/immediate
> > > > > after power on.
> > > >
> > > > If you boot with earlyprintk=serial are there any messages?
> > > >
> > > > > Best regards
> > > > > Lukasz
> > > > >
> > > > > śr., 27 sty 2021 o 11:05 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> > > > > >
> > > > > > Hi Lukasz,
> > > > > >
> > > > > > On Wed, Jan 27, 2021 at 10:22:29AM +0100, Łukasz Majczak wrote:
> > > > > > > Crash after mm: fix initialization of struct page for holes in memory layout
> > > > > > >
> > > > > > > Hi,
> > > > > > > I was trying to run v5.11-rc5 on my Samsung Chromebook Pro (Caroline),
> > > > > > > but I've noticed it has crashed - unfortunately it seems to happen at
> > > > > > > a very early stage - No output to the console nor to the screen, so I
> > > > > > > have started a bisect (between 5.11-rc4 - which works just find - and
> > > > > > > 5.11-rc5),
> > > > > > > bisect results points to:
> > > > > > >
> > > > > > > d3921cb8be29 mm: fix initialization of struct page for holes in memory layout
> > > > > > >
> > > > > > > Reproduction is just to build and load the kernel.
> > > > > > >
> > > > > > > If it will help any how I am attaching:
> > > > > > > - /proc/cpuinfo (from healthy system):
> > > > > > > https://gist.github.com/semihalf-majczak-lukasz/3517867bf39f07377c1a785b64a97066
> > > > > > > - my .config file (for a broken system):
> > > > > > > https://gist.github.com/semihalf-majczak-lukasz/584b329f1bf3e43b53efe8e18b5da33c
> > > > > > >
> > > > > > > If there is anything I could add/do/test to help fix this please let me know.
> > > > > >
> > > > > > Chris Wilson also reported boot failures on several Chromebooks:
> > > > > >
> > > > > > https://lore.kernel.org/lkml/161160687463.28991.354987542182281928@build.alporthouse.com
> > > > > >
> > > > > > I presume serial console is not an option, so if you could boot with
> > > > > > earlyprintk=vga and see if there is anything useful printed on the screen
> > > > > > it would be really helpful.
> > > > > >
> > > > > > > Best regards
> > > > > > > Lukasz
> > > > > >
> > > > > > --
> > > > > > Sincerely yours,
> > > > > > Mike.
> > > >
> > > > --
> > > > Sincerely yours,
> > > > Mike.
>
> --
> Sincerely yours,
> Mike.

  reply	other threads:[~2021-01-27 19:19 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-27  9:22 PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout Łukasz Majczak
2021-01-27 10:04 ` Mike Rapoport
2021-01-27 10:08   ` Łukasz Majczak
2021-01-27 11:18     ` Mike Rapoport
2021-01-27 12:15       ` Łukasz Majczak
2021-01-27 13:15         ` Łukasz Majczak
2021-01-27 18:26           ` Mike Rapoport
2021-01-27 19:18             ` Łukasz Majczak [this message]
2021-01-28  2:45             ` Baoquan He
2021-01-28  9:31               ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFJ_xbqHq-P8z-xpkVDAQsG1s4j5FMmXFnyXLzd=ja8y_=8LfA@mail.gmail.com' \
    --to=lma@semihalf.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris@chris-wilson.co.uk \
    --cc=groeck@google.com \
    --cc=jsbarnes@google.com \
    --cc=lb@semihalf.com \
    --cc=levinale@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mw@semihalf.com \
    --cc=rad@semihalf.com \
    --cc=rppt@linux.ibm.com \
    --cc=tomi.p.sarvela@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).