linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout
@ 2021-01-27  9:22 Łukasz Majczak
  2021-01-27 10:04 ` Mike Rapoport
  0 siblings, 1 reply; 10+ messages in thread
From: Łukasz Majczak @ 2021-01-27  9:22 UTC (permalink / raw)
  To: Andrew Morton, linux-mm, linux-kernel, Mike Rapoport
  Cc: Radosław Biernacki, Marcin Wojtas, Alex Levin,
	Guenter Roeck, Jesse Barnes

Crash after mm: fix initialization of struct page for holes in memory layout

Hi,
I was trying to run v5.11-rc5 on my Samsung Chromebook Pro (Caroline),
but I've noticed it has crashed - unfortunately it seems to happen at
a very early stage - No output to the console nor to the screen, so I
have started a bisect (between 5.11-rc4 - which works just find - and
5.11-rc5),
bisect results points to:

d3921cb8be29 mm: fix initialization of struct page for holes in memory layout

Reproduction is just to build and load the kernel.

If it will help any how I am attaching:
- /proc/cpuinfo (from healthy system):
https://gist.github.com/semihalf-majczak-lukasz/3517867bf39f07377c1a785b64a97066
- my .config file (for a broken system):
https://gist.github.com/semihalf-majczak-lukasz/584b329f1bf3e43b53efe8e18b5da33c

If there is anything I could add/do/test to help fix this please let me know.

Best regards
Lukasz

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout
  2021-01-27  9:22 PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout Łukasz Majczak
@ 2021-01-27 10:04 ` Mike Rapoport
  2021-01-27 10:08   ` Łukasz Majczak
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Rapoport @ 2021-01-27 10:04 UTC (permalink / raw)
  To: Łukasz Majczak
  Cc: Andrew Morton, linux-mm, linux-kernel, Radosław Biernacki,
	Marcin Wojtas, Alex Levin, Guenter Roeck, Jesse Barnes,
	Chris Wilson, Sarvela, Tomi P

Hi Lukasz,

On Wed, Jan 27, 2021 at 10:22:29AM +0100, Łukasz Majczak wrote:
> Crash after mm: fix initialization of struct page for holes in memory layout
> 
> Hi,
> I was trying to run v5.11-rc5 on my Samsung Chromebook Pro (Caroline),
> but I've noticed it has crashed - unfortunately it seems to happen at
> a very early stage - No output to the console nor to the screen, so I
> have started a bisect (between 5.11-rc4 - which works just find - and
> 5.11-rc5),
> bisect results points to:
> 
> d3921cb8be29 mm: fix initialization of struct page for holes in memory layout
> 
> Reproduction is just to build and load the kernel.
> 
> If it will help any how I am attaching:
> - /proc/cpuinfo (from healthy system):
> https://gist.github.com/semihalf-majczak-lukasz/3517867bf39f07377c1a785b64a97066
> - my .config file (for a broken system):
> https://gist.github.com/semihalf-majczak-lukasz/584b329f1bf3e43b53efe8e18b5da33c
> 
> If there is anything I could add/do/test to help fix this please let me know.

Chris Wilson also reported boot failures on several Chromebooks:

https://lore.kernel.org/lkml/161160687463.28991.354987542182281928@build.alporthouse.com

I presume serial console is not an option, so if you could boot with
earlyprintk=vga and see if there is anything useful printed on the screen
it would be really helpful.

> Best regards
> Lukasz

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout
  2021-01-27 10:04 ` Mike Rapoport
@ 2021-01-27 10:08   ` Łukasz Majczak
  2021-01-27 11:18     ` Mike Rapoport
  0 siblings, 1 reply; 10+ messages in thread
From: Łukasz Majczak @ 2021-01-27 10:08 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, linux-mm, linux-kernel, Radosław Biernacki,
	Marcin Wojtas, Alex Levin, Guenter Roeck, Jesse Barnes,
	Chris Wilson, Sarvela, Tomi P

Hi Mike,

Actually I have a serial console attached (via servo device), but
there is no output :( and also the reboot/crash is very fast/immediate
after power on.

Best regards
Lukasz

śr., 27 sty 2021 o 11:05 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
>
> Hi Lukasz,
>
> On Wed, Jan 27, 2021 at 10:22:29AM +0100, Łukasz Majczak wrote:
> > Crash after mm: fix initialization of struct page for holes in memory layout
> >
> > Hi,
> > I was trying to run v5.11-rc5 on my Samsung Chromebook Pro (Caroline),
> > but I've noticed it has crashed - unfortunately it seems to happen at
> > a very early stage - No output to the console nor to the screen, so I
> > have started a bisect (between 5.11-rc4 - which works just find - and
> > 5.11-rc5),
> > bisect results points to:
> >
> > d3921cb8be29 mm: fix initialization of struct page for holes in memory layout
> >
> > Reproduction is just to build and load the kernel.
> >
> > If it will help any how I am attaching:
> > - /proc/cpuinfo (from healthy system):
> > https://gist.github.com/semihalf-majczak-lukasz/3517867bf39f07377c1a785b64a97066
> > - my .config file (for a broken system):
> > https://gist.github.com/semihalf-majczak-lukasz/584b329f1bf3e43b53efe8e18b5da33c
> >
> > If there is anything I could add/do/test to help fix this please let me know.
>
> Chris Wilson also reported boot failures on several Chromebooks:
>
> https://lore.kernel.org/lkml/161160687463.28991.354987542182281928@build.alporthouse.com
>
> I presume serial console is not an option, so if you could boot with
> earlyprintk=vga and see if there is anything useful printed on the screen
> it would be really helpful.
>
> > Best regards
> > Lukasz
>
> --
> Sincerely yours,
> Mike.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout
  2021-01-27 10:08   ` Łukasz Majczak
@ 2021-01-27 11:18     ` Mike Rapoport
  2021-01-27 12:15       ` Łukasz Majczak
  0 siblings, 1 reply; 10+ messages in thread
From: Mike Rapoport @ 2021-01-27 11:18 UTC (permalink / raw)
  To: Łukasz Majczak
  Cc: Andrew Morton, linux-mm, linux-kernel, Radosław Biernacki,
	Marcin Wojtas, Alex Levin, Guenter Roeck, Jesse Barnes,
	Chris Wilson, Sarvela, Tomi P

On Wed, Jan 27, 2021 at 11:08:17AM +0100, Łukasz Majczak wrote:
> Hi Mike,
> 
> Actually I have a serial console attached (via servo device), but
> there is no output :( and also the reboot/crash is very fast/immediate
> after power on.
 
If you boot with earlyprintk=serial are there any messages?

> Best regards
> Lukasz
> 
> śr., 27 sty 2021 o 11:05 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> >
> > Hi Lukasz,
> >
> > On Wed, Jan 27, 2021 at 10:22:29AM +0100, Łukasz Majczak wrote:
> > > Crash after mm: fix initialization of struct page for holes in memory layout
> > >
> > > Hi,
> > > I was trying to run v5.11-rc5 on my Samsung Chromebook Pro (Caroline),
> > > but I've noticed it has crashed - unfortunately it seems to happen at
> > > a very early stage - No output to the console nor to the screen, so I
> > > have started a bisect (between 5.11-rc4 - which works just find - and
> > > 5.11-rc5),
> > > bisect results points to:
> > >
> > > d3921cb8be29 mm: fix initialization of struct page for holes in memory layout
> > >
> > > Reproduction is just to build and load the kernel.
> > >
> > > If it will help any how I am attaching:
> > > - /proc/cpuinfo (from healthy system):
> > > https://gist.github.com/semihalf-majczak-lukasz/3517867bf39f07377c1a785b64a97066
> > > - my .config file (for a broken system):
> > > https://gist.github.com/semihalf-majczak-lukasz/584b329f1bf3e43b53efe8e18b5da33c
> > >
> > > If there is anything I could add/do/test to help fix this please let me know.
> >
> > Chris Wilson also reported boot failures on several Chromebooks:
> >
> > https://lore.kernel.org/lkml/161160687463.28991.354987542182281928@build.alporthouse.com
> >
> > I presume serial console is not an option, so if you could boot with
> > earlyprintk=vga and see if there is anything useful printed on the screen
> > it would be really helpful.
> >
> > > Best regards
> > > Lukasz
> >
> > --
> > Sincerely yours,
> > Mike.

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout
  2021-01-27 11:18     ` Mike Rapoport
@ 2021-01-27 12:15       ` Łukasz Majczak
  2021-01-27 13:15         ` Łukasz Majczak
  0 siblings, 1 reply; 10+ messages in thread
From: Łukasz Majczak @ 2021-01-27 12:15 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, linux-mm, linux-kernel, Radosław Biernacki,
	Marcin Wojtas, Alex Levin, Guenter Roeck, Jesse Barnes,
	Chris Wilson, Sarvela, Tomi P

Unfortunately nothing :( my current kernel command line contains:
console=ttyS0,115200n8 debug earlyprintk=serial loglevel=7

I was thinking about using earlycon, but it seems to be blocked.
(I think the lack of earlycon might be related to Chromebook HW
security design. There is an EC controller which is a part of AP ->
serial chain as kernel messages are considered sensitive from a
security standpoint.)

Best regards,
Lukasz

śr., 27 sty 2021 o 12:19 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
>
> On Wed, Jan 27, 2021 at 11:08:17AM +0100, Łukasz Majczak wrote:
> > Hi Mike,
> >
> > Actually I have a serial console attached (via servo device), but
> > there is no output :( and also the reboot/crash is very fast/immediate
> > after power on.
>
> If you boot with earlyprintk=serial are there any messages?
>
> > Best regards
> > Lukasz
> >
> > śr., 27 sty 2021 o 11:05 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> > >
> > > Hi Lukasz,
> > >
> > > On Wed, Jan 27, 2021 at 10:22:29AM +0100, Łukasz Majczak wrote:
> > > > Crash after mm: fix initialization of struct page for holes in memory layout
> > > >
> > > > Hi,
> > > > I was trying to run v5.11-rc5 on my Samsung Chromebook Pro (Caroline),
> > > > but I've noticed it has crashed - unfortunately it seems to happen at
> > > > a very early stage - No output to the console nor to the screen, so I
> > > > have started a bisect (between 5.11-rc4 - which works just find - and
> > > > 5.11-rc5),
> > > > bisect results points to:
> > > >
> > > > d3921cb8be29 mm: fix initialization of struct page for holes in memory layout
> > > >
> > > > Reproduction is just to build and load the kernel.
> > > >
> > > > If it will help any how I am attaching:
> > > > - /proc/cpuinfo (from healthy system):
> > > > https://gist.github.com/semihalf-majczak-lukasz/3517867bf39f07377c1a785b64a97066
> > > > - my .config file (for a broken system):
> > > > https://gist.github.com/semihalf-majczak-lukasz/584b329f1bf3e43b53efe8e18b5da33c
> > > >
> > > > If there is anything I could add/do/test to help fix this please let me know.
> > >
> > > Chris Wilson also reported boot failures on several Chromebooks:
> > >
> > > https://lore.kernel.org/lkml/161160687463.28991.354987542182281928@build.alporthouse.com
> > >
> > > I presume serial console is not an option, so if you could boot with
> > > earlyprintk=vga and see if there is anything useful printed on the screen
> > > it would be really helpful.
> > >
> > > > Best regards
> > > > Lukasz
> > >
> > > --
> > > Sincerely yours,
> > > Mike.
>
> --
> Sincerely yours,
> Mike.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout
  2021-01-27 12:15       ` Łukasz Majczak
@ 2021-01-27 13:15         ` Łukasz Majczak
  2021-01-27 18:26           ` Mike Rapoport
  0 siblings, 1 reply; 10+ messages in thread
From: Łukasz Majczak @ 2021-01-27 13:15 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, linux-mm, linux-kernel, Radosław Biernacki,
	Marcin Wojtas, Alex Levin, Guenter Roeck, Jesse Barnes,
	Chris Wilson, Sarvela, Tomi P

Hi Mike,

I have started bisecting your patch and I have figured out that there
might be something wrong with clamping - with comments out these lines
it started to work.
The full log (with logs from below patch) can be found here:
https://gist.github.com/semihalf-majczak-lukasz/3cecbab0ddc59a6c3ce11ddc29645725
it's fresh - I haven't analyze it yet, just sharing with hope it will help.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index eed54ce26ad1..9f4468c413a1 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7093,9 +7093,11 @@ static u64 __init
init_unavailable_range(unsigned long spfn, unsigned long epfn,
        zone_spfn = arch_zone_lowest_possible_pfn[zone];
        zone_epfn = arch_zone_highest_possible_pfn[zone];

-       spfn = clamp(spfn, zone_spfn, zone_epfn);
-       epfn = clamp(epfn, zone_spfn, zone_epfn);
-
+       //spfn = clamp(spfn, zone_spfn, zone_epfn);
+       //epfn = clamp(epfn, zone_spfn, zone_epfn);
+       pr_info("LMA DBG: zone_spfn: %llx, zone_epfn %llx\n",
zone_spfn, zone_epfn);
+       pr_info("LMA DBG: spfn: %llx, epfn %llx\n", spfn, epfn);
+       pr_info("LMA DBG: clamp_spfn: %llx, clamp_epfn %llx\n",
clamp(spfn, zone_spfn, zone_epfn), clamp(epfn, zone_spfn, zone_epfn));
        for (pfn = spfn; pfn < epfn; pfn++) {
                if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))) {
                        pfn = ALIGN_DOWN(pfn, pageblock_nr_pages)

Best regards,
Lukasz


śr., 27 sty 2021 o 13:15 Łukasz Majczak <lma@semihalf.com> napisał(a):
>
> Unfortunately nothing :( my current kernel command line contains:
> console=ttyS0,115200n8 debug earlyprintk=serial loglevel=7
>
> I was thinking about using earlycon, but it seems to be blocked.
> (I think the lack of earlycon might be related to Chromebook HW
> security design. There is an EC controller which is a part of AP ->
> serial chain as kernel messages are considered sensitive from a
> security standpoint.)
>
> Best regards,
> Lukasz
>
> śr., 27 sty 2021 o 12:19 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> >
> > On Wed, Jan 27, 2021 at 11:08:17AM +0100, Łukasz Majczak wrote:
> > > Hi Mike,
> > >
> > > Actually I have a serial console attached (via servo device), but
> > > there is no output :( and also the reboot/crash is very fast/immediate
> > > after power on.
> >
> > If you boot with earlyprintk=serial are there any messages?
> >
> > > Best regards
> > > Lukasz
> > >
> > > śr., 27 sty 2021 o 11:05 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> > > >
> > > > Hi Lukasz,
> > > >
> > > > On Wed, Jan 27, 2021 at 10:22:29AM +0100, Łukasz Majczak wrote:
> > > > > Crash after mm: fix initialization of struct page for holes in memory layout
> > > > >
> > > > > Hi,
> > > > > I was trying to run v5.11-rc5 on my Samsung Chromebook Pro (Caroline),
> > > > > but I've noticed it has crashed - unfortunately it seems to happen at
> > > > > a very early stage - No output to the console nor to the screen, so I
> > > > > have started a bisect (between 5.11-rc4 - which works just find - and
> > > > > 5.11-rc5),
> > > > > bisect results points to:
> > > > >
> > > > > d3921cb8be29 mm: fix initialization of struct page for holes in memory layout
> > > > >
> > > > > Reproduction is just to build and load the kernel.
> > > > >
> > > > > If it will help any how I am attaching:
> > > > > - /proc/cpuinfo (from healthy system):
> > > > > https://gist.github.com/semihalf-majczak-lukasz/3517867bf39f07377c1a785b64a97066
> > > > > - my .config file (for a broken system):
> > > > > https://gist.github.com/semihalf-majczak-lukasz/584b329f1bf3e43b53efe8e18b5da33c
> > > > >
> > > > > If there is anything I could add/do/test to help fix this please let me know.
> > > >
> > > > Chris Wilson also reported boot failures on several Chromebooks:
> > > >
> > > > https://lore.kernel.org/lkml/161160687463.28991.354987542182281928@build.alporthouse.com
> > > >
> > > > I presume serial console is not an option, so if you could boot with
> > > > earlyprintk=vga and see if there is anything useful printed on the screen
> > > > it would be really helpful.
> > > >
> > > > > Best regards
> > > > > Lukasz
> > > >
> > > > --
> > > > Sincerely yours,
> > > > Mike.
> >
> > --
> > Sincerely yours,
> > Mike.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout
  2021-01-27 13:15         ` Łukasz Majczak
@ 2021-01-27 18:26           ` Mike Rapoport
  2021-01-27 19:18             ` Łukasz Majczak
  2021-01-28  2:45             ` Baoquan He
  0 siblings, 2 replies; 10+ messages in thread
From: Mike Rapoport @ 2021-01-27 18:26 UTC (permalink / raw)
  To: Łukasz Majczak
  Cc: Andrew Morton, linux-mm, linux-kernel, Radosław Biernacki,
	Marcin Wojtas, Alex Levin, Guenter Roeck, Jesse Barnes,
	Chris Wilson, Sarvela, Tomi P

Hi Lukasz,

On Wed, Jan 27, 2021 at 02:15:53PM +0100, Łukasz Majczak wrote:
> Hi Mike,
> 
> I have started bisecting your patch and I have figured out that there
> might be something wrong with clamping - with comments out these lines
> it started to work.
> The full log (with logs from below patch) can be found here:
> https://gist.github.com/semihalf-majczak-lukasz/3cecbab0ddc59a6c3ce11ddc29645725
> it's fresh - I haven't analyze it yet, just sharing with hope it will help.

Thanks, that helps!

The first page is never considered by the kernel as memory and so
arch_zone_lowest_possible_pfn[ZONE_DMA] is set to 0x1000. As the result,
init_unavailable_mem() skips pfn 0 and then __SetPageReserved(page) in
reserve_bootmem_region() panics because the struct page for pfn 0 remains
poisoned.

Can you please try the below patch on top of v5.11-rc5?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 783913e41f65..3ce9ef238dfc 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7083,10 +7083,11 @@ void __init free_area_init_memoryless_node(int nid)
 static u64 __init init_unavailable_range(unsigned long spfn, unsigned long epfn,
 					 int zone, int nid)
 {
-	unsigned long pfn, zone_spfn, zone_epfn;
+	unsigned long pfn, zone_spfn = 0, zone_epfn;
 	u64 pgcnt = 0;
 
-	zone_spfn = arch_zone_lowest_possible_pfn[zone];
+	if (zone > 0)
+		zone_spfn = arch_zone_highest_possible_pfn[zone - 1];
 	zone_epfn = arch_zone_highest_possible_pfn[zone];
 
 	spfn = clamp(spfn, zone_spfn, zone_epfn);

 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index eed54ce26ad1..9f4468c413a1 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7093,9 +7093,11 @@ static u64 __init
> init_unavailable_range(unsigned long spfn, unsigned long epfn,
>         zone_spfn = arch_zone_lowest_possible_pfn[zone];
>         zone_epfn = arch_zone_highest_possible_pfn[zone];
> 
> -       spfn = clamp(spfn, zone_spfn, zone_epfn);
> -       epfn = clamp(epfn, zone_spfn, zone_epfn);
> -
> +       //spfn = clamp(spfn, zone_spfn, zone_epfn);
> +       //epfn = clamp(epfn, zone_spfn, zone_epfn);
> +       pr_info("LMA DBG: zone_spfn: %llx, zone_epfn %llx\n",
> zone_spfn, zone_epfn);
> +       pr_info("LMA DBG: spfn: %llx, epfn %llx\n", spfn, epfn);
> +       pr_info("LMA DBG: clamp_spfn: %llx, clamp_epfn %llx\n",
> clamp(spfn, zone_spfn, zone_epfn), clamp(epfn, zone_spfn, zone_epfn));
>         for (pfn = spfn; pfn < epfn; pfn++) {
>                 if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))) {
>                         pfn = ALIGN_DOWN(pfn, pageblock_nr_pages)
> 
> Best regards,
> Lukasz
> 
> 
> śr., 27 sty 2021 o 13:15 Łukasz Majczak <lma@semihalf.com> napisał(a):
> >
> > Unfortunately nothing :( my current kernel command line contains:
> > console=ttyS0,115200n8 debug earlyprintk=serial loglevel=7
> >
> > I was thinking about using earlycon, but it seems to be blocked.
> > (I think the lack of earlycon might be related to Chromebook HW
> > security design. There is an EC controller which is a part of AP ->
> > serial chain as kernel messages are considered sensitive from a
> > security standpoint.)
> >
> > Best regards,
> > Lukasz
> >
> > śr., 27 sty 2021 o 12:19 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> > >
> > > On Wed, Jan 27, 2021 at 11:08:17AM +0100, Łukasz Majczak wrote:
> > > > Hi Mike,
> > > >
> > > > Actually I have a serial console attached (via servo device), but
> > > > there is no output :( and also the reboot/crash is very fast/immediate
> > > > after power on.
> > >
> > > If you boot with earlyprintk=serial are there any messages?
> > >
> > > > Best regards
> > > > Lukasz
> > > >
> > > > śr., 27 sty 2021 o 11:05 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> > > > >
> > > > > Hi Lukasz,
> > > > >
> > > > > On Wed, Jan 27, 2021 at 10:22:29AM +0100, Łukasz Majczak wrote:
> > > > > > Crash after mm: fix initialization of struct page for holes in memory layout
> > > > > >
> > > > > > Hi,
> > > > > > I was trying to run v5.11-rc5 on my Samsung Chromebook Pro (Caroline),
> > > > > > but I've noticed it has crashed - unfortunately it seems to happen at
> > > > > > a very early stage - No output to the console nor to the screen, so I
> > > > > > have started a bisect (between 5.11-rc4 - which works just find - and
> > > > > > 5.11-rc5),
> > > > > > bisect results points to:
> > > > > >
> > > > > > d3921cb8be29 mm: fix initialization of struct page for holes in memory layout
> > > > > >
> > > > > > Reproduction is just to build and load the kernel.
> > > > > >
> > > > > > If it will help any how I am attaching:
> > > > > > - /proc/cpuinfo (from healthy system):
> > > > > > https://gist.github.com/semihalf-majczak-lukasz/3517867bf39f07377c1a785b64a97066
> > > > > > - my .config file (for a broken system):
> > > > > > https://gist.github.com/semihalf-majczak-lukasz/584b329f1bf3e43b53efe8e18b5da33c
> > > > > >
> > > > > > If there is anything I could add/do/test to help fix this please let me know.
> > > > >
> > > > > Chris Wilson also reported boot failures on several Chromebooks:
> > > > >
> > > > > https://lore.kernel.org/lkml/161160687463.28991.354987542182281928@build.alporthouse.com
> > > > >
> > > > > I presume serial console is not an option, so if you could boot with
> > > > > earlyprintk=vga and see if there is anything useful printed on the screen
> > > > > it would be really helpful.
> > > > >
> > > > > > Best regards
> > > > > > Lukasz
> > > > >
> > > > > --
> > > > > Sincerely yours,
> > > > > Mike.
> > >
> > > --
> > > Sincerely yours,
> > > Mike.

-- 
Sincerely yours,
Mike.

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout
  2021-01-27 18:26           ` Mike Rapoport
@ 2021-01-27 19:18             ` Łukasz Majczak
  2021-01-28  2:45             ` Baoquan He
  1 sibling, 0 replies; 10+ messages in thread
From: Łukasz Majczak @ 2021-01-27 19:18 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, linux-mm, linux-kernel, Radosław Biernacki,
	Marcin Wojtas, Alex Levin, Guenter Roeck, Jesse Barnes,
	Chris Wilson, Sarvela, Tomi P, Łukasz Bartosik

Hi Mike,

Great ! it seems to work well - I have built a valila kernel v5.11-rc5
with your patch and it boots properly.
Full log available here:
https://gist.github.com/semihalf-majczak-lukasz/ea89bf52f6fad7907a18d1870e7ce9bd

Best regards,
Lukasz

śr., 27 sty 2021 o 19:27 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
>
> Hi Lukasz,
>
> On Wed, Jan 27, 2021 at 02:15:53PM +0100, Łukasz Majczak wrote:
> > Hi Mike,
> >
> > I have started bisecting your patch and I have figured out that there
> > might be something wrong with clamping - with comments out these lines
> > it started to work.
> > The full log (with logs from below patch) can be found here:
> > https://gist.github.com/semihalf-majczak-lukasz/3cecbab0ddc59a6c3ce11ddc29645725
> > it's fresh - I haven't analyze it yet, just sharing with hope it will help.
>
> Thanks, that helps!
>
> The first page is never considered by the kernel as memory and so
> arch_zone_lowest_possible_pfn[ZONE_DMA] is set to 0x1000. As the result,
> init_unavailable_mem() skips pfn 0 and then __SetPageReserved(page) in
> reserve_bootmem_region() panics because the struct page for pfn 0 remains
> poisoned.
>
> Can you please try the below patch on top of v5.11-rc5?
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 783913e41f65..3ce9ef238dfc 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7083,10 +7083,11 @@ void __init free_area_init_memoryless_node(int nid)
>  static u64 __init init_unavailable_range(unsigned long spfn, unsigned long epfn,
>                                          int zone, int nid)
>  {
> -       unsigned long pfn, zone_spfn, zone_epfn;
> +       unsigned long pfn, zone_spfn = 0, zone_epfn;
>         u64 pgcnt = 0;
>
> -       zone_spfn = arch_zone_lowest_possible_pfn[zone];
> +       if (zone > 0)
> +               zone_spfn = arch_zone_highest_possible_pfn[zone - 1];
>         zone_epfn = arch_zone_highest_possible_pfn[zone];
>
>         spfn = clamp(spfn, zone_spfn, zone_epfn);
>
>
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index eed54ce26ad1..9f4468c413a1 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7093,9 +7093,11 @@ static u64 __init
> > init_unavailable_range(unsigned long spfn, unsigned long epfn,
> >         zone_spfn = arch_zone_lowest_possible_pfn[zone];
> >         zone_epfn = arch_zone_highest_possible_pfn[zone];
> >
> > -       spfn = clamp(spfn, zone_spfn, zone_epfn);
> > -       epfn = clamp(epfn, zone_spfn, zone_epfn);
> > -
> > +       //spfn = clamp(spfn, zone_spfn, zone_epfn);
> > +       //epfn = clamp(epfn, zone_spfn, zone_epfn);
> > +       pr_info("LMA DBG: zone_spfn: %llx, zone_epfn %llx\n",
> > zone_spfn, zone_epfn);
> > +       pr_info("LMA DBG: spfn: %llx, epfn %llx\n", spfn, epfn);
> > +       pr_info("LMA DBG: clamp_spfn: %llx, clamp_epfn %llx\n",
> > clamp(spfn, zone_spfn, zone_epfn), clamp(epfn, zone_spfn, zone_epfn));
> >         for (pfn = spfn; pfn < epfn; pfn++) {
> >                 if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))) {
> >                         pfn = ALIGN_DOWN(pfn, pageblock_nr_pages)
> >
> > Best regards,
> > Lukasz
> >
> >
> > śr., 27 sty 2021 o 13:15 Łukasz Majczak <lma@semihalf.com> napisał(a):
> > >
> > > Unfortunately nothing :( my current kernel command line contains:
> > > console=ttyS0,115200n8 debug earlyprintk=serial loglevel=7
> > >
> > > I was thinking about using earlycon, but it seems to be blocked.
> > > (I think the lack of earlycon might be related to Chromebook HW
> > > security design. There is an EC controller which is a part of AP ->
> > > serial chain as kernel messages are considered sensitive from a
> > > security standpoint.)
> > >
> > > Best regards,
> > > Lukasz
> > >
> > > śr., 27 sty 2021 o 12:19 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> > > >
> > > > On Wed, Jan 27, 2021 at 11:08:17AM +0100, Łukasz Majczak wrote:
> > > > > Hi Mike,
> > > > >
> > > > > Actually I have a serial console attached (via servo device), but
> > > > > there is no output :( and also the reboot/crash is very fast/immediate
> > > > > after power on.
> > > >
> > > > If you boot with earlyprintk=serial are there any messages?
> > > >
> > > > > Best regards
> > > > > Lukasz
> > > > >
> > > > > śr., 27 sty 2021 o 11:05 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> > > > > >
> > > > > > Hi Lukasz,
> > > > > >
> > > > > > On Wed, Jan 27, 2021 at 10:22:29AM +0100, Łukasz Majczak wrote:
> > > > > > > Crash after mm: fix initialization of struct page for holes in memory layout
> > > > > > >
> > > > > > > Hi,
> > > > > > > I was trying to run v5.11-rc5 on my Samsung Chromebook Pro (Caroline),
> > > > > > > but I've noticed it has crashed - unfortunately it seems to happen at
> > > > > > > a very early stage - No output to the console nor to the screen, so I
> > > > > > > have started a bisect (between 5.11-rc4 - which works just find - and
> > > > > > > 5.11-rc5),
> > > > > > > bisect results points to:
> > > > > > >
> > > > > > > d3921cb8be29 mm: fix initialization of struct page for holes in memory layout
> > > > > > >
> > > > > > > Reproduction is just to build and load the kernel.
> > > > > > >
> > > > > > > If it will help any how I am attaching:
> > > > > > > - /proc/cpuinfo (from healthy system):
> > > > > > > https://gist.github.com/semihalf-majczak-lukasz/3517867bf39f07377c1a785b64a97066
> > > > > > > - my .config file (for a broken system):
> > > > > > > https://gist.github.com/semihalf-majczak-lukasz/584b329f1bf3e43b53efe8e18b5da33c
> > > > > > >
> > > > > > > If there is anything I could add/do/test to help fix this please let me know.
> > > > > >
> > > > > > Chris Wilson also reported boot failures on several Chromebooks:
> > > > > >
> > > > > > https://lore.kernel.org/lkml/161160687463.28991.354987542182281928@build.alporthouse.com
> > > > > >
> > > > > > I presume serial console is not an option, so if you could boot with
> > > > > > earlyprintk=vga and see if there is anything useful printed on the screen
> > > > > > it would be really helpful.
> > > > > >
> > > > > > > Best regards
> > > > > > > Lukasz
> > > > > >
> > > > > > --
> > > > > > Sincerely yours,
> > > > > > Mike.
> > > >
> > > > --
> > > > Sincerely yours,
> > > > Mike.
>
> --
> Sincerely yours,
> Mike.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout
  2021-01-27 18:26           ` Mike Rapoport
  2021-01-27 19:18             ` Łukasz Majczak
@ 2021-01-28  2:45             ` Baoquan He
  2021-01-28  9:31               ` Mike Rapoport
  1 sibling, 1 reply; 10+ messages in thread
From: Baoquan He @ 2021-01-28  2:45 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Łukasz Majczak, Andrew Morton, linux-mm, linux-kernel,
	Radosław Biernacki, Marcin Wojtas, Alex Levin,
	Guenter Roeck, Jesse Barnes, Chris Wilson, Sarvela, Tomi P

On 01/27/21 at 08:26pm, Mike Rapoport wrote:
> Hi Lukasz,
> 
> On Wed, Jan 27, 2021 at 02:15:53PM +0100, Łukasz Majczak wrote:
> > Hi Mike,
> > 
> > I have started bisecting your patch and I have figured out that there
> > might be something wrong with clamping - with comments out these lines
> > it started to work.
> > The full log (with logs from below patch) can be found here:
> > https://gist.github.com/semihalf-majczak-lukasz/3cecbab0ddc59a6c3ce11ddc29645725
> > it's fresh - I haven't analyze it yet, just sharing with hope it will help.
> 
> Thanks, that helps!
> 
> The first page is never considered by the kernel as memory and so
> arch_zone_lowest_possible_pfn[ZONE_DMA] is set to 0x1000. As the result,
> init_unavailable_mem() skips pfn 0 and then __SetPageReserved(page) in
> reserve_bootmem_region() panics because the struct page for pfn 0 remains
> poisoned.

It's a great finding and quick fix. Previously I tested my cleanup
patches based on Mike's commit 9ebeee59af4cdd4d ("mm: fix initialization
of struct page for holes in memory layout") on a hardware system,
didn't meet this crash. But this crash seems to be a always reproduced
issue, wondering why I didn't reproduce it.

> 
> Can you please try the below patch on top of v5.11-rc5?
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 783913e41f65..3ce9ef238dfc 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7083,10 +7083,11 @@ void __init free_area_init_memoryless_node(int nid)
>  static u64 __init init_unavailable_range(unsigned long spfn, unsigned long epfn,
>  					 int zone, int nid)
>  {
> -	unsigned long pfn, zone_spfn, zone_epfn;
> +	unsigned long pfn, zone_spfn = 0, zone_epfn;
>  	u64 pgcnt = 0;
>  
> -	zone_spfn = arch_zone_lowest_possible_pfn[zone];
> +	if (zone > 0)
> +		zone_spfn = arch_zone_highest_possible_pfn[zone - 1];
>  	zone_epfn = arch_zone_highest_possible_pfn[zone];
>  
>  	spfn = clamp(spfn, zone_spfn, zone_epfn);
> 
>  
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index eed54ce26ad1..9f4468c413a1 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7093,9 +7093,11 @@ static u64 __init
> > init_unavailable_range(unsigned long spfn, unsigned long epfn,
> >         zone_spfn = arch_zone_lowest_possible_pfn[zone];
> >         zone_epfn = arch_zone_highest_possible_pfn[zone];
> > 
> > -       spfn = clamp(spfn, zone_spfn, zone_epfn);
> > -       epfn = clamp(epfn, zone_spfn, zone_epfn);
> > -
> > +       //spfn = clamp(spfn, zone_spfn, zone_epfn);
> > +       //epfn = clamp(epfn, zone_spfn, zone_epfn);
> > +       pr_info("LMA DBG: zone_spfn: %llx, zone_epfn %llx\n",
> > zone_spfn, zone_epfn);
> > +       pr_info("LMA DBG: spfn: %llx, epfn %llx\n", spfn, epfn);
> > +       pr_info("LMA DBG: clamp_spfn: %llx, clamp_epfn %llx\n",
> > clamp(spfn, zone_spfn, zone_epfn), clamp(epfn, zone_spfn, zone_epfn));
> >         for (pfn = spfn; pfn < epfn; pfn++) {
> >                 if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))) {
> >                         pfn = ALIGN_DOWN(pfn, pageblock_nr_pages)
> > 
> > Best regards,
> > Lukasz
> > 
> > 
> > śr., 27 sty 2021 o 13:15 Łukasz Majczak <lma@semihalf.com> napisał(a):
> > >
> > > Unfortunately nothing :( my current kernel command line contains:
> > > console=ttyS0,115200n8 debug earlyprintk=serial loglevel=7
> > >
> > > I was thinking about using earlycon, but it seems to be blocked.
> > > (I think the lack of earlycon might be related to Chromebook HW
> > > security design. There is an EC controller which is a part of AP ->
> > > serial chain as kernel messages are considered sensitive from a
> > > security standpoint.)
> > >
> > > Best regards,
> > > Lukasz
> > >
> > > śr., 27 sty 2021 o 12:19 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> > > >
> > > > On Wed, Jan 27, 2021 at 11:08:17AM +0100, Łukasz Majczak wrote:
> > > > > Hi Mike,
> > > > >
> > > > > Actually I have a serial console attached (via servo device), but
> > > > > there is no output :( and also the reboot/crash is very fast/immediate
> > > > > after power on.
> > > >
> > > > If you boot with earlyprintk=serial are there any messages?
> > > >
> > > > > Best regards
> > > > > Lukasz
> > > > >
> > > > > śr., 27 sty 2021 o 11:05 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> > > > > >
> > > > > > Hi Lukasz,
> > > > > >
> > > > > > On Wed, Jan 27, 2021 at 10:22:29AM +0100, Łukasz Majczak wrote:
> > > > > > > Crash after mm: fix initialization of struct page for holes in memory layout
> > > > > > >
> > > > > > > Hi,
> > > > > > > I was trying to run v5.11-rc5 on my Samsung Chromebook Pro (Caroline),
> > > > > > > but I've noticed it has crashed - unfortunately it seems to happen at
> > > > > > > a very early stage - No output to the console nor to the screen, so I
> > > > > > > have started a bisect (between 5.11-rc4 - which works just find - and
> > > > > > > 5.11-rc5),
> > > > > > > bisect results points to:
> > > > > > >
> > > > > > > d3921cb8be29 mm: fix initialization of struct page for holes in memory layout
> > > > > > >
> > > > > > > Reproduction is just to build and load the kernel.
> > > > > > >
> > > > > > > If it will help any how I am attaching:
> > > > > > > - /proc/cpuinfo (from healthy system):
> > > > > > > https://gist.github.com/semihalf-majczak-lukasz/3517867bf39f07377c1a785b64a97066
> > > > > > > - my .config file (for a broken system):
> > > > > > > https://gist.github.com/semihalf-majczak-lukasz/584b329f1bf3e43b53efe8e18b5da33c
> > > > > > >
> > > > > > > If there is anything I could add/do/test to help fix this please let me know.
> > > > > >
> > > > > > Chris Wilson also reported boot failures on several Chromebooks:
> > > > > >
> > > > > > https://lore.kernel.org/lkml/161160687463.28991.354987542182281928@build.alporthouse.com
> > > > > >
> > > > > > I presume serial console is not an option, so if you could boot with
> > > > > > earlyprintk=vga and see if there is anything useful printed on the screen
> > > > > > it would be really helpful.
> > > > > >
> > > > > > > Best regards
> > > > > > > Lukasz
> > > > > >
> > > > > > --
> > > > > > Sincerely yours,
> > > > > > Mike.
> > > >
> > > > --
> > > > Sincerely yours,
> > > > Mike.
> 
> -- 
> Sincerely yours,
> Mike.
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout
  2021-01-28  2:45             ` Baoquan He
@ 2021-01-28  9:31               ` Mike Rapoport
  0 siblings, 0 replies; 10+ messages in thread
From: Mike Rapoport @ 2021-01-28  9:31 UTC (permalink / raw)
  To: Baoquan He
  Cc: Łukasz Majczak, Andrew Morton, linux-mm, linux-kernel,
	Radosław Biernacki, Marcin Wojtas, Alex Levin,
	Guenter Roeck, Jesse Barnes, Chris Wilson, Sarvela, Tomi P

On Thu, Jan 28, 2021 at 10:45:49AM +0800, Baoquan He wrote:
> On 01/27/21 at 08:26pm, Mike Rapoport wrote:
> > Hi Lukasz,
> > 
> > On Wed, Jan 27, 2021 at 02:15:53PM +0100, Łukasz Majczak wrote:
> > > Hi Mike,
> > > 
> > > I have started bisecting your patch and I have figured out that there
> > > might be something wrong with clamping - with comments out these lines
> > > it started to work.
> > > The full log (with logs from below patch) can be found here:
> > > https://gist.github.com/semihalf-majczak-lukasz/3cecbab0ddc59a6c3ce11ddc29645725
> > > it's fresh - I haven't analyze it yet, just sharing with hope it will help.
> > 
> > Thanks, that helps!
> > 
> > The first page is never considered by the kernel as memory and so
> > arch_zone_lowest_possible_pfn[ZONE_DMA] is set to 0x1000. As the result,
> > init_unavailable_mem() skips pfn 0 and then __SetPageReserved(page) in
> > reserve_bootmem_region() panics because the struct page for pfn 0 remains
> > poisoned.
> 
> It's a great finding and quick fix.

Unfortunately it's only a partial fix as it does not address the problem of
having pfn 0 outside any zone. It gets ZONE_DMA link at
init_unavailable_mem(), but zones[ZONE_DMA]->zone_start_pfn is 1.

I'm looking now how to fix this as well, hopefully I'll have a patch Real
Soon (tm) :)

>  Previously I tested my cleanup patches based on Mike's commit
>  9ebeee59af4cdd4d ("mm: fix initialization of struct page for holes in
>  memory layout") on a hardware system, didn't meet this crash. But this
>  crash seems to be a always reproduced issue, wondering why I didn't
>  reproduce it.

This crash is reproducible on systems that do not report pfn 0 as usable,
e.g for Chromebook Lukasz is using it is 'type 16':

[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] type 16
[    0.000000] BIOS-e820: [mem 0x0000000000001000-0x000000000009ffff] usable

And on my laptop and on a bunch of other systems I have it is usable:

[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009cfff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009d000-0x000000000009ffff] reserved

 
> > 
> > Can you please try the below patch on top of v5.11-rc5?
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 783913e41f65..3ce9ef238dfc 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7083,10 +7083,11 @@ void __init free_area_init_memoryless_node(int nid)
> >  static u64 __init init_unavailable_range(unsigned long spfn, unsigned long epfn,
> >  					 int zone, int nid)
> >  {
> > -	unsigned long pfn, zone_spfn, zone_epfn;
> > +	unsigned long pfn, zone_spfn = 0, zone_epfn;
> >  	u64 pgcnt = 0;
> >  
> > -	zone_spfn = arch_zone_lowest_possible_pfn[zone];
> > +	if (zone > 0)
> > +		zone_spfn = arch_zone_highest_possible_pfn[zone - 1];
> >  	zone_epfn = arch_zone_highest_possible_pfn[zone];
> >  
> >  	spfn = clamp(spfn, zone_spfn, zone_epfn);
> > 
> >  
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index eed54ce26ad1..9f4468c413a1 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -7093,9 +7093,11 @@ static u64 __init
> > > init_unavailable_range(unsigned long spfn, unsigned long epfn,
> > >         zone_spfn = arch_zone_lowest_possible_pfn[zone];
> > >         zone_epfn = arch_zone_highest_possible_pfn[zone];
> > > 
> > > -       spfn = clamp(spfn, zone_spfn, zone_epfn);
> > > -       epfn = clamp(epfn, zone_spfn, zone_epfn);
> > > -
> > > +       //spfn = clamp(spfn, zone_spfn, zone_epfn);
> > > +       //epfn = clamp(epfn, zone_spfn, zone_epfn);
> > > +       pr_info("LMA DBG: zone_spfn: %llx, zone_epfn %llx\n",
> > > zone_spfn, zone_epfn);
> > > +       pr_info("LMA DBG: spfn: %llx, epfn %llx\n", spfn, epfn);
> > > +       pr_info("LMA DBG: clamp_spfn: %llx, clamp_epfn %llx\n",
> > > clamp(spfn, zone_spfn, zone_epfn), clamp(epfn, zone_spfn, zone_epfn));
> > >         for (pfn = spfn; pfn < epfn; pfn++) {
> > >                 if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))) {
> > >                         pfn = ALIGN_DOWN(pfn, pageblock_nr_pages)
> > > 
> > > Best regards,
> > > Lukasz
> > > 
> > > 
> > > śr., 27 sty 2021 o 13:15 Łukasz Majczak <lma@semihalf.com> napisał(a):
> > > >
> > > > Unfortunately nothing :( my current kernel command line contains:
> > > > console=ttyS0,115200n8 debug earlyprintk=serial loglevel=7
> > > >
> > > > I was thinking about using earlycon, but it seems to be blocked.
> > > > (I think the lack of earlycon might be related to Chromebook HW
> > > > security design. There is an EC controller which is a part of AP ->
> > > > serial chain as kernel messages are considered sensitive from a
> > > > security standpoint.)
> > > >
> > > > Best regards,
> > > > Lukasz
> > > >
> > > > śr., 27 sty 2021 o 12:19 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> > > > >
> > > > > On Wed, Jan 27, 2021 at 11:08:17AM +0100, Łukasz Majczak wrote:
> > > > > > Hi Mike,
> > > > > >
> > > > > > Actually I have a serial console attached (via servo device), but
> > > > > > there is no output :( and also the reboot/crash is very fast/immediate
> > > > > > after power on.
> > > > >
> > > > > If you boot with earlyprintk=serial are there any messages?
> > > > >
> > > > > > Best regards
> > > > > > Lukasz
> > > > > >
> > > > > > śr., 27 sty 2021 o 11:05 Mike Rapoport <rppt@linux.ibm.com> napisał(a):
> > > > > > >
> > > > > > > Hi Lukasz,
> > > > > > >
> > > > > > > On Wed, Jan 27, 2021 at 10:22:29AM +0100, Łukasz Majczak wrote:
> > > > > > > > Crash after mm: fix initialization of struct page for holes in memory layout
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > > I was trying to run v5.11-rc5 on my Samsung Chromebook Pro (Caroline),
> > > > > > > > but I've noticed it has crashed - unfortunately it seems to happen at
> > > > > > > > a very early stage - No output to the console nor to the screen, so I
> > > > > > > > have started a bisect (between 5.11-rc4 - which works just find - and
> > > > > > > > 5.11-rc5),
> > > > > > > > bisect results points to:
> > > > > > > >
> > > > > > > > d3921cb8be29 mm: fix initialization of struct page for holes in memory layout
> > > > > > > >
> > > > > > > > Reproduction is just to build and load the kernel.
> > > > > > > >
> > > > > > > > If it will help any how I am attaching:
> > > > > > > > - /proc/cpuinfo (from healthy system):
> > > > > > > > https://gist.github.com/semihalf-majczak-lukasz/3517867bf39f07377c1a785b64a97066
> > > > > > > > - my .config file (for a broken system):
> > > > > > > > https://gist.github.com/semihalf-majczak-lukasz/584b329f1bf3e43b53efe8e18b5da33c
> > > > > > > >
> > > > > > > > If there is anything I could add/do/test to help fix this please let me know.
> > > > > > >
> > > > > > > Chris Wilson also reported boot failures on several Chromebooks:
> > > > > > >
> > > > > > > https://lore.kernel.org/lkml/161160687463.28991.354987542182281928@build.alporthouse.com
> > > > > > >
> > > > > > > I presume serial console is not an option, so if you could boot with
> > > > > > > earlyprintk=vga and see if there is anything useful printed on the screen
> > > > > > > it would be really helpful.
> > > > > > >
> > > > > > > > Best regards
> > > > > > > > Lukasz
> > > > > > >
> > > > > > > --
> > > > > > > Sincerely yours,
> > > > > > > Mike.
> > > > >
> > > > > --
> > > > > Sincerely yours,
> > > > > Mike.
> > 
> > -- 
> > Sincerely yours,
> > Mike.
> > 
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-01-28  9:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-27  9:22 PROBLEM: Crash after mm: fix initialization of struct page for holes in memory layout Łukasz Majczak
2021-01-27 10:04 ` Mike Rapoport
2021-01-27 10:08   ` Łukasz Majczak
2021-01-27 11:18     ` Mike Rapoport
2021-01-27 12:15       ` Łukasz Majczak
2021-01-27 13:15         ` Łukasz Majczak
2021-01-27 18:26           ` Mike Rapoport
2021-01-27 19:18             ` Łukasz Majczak
2021-01-28  2:45             ` Baoquan He
2021-01-28  9:31               ` Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).