All of lore.kernel.org
 help / color / mirror / Atom feed
* linux-next: powerpcle qemu boot failure after merge of the akpm tree
@ 2019-01-31  5:38 ` Stephen Rothwell
  0 siblings, 0 replies; 17+ messages in thread
From: Stephen Rothwell @ 2019-01-31  5:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linux Next Mailing List, Linux Kernel Mailing List,
	Mike Rapoport, Michael Ellerman, Benjamin Herrenschmidt, PowerPC

[-- Attachment #1: Type: text/plain, Size: 1043 bytes --]

Hi all,

[I am guessing that is is something in Andrew's tree that has caused
this.]

My qemu boot of the powerpc pseries_le_defconfig config failed like this:

htab_hash_mask    = 0x1ffff
-----------------------------------------------------
numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff
CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
Call Trace:
[c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
[c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
[c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
[c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
[c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
[c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
[c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
[c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* linux-next: powerpcle qemu boot failure after merge of the akpm tree
@ 2019-01-31  5:38 ` Stephen Rothwell
  0 siblings, 0 replies; 17+ messages in thread
From: Stephen Rothwell @ 2019-01-31  5:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linux Kernel Mailing List, Mike Rapoport,
	Linux Next Mailing List, PowerPC

[-- Attachment #1: Type: text/plain, Size: 1043 bytes --]

Hi all,

[I am guessing that is is something in Andrew's tree that has caused
this.]

My qemu boot of the powerpc pseries_le_defconfig config failed like this:

htab_hash_mask    = 0x1ffff
-----------------------------------------------------
numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff
CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
Call Trace:
[c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
[c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
[c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
[c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
[c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
[c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
[c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
[c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
  2019-01-31  5:38 ` Stephen Rothwell
@ 2019-01-31  6:06   ` Stephen Rothwell
  -1 siblings, 0 replies; 17+ messages in thread
From: Stephen Rothwell @ 2019-01-31  6:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linux Next Mailing List, Linux Kernel Mailing List,
	Mike Rapoport, Michael Ellerman, Benjamin Herrenschmidt, PowerPC,
	Christophe Leroy

[-- Attachment #1: Type: text/plain, Size: 3824 bytes --]

Hi all,

On Thu, 31 Jan 2019 16:38:54 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> [I am guessing that is is something in Andrew's tree that has caused
> this.]
> 
> My qemu boot of the powerpc pseries_le_defconfig config failed like this:
> 
> htab_hash_mask    = 0x1ffff
> -----------------------------------------------------
> numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
> Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
> Call Trace:
> [c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
> [c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
> [c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
> [c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
> [c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
> [c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
> [c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
> [c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c

A quick bisect leads to this:

1c3c9328cde027eb875ba4692f0a5d66b0afe862 is the first bad commit
commit 1c3c9328cde027eb875ba4692f0a5d66b0afe862
Author: Mike Rapoport <rppt@linux.ibm.com>
Date:   Thu Jan 31 10:51:32 2019 +1100

    treewide: add checks for the return value of memblock_alloc*()
    
    Add check for the return value of memblock_alloc*() functions and call
    panic() in case of error.  The panic message repeats the one used by
    panicing memblock allocators with adjustment of parameters to include only
    relevant ones.
    
    The replacement was mostly automated with semantic patches like the one
    below with manual massaging of format strings.
    
    @@
    expression ptr, size, align;
    @@
    ptr = memblock_alloc(size, align);
    + if (!ptr)
    +       panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__,
    size, align);
    
    Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
    Reviewed-by: Guo Ren <ren_guo@c-sky.com>                [c-sky]
    Acked-by: Paul Burton <paul.burton@mips.com>            [MIPS]
    Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>    [s390]
    Reviewed-by: Juergen Gross <jgross@suse.com>            [Xen]
    Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>  [m68k]
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christophe Leroy <christophe.leroy@c-s.fr>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Dennis Zhou <dennis@kernel.org>
    Cc: Greentime Hu <green.hu@gmail.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Guan Xuetao <gxt@pku.edu.cn>
    Cc: Guo Ren <guoren@kernel.org>
    Cc: Mark Salter <msalter@redhat.com>
    Cc: Matt Turner <mattst88@gmail.com>
    Cc: Max Filippov <jcmvbkbc@gmail.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Michal Simek <monstr@monstr.eu>
    Cc: Petr Mladek <pmladek@suse.com>
    Cc: Richard Weinberger <richard@nod.at>
    Cc: Rich Felker <dalias@libc.org>
    Cc: Rob Herring <robh+dt@kernel.org>
    Cc: Rob Herring <robh@kernel.org>
    Cc: Russell King <linux@armlinux.org.uk>
    Cc: Stafford Horne <shorne@gmail.com>
    Cc: Tony Luck <tony.luck@intel.com>
    Cc: Vineet Gupta <vgupta@synopsys.com>
    Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Which is just adding the panic we hit.  So, presumably, the bug is in a
preceding patch :-(

I have left the kernel not booting for today.
-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
@ 2019-01-31  6:06   ` Stephen Rothwell
  0 siblings, 0 replies; 17+ messages in thread
From: Stephen Rothwell @ 2019-01-31  6:06 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linux Kernel Mailing List, Mike Rapoport,
	Linux Next Mailing List, PowerPC

[-- Attachment #1: Type: text/plain, Size: 3824 bytes --]

Hi all,

On Thu, 31 Jan 2019 16:38:54 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> [I am guessing that is is something in Andrew's tree that has caused
> this.]
> 
> My qemu boot of the powerpc pseries_le_defconfig config failed like this:
> 
> htab_hash_mask    = 0x1ffff
> -----------------------------------------------------
> numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
> Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
> Call Trace:
> [c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
> [c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
> [c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
> [c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
> [c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
> [c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
> [c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
> [c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c

A quick bisect leads to this:

1c3c9328cde027eb875ba4692f0a5d66b0afe862 is the first bad commit
commit 1c3c9328cde027eb875ba4692f0a5d66b0afe862
Author: Mike Rapoport <rppt@linux.ibm.com>
Date:   Thu Jan 31 10:51:32 2019 +1100

    treewide: add checks for the return value of memblock_alloc*()
    
    Add check for the return value of memblock_alloc*() functions and call
    panic() in case of error.  The panic message repeats the one used by
    panicing memblock allocators with adjustment of parameters to include only
    relevant ones.
    
    The replacement was mostly automated with semantic patches like the one
    below with manual massaging of format strings.
    
    @@
    expression ptr, size, align;
    @@
    ptr = memblock_alloc(size, align);
    + if (!ptr)
    +       panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__,
    size, align);
    
    Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
    Reviewed-by: Guo Ren <ren_guo@c-sky.com>                [c-sky]
    Acked-by: Paul Burton <paul.burton@mips.com>            [MIPS]
    Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>    [s390]
    Reviewed-by: Juergen Gross <jgross@suse.com>            [Xen]
    Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>  [m68k]
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Christophe Leroy <christophe.leroy@c-s.fr>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Dennis Zhou <dennis@kernel.org>
    Cc: Greentime Hu <green.hu@gmail.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Guan Xuetao <gxt@pku.edu.cn>
    Cc: Guo Ren <guoren@kernel.org>
    Cc: Mark Salter <msalter@redhat.com>
    Cc: Matt Turner <mattst88@gmail.com>
    Cc: Max Filippov <jcmvbkbc@gmail.com>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Michal Simek <monstr@monstr.eu>
    Cc: Petr Mladek <pmladek@suse.com>
    Cc: Richard Weinberger <richard@nod.at>
    Cc: Rich Felker <dalias@libc.org>
    Cc: Rob Herring <robh+dt@kernel.org>
    Cc: Rob Herring <robh@kernel.org>
    Cc: Russell King <linux@armlinux.org.uk>
    Cc: Stafford Horne <shorne@gmail.com>
    Cc: Tony Luck <tony.luck@intel.com>
    Cc: Vineet Gupta <vgupta@synopsys.com>
    Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Which is just adding the panic we hit.  So, presumably, the bug is in a
preceding patch :-(

I have left the kernel not booting for today.
-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
  2019-01-31  6:06   ` Stephen Rothwell
@ 2019-01-31  6:15     ` Christophe Leroy
  -1 siblings, 0 replies; 17+ messages in thread
From: Christophe Leroy @ 2019-01-31  6:15 UTC (permalink / raw)
  To: Stephen Rothwell, Andrew Morton, Mike Rapoport
  Cc: Linux Next Mailing List, Linux Kernel Mailing List,
	Michael Ellerman, Benjamin Herrenschmidt, PowerPC



Le 31/01/2019 à 07:06, Stephen Rothwell a écrit :
> Hi all,
> 
> On Thu, 31 Jan 2019 16:38:54 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>>
>> [I am guessing that is is something in Andrew's tree that has caused
>> this.]
>>
>> My qemu boot of the powerpc pseries_le_defconfig config failed like this:
>>
>> htab_hash_mask    = 0x1ffff
>> -----------------------------------------------------
>> numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
>> Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff
>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
>> Call Trace:
>> [c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
>> [c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
>> [c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
>> [c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
>> [c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
>> [c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
>> [c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
>> [c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c
> 
> A quick bisect leads to this:
> 
> 1c3c9328cde027eb875ba4692f0a5d66b0afe862 is the first bad commit
> commit 1c3c9328cde027eb875ba4692f0a5d66b0afe862
> Author: Mike Rapoport <rppt@linux.ibm.com>
> Date:   Thu Jan 31 10:51:32 2019 +1100
> 
>      treewide: add checks for the return value of memblock_alloc*()
>      
>      Add check for the return value of memblock_alloc*() functions and call
>      panic() in case of error.  The panic message repeats the one used by
>      panicing memblock allocators with adjustment of parameters to include only
>      relevant ones.
>      
>      The replacement was mostly automated with semantic patches like the one
>      below with manual massaging of format strings.
>      
>      @@
>      expression ptr, size, align;
>      @@
>      ptr = memblock_alloc(size, align);
>      + if (!ptr)
>      +       panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__,
>      size, align);
>      
>      Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
>      Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
>      Reviewed-by: Guo Ren <ren_guo@c-sky.com>                [c-sky]
>      Acked-by: Paul Burton <paul.burton@mips.com>            [MIPS]
>      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>    [s390]
>      Reviewed-by: Juergen Gross <jgross@suse.com>            [Xen]
>      Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>  [m68k]
>      Cc: Catalin Marinas <catalin.marinas@arm.com>
>      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
>      Cc: Christoph Hellwig <hch@lst.de>
>      Cc: "David S. Miller" <davem@davemloft.net>
>      Cc: Dennis Zhou <dennis@kernel.org>
>      Cc: Greentime Hu <green.hu@gmail.com>
>      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>      Cc: Guan Xuetao <gxt@pku.edu.cn>
>      Cc: Guo Ren <guoren@kernel.org>
>      Cc: Mark Salter <msalter@redhat.com>
>      Cc: Matt Turner <mattst88@gmail.com>
>      Cc: Max Filippov <jcmvbkbc@gmail.com>
>      Cc: Michael Ellerman <mpe@ellerman.id.au>
>      Cc: Michal Simek <monstr@monstr.eu>
>      Cc: Petr Mladek <pmladek@suse.com>
>      Cc: Richard Weinberger <richard@nod.at>
>      Cc: Rich Felker <dalias@libc.org>
>      Cc: Rob Herring <robh+dt@kernel.org>
>      Cc: Rob Herring <robh@kernel.org>
>      Cc: Russell King <linux@armlinux.org.uk>
>      Cc: Stafford Horne <shorne@gmail.com>
>      Cc: Tony Luck <tony.luck@intel.com>
>      Cc: Vineet Gupta <vgupta@synopsys.com>
>      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
>      Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> 
> Which is just adding the panic we hit.  So, presumably, the bug is in a
> preceding patch :-(
> 
> I have left the kernel not booting for today.
> 

No I think the error is really in that patch, see my other mail.

See 
https://elixir.bootlin.com/linux/v5.0-rc4/source/mm/memblock.c#L1455, 
memblock_alloc_try_nid_raw() is not supposed to panic, so the last hunk 
of this patch should be reverted.

Found in total three problematic hunks in that patch:

@@ -48,6 +53,11 @@ static phys_addr_t __init kasan_alloc_raw_page(int node)
  	void *p = memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
  						__pa(MAX_DMA_ADDRESS),
  						MEMBLOCK_ALLOC_KASAN, node);
+	if (!p)
+		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%llx\n",
+		      __func__, PAGE_SIZE, PAGE_SIZE, node,
+		      __pa(MAX_DMA_ADDRESS));
+
  	return __pa(p);
  }

@@ -211,6 +211,9 @@ static int __init iob_init(struct device_node *dn)
  	iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21,
  					MEMBLOCK_LOW_LIMIT, 0x80000000,
  					NUMA_NO_NODE);
+	if (!iob_l2_base)
+		panic("%s: Failed to allocate %lu bytes align=0x%lx max_addr=%x\n",
+		      __func__, 1UL << 21, 1UL << 21, 0x80000000);

  	pr_info("IOBMAP L2 allocated at: %p\n", iob_l2_base);


@@ -425,6 +436,10 @@ static void __init sparse_buffer_init(unsigned long 
size, int nid)
  		memblock_alloc_try_nid_raw(size, PAGE_SIZE,
  						__pa(MAX_DMA_ADDRESS),
  						MEMBLOCK_ALLOC_ACCESSIBLE, nid);
+	if (!sparsemap_buf)
+		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
+		      __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
+
  	sparsemap_buf_end = sparsemap_buf + size;
  }



Christophe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
@ 2019-01-31  6:15     ` Christophe Leroy
  0 siblings, 0 replies; 17+ messages in thread
From: Christophe Leroy @ 2019-01-31  6:15 UTC (permalink / raw)
  To: Stephen Rothwell, Andrew Morton, Mike Rapoport
  Cc: Linux Next Mailing List, PowerPC, Linux Kernel Mailing List



Le 31/01/2019 à 07:06, Stephen Rothwell a écrit :
> Hi all,
> 
> On Thu, 31 Jan 2019 16:38:54 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>>
>> [I am guessing that is is something in Andrew's tree that has caused
>> this.]
>>
>> My qemu boot of the powerpc pseries_le_defconfig config failed like this:
>>
>> htab_hash_mask    = 0x1ffff
>> -----------------------------------------------------
>> numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
>> Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff
>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
>> Call Trace:
>> [c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
>> [c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
>> [c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
>> [c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
>> [c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
>> [c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
>> [c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
>> [c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c
> 
> A quick bisect leads to this:
> 
> 1c3c9328cde027eb875ba4692f0a5d66b0afe862 is the first bad commit
> commit 1c3c9328cde027eb875ba4692f0a5d66b0afe862
> Author: Mike Rapoport <rppt@linux.ibm.com>
> Date:   Thu Jan 31 10:51:32 2019 +1100
> 
>      treewide: add checks for the return value of memblock_alloc*()
>      
>      Add check for the return value of memblock_alloc*() functions and call
>      panic() in case of error.  The panic message repeats the one used by
>      panicing memblock allocators with adjustment of parameters to include only
>      relevant ones.
>      
>      The replacement was mostly automated with semantic patches like the one
>      below with manual massaging of format strings.
>      
>      @@
>      expression ptr, size, align;
>      @@
>      ptr = memblock_alloc(size, align);
>      + if (!ptr)
>      +       panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__,
>      size, align);
>      
>      Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
>      Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
>      Reviewed-by: Guo Ren <ren_guo@c-sky.com>                [c-sky]
>      Acked-by: Paul Burton <paul.burton@mips.com>            [MIPS]
>      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>    [s390]
>      Reviewed-by: Juergen Gross <jgross@suse.com>            [Xen]
>      Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>  [m68k]
>      Cc: Catalin Marinas <catalin.marinas@arm.com>
>      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
>      Cc: Christoph Hellwig <hch@lst.de>
>      Cc: "David S. Miller" <davem@davemloft.net>
>      Cc: Dennis Zhou <dennis@kernel.org>
>      Cc: Greentime Hu <green.hu@gmail.com>
>      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>      Cc: Guan Xuetao <gxt@pku.edu.cn>
>      Cc: Guo Ren <guoren@kernel.org>
>      Cc: Mark Salter <msalter@redhat.com>
>      Cc: Matt Turner <mattst88@gmail.com>
>      Cc: Max Filippov <jcmvbkbc@gmail.com>
>      Cc: Michael Ellerman <mpe@ellerman.id.au>
>      Cc: Michal Simek <monstr@monstr.eu>
>      Cc: Petr Mladek <pmladek@suse.com>
>      Cc: Richard Weinberger <richard@nod.at>
>      Cc: Rich Felker <dalias@libc.org>
>      Cc: Rob Herring <robh+dt@kernel.org>
>      Cc: Rob Herring <robh@kernel.org>
>      Cc: Russell King <linux@armlinux.org.uk>
>      Cc: Stafford Horne <shorne@gmail.com>
>      Cc: Tony Luck <tony.luck@intel.com>
>      Cc: Vineet Gupta <vgupta@synopsys.com>
>      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
>      Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> 
> Which is just adding the panic we hit.  So, presumably, the bug is in a
> preceding patch :-(
> 
> I have left the kernel not booting for today.
> 

No I think the error is really in that patch, see my other mail.

See 
https://elixir.bootlin.com/linux/v5.0-rc4/source/mm/memblock.c#L1455, 
memblock_alloc_try_nid_raw() is not supposed to panic, so the last hunk 
of this patch should be reverted.

Found in total three problematic hunks in that patch:

@@ -48,6 +53,11 @@ static phys_addr_t __init kasan_alloc_raw_page(int node)
  	void *p = memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
  						__pa(MAX_DMA_ADDRESS),
  						MEMBLOCK_ALLOC_KASAN, node);
+	if (!p)
+		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%llx\n",
+		      __func__, PAGE_SIZE, PAGE_SIZE, node,
+		      __pa(MAX_DMA_ADDRESS));
+
  	return __pa(p);
  }

@@ -211,6 +211,9 @@ static int __init iob_init(struct device_node *dn)
  	iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21,
  					MEMBLOCK_LOW_LIMIT, 0x80000000,
  					NUMA_NO_NODE);
+	if (!iob_l2_base)
+		panic("%s: Failed to allocate %lu bytes align=0x%lx max_addr=%x\n",
+		      __func__, 1UL << 21, 1UL << 21, 0x80000000);

  	pr_info("IOBMAP L2 allocated at: %p\n", iob_l2_base);


@@ -425,6 +436,10 @@ static void __init sparse_buffer_init(unsigned long 
size, int nid)
  		memblock_alloc_try_nid_raw(size, PAGE_SIZE,
  						__pa(MAX_DMA_ADDRESS),
  						MEMBLOCK_ALLOC_ACCESSIBLE, nid);
+	if (!sparsemap_buf)
+		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
+		      __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
+
  	sparsemap_buf_end = sparsemap_buf + size;
  }



Christophe

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
  2019-01-31  6:15     ` Christophe Leroy
@ 2019-01-31  6:39       ` Mike Rapoport
  -1 siblings, 0 replies; 17+ messages in thread
From: Mike Rapoport @ 2019-01-31  6:39 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Stephen Rothwell, Andrew Morton, Linux Next Mailing List,
	Linux Kernel Mailing List, Michael Ellerman,
	Benjamin Herrenschmidt, PowerPC

On Thu, Jan 31, 2019 at 07:15:26AM +0100, Christophe Leroy wrote:
> 
> 
> Le 31/01/2019 à 07:06, Stephen Rothwell a écrit :
> >Hi all,
> >
> >On Thu, 31 Jan 2019 16:38:54 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> >>
> >>[I am guessing that is is something in Andrew's tree that has caused
> >>this.]
> >>
> >>My qemu boot of the powerpc pseries_le_defconfig config failed like this:
> >>
> >>htab_hash_mask    = 0x1ffff
> >>-----------------------------------------------------
> >>numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
> >>Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff

This means that sparse_buffer_init tries to allocate 2G for the sparsemap_buf...

Stephen, how many memory do you give to your VM?

> >>CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
> >>Call Trace:
> >>[c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
> >>[c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
> >>[c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
> >>[c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
> >>[c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
> >>[c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
> >>[c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
> >>[c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c
> >
> >A quick bisect leads to this:
> >
> >1c3c9328cde027eb875ba4692f0a5d66b0afe862 is the first bad commit
> >commit 1c3c9328cde027eb875ba4692f0a5d66b0afe862
> >Author: Mike Rapoport <rppt@linux.ibm.com>
> >Date:   Thu Jan 31 10:51:32 2019 +1100
> >
> >     treewide: add checks for the return value of memblock_alloc*()
> >     Add check for the return value of memblock_alloc*() functions and call
> >     panic() in case of error.  The panic message repeats the one used by
> >     panicing memblock allocators with adjustment of parameters to include only
> >     relevant ones.
> >     The replacement was mostly automated with semantic patches like the one
> >     below with manual massaging of format strings.
> >     @@
> >     expression ptr, size, align;
> >     @@
> >     ptr = memblock_alloc(size, align);
> >     + if (!ptr)
> >     +       panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__,
> >     size, align);
> >     Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
> >     Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> >     Reviewed-by: Guo Ren <ren_guo@c-sky.com>                [c-sky]
> >     Acked-by: Paul Burton <paul.burton@mips.com>            [MIPS]
> >     Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>    [s390]
> >     Reviewed-by: Juergen Gross <jgross@suse.com>            [Xen]
> >     Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>  [m68k]
> >     Cc: Catalin Marinas <catalin.marinas@arm.com>
> >     Cc: Christophe Leroy <christophe.leroy@c-s.fr>
> >     Cc: Christoph Hellwig <hch@lst.de>
> >     Cc: "David S. Miller" <davem@davemloft.net>
> >     Cc: Dennis Zhou <dennis@kernel.org>
> >     Cc: Greentime Hu <green.hu@gmail.com>
> >     Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >     Cc: Guan Xuetao <gxt@pku.edu.cn>
> >     Cc: Guo Ren <guoren@kernel.org>
> >     Cc: Mark Salter <msalter@redhat.com>
> >     Cc: Matt Turner <mattst88@gmail.com>
> >     Cc: Max Filippov <jcmvbkbc@gmail.com>
> >     Cc: Michael Ellerman <mpe@ellerman.id.au>
> >     Cc: Michal Simek <monstr@monstr.eu>
> >     Cc: Petr Mladek <pmladek@suse.com>
> >     Cc: Richard Weinberger <richard@nod.at>
> >     Cc: Rich Felker <dalias@libc.org>
> >     Cc: Rob Herring <robh+dt@kernel.org>
> >     Cc: Rob Herring <robh@kernel.org>
> >     Cc: Russell King <linux@armlinux.org.uk>
> >     Cc: Stafford Horne <shorne@gmail.com>
> >     Cc: Tony Luck <tony.luck@intel.com>
> >     Cc: Vineet Gupta <vgupta@synopsys.com>
> >     Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
> >     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> >
> >Which is just adding the panic we hit.  So, presumably, the bug is in a
> >preceding patch :-(
> >
> >I have left the kernel not booting for today.
> >
> 
> No I think the error is really in that patch, see my other mail.
> 
> See https://elixir.bootlin.com/linux/v5.0-rc4/source/mm/memblock.c#L1455,
> memblock_alloc_try_nid_raw() is not supposed to panic, so the last hunk of
> this patch should be reverted.

It is not supposed to panic, but it can still fail, so simply ignoring it's
return value seems a bit odd at least.
 
> Found in total three problematic hunks in that patch:
> 
> @@ -48,6 +53,11 @@ static phys_addr_t __init kasan_alloc_raw_page(int node)
>  	void *p = memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
>  						__pa(MAX_DMA_ADDRESS),
>  						MEMBLOCK_ALLOC_KASAN, node);
> +	if (!p)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%llx\n",
> +		      __func__, PAGE_SIZE, PAGE_SIZE, node,
> +		      __pa(MAX_DMA_ADDRESS));
> +
>  	return __pa(p);
>  }
> 
> @@ -211,6 +211,9 @@ static int __init iob_init(struct device_node *dn)
>  	iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21,
>  					MEMBLOCK_LOW_LIMIT, 0x80000000,
>  					NUMA_NO_NODE);
> +	if (!iob_l2_base)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx max_addr=%x\n",
> +		      __func__, 1UL << 21, 1UL << 21, 0x80000000);
> 
>  	pr_info("IOBMAP L2 allocated at: %p\n", iob_l2_base);
> 
> 
> @@ -425,6 +436,10 @@ static void __init sparse_buffer_init(unsigned long
> size, int nid)
>  		memblock_alloc_try_nid_raw(size, PAGE_SIZE,
>  						__pa(MAX_DMA_ADDRESS),
>  						MEMBLOCK_ALLOC_ACCESSIBLE, nid);
> +	if (!sparsemap_buf)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
> +		      __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
> +
>  	sparsemap_buf_end = sparsemap_buf + size;
>  }
> 
> 
> 
> Christophe
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
@ 2019-01-31  6:39       ` Mike Rapoport
  0 siblings, 0 replies; 17+ messages in thread
From: Mike Rapoport @ 2019-01-31  6:39 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Stephen Rothwell, Linux Kernel Mailing List,
	Linux Next Mailing List, Andrew Morton, PowerPC

On Thu, Jan 31, 2019 at 07:15:26AM +0100, Christophe Leroy wrote:
> 
> 
> Le 31/01/2019 à 07:06, Stephen Rothwell a écrit :
> >Hi all,
> >
> >On Thu, 31 Jan 2019 16:38:54 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> >>
> >>[I am guessing that is is something in Andrew's tree that has caused
> >>this.]
> >>
> >>My qemu boot of the powerpc pseries_le_defconfig config failed like this:
> >>
> >>htab_hash_mask    = 0x1ffff
> >>-----------------------------------------------------
> >>numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
> >>Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff

This means that sparse_buffer_init tries to allocate 2G for the sparsemap_buf...

Stephen, how many memory do you give to your VM?

> >>CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
> >>Call Trace:
> >>[c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
> >>[c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
> >>[c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
> >>[c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
> >>[c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
> >>[c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
> >>[c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
> >>[c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c
> >
> >A quick bisect leads to this:
> >
> >1c3c9328cde027eb875ba4692f0a5d66b0afe862 is the first bad commit
> >commit 1c3c9328cde027eb875ba4692f0a5d66b0afe862
> >Author: Mike Rapoport <rppt@linux.ibm.com>
> >Date:   Thu Jan 31 10:51:32 2019 +1100
> >
> >     treewide: add checks for the return value of memblock_alloc*()
> >     Add check for the return value of memblock_alloc*() functions and call
> >     panic() in case of error.  The panic message repeats the one used by
> >     panicing memblock allocators with adjustment of parameters to include only
> >     relevant ones.
> >     The replacement was mostly automated with semantic patches like the one
> >     below with manual massaging of format strings.
> >     @@
> >     expression ptr, size, align;
> >     @@
> >     ptr = memblock_alloc(size, align);
> >     + if (!ptr)
> >     +       panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__,
> >     size, align);
> >     Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
> >     Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> >     Reviewed-by: Guo Ren <ren_guo@c-sky.com>                [c-sky]
> >     Acked-by: Paul Burton <paul.burton@mips.com>            [MIPS]
> >     Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>    [s390]
> >     Reviewed-by: Juergen Gross <jgross@suse.com>            [Xen]
> >     Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>  [m68k]
> >     Cc: Catalin Marinas <catalin.marinas@arm.com>
> >     Cc: Christophe Leroy <christophe.leroy@c-s.fr>
> >     Cc: Christoph Hellwig <hch@lst.de>
> >     Cc: "David S. Miller" <davem@davemloft.net>
> >     Cc: Dennis Zhou <dennis@kernel.org>
> >     Cc: Greentime Hu <green.hu@gmail.com>
> >     Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >     Cc: Guan Xuetao <gxt@pku.edu.cn>
> >     Cc: Guo Ren <guoren@kernel.org>
> >     Cc: Mark Salter <msalter@redhat.com>
> >     Cc: Matt Turner <mattst88@gmail.com>
> >     Cc: Max Filippov <jcmvbkbc@gmail.com>
> >     Cc: Michael Ellerman <mpe@ellerman.id.au>
> >     Cc: Michal Simek <monstr@monstr.eu>
> >     Cc: Petr Mladek <pmladek@suse.com>
> >     Cc: Richard Weinberger <richard@nod.at>
> >     Cc: Rich Felker <dalias@libc.org>
> >     Cc: Rob Herring <robh+dt@kernel.org>
> >     Cc: Rob Herring <robh@kernel.org>
> >     Cc: Russell King <linux@armlinux.org.uk>
> >     Cc: Stafford Horne <shorne@gmail.com>
> >     Cc: Tony Luck <tony.luck@intel.com>
> >     Cc: Vineet Gupta <vgupta@synopsys.com>
> >     Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
> >     Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> >
> >Which is just adding the panic we hit.  So, presumably, the bug is in a
> >preceding patch :-(
> >
> >I have left the kernel not booting for today.
> >
> 
> No I think the error is really in that patch, see my other mail.
> 
> See https://elixir.bootlin.com/linux/v5.0-rc4/source/mm/memblock.c#L1455,
> memblock_alloc_try_nid_raw() is not supposed to panic, so the last hunk of
> this patch should be reverted.

It is not supposed to panic, but it can still fail, so simply ignoring it's
return value seems a bit odd at least.
 
> Found in total three problematic hunks in that patch:
> 
> @@ -48,6 +53,11 @@ static phys_addr_t __init kasan_alloc_raw_page(int node)
>  	void *p = memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
>  						__pa(MAX_DMA_ADDRESS),
>  						MEMBLOCK_ALLOC_KASAN, node);
> +	if (!p)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%llx\n",
> +		      __func__, PAGE_SIZE, PAGE_SIZE, node,
> +		      __pa(MAX_DMA_ADDRESS));
> +
>  	return __pa(p);
>  }
> 
> @@ -211,6 +211,9 @@ static int __init iob_init(struct device_node *dn)
>  	iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21,
>  					MEMBLOCK_LOW_LIMIT, 0x80000000,
>  					NUMA_NO_NODE);
> +	if (!iob_l2_base)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx max_addr=%x\n",
> +		      __func__, 1UL << 21, 1UL << 21, 0x80000000);
> 
>  	pr_info("IOBMAP L2 allocated at: %p\n", iob_l2_base);
> 
> 
> @@ -425,6 +436,10 @@ static void __init sparse_buffer_init(unsigned long
> size, int nid)
>  		memblock_alloc_try_nid_raw(size, PAGE_SIZE,
>  						__pa(MAX_DMA_ADDRESS),
>  						MEMBLOCK_ALLOC_ACCESSIBLE, nid);
> +	if (!sparsemap_buf)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
> +		      __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
> +
>  	sparsemap_buf_end = sparsemap_buf + size;
>  }
> 
> 
> 
> Christophe
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
  2019-01-31  6:39       ` Mike Rapoport
@ 2019-01-31  7:13         ` Stephen Rothwell
  -1 siblings, 0 replies; 17+ messages in thread
From: Stephen Rothwell @ 2019-01-31  7:13 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Christophe Leroy, Andrew Morton, Linux Next Mailing List,
	Linux Kernel Mailing List, Michael Ellerman,
	Benjamin Herrenschmidt, PowerPC

[-- Attachment #1: Type: text/plain, Size: 1660 bytes --]

Hi Mike,

On Thu, 31 Jan 2019 08:39:30 +0200 Mike Rapoport <rppt@linux.ibm.com> wrote:
>
> On Thu, Jan 31, 2019 at 07:15:26AM +0100, Christophe Leroy wrote:
> > 
> > Le 31/01/2019 à 07:06, Stephen Rothwell a écrit :  
> > >>My qemu boot of the powerpc pseries_le_defconfig config failed like this:
> > >>
> > >>htab_hash_mask    = 0x1ffff
> > >>-----------------------------------------------------
> > >>numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
> > >>Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff  
> 
> This means that sparse_buffer_init tries to allocate 2G for the sparsemap_buf...
> 
> Stephen, how many memory do you give to your VM?

Exactly 2G.

     qemu-system-ppc64 -M pseries -m 2G ....

The boot normally continue like this:

    rfi-flush: fallback displacement flush available
    count-cache-flush: software flush disabled.
    stf-barrier: hwsync barrier available
    PCI host bridge /pci@800000020000000  ranges:
      IO 0x0000200000000000..0x000020000000ffff -> 0x0000000000000000
     MEM 0x0000200080000000..0x00002000ffffffff -> 0x0000000080000000
     MEM 0x0000210000000000..0x000021ffffffffff -> 0x0000210000000000
    PPC64 nvram contains 65536 bytes
    barrier-nospec: using ORI speculation barrier
    Zone ranges:
      Normal   [mem 0x0000000000000000-0x000000007fffffff]
    Movable zone start for each node
    Early memory node ranges
      node   0: [mem 0x0000000000000000-0x000000007fffffff]
    Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
@ 2019-01-31  7:13         ` Stephen Rothwell
  0 siblings, 0 replies; 17+ messages in thread
From: Stephen Rothwell @ 2019-01-31  7:13 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Linux Kernel Mailing List, Linux Next Mailing List,
	Andrew Morton, PowerPC

[-- Attachment #1: Type: text/plain, Size: 1660 bytes --]

Hi Mike,

On Thu, 31 Jan 2019 08:39:30 +0200 Mike Rapoport <rppt@linux.ibm.com> wrote:
>
> On Thu, Jan 31, 2019 at 07:15:26AM +0100, Christophe Leroy wrote:
> > 
> > Le 31/01/2019 à 07:06, Stephen Rothwell a écrit :  
> > >>My qemu boot of the powerpc pseries_le_defconfig config failed like this:
> > >>
> > >>htab_hash_mask    = 0x1ffff
> > >>-----------------------------------------------------
> > >>numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
> > >>Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff  
> 
> This means that sparse_buffer_init tries to allocate 2G for the sparsemap_buf...
> 
> Stephen, how many memory do you give to your VM?

Exactly 2G.

     qemu-system-ppc64 -M pseries -m 2G ....

The boot normally continue like this:

    rfi-flush: fallback displacement flush available
    count-cache-flush: software flush disabled.
    stf-barrier: hwsync barrier available
    PCI host bridge /pci@800000020000000  ranges:
      IO 0x0000200000000000..0x000020000000ffff -> 0x0000000000000000
     MEM 0x0000200080000000..0x00002000ffffffff -> 0x0000000080000000
     MEM 0x0000210000000000..0x000021ffffffffff -> 0x0000210000000000
    PPC64 nvram contains 65536 bytes
    barrier-nospec: using ORI speculation barrier
    Zone ranges:
      Normal   [mem 0x0000000000000000-0x000000007fffffff]
    Movable zone start for each node
    Early memory node ranges
      node   0: [mem 0x0000000000000000-0x000000007fffffff]
    Initmem setup node 0 [mem 0x0000000000000000-0x000000007fffffff]

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
  2019-01-31  6:15     ` Christophe Leroy
  (?)
@ 2019-01-31  7:40       ` Mike Rapoport
  -1 siblings, 0 replies; 17+ messages in thread
From: Mike Rapoport @ 2019-01-31  7:40 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Stephen Rothwell, Andrew Morton, Linux Next Mailing List,
	Linux Kernel Mailing List, Michael Ellerman,
	Benjamin Herrenschmidt, PowerPC, Andrey Konovalov

(added Andrey Konovalov)

On Thu, Jan 31, 2019 at 07:15:26AM +0100, Christophe Leroy wrote:
> 
> Le 31/01/2019 à 07:06, Stephen Rothwell a écrit :
> >Hi all,
> >
> >On Thu, 31 Jan 2019 16:38:54 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> >>
> >>[I am guessing that is is something in Andrew's tree that has caused
> >>this.]
> >>
> >>My qemu boot of the powerpc pseries_le_defconfig config failed like this:
> >>
> >>htab_hash_mask    = 0x1ffff
> >>-----------------------------------------------------
> >>numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
> >>Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff
> >>CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
> >>Call Trace:
> >>[c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
> >>[c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
> >>[c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
> >>[c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
> >>[c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
> >>[c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
> >>[c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
> >>[c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c
> >
> >A quick bisect leads to this:
> >
> >1c3c9328cde027eb875ba4692f0a5d66b0afe862 is the first bad commit
> >commit 1c3c9328cde027eb875ba4692f0a5d66b0afe862
> >Author: Mike Rapoport <rppt@linux.ibm.com>
> >Date:   Thu Jan 31 10:51:32 2019 +1100
> >
> >     treewide: add checks for the return value of memblock_alloc*()
> >     Add check for the return value of memblock_alloc*() functions and call
> >     panic() in case of error.  The panic message repeats the one used by
> >     panicing memblock allocators with adjustment of parameters to include only
> >     relevant ones.
> >
> >Which is just adding the panic we hit.  So, presumably, the bug is in a
> >preceding patch :-(
> >
> >I have left the kernel not booting for today.
> >
> 
> No I think the error is really in that patch, see my other mail.
> 
> See https://elixir.bootlin.com/linux/v5.0-rc4/source/mm/memblock.c#L1455,
> memblock_alloc_try_nid_raw() is not supposed to panic, so the last hunk of
> this patch should be reverted.
> 
> Found in total three problematic hunks in that patch:
> 
> @@ -48,6 +53,11 @@ static phys_addr_t __init kasan_alloc_raw_page(int node)
>  	void *p = memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
>  						__pa(MAX_DMA_ADDRESS),
>  						MEMBLOCK_ALLOC_KASAN, node);
> +	if (!p)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%llx\n",
> +		      __func__, PAGE_SIZE, PAGE_SIZE, node,
> +		      __pa(MAX_DMA_ADDRESS));
> +
>  	return __pa(p);
>  }
 
I've looked more closely to the code that uses this function and it does
not seem to handle allocation error.
I can replace the panic with WARN(), but I think that panic() here is
appropriate.

Andrey, can you comment?


> @@ -211,6 +211,9 @@ static int __init iob_init(struct device_node *dn)
>  	iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21,
>  					MEMBLOCK_LOW_LIMIT, 0x80000000,
>  					NUMA_NO_NODE);
> +	if (!iob_l2_base)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx max_addr=%x\n",
> +		      __func__, 1UL << 21, 1UL << 21, 0x80000000);
> 
>  	pr_info("IOBMAP L2 allocated at: %p\n", iob_l2_base);
 
This one is actually fixes my own mistake from one of the previous patches
that converted memblock_alloc_base() to memblock_alloc_try_nid_raw() without
adding the panic() (commit 47e382eb08cfa0199c4ea9f9cc73f1b48a3a4b1d
"powerpc: prefer memblock APIs returning virtual address")
 
> @@ -425,6 +436,10 @@ static void __init sparse_buffer_init(unsigned long
> size, int nid)
>  		memblock_alloc_try_nid_raw(size, PAGE_SIZE,
>  						__pa(MAX_DMA_ADDRESS),
>  						MEMBLOCK_ALLOC_ACCESSIBLE, nid);
> +	if (!sparsemap_buf)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
> +		      __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
> +
>  	sparsemap_buf_end = sparsemap_buf + size;
>  }
 
This hunk was not needed as sparse can deal with this allocation failure.

Andrew, can you please add the below patch to as a fixup to "treewide: add
checks for the return value of memblock_alloc*()"?
 
From 854f54b9d4fe52f477765b905a4b2c421d30f46e Mon Sep 17 00:00:00 2001
From: Mike Rapoport <rppt@linux.ibm.com>
Date: Thu, 31 Jan 2019 09:18:50 +0200
Subject: [PATCH] mm/sparse: don't panic if the allocation in
 sparse_buffer_init fails

Addition of panic if memblock_alloc_try_nid_raw() call in
sparse_buffer_init() fails was over enthusiastic as the system is perfectly
capable to deal with that allocation failure.
Remove the panic().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 mm/sparse.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/mm/sparse.c b/mm/sparse.c
index 1471f06..c11aba0 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -434,10 +434,6 @@ static void __init sparse_buffer_init(unsigned long size, int nid)
 		memblock_alloc_try_nid_raw(size, PAGE_SIZE,
 						__pa(MAX_DMA_ADDRESS),
 						MEMBLOCK_ALLOC_ACCESSIBLE, nid);
-	if (!sparsemap_buf)
-		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
-		      __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
-
 	sparsemap_buf_end = sparsemap_buf + size;
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
@ 2019-01-31  7:40       ` Mike Rapoport
  0 siblings, 0 replies; 17+ messages in thread
From: Mike Rapoport @ 2019-01-31  7:40 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Stephen Rothwell, Andrew Morton, Linux Next Mailing List,
	Linux Kernel Mailing List, Michael Ellerman,
	Benjamin Herrenschmidt, PowerPC, Andrey Konovalov

(added Andrey Konovalov)

On Thu, Jan 31, 2019 at 07:15:26AM +0100, Christophe Leroy wrote:
> 
> Le 31/01/2019 à 07:06, Stephen Rothwell a écrit :
> >Hi all,
> >
> >On Thu, 31 Jan 2019 16:38:54 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> >>
> >>[I am guessing that is is something in Andrew's tree that has caused
> >>this.]
> >>
> >>My qemu boot of the powerpc pseries_le_defconfig config failed like this:
> >>
> >>htab_hash_mask    = 0x1ffff
> >>-----------------------------------------------------
> >>numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
> >>Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff
> >>CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
> >>Call Trace:
> >>[c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
> >>[c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
> >>[c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
> >>[c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
> >>[c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
> >>[c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
> >>[c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
> >>[c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c
> >
> >A quick bisect leads to this:
> >
> >1c3c9328cde027eb875ba4692f0a5d66b0afe862 is the first bad commit
> >commit 1c3c9328cde027eb875ba4692f0a5d66b0afe862
> >Author: Mike Rapoport <rppt@linux.ibm.com>
> >Date:   Thu Jan 31 10:51:32 2019 +1100
> >
> >     treewide: add checks for the return value of memblock_alloc*()
> >     Add check for the return value of memblock_alloc*() functions and call
> >     panic() in case of error.  The panic message repeats the one used by
> >     panicing memblock allocators with adjustment of parameters to include only
> >     relevant ones.
> >
> >Which is just adding the panic we hit.  So, presumably, the bug is in a
> >preceding patch :-(
> >
> >I have left the kernel not booting for today.
> >
> 
> No I think the error is really in that patch, see my other mail.
> 
> See https://elixir.bootlin.com/linux/v5.0-rc4/source/mm/memblock.c#L1455,
> memblock_alloc_try_nid_raw() is not supposed to panic, so the last hunk of
> this patch should be reverted.
> 
> Found in total three problematic hunks in that patch:
> 
> @@ -48,6 +53,11 @@ static phys_addr_t __init kasan_alloc_raw_page(int node)
>  	void *p = memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
>  						__pa(MAX_DMA_ADDRESS),
>  						MEMBLOCK_ALLOC_KASAN, node);
> +	if (!p)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%llx\n",
> +		      __func__, PAGE_SIZE, PAGE_SIZE, node,
> +		      __pa(MAX_DMA_ADDRESS));
> +
>  	return __pa(p);
>  }
 
I've looked more closely to the code that uses this function and it does
not seem to handle allocation error.
I can replace the panic with WARN(), but I think that panic() here is
appropriate.

Andrey, can you comment?


> @@ -211,6 +211,9 @@ static int __init iob_init(struct device_node *dn)
>  	iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21,
>  					MEMBLOCK_LOW_LIMIT, 0x80000000,
>  					NUMA_NO_NODE);
> +	if (!iob_l2_base)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx max_addr=%x\n",
> +		      __func__, 1UL << 21, 1UL << 21, 0x80000000);
> 
>  	pr_info("IOBMAP L2 allocated at: %p\n", iob_l2_base);
 
This one is actually fixes my own mistake from one of the previous patches
that converted memblock_alloc_base() to memblock_alloc_try_nid_raw() without
adding the panic() (commit 47e382eb08cfa0199c4ea9f9cc73f1b48a3a4b1d
"powerpc: prefer memblock APIs returning virtual address")
 
> @@ -425,6 +436,10 @@ static void __init sparse_buffer_init(unsigned long
> size, int nid)
>  		memblock_alloc_try_nid_raw(size, PAGE_SIZE,
>  						__pa(MAX_DMA_ADDRESS),
>  						MEMBLOCK_ALLOC_ACCESSIBLE, nid);
> +	if (!sparsemap_buf)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
> +		      __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
> +
>  	sparsemap_buf_end = sparsemap_buf + size;
>  }
 
This hunk was not needed as sparse can deal with this allocation failure.

Andrew, can you please add the below patch to as a fixup to "treewide: add
checks for the return value of memblock_alloc*()"?
 
>From 854f54b9d4fe52f477765b905a4b2c421d30f46e Mon Sep 17 00:00:00 2001
From: Mike Rapoport <rppt@linux.ibm.com>
Date: Thu, 31 Jan 2019 09:18:50 +0200
Subject: [PATCH] mm/sparse: don't panic if the allocation in
 sparse_buffer_init fails

Addition of panic if memblock_alloc_try_nid_raw() call in
sparse_buffer_init() fails was over enthusiastic as the system is perfectly
capable to deal with that allocation failure.
Remove the panic().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 mm/sparse.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/mm/sparse.c b/mm/sparse.c
index 1471f06..c11aba0 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -434,10 +434,6 @@ static void __init sparse_buffer_init(unsigned long size, int nid)
 		memblock_alloc_try_nid_raw(size, PAGE_SIZE,
 						__pa(MAX_DMA_ADDRESS),
 						MEMBLOCK_ALLOC_ACCESSIBLE, nid);
-	if (!sparsemap_buf)
-		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
-		      __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
-
 	sparsemap_buf_end = sparsemap_buf + size;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
@ 2019-01-31  7:40       ` Mike Rapoport
  0 siblings, 0 replies; 17+ messages in thread
From: Mike Rapoport @ 2019-01-31  7:40 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Stephen Rothwell, Linux Kernel Mailing List, Andrey Konovalov,
	Linux Next Mailing List, Andrew Morton, PowerPC

(added Andrey Konovalov)

On Thu, Jan 31, 2019 at 07:15:26AM +0100, Christophe Leroy wrote:
> 
> Le 31/01/2019 à 07:06, Stephen Rothwell a écrit :
> >Hi all,
> >
> >On Thu, 31 Jan 2019 16:38:54 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> >>
> >>[I am guessing that is is something in Andrew's tree that has caused
> >>this.]
> >>
> >>My qemu boot of the powerpc pseries_le_defconfig config failed like this:
> >>
> >>htab_hash_mask    = 0x1ffff
> >>-----------------------------------------------------
> >>numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
> >>Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff
> >>CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
> >>Call Trace:
> >>[c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
> >>[c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
> >>[c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
> >>[c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
> >>[c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
> >>[c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
> >>[c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
> >>[c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c
> >
> >A quick bisect leads to this:
> >
> >1c3c9328cde027eb875ba4692f0a5d66b0afe862 is the first bad commit
> >commit 1c3c9328cde027eb875ba4692f0a5d66b0afe862
> >Author: Mike Rapoport <rppt@linux.ibm.com>
> >Date:   Thu Jan 31 10:51:32 2019 +1100
> >
> >     treewide: add checks for the return value of memblock_alloc*()
> >     Add check for the return value of memblock_alloc*() functions and call
> >     panic() in case of error.  The panic message repeats the one used by
> >     panicing memblock allocators with adjustment of parameters to include only
> >     relevant ones.
> >
> >Which is just adding the panic we hit.  So, presumably, the bug is in a
> >preceding patch :-(
> >
> >I have left the kernel not booting for today.
> >
> 
> No I think the error is really in that patch, see my other mail.
> 
> See https://elixir.bootlin.com/linux/v5.0-rc4/source/mm/memblock.c#L1455,
> memblock_alloc_try_nid_raw() is not supposed to panic, so the last hunk of
> this patch should be reverted.
> 
> Found in total three problematic hunks in that patch:
> 
> @@ -48,6 +53,11 @@ static phys_addr_t __init kasan_alloc_raw_page(int node)
>  	void *p = memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
>  						__pa(MAX_DMA_ADDRESS),
>  						MEMBLOCK_ALLOC_KASAN, node);
> +	if (!p)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%llx\n",
> +		      __func__, PAGE_SIZE, PAGE_SIZE, node,
> +		      __pa(MAX_DMA_ADDRESS));
> +
>  	return __pa(p);
>  }
 
I've looked more closely to the code that uses this function and it does
not seem to handle allocation error.
I can replace the panic with WARN(), but I think that panic() here is
appropriate.

Andrey, can you comment?


> @@ -211,6 +211,9 @@ static int __init iob_init(struct device_node *dn)
>  	iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21,
>  					MEMBLOCK_LOW_LIMIT, 0x80000000,
>  					NUMA_NO_NODE);
> +	if (!iob_l2_base)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx max_addr=%x\n",
> +		      __func__, 1UL << 21, 1UL << 21, 0x80000000);
> 
>  	pr_info("IOBMAP L2 allocated at: %p\n", iob_l2_base);
 
This one is actually fixes my own mistake from one of the previous patches
that converted memblock_alloc_base() to memblock_alloc_try_nid_raw() without
adding the panic() (commit 47e382eb08cfa0199c4ea9f9cc73f1b48a3a4b1d
"powerpc: prefer memblock APIs returning virtual address")
 
> @@ -425,6 +436,10 @@ static void __init sparse_buffer_init(unsigned long
> size, int nid)
>  		memblock_alloc_try_nid_raw(size, PAGE_SIZE,
>  						__pa(MAX_DMA_ADDRESS),
>  						MEMBLOCK_ALLOC_ACCESSIBLE, nid);
> +	if (!sparsemap_buf)
> +		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
> +		      __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
> +
>  	sparsemap_buf_end = sparsemap_buf + size;
>  }
 
This hunk was not needed as sparse can deal with this allocation failure.

Andrew, can you please add the below patch to as a fixup to "treewide: add
checks for the return value of memblock_alloc*()"?
 
From 854f54b9d4fe52f477765b905a4b2c421d30f46e Mon Sep 17 00:00:00 2001
From: Mike Rapoport <rppt@linux.ibm.com>
Date: Thu, 31 Jan 2019 09:18:50 +0200
Subject: [PATCH] mm/sparse: don't panic if the allocation in
 sparse_buffer_init fails

Addition of panic if memblock_alloc_try_nid_raw() call in
sparse_buffer_init() fails was over enthusiastic as the system is perfectly
capable to deal with that allocation failure.
Remove the panic().

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 mm/sparse.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/mm/sparse.c b/mm/sparse.c
index 1471f06..c11aba0 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -434,10 +434,6 @@ static void __init sparse_buffer_init(unsigned long size, int nid)
 		memblock_alloc_try_nid_raw(size, PAGE_SIZE,
 						__pa(MAX_DMA_ADDRESS),
 						MEMBLOCK_ALLOC_ACCESSIBLE, nid);
-	if (!sparsemap_buf)
-		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
-		      __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
-
 	sparsemap_buf_end = sparsemap_buf + size;
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
  2019-01-31  7:40       ` Mike Rapoport
@ 2019-01-31  8:35         ` Stephen Rothwell
  -1 siblings, 0 replies; 17+ messages in thread
From: Stephen Rothwell @ 2019-01-31  8:35 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Christophe Leroy, Andrew Morton, Linux Next Mailing List,
	Linux Kernel Mailing List, Michael Ellerman,
	Benjamin Herrenschmidt, PowerPC, Andrey Konovalov

[-- Attachment #1: Type: text/plain, Size: 638 bytes --]

Hi Mike,

On Thu, 31 Jan 2019 09:40:18 +0200 Mike Rapoport <rppt@linux.ibm.com> wrote:
>
> Andrew, can you please add the below patch to as a fixup to "treewide: add
> checks for the return value of memblock_alloc*()"?

I have added that to linux-next for tomorrow (in case Andrew doesn't
get to it).

> From 854f54b9d4fe52f477765b905a4b2c421d30f46e Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <rppt@linux.ibm.com>
> Date: Thu, 31 Jan 2019 09:18:50 +0200
> Subject: [PATCH] mm/sparse: don't panic if the allocation in
>  sparse_buffer_init fails

Thanks all for the quick resolution.

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
@ 2019-01-31  8:35         ` Stephen Rothwell
  0 siblings, 0 replies; 17+ messages in thread
From: Stephen Rothwell @ 2019-01-31  8:35 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Linux Kernel Mailing List, Andrey Konovalov,
	Linux Next Mailing List, Andrew Morton, PowerPC

[-- Attachment #1: Type: text/plain, Size: 638 bytes --]

Hi Mike,

On Thu, 31 Jan 2019 09:40:18 +0200 Mike Rapoport <rppt@linux.ibm.com> wrote:
>
> Andrew, can you please add the below patch to as a fixup to "treewide: add
> checks for the return value of memblock_alloc*()"?

I have added that to linux-next for tomorrow (in case Andrew doesn't
get to it).

> From 854f54b9d4fe52f477765b905a4b2c421d30f46e Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <rppt@linux.ibm.com>
> Date: Thu, 31 Jan 2019 09:18:50 +0200
> Subject: [PATCH] mm/sparse: don't panic if the allocation in
>  sparse_buffer_init fails

Thanks all for the quick resolution.

-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
  2019-01-31  7:40       ` Mike Rapoport
@ 2019-01-31 13:50         ` Andrey Konovalov
  -1 siblings, 0 replies; 17+ messages in thread
From: Andrey Konovalov @ 2019-01-31 13:50 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Christophe Leroy, Stephen Rothwell, Andrew Morton,
	Linux Next Mailing List, Linux Kernel Mailing List,
	Michael Ellerman, Benjamin Herrenschmidt, PowerPC,
	Andrey Ryabinin

On Thu, Jan 31, 2019 at 8:40 AM Mike Rapoport <rppt@linux.ibm.com> wrote:
>
> (added Andrey Konovalov)
>
> On Thu, Jan 31, 2019 at 07:15:26AM +0100, Christophe Leroy wrote:
> >
> > Le 31/01/2019 à 07:06, Stephen Rothwell a écrit :
> > >Hi all,
> > >
> > >On Thu, 31 Jan 2019 16:38:54 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> > >>
> > >>[I am guessing that is is something in Andrew's tree that has caused
> > >>this.]
> > >>
> > >>My qemu boot of the powerpc pseries_le_defconfig config failed like this:
> > >>
> > >>htab_hash_mask    = 0x1ffff
> > >>-----------------------------------------------------
> > >>numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
> > >>Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff
> > >>CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
> > >>Call Trace:
> > >>[c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
> > >>[c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
> > >>[c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
> > >>[c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
> > >>[c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
> > >>[c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
> > >>[c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
> > >>[c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c
> > >
> > >A quick bisect leads to this:
> > >
> > >1c3c9328cde027eb875ba4692f0a5d66b0afe862 is the first bad commit
> > >commit 1c3c9328cde027eb875ba4692f0a5d66b0afe862
> > >Author: Mike Rapoport <rppt@linux.ibm.com>
> > >Date:   Thu Jan 31 10:51:32 2019 +1100
> > >
> > >     treewide: add checks for the return value of memblock_alloc*()
> > >     Add check for the return value of memblock_alloc*() functions and call
> > >     panic() in case of error.  The panic message repeats the one used by
> > >     panicing memblock allocators with adjustment of parameters to include only
> > >     relevant ones.
> > >
> > >Which is just adding the panic we hit.  So, presumably, the bug is in a
> > >preceding patch :-(
> > >
> > >I have left the kernel not booting for today.
> > >
> >
> > No I think the error is really in that patch, see my other mail.
> >
> > See https://elixir.bootlin.com/linux/v5.0-rc4/source/mm/memblock.c#L1455,
> > memblock_alloc_try_nid_raw() is not supposed to panic, so the last hunk of
> > this patch should be reverted.
> >
> > Found in total three problematic hunks in that patch:
> >
> > @@ -48,6 +53,11 @@ static phys_addr_t __init kasan_alloc_raw_page(int node)
> >       void *p = memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
> >                                               __pa(MAX_DMA_ADDRESS),
> >                                               MEMBLOCK_ALLOC_KASAN, node);
> > +     if (!p)
> > +             panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%llx\n",
> > +                   __func__, PAGE_SIZE, PAGE_SIZE, node,
> > +                   __pa(MAX_DMA_ADDRESS));
> > +
> >       return __pa(p);
> >  }
>
> I've looked more closely to the code that uses this function and it does
> not seem to handle allocation error.
> I can replace the panic with WARN(), but I think that panic() here is
> appropriate.
>
> Andrey, can you comment?

+ Andrey Ryabinin

I think panic() there looks appropriate. Added Andrey Ryabinin to take a look.

>
>
> > @@ -211,6 +211,9 @@ static int __init iob_init(struct device_node *dn)
> >       iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21,
> >                                       MEMBLOCK_LOW_LIMIT, 0x80000000,
> >                                       NUMA_NO_NODE);
> > +     if (!iob_l2_base)
> > +             panic("%s: Failed to allocate %lu bytes align=0x%lx max_addr=%x\n",
> > +                   __func__, 1UL << 21, 1UL << 21, 0x80000000);
> >
> >       pr_info("IOBMAP L2 allocated at: %p\n", iob_l2_base);
>
> This one is actually fixes my own mistake from one of the previous patches
> that converted memblock_alloc_base() to memblock_alloc_try_nid_raw() without
> adding the panic() (commit 47e382eb08cfa0199c4ea9f9cc73f1b48a3a4b1d
> "powerpc: prefer memblock APIs returning virtual address")
>
> > @@ -425,6 +436,10 @@ static void __init sparse_buffer_init(unsigned long
> > size, int nid)
> >               memblock_alloc_try_nid_raw(size, PAGE_SIZE,
> >                                               __pa(MAX_DMA_ADDRESS),
> >                                               MEMBLOCK_ALLOC_ACCESSIBLE, nid);
> > +     if (!sparsemap_buf)
> > +             panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
> > +                   __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
> > +
> >       sparsemap_buf_end = sparsemap_buf + size;
> >  }
>
> This hunk was not needed as sparse can deal with this allocation failure.
>
> Andrew, can you please add the below patch to as a fixup to "treewide: add
> checks for the return value of memblock_alloc*()"?
>
> From 854f54b9d4fe52f477765b905a4b2c421d30f46e Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <rppt@linux.ibm.com>
> Date: Thu, 31 Jan 2019 09:18:50 +0200
> Subject: [PATCH] mm/sparse: don't panic if the allocation in
>  sparse_buffer_init fails
>
> Addition of panic if memblock_alloc_try_nid_raw() call in
> sparse_buffer_init() fails was over enthusiastic as the system is perfectly
> capable to deal with that allocation failure.
> Remove the panic().
>
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>  mm/sparse.c | 4 ----
>  1 file changed, 4 deletions(-)
>
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 1471f06..c11aba0 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -434,10 +434,6 @@ static void __init sparse_buffer_init(unsigned long size, int nid)
>                 memblock_alloc_try_nid_raw(size, PAGE_SIZE,
>                                                 __pa(MAX_DMA_ADDRESS),
>                                                 MEMBLOCK_ALLOC_ACCESSIBLE, nid);
> -       if (!sparsemap_buf)
> -               panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
> -                     __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
> -
>         sparsemap_buf_end = sparsemap_buf + size;
>  }
>
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: linux-next: powerpc le qemu boot failure after merge of the akpm tree
@ 2019-01-31 13:50         ` Andrey Konovalov
  0 siblings, 0 replies; 17+ messages in thread
From: Andrey Konovalov @ 2019-01-31 13:50 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Stephen Rothwell, Linux Kernel Mailing List,
	Linux Next Mailing List, Andrey Ryabinin, Andrew Morton, PowerPC

On Thu, Jan 31, 2019 at 8:40 AM Mike Rapoport <rppt@linux.ibm.com> wrote:
>
> (added Andrey Konovalov)
>
> On Thu, Jan 31, 2019 at 07:15:26AM +0100, Christophe Leroy wrote:
> >
> > Le 31/01/2019 à 07:06, Stephen Rothwell a écrit :
> > >Hi all,
> > >
> > >On Thu, 31 Jan 2019 16:38:54 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> > >>
> > >>[I am guessing that is is something in Andrew's tree that has caused
> > >>this.]
> > >>
> > >>My qemu boot of the powerpc pseries_le_defconfig config failed like this:
> > >>
> > >>htab_hash_mask    = 0x1ffff
> > >>-----------------------------------------------------
> > >>numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
> > >>Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff
> > >>CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
> > >>Call Trace:
> > >>[c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
> > >>[c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
> > >>[c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
> > >>[c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
> > >>[c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
> > >>[c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
> > >>[c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
> > >>[c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c
> > >
> > >A quick bisect leads to this:
> > >
> > >1c3c9328cde027eb875ba4692f0a5d66b0afe862 is the first bad commit
> > >commit 1c3c9328cde027eb875ba4692f0a5d66b0afe862
> > >Author: Mike Rapoport <rppt@linux.ibm.com>
> > >Date:   Thu Jan 31 10:51:32 2019 +1100
> > >
> > >     treewide: add checks for the return value of memblock_alloc*()
> > >     Add check for the return value of memblock_alloc*() functions and call
> > >     panic() in case of error.  The panic message repeats the one used by
> > >     panicing memblock allocators with adjustment of parameters to include only
> > >     relevant ones.
> > >
> > >Which is just adding the panic we hit.  So, presumably, the bug is in a
> > >preceding patch :-(
> > >
> > >I have left the kernel not booting for today.
> > >
> >
> > No I think the error is really in that patch, see my other mail.
> >
> > See https://elixir.bootlin.com/linux/v5.0-rc4/source/mm/memblock.c#L1455,
> > memblock_alloc_try_nid_raw() is not supposed to panic, so the last hunk of
> > this patch should be reverted.
> >
> > Found in total three problematic hunks in that patch:
> >
> > @@ -48,6 +53,11 @@ static phys_addr_t __init kasan_alloc_raw_page(int node)
> >       void *p = memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
> >                                               __pa(MAX_DMA_ADDRESS),
> >                                               MEMBLOCK_ALLOC_KASAN, node);
> > +     if (!p)
> > +             panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%llx\n",
> > +                   __func__, PAGE_SIZE, PAGE_SIZE, node,
> > +                   __pa(MAX_DMA_ADDRESS));
> > +
> >       return __pa(p);
> >  }
>
> I've looked more closely to the code that uses this function and it does
> not seem to handle allocation error.
> I can replace the panic with WARN(), but I think that panic() here is
> appropriate.
>
> Andrey, can you comment?

+ Andrey Ryabinin

I think panic() there looks appropriate. Added Andrey Ryabinin to take a look.

>
>
> > @@ -211,6 +211,9 @@ static int __init iob_init(struct device_node *dn)
> >       iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21,
> >                                       MEMBLOCK_LOW_LIMIT, 0x80000000,
> >                                       NUMA_NO_NODE);
> > +     if (!iob_l2_base)
> > +             panic("%s: Failed to allocate %lu bytes align=0x%lx max_addr=%x\n",
> > +                   __func__, 1UL << 21, 1UL << 21, 0x80000000);
> >
> >       pr_info("IOBMAP L2 allocated at: %p\n", iob_l2_base);
>
> This one is actually fixes my own mistake from one of the previous patches
> that converted memblock_alloc_base() to memblock_alloc_try_nid_raw() without
> adding the panic() (commit 47e382eb08cfa0199c4ea9f9cc73f1b48a3a4b1d
> "powerpc: prefer memblock APIs returning virtual address")
>
> > @@ -425,6 +436,10 @@ static void __init sparse_buffer_init(unsigned long
> > size, int nid)
> >               memblock_alloc_try_nid_raw(size, PAGE_SIZE,
> >                                               __pa(MAX_DMA_ADDRESS),
> >                                               MEMBLOCK_ALLOC_ACCESSIBLE, nid);
> > +     if (!sparsemap_buf)
> > +             panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
> > +                   __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
> > +
> >       sparsemap_buf_end = sparsemap_buf + size;
> >  }
>
> This hunk was not needed as sparse can deal with this allocation failure.
>
> Andrew, can you please add the below patch to as a fixup to "treewide: add
> checks for the return value of memblock_alloc*()"?
>
> From 854f54b9d4fe52f477765b905a4b2c421d30f46e Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <rppt@linux.ibm.com>
> Date: Thu, 31 Jan 2019 09:18:50 +0200
> Subject: [PATCH] mm/sparse: don't panic if the allocation in
>  sparse_buffer_init fails
>
> Addition of panic if memblock_alloc_try_nid_raw() call in
> sparse_buffer_init() fails was over enthusiastic as the system is perfectly
> capable to deal with that allocation failure.
> Remove the panic().
>
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
> ---
>  mm/sparse.c | 4 ----
>  1 file changed, 4 deletions(-)
>
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 1471f06..c11aba0 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -434,10 +434,6 @@ static void __init sparse_buffer_init(unsigned long size, int nid)
>                 memblock_alloc_try_nid_raw(size, PAGE_SIZE,
>                                                 __pa(MAX_DMA_ADDRESS),
>                                                 MEMBLOCK_ALLOC_ACCESSIBLE, nid);
> -       if (!sparsemap_buf)
> -               panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
> -                     __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
> -
>         sparsemap_buf_end = sparsemap_buf + size;
>  }
>
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-01-31 13:52 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-31  5:38 linux-next: powerpcle qemu boot failure after merge of the akpm tree Stephen Rothwell
2019-01-31  5:38 ` Stephen Rothwell
2019-01-31  6:06 ` linux-next: powerpc le " Stephen Rothwell
2019-01-31  6:06   ` Stephen Rothwell
2019-01-31  6:15   ` Christophe Leroy
2019-01-31  6:15     ` Christophe Leroy
2019-01-31  6:39     ` Mike Rapoport
2019-01-31  6:39       ` Mike Rapoport
2019-01-31  7:13       ` Stephen Rothwell
2019-01-31  7:13         ` Stephen Rothwell
2019-01-31  7:40     ` Mike Rapoport
2019-01-31  7:40       ` Mike Rapoport
2019-01-31  7:40       ` Mike Rapoport
2019-01-31  8:35       ` Stephen Rothwell
2019-01-31  8:35         ` Stephen Rothwell
2019-01-31 13:50       ` Andrey Konovalov
2019-01-31 13:50         ` Andrey Konovalov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.