linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Re: [Bug 204789] New: Boot failure with more than 256G of memory
       [not found] <bug-204789-27@https.bugzilla.kernel.org/>
@ 2019-09-11 14:31 ` Andrew Morton
  2019-09-11 15:34   ` Cameron Berkenpas
  2019-09-13  4:53   ` Aneesh Kumar K.V
  0 siblings, 2 replies; 9+ messages in thread
From: Andrew Morton @ 2019-09-11 14:31 UTC (permalink / raw)
  To: cam; +Cc: bugzilla-daemon, linuxppc-dev

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Sun, 08 Sep 2019 00:04:26 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=204789
> 
>             Bug ID: 204789
>            Summary: Boot failure with more than 256G of memory
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: 5.2.x
>           Hardware: PPC-64
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Other
>           Assignee: akpm@linux-foundation.org
>           Reporter: cam@neo-zeon.de
>         Regression: No

"Yes" :)

> Kernel series 5.2.x will not boot on my Talos II workstation with dual POWER9
> 18 core processors and 512G of physical memory with disable_radix=yes and 4k
> pages.
> 
> 5.3-rc6 did not work either.
> 
> 5.1 and earlier boot fine. 

Thanks.  It's probably best to report this on the powerpc list, cc'ed here.

> I can get the system to boot IF I leave the Radix MMU enabled or if I boot a
> kernel with 64k pages. I haven't yet tested enabling the Radix MMU with 64k
> pages at the same time, but I suspect this would work. This is a system I
> cannot take down TOO frequently.
> 
> The system will also boot with the Radix MMU disabled and 4k pages with 256G or
> less memory. Setting mem on the kernel CLI to 256G or less results in a
> successful boot. Setting mem=257G or higher no Radix MMU and 4k pages and the
> kernel will not boot.
> 
> Petitboot comes up, but the system fails VERY early in boot in the serial
> console with:
> SIGTERM received, booting...
> [   23.838858] kexec_core: Starting new kernel
> 
> Early printk is enabled, and it never progresses any further.
> 
> 5.1 boots just fine with the Radix MMU disabled and 4k pages.
> 
> Unfortunately, I currently need 4k pages for bcache to work, and Radix MMU
> disabled in order for FreeBSD 12.x to work under KVM so I'm sticking with
> 5.1.21 for now.
> 
> I have been unable to reproduce this issue in KVM.
> 
> Here are my PCIe peripherals:
> 1. Microsemi/Adaptec HBA 1100-4i SAS controller
> 2. Megaraid 9316-16i SAS RAID controller.
> 
> I've only tried little endian as this is a little endian install.
> 
> -- 
> You are receiving this mail because:
> You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bug 204789] New: Boot failure with more than 256G of memory
  2019-09-11 14:31 ` [Bug 204789] New: Boot failure with more than 256G of memory Andrew Morton
@ 2019-09-11 15:34   ` Cameron Berkenpas
  2019-09-13  4:53   ` Aneesh Kumar K.V
  1 sibling, 0 replies; 9+ messages in thread
From: Cameron Berkenpas @ 2019-09-11 15:34 UTC (permalink / raw)
  To: Andrew Morton; +Cc: bugzilla-daemon, linuxppc-dev

Hello,

Regression set to "yes". Not sure how I missed that. :)

Will report future PPC issues to that I come across to this list as well.

Thanks!

-Cameron

On 9/11/19 7:31 AM, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Sun, 08 Sep 2019 00:04:26 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
>
>> https://bugzilla.kernel.org/show_bug.cgi?id=204789
>>
>>              Bug ID: 204789
>>             Summary: Boot failure with more than 256G of memory
>>             Product: Memory Management
>>             Version: 2.5
>>      Kernel Version: 5.2.x
>>            Hardware: PPC-64
>>                  OS: Linux
>>                Tree: Mainline
>>              Status: NEW
>>            Severity: high
>>            Priority: P1
>>           Component: Other
>>            Assignee: akpm@linux-foundation.org
>>            Reporter: cam@neo-zeon.de
>>          Regression: No
> "Yes" :)
>
>> Kernel series 5.2.x will not boot on my Talos II workstation with dual POWER9
>> 18 core processors and 512G of physical memory with disable_radix=yes and 4k
>> pages.
>>
>> 5.3-rc6 did not work either.
>>
>> 5.1 and earlier boot fine.
> Thanks.  It's probably best to report this on the powerpc list, cc'ed here.
>
>> I can get the system to boot IF I leave the Radix MMU enabled or if I boot a
>> kernel with 64k pages. I haven't yet tested enabling the Radix MMU with 64k
>> pages at the same time, but I suspect this would work. This is a system I
>> cannot take down TOO frequently.
>>
>> The system will also boot with the Radix MMU disabled and 4k pages with 256G or
>> less memory. Setting mem on the kernel CLI to 256G or less results in a
>> successful boot. Setting mem=257G or higher no Radix MMU and 4k pages and the
>> kernel will not boot.
>>
>> Petitboot comes up, but the system fails VERY early in boot in the serial
>> console with:
>> SIGTERM received, booting...
>> [   23.838858] kexec_core: Starting new kernel
>>
>> Early printk is enabled, and it never progresses any further.
>>
>> 5.1 boots just fine with the Radix MMU disabled and 4k pages.
>>
>> Unfortunately, I currently need 4k pages for bcache to work, and Radix MMU
>> disabled in order for FreeBSD 12.x to work under KVM so I'm sticking with
>> 5.1.21 for now.
>>
>> I have been unable to reproduce this issue in KVM.
>>
>> Here are my PCIe peripherals:
>> 1. Microsemi/Adaptec HBA 1100-4i SAS controller
>> 2. Megaraid 9316-16i SAS RAID controller.
>>
>> I've only tried little endian as this is a little endian install.
>>
>> -- 
>> You are receiving this mail because:
>> You are the assignee for the bug.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bug 204789] New: Boot failure with more than 256G of memory
  2019-09-11 14:31 ` [Bug 204789] New: Boot failure with more than 256G of memory Andrew Morton
  2019-09-11 15:34   ` Cameron Berkenpas
@ 2019-09-13  4:53   ` Aneesh Kumar K.V
  2019-09-13 14:21     ` Aneesh Kumar K.V
  1 sibling, 1 reply; 9+ messages in thread
From: Aneesh Kumar K.V @ 2019-09-13  4:53 UTC (permalink / raw)
  To: Andrew Morton, cam; +Cc: bugzilla-daemon, linuxppc-dev

Andrew Morton <akpm@linux-foundation.org> writes:

> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Sun, 08 Sep 2019 00:04:26 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
>
>> https://bugzilla.kernel.org/show_bug.cgi?id=204789
>> 
>>             Bug ID: 204789
>>            Summary: Boot failure with more than 256G of memory
>>            Product: Memory Management
>>            Version: 2.5
>>     Kernel Version: 5.2.x
>>           Hardware: PPC-64
>>                 OS: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: high
>>           Priority: P1
>>          Component: Other
>>           Assignee: akpm@linux-foundation.org
>>           Reporter: cam@neo-zeon.de
>>         Regression: No
>
> "Yes" :)
>
>> Kernel series 5.2.x will not boot on my Talos II workstation with dual POWER9
>> 18 core processors and 512G of physical memory with disable_radix=yes and 4k
>> pages.
>> 
>> 5.3-rc6 did not work either.
>> 
>> 5.1 and earlier boot fine. 
>
> Thanks.  It's probably best to report this on the powerpc list, cc'ed here.
>
>> I can get the system to boot IF I leave the Radix MMU enabled or if I boot a
>> kernel with 64k pages. I haven't yet tested enabling the Radix MMU with 64k
>> pages at the same time, but I suspect this would work. This is a system I
>> cannot take down TOO frequently.
>> 
>> The system will also boot with the Radix MMU disabled and 4k pages with 256G or
>> less memory. Setting mem on the kernel CLI to 256G or less results in a
>> successful boot. Setting mem=257G or higher no Radix MMU and 4k pages and the
>> kernel will not boot.
>> 
>> Petitboot comes up, but the system fails VERY early in boot in the serial
>> console with:
>> SIGTERM received, booting...
>> [   23.838858] kexec_core: Starting new kernel
>> 
>> Early printk is enabled, and it never progresses any further.
>> 
>> 5.1 boots just fine with the Radix MMU disabled and 4k pages.
>> 
>> Unfortunately, I currently need 4k pages for bcache to work, and Radix MMU
>> disabled in order for FreeBSD 12.x to work under KVM so I'm sticking with
>> 5.1.21 for now.
>> 
>> I have been unable to reproduce this issue in KVM.
>> 
>> Here are my PCIe peripherals:
>> 1. Microsemi/Adaptec HBA 1100-4i SAS controller
>> 2. Megaraid 9316-16i SAS RAID controller.
>> 
>> I've only tried little endian as this is a little endian install.

Will you be able to bisect this? I tried 4K PAGESIZE on P8 with upstream
kernel and I can't recreate the issuue.

[root@ltc ~]# free -g
              total        used        free      shared  buff/cache   available
Mem:            495           0         494           0           0         493
Swap:             0           0           0
[root@ltc ~]# getconf PAGESIZE
4096
[root@ltc ~]# grep Hash /proc/cpuinfo 
MMU             : Hash

I will see if I can get a P9 system with largemem

-aneesh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bug 204789] New: Boot failure with more than 256G of memory
  2019-09-13  4:53   ` Aneesh Kumar K.V
@ 2019-09-13 14:21     ` Aneesh Kumar K.V
  2019-09-13 15:05       ` Cameron Berkenpas
  0 siblings, 1 reply; 9+ messages in thread
From: Aneesh Kumar K.V @ 2019-09-13 14:21 UTC (permalink / raw)
  To: Andrew Morton, cam; +Cc: bugzilla-daemon, linuxppc-dev

Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> writes:

> Andrew Morton <akpm@linux-foundation.org> writes:
>
>> (switched to email.  Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Sun, 08 Sep 2019 00:04:26 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
>>
>>> https://bugzilla.kernel.org/show_bug.cgi?id=204789
>>> 
>>>             Bug ID: 204789
>>>            Summary: Boot failure with more than 256G of memory
>>>            Product: Memory Management
>>>            Version: 2.5
>>>     Kernel Version: 5.2.x
>>>           Hardware: PPC-64
>>>                 OS: Linux
>>>               Tree: Mainline
>>>             Status: NEW
>>>           Severity: high
>>>           Priority: P1
>>>          Component: Other
>>>           Assignee: akpm@linux-foundation.org
>>>           Reporter: cam@neo-zeon.de
>>>         Regression: No
>>
>> "Yes" :)
>>
>>> Kernel series 5.2.x will not boot on my Talos II workstation with dual POWER9
>>> 18 core processors and 512G of physical memory with disable_radix=yes and 4k
>>> pages.
>>> 
>>> 5.3-rc6 did not work either.
>>> 
>>> 5.1 and earlier boot fine. 
>>
>> Thanks.  It's probably best to report this on the powerpc list, cc'ed here.
>>
>>> I can get the system to boot IF I leave the Radix MMU enabled or if I boot a
>>> kernel with 64k pages. I haven't yet tested enabling the Radix MMU with 64k
>>> pages at the same time, but I suspect this would work. This is a system I
>>> cannot take down TOO frequently.
>>> 
>>> The system will also boot with the Radix MMU disabled and 4k pages with 256G or
>>> less memory. Setting mem on the kernel CLI to 256G or less results in a
>>> successful boot. Setting mem=257G or higher no Radix MMU and 4k pages and the
>>> kernel will not boot.
>>> 
>>> Petitboot comes up, but the system fails VERY early in boot in the serial
>>> console with:
>>> SIGTERM received, booting...
>>> [   23.838858] kexec_core: Starting new kernel
>>> 
>>> Early printk is enabled, and it never progresses any further.
>>> 
>>> 5.1 boots just fine with the Radix MMU disabled and 4k pages.
>>> 
>>> Unfortunately, I currently need 4k pages for bcache to work, and Radix MMU
>>> disabled in order for FreeBSD 12.x to work under KVM so I'm sticking with
>>> 5.1.21 for now.
>>> 
>>> I have been unable to reproduce this issue in KVM.
>>> 
>>> Here are my PCIe peripherals:
>>> 1. Microsemi/Adaptec HBA 1100-4i SAS controller
>>> 2. Megaraid 9316-16i SAS RAID controller.
>>> 
>>> I've only tried little endian as this is a little endian install.
>
> Will you be able to bisect this? I tried 4K PAGESIZE on P8 with upstream
> kernel and I can't recreate the issuue.
>
> [root@ltc ~]# free -g
>               total        used        free      shared  buff/cache   available
> Mem:            495           0         494           0           0         493
> Swap:             0           0           0
> [root@ltc ~]# getconf PAGESIZE
> 4096
> [root@ltc ~]# grep Hash /proc/cpuinfo 
> MMU             : Hash
>
> I will see if I can get a P9 system with largemem
>

I was able to recreate this on a system that got memory above 16TB
address. I guess your P9 system memory layout is also like that.

Can you try this patch? It doesn't really fix the isssue, as in map the
full 512GB of memory. But it do prevent the kernel crash.

commit ebd05100344765fc3c030f0c257c2f9236fcd1ec
Author: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Date:   Fri Sep 13 19:26:25 2019 +0530

    powerpc/book3s64/hash/4k: 4k supports only 16TB linear mapping
    
    With commit: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the
    same 0xc range"), we now split the 64TB address range into 4 contexts each of
    16TB. That implies we can do only 16TB linear mapping. Make sure we don't
    add physical memory above 16TB if that is present in the system.
    
    Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index bb3deb76c951..86cce8189240 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -35,12 +35,16 @@ extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
  * memory requirements with large number of sections.
  * 51 bits is the max physical real address on POWER9
  */
-#if defined(CONFIG_SPARSEMEM_VMEMMAP) && defined(CONFIG_SPARSEMEM_EXTREME) &&  \
-	defined(CONFIG_PPC_64K_PAGES)
+
+#if defined(CONFIG_PPC_64K_PAGES)
+#if defined(CONFIG_SPARSEMEM_VMEMMAP) && defined(CONFIG_SPARSEMEM_EXTREME)
 #define MAX_PHYSMEM_BITS 51
 #else
 #define MAX_PHYSMEM_BITS 46
 #endif
+#else /* CONFIG_PPC_64K_PAGES */
+#define MAX_PHYSMEM_BITS 44
+#endif
 
 /* 64-bit classic hash table MMU */
 #include <asm/book3s/64/mmu-hash.h>


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Bug 204789] New: Boot failure with more than 256G of memory
  2019-09-13 14:21     ` Aneesh Kumar K.V
@ 2019-09-13 15:05       ` Cameron Berkenpas
  2019-09-13 16:13         ` Aneesh Kumar K.V
  0 siblings, 1 reply; 9+ messages in thread
From: Cameron Berkenpas @ 2019-09-13 15:05 UTC (permalink / raw)
  To: Aneesh Kumar K.V, Andrew Morton; +Cc: bugzilla-daemon, linuxppc-dev

Yep, the box comes up now, but with 256G memory as expected.

I'll get back to you on when I'll be able to bisect.

Thanks!

On 9/13/19 7:21 AM, Aneesh Kumar K.V wrote:
> Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> writes:
>
>> Andrew Morton <akpm@linux-foundation.org> writes:
>>
>>> (switched to email.  Please respond via emailed reply-to-all, not via the
>>> bugzilla web interface).
>>>
>>> On Sun, 08 Sep 2019 00:04:26 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
>>>
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=204789
>>>>
>>>>              Bug ID: 204789
>>>>             Summary: Boot failure with more than 256G of memory
>>>>             Product: Memory Management
>>>>             Version: 2.5
>>>>      Kernel Version: 5.2.x
>>>>            Hardware: PPC-64
>>>>                  OS: Linux
>>>>                Tree: Mainline
>>>>              Status: NEW
>>>>            Severity: high
>>>>            Priority: P1
>>>>           Component: Other
>>>>            Assignee: akpm@linux-foundation.org
>>>>            Reporter: cam@neo-zeon.de
>>>>          Regression: No
>>> "Yes" :)
>>>
>>>> Kernel series 5.2.x will not boot on my Talos II workstation with dual POWER9
>>>> 18 core processors and 512G of physical memory with disable_radix=yes and 4k
>>>> pages.
>>>>
>>>> 5.3-rc6 did not work either.
>>>>
>>>> 5.1 and earlier boot fine.
>>> Thanks.  It's probably best to report this on the powerpc list, cc'ed here.
>>>
>>>> I can get the system to boot IF I leave the Radix MMU enabled or if I boot a
>>>> kernel with 64k pages. I haven't yet tested enabling the Radix MMU with 64k
>>>> pages at the same time, but I suspect this would work. This is a system I
>>>> cannot take down TOO frequently.
>>>>
>>>> The system will also boot with the Radix MMU disabled and 4k pages with 256G or
>>>> less memory. Setting mem on the kernel CLI to 256G or less results in a
>>>> successful boot. Setting mem=257G or higher no Radix MMU and 4k pages and the
>>>> kernel will not boot.
>>>>
>>>> Petitboot comes up, but the system fails VERY early in boot in the serial
>>>> console with:
>>>> SIGTERM received, booting...
>>>> [   23.838858] kexec_core: Starting new kernel
>>>>
>>>> Early printk is enabled, and it never progresses any further.
>>>>
>>>> 5.1 boots just fine with the Radix MMU disabled and 4k pages.
>>>>
>>>> Unfortunately, I currently need 4k pages for bcache to work, and Radix MMU
>>>> disabled in order for FreeBSD 12.x to work under KVM so I'm sticking with
>>>> 5.1.21 for now.
>>>>
>>>> I have been unable to reproduce this issue in KVM.
>>>>
>>>> Here are my PCIe peripherals:
>>>> 1. Microsemi/Adaptec HBA 1100-4i SAS controller
>>>> 2. Megaraid 9316-16i SAS RAID controller.
>>>>
>>>> I've only tried little endian as this is a little endian install.
>> Will you be able to bisect this? I tried 4K PAGESIZE on P8 with upstream
>> kernel and I can't recreate the issuue.
>>
>> [root@ltc ~]# free -g
>>                total        used        free      shared  buff/cache   available
>> Mem:            495           0         494           0           0         493
>> Swap:             0           0           0
>> [root@ltc ~]# getconf PAGESIZE
>> 4096
>> [root@ltc ~]# grep Hash /proc/cpuinfo
>> MMU             : Hash
>>
>> I will see if I can get a P9 system with largemem
>>
> I was able to recreate this on a system that got memory above 16TB
> address. I guess your P9 system memory layout is also like that.
>
> Can you try this patch? It doesn't really fix the isssue, as in map the
> full 512GB of memory. But it do prevent the kernel crash.
>
> commit ebd05100344765fc3c030f0c257c2f9236fcd1ec
> Author: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Date:   Fri Sep 13 19:26:25 2019 +0530
>
>      powerpc/book3s64/hash/4k: 4k supports only 16TB linear mapping
>      
>      With commit: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the
>      same 0xc range"), we now split the 64TB address range into 4 contexts each of
>      16TB. That implies we can do only 16TB linear mapping. Make sure we don't
>      add physical memory above 16TB if that is present in the system.
>      
>      Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
> index bb3deb76c951..86cce8189240 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu.h
> @@ -35,12 +35,16 @@ extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
>    * memory requirements with large number of sections.
>    * 51 bits is the max physical real address on POWER9
>    */
> -#if defined(CONFIG_SPARSEMEM_VMEMMAP) && defined(CONFIG_SPARSEMEM_EXTREME) &&  \
> -	defined(CONFIG_PPC_64K_PAGES)
> +
> +#if defined(CONFIG_PPC_64K_PAGES)
> +#if defined(CONFIG_SPARSEMEM_VMEMMAP) && defined(CONFIG_SPARSEMEM_EXTREME)
>   #define MAX_PHYSMEM_BITS 51
>   #else
>   #define MAX_PHYSMEM_BITS 46
>   #endif
> +#else /* CONFIG_PPC_64K_PAGES */
> +#define MAX_PHYSMEM_BITS 44
> +#endif
>   
>   /* 64-bit classic hash table MMU */
>   #include <asm/book3s/64/mmu-hash.h>
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bug 204789] New: Boot failure with more than 256G of memory
  2019-09-13 15:05       ` Cameron Berkenpas
@ 2019-09-13 16:13         ` Aneesh Kumar K.V
  2019-09-13 17:28           ` Cameron Berkenpas
  0 siblings, 1 reply; 9+ messages in thread
From: Aneesh Kumar K.V @ 2019-09-13 16:13 UTC (permalink / raw)
  To: Cameron Berkenpas, Andrew Morton; +Cc: bugzilla-daemon, linuxppc-dev

On 9/13/19 8:35 PM, Cameron Berkenpas wrote:
> Yep, the box comes up now, but with 256G memory as expected.
> 
> I'll get back to you on when I'll be able to bisect.
> 
> Thanks!

I am sure this is due to

commit: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in 
the same 0xc range"),

We reduced the linear map range for 4K page size to 16TB there.


-aneesh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bug 204789] New: Boot failure with more than 256G of memory
  2019-09-13 16:13         ` Aneesh Kumar K.V
@ 2019-09-13 17:28           ` Cameron Berkenpas
  2019-09-18  3:15             ` Aneesh Kumar K.V
  0 siblings, 1 reply; 9+ messages in thread
From: Cameron Berkenpas @ 2019-09-13 17:28 UTC (permalink / raw)
  To: Aneesh Kumar K.V, Andrew Morton; +Cc: bugzilla-daemon, linuxppc-dev

Running against the kernel I built against 0034d395f89d and the problem 
is still there.

However, running against the kernel I built against the previous commit, 
a35a3c6f6065, and the system boots.

This being due to 0034d395f89d confirmed.

Thanks!

On 9/13/19 9:13 AM, Aneesh Kumar K.V wrote:
> On 9/13/19 8:35 PM, Cameron Berkenpas wrote:
>> Yep, the box comes up now, but with 256G memory as expected.
>>
>> I'll get back to you on when I'll be able to bisect.
>>
>> Thanks!
>
> I am sure this is due to
>
> commit: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions 
> in the same 0xc range"),
>
> We reduced the linear map range for 4K page size to 16TB there.
>
>
> -aneesh


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bug 204789] New: Boot failure with more than 256G of memory
  2019-09-13 17:28           ` Cameron Berkenpas
@ 2019-09-18  3:15             ` Aneesh Kumar K.V
  2019-09-18 15:38               ` Cameron Berkenpas
  0 siblings, 1 reply; 9+ messages in thread
From: Aneesh Kumar K.V @ 2019-09-18  3:15 UTC (permalink / raw)
  To: Cameron Berkenpas, Andrew Morton; +Cc: bugzilla-daemon, linuxppc-dev

On 9/13/19 10:58 PM, Cameron Berkenpas wrote:
> Running against the kernel I built against 0034d395f89d and the problem 
> is still there.
> 
> However, running against the kernel I built against the previous commit, 
> a35a3c6f6065, and the system boots.
> 
> This being due to 0034d395f89d confirmed.


https://lore.kernel.org/linuxppc-dev/20190917145702.9214-1-aneesh.kumar@linux.ibm.com 


This series should help you.

-aneesh


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bug 204789] New: Boot failure with more than 256G of memory
  2019-09-18  3:15             ` Aneesh Kumar K.V
@ 2019-09-18 15:38               ` Cameron Berkenpas
  0 siblings, 0 replies; 9+ messages in thread
From: Cameron Berkenpas @ 2019-09-18 15:38 UTC (permalink / raw)
  To: Aneesh Kumar K.V, Andrew Morton; +Cc: bugzilla-daemon, linuxppc-dev

Hello,

Unfortunately, this patch set has made things quite a bit worse for me. 
Appending mem=256G doesn't fix it either. in all cases, the system at 
least gets past early boot and then I will probably get a panic and 
eventual reboot, or occasionally it just locks up entirely.

Here's my very first attempt at booting the kernel where I didn't even 
get a panic:
https://pastebin.com/a3TVZcVB

Here's another attempt where I get a panic:
https://pastebin.com/QsJjyC2v

Finally here's an attempt with mem=256G:
https://pastebin.com/swgLYie9

I don't know that these results are substantially different from each 
other, but perhaps there's something helpful.

Sometimes (but not in any of the above), the host gets to the point that 
systemd starts up, but ultimately it seems I got the same stacktrace.

At one point, I ended up with a CPU guarded out, but it was simple to 
recover.

-Cameron

On 9/17/19 8:15 PM, Aneesh Kumar K.V wrote:
> On 9/13/19 10:58 PM, Cameron Berkenpas wrote:
>> Running against the kernel I built against 0034d395f89d and the 
>> problem is still there.
>>
>> However, running against the kernel I built against the previous 
>> commit, a35a3c6f6065, and the system boots.
>>
>> This being due to 0034d395f89d confirmed.
>
>
> https://lore.kernel.org/linuxppc-dev/20190917145702.9214-1-aneesh.kumar@linux.ibm.com 
>
>
> This series should help you.
>
> -aneesh
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-09-18 15:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-204789-27@https.bugzilla.kernel.org/>
2019-09-11 14:31 ` [Bug 204789] New: Boot failure with more than 256G of memory Andrew Morton
2019-09-11 15:34   ` Cameron Berkenpas
2019-09-13  4:53   ` Aneesh Kumar K.V
2019-09-13 14:21     ` Aneesh Kumar K.V
2019-09-13 15:05       ` Cameron Berkenpas
2019-09-13 16:13         ` Aneesh Kumar K.V
2019-09-13 17:28           ` Cameron Berkenpas
2019-09-18  3:15             ` Aneesh Kumar K.V
2019-09-18 15:38               ` Cameron Berkenpas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).