All of lore.kernel.org
 help / color / mirror / Atom feed
* x86: early boot crash: "alloc_low_page: ran out of memory" (bisected)
@ 2013-08-09 23:18 Dave Hansen
  2013-08-09 23:23 ` Yinghai Lu
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Hansen @ 2013-08-09 23:18 UTC (permalink / raw)
  To: Yinghai Lu, x86, LKML, H. Peter Anvin

I'm getting a 100% reproducible panic early in boot:

> [    0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory

I'm not sure why I didn't run in to this until now.  I think there are a
couple of config options that need to get set just right to trigger it,
but CONFIG_DEBUG_PAGEALLOC seems to be the main one.  Full config is here:

	http://sr71.net/~dave/intel/foo/config-bigbox-crash-20130809.txt

I bisected it back to this commit (which I seem to remember causing some
other probems):

> commit 8170e6bed465b4b0c7687f93e9948aca4358a33b
> Author: H. Peter Anvin <hpa@zytor.com>
> Date:   Thu Jan 24 12:19:52 2013 -0800
> 
>     x86, 64bit: Use a #PF handler to materialize early mappings on demand

I need somewhere between 500G and 600G of memory to trigger it, but it
can be triggered using qemu with much less _actual_ RAM than that.  From
looking at the dmesg diffs, I suspect that the delta in memory use
between using 1G and 4k ptes for the identity mapping (DEBUG_PAGEALLOC
forces 4k pages) is the proximate trigger.

I also suspect that alloc_low_pages() is buggy in the way it manipulates
min/max_pfn_mapped.  I'm quite baffled how 'max_pfn_mapped' is supposed
to get set up correctly.  Current code says:

	max_pfn_mapped = 0; /* will get exact value next */

but I certainly don't see it getting set later on in that function, or
_ever_ as adding some printk()'s shows:

> +[    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
> +[    0.000000]  [mem 0x00000000-0x000fffff] page 4k
> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 0 max_pfn_mapped: 0
> +[    0.000000] BRK [0x02086000, 0x02086fff] PGTABLE
> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 0 max_pfn_mapped: 0
> +[    0.000000] BRK [0x02087000, 0x02087fff] PGTABLE
> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 0 max_pfn_mapped: 0
> +[    0.000000] BRK [0x02088000, 0x02088fff] PGTABLE
> +[    0.000000] init_memory_mapping: [mem 0xf07fe00000-0xf07fffffff]
> +[    0.000000]  [mem 0xf07fe00000-0xf07fffffff] page 4k
> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 252182528 max_pfn_mapped: 0
> +[    0.000000] BRK [0x02089000, 0x02089fff] PGTABLE
> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 252182528 max_pfn_mapped: 0
> +[    0.000000] BRK [0x0208a000, 0x0208afff] PGTABLE
> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 252182528 max_pfn_mapped: 0
> +[    0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory

I'll take a closer look at it next week, but figured I'd report it first.

Full dmesg:

> early console in setup code
> [    0.000000] Initializing cgroup subsys cpuset
> [    0.000000] Initializing cgroup subsys cpu
> [    0.000000] Linux version 3.8.0-rc5-00059-g8170e6b (davehans@viggo.jf.intel.com) (gcc version 4.6.3 20120306 (Red Hat 4.6.3-2) (GCC) ) #29 SMP Fri Aug 9 15:56:12 PDT 2013
> [    0.000000] Command line: root=/dev/sda1 console=ttyS0,115200 earlyprintk=ttyS0,115200 debug
> [    0.000000] e820: BIOS-provided physical RAM map:
> [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009f3ff] usable
> [    0.000000] BIOS-e820: [mem 0x000000000009f400-0x000000000009ffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000dfffbfff] usable
> [    0.000000] BIOS-e820: [mem 0x00000000dfffc000-0x00000000dfffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000929bffffff] usable
> [    0.000000] bootconsole [earlyser0] enabled
> [    0.000000] NX (Execute Disable) protection: active
> [    0.000000] SMBIOS 2.4 present.
> [    0.000000] DMI: Bochs Bochs, BIOS Bochs 01/01/2007
> [    0.000000] e820: update [mem 0x00000000-0x0000ffff] usable ==> reserved
> [    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
> [    0.000000] No AGP bridge found
> [    0.000000] e820: last_pfn = 0x929c000 max_arch_pfn = 0x400000000
> [    0.000000] MTRR default type: write-back
> [    0.000000] MTRR fixed ranges enabled:
> [    0.000000]   00000-9FFFF write-back
> [    0.000000]   A0000-BFFFF uncachable
> [    0.000000]   C0000-FFFFF write-protect
> [    0.000000] MTRR variable ranges enabled:
> [    0.000000]   0 base 00E0000000 mask FFE0000000 uncachable
> [    0.000000]   1 disabled
> [    0.000000]   2 disabled
> [    0.000000]   3 disabled
> [    0.000000]   4 disabled
> [    0.000000]   5 disabled
> [    0.000000]   6 disabled
> [    0.000000]   7 disabled
> [    0.000000] PAT not supported by CPU.
> [    0.000000] e820: last_pfn = 0xdfffc max_arch_pfn = 0x400000000
> [    0.000000] found SMP MP-table at [mem 0x000fdb00-0x000fdb0f] mapped at [ffff8800000fdb00]
> [    0.000000] initial memory mapped: [mem 0x00000000-0xffffffffffffffff]
> [    0.000000] Base memory trampoline at [ffff880000099000] 99000 size 24576
> [    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
> [    0.000000]  [mem 0x00000000-0x000fffff] page 4k
> [    0.000000] BRK [0x0205a000, 0x0205afff] PGTABLE
> [    0.000000] BRK [0x0205b000, 0x0205bfff] PGTABLE
> [    0.000000] BRK [0x0205c000, 0x0205cfff] PGTABLE
> [    0.000000] init_memory_mapping: [mem 0x929be00000-0x929bffffff]
> [    0.000000]  [mem 0x929be00000-0x929bffffff] page 4k
> [    0.000000] BRK [0x0205d000, 0x0205dfff] PGTABLE
> [    0.000000] BRK [0x0205e000, 0x0205efff] PGTABLE
> [    0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory
> [    0.000000] Pid: 0, comm: swapper Not tainted 3.8.0-rc5-00059-g8170e6b #29
> [    0.000000] Call Trace:
> [    0.000000]  [<ffffffff81639b47>] panic+0xbb/0x1cb
> [    0.000000]  [<ffffffff816257aa>] alloc_low_pages+0x15a/0x160
> [    0.000000]  [<ffffffff81634d46>] phys_pmd_init+0x1f1/0x290
> [    0.000000]  [<ffffffff81634fb7>] phys_pud_init+0x1d2/0x24f
> [    0.000000]  [<ffffffff81635132>] kernel_physical_mapping_init+0xfe/0x16e
> [    0.000000]  [<ffffffff81625993>] init_memory_mapping+0x1e3/0x350
> [    0.000000]  [<ffffffff81cf5c5d>] init_range_memory_mapping+0xc2/0x10b
> [    0.000000]  [<ffffffff81cf5dd9>] init_mem_mapping+0x133/0x1c8
> [    0.000000]  [<ffffffff81ce77ad>] setup_arch+0x6ef/0xbe4
> [    0.000000]  [<ffffffff81639ca4>] ? printk+0x4d/0x4f
> [    0.000000]  [<ffffffff81ce3b4d>] start_kernel+0xce/0x3b3
> [    0.000000]  [<ffffffff81ce3592>] x86_64_start_reservations+0x91/0x95
> [    0.000000]  [<ffffffff81ce3681>] x86_64_start_kernel+0xeb/0xf2

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: x86: early boot crash: "alloc_low_page: ran out of memory" (bisected)
  2013-08-09 23:18 x86: early boot crash: "alloc_low_page: ran out of memory" (bisected) Dave Hansen
@ 2013-08-09 23:23 ` Yinghai Lu
  2013-08-10  1:19   ` Dave Hansen
  0 siblings, 1 reply; 11+ messages in thread
From: Yinghai Lu @ 2013-08-09 23:23 UTC (permalink / raw)
  To: Dave Hansen; +Cc: the arch/x86 maintainers, LKML, H. Peter Anvin

On Fri, Aug 9, 2013 at 4:18 PM, Dave Hansen <dave.hansen@intel.com> wrote:
> I'm getting a 100% reproducible panic early in boot:
>
>> [    0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory
>
> I'm not sure why I didn't run in to this until now.  I think there are a
> couple of config options that need to get set just right to trigger it,
> but CONFIG_DEBUG_PAGEALLOC seems to be the main one.  Full config is here:
>
>         http://sr71.net/~dave/intel/foo/config-bigbox-crash-20130809.txt
>
> I bisected it back to this commit (which I seem to remember causing some
> other probems):
>
>> commit 8170e6bed465b4b0c7687f93e9948aca4358a33b
>> Author: H. Peter Anvin <hpa@zytor.com>
>> Date:   Thu Jan 24 12:19:52 2013 -0800
>>
>>     x86, 64bit: Use a #PF handler to materialize early mappings on demand
>
> I need somewhere between 500G and 600G of memory to trigger it, but it
> can be triggered using qemu with much less _actual_ RAM than that.  From
> looking at the dmesg diffs, I suspect that the delta in memory use
> between using 1G and 4k ptes for the identity mapping (DEBUG_PAGEALLOC
> forces 4k pages) is the proximate trigger.
>
> I also suspect that alloc_low_pages() is buggy in the way it manipulates
> min/max_pfn_mapped.  I'm quite baffled how 'max_pfn_mapped' is supposed
> to get set up correctly.  Current code says:
>
>         max_pfn_mapped = 0; /* will get exact value next */
>
> but I certainly don't see it getting set later on in that function, or
> _ever_ as adding some printk()'s shows:
>
>> +[    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
>> +[    0.000000]  [mem 0x00000000-0x000fffff] page 4k
>> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 0 max_pfn_mapped: 0
>> +[    0.000000] BRK [0x02086000, 0x02086fff] PGTABLE
>> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 0 max_pfn_mapped: 0
>> +[    0.000000] BRK [0x02087000, 0x02087fff] PGTABLE
>> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 0 max_pfn_mapped: 0
>> +[    0.000000] BRK [0x02088000, 0x02088fff] PGTABLE
>> +[    0.000000] init_memory_mapping: [mem 0xf07fe00000-0xf07fffffff]
>> +[    0.000000]  [mem 0xf07fe00000-0xf07fffffff] page 4k
>> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 252182528 max_pfn_mapped: 0
>> +[    0.000000] BRK [0x02089000, 0x02089fff] PGTABLE
>> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 252182528 max_pfn_mapped: 0
>> +[    0.000000] BRK [0x0208a000, 0x0208afff] PGTABLE
>> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 252182528 max_pfn_mapped: 0
>> +[    0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory
>
> I'll take a closer look at it next week, but figured I'd report it first.
>
> Full dmesg:
>
>> early console in setup code
>> [    0.000000] Initializing cgroup subsys cpuset
>> [    0.000000] Initializing cgroup subsys cpu
>> [    0.000000] Linux version 3.8.0-rc5-00059-g8170e6b

so how about v3.10?

We should have some fixes in 3.10 already.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: x86: early boot crash: "alloc_low_page: ran out of memory" (bisected)
  2013-08-09 23:23 ` Yinghai Lu
@ 2013-08-10  1:19   ` Dave Hansen
  2013-08-10  2:10     ` Yinghai Lu
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Hansen @ 2013-08-10  1:19 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: the arch/x86 maintainers, LKML, H. Peter Anvin

On 08/09/2013 04:23 PM, Yinghai Lu wrote:
> On Fri, Aug 9, 2013 at 4:18 PM, Dave Hansen <dave.hansen@intel.com> wrote:
>> I'm getting a 100% reproducible panic early in boot:
...
>>> early console in setup code
>>> [    0.000000] Initializing cgroup subsys cpuset
>>> [    0.000000] Initializing cgroup subsys cpu
>>> [    0.000000] Linux version 3.8.0-rc5-00059-g8170e6b
> 
> so how about v3.10?
> 
> We should have some fixes in 3.10 already.

I was hitting it on Linus's current tree today (3.11-rcwhatever).  I
pasted the panic() from your patch's commit specifically, but the same
behavior is happening on current kernels, and it looked consistent as I
bisected between the 3.11-rc's and your commit.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: x86: early boot crash: "alloc_low_page: ran out of memory" (bisected)
  2013-08-10  1:19   ` Dave Hansen
@ 2013-08-10  2:10     ` Yinghai Lu
  2013-08-10  2:21       ` Yinghai Lu
  0 siblings, 1 reply; 11+ messages in thread
From: Yinghai Lu @ 2013-08-10  2:10 UTC (permalink / raw)
  To: Dave Hansen; +Cc: the arch/x86 maintainers, LKML, H. Peter Anvin

On Fri, Aug 9, 2013 at 6:19 PM, Dave Hansen <dave@sr71.net> wrote:
> On 08/09/2013 04:23 PM, Yinghai Lu wrote:
>> On Fri, Aug 9, 2013 at 4:18 PM, Dave Hansen <dave.hansen@intel.com> wrote:
>>> I'm getting a 100% reproducible panic early in boot:
> ...
>>>> early console in setup code
>>>> [    0.000000] Initializing cgroup subsys cpuset
>>>> [    0.000000] Initializing cgroup subsys cpu
>>>> [    0.000000] Linux version 3.8.0-rc5-00059-g8170e6b
>>
>> so how about v3.10?
>>
>> We should have some fixes in 3.10 already.
>
> I was hitting it on Linus's current tree today (3.11-rcwhatever).  I
> pasted the panic() from your patch's commit specifically, but the same
> behavior is happening on current kernels, and it looked consistent as I
> bisected between the 3.11-rc's and your commit.

Can you post 3.11-rc with "debug ignore_loglevel memblock=debug" ?

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: x86: early boot crash: "alloc_low_page: ran out of memory" (bisected)
  2013-08-10  2:10     ` Yinghai Lu
@ 2013-08-10  2:21       ` Yinghai Lu
  2013-08-12 16:27         ` Dave Hansen
  0 siblings, 1 reply; 11+ messages in thread
From: Yinghai Lu @ 2013-08-10  2:21 UTC (permalink / raw)
  To: Dave Hansen; +Cc: the arch/x86 maintainers, LKML, H. Peter Anvin

[-- Attachment #1: Type: text/plain, Size: 1004 bytes --]

On Fri, Aug 9, 2013 at 7:10 PM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Fri, Aug 9, 2013 at 6:19 PM, Dave Hansen <dave@sr71.net> wrote:
>> On 08/09/2013 04:23 PM, Yinghai Lu wrote:
>>> On Fri, Aug 9, 2013 at 4:18 PM, Dave Hansen <dave.hansen@intel.com> wrote:
>>>> I'm getting a 100% reproducible panic early in boot:
>> ...
>>>>> early console in setup code
>>>>> [    0.000000] Initializing cgroup subsys cpuset
>>>>> [    0.000000] Initializing cgroup subsys cpu
>>>>> [    0.000000] Linux version 3.8.0-rc5-00059-g8170e6b
>>>
>>> so how about v3.10?
>>>
>>> We should have some fixes in 3.10 already.
>>
>> I was hitting it on Linus's current tree today (3.11-rcwhatever).  I
>> pasted the panic() from your patch's commit specifically, but the same
>> behavior is happening on current kernels, and it looked consistent as I
>> bisected between the 3.11-rc's and your commit.
>
> Can you post 3.11-rc with "debug ignore_loglevel memblock=debug" ?
>

Can you try attached patch ?

Thanks

Yinghai

[-- Attachment #2: fix_dave_machine.patch --]
[-- Type: application/octet-stream, Size: 546 bytes --]

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 2ec29ac..f9eec80 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -78,8 +78,8 @@ __ref void *alloc_low_pages(unsigned int num)
 	return __va(pfn << PAGE_SHIFT);
 }
 
-/* need 4 4k for initial PMD_SIZE, 4k for 0-ISA_END_ADDRESS */
-#define INIT_PGT_BUF_SIZE	(5 * PAGE_SIZE)
+/* need 4 4k for initial PMD_SIZE,  3 4k for 0-ISA_END_ADDRESS */
+#define INIT_PGT_BUF_SIZE	(7 * PAGE_SIZE)
 RESERVE_BRK(early_pgt_alloc, INIT_PGT_BUF_SIZE);
 void  __init early_alloc_pgt_buf(void)
 {

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: x86: early boot crash: "alloc_low_page: ran out of memory" (bisected)
  2013-08-10  2:21       ` Yinghai Lu
@ 2013-08-12 16:27         ` Dave Hansen
  2013-08-12 23:43           ` [PATCH] x86: Fix booting with DEBUG_PAGE_ALLOC with more than 512G RAM Yinghai Lu
  2013-08-12 23:47           ` x86: early boot crash: "alloc_low_page: ran out of memory" (bisected) Yinghai Lu
  0 siblings, 2 replies; 11+ messages in thread
From: Dave Hansen @ 2013-08-12 16:27 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: the arch/x86 maintainers, LKML, H. Peter Anvin

On 08/09/2013 07:21 PM, Yinghai Lu wrote:
> -/* need 4 4k for initial PMD_SIZE, 4k for 0-ISA_END_ADDRESS */
> -#define INIT_PGT_BUF_SIZE	(5 * PAGE_SIZE)
> +/* need 4 4k for initial PMD_SIZE,  3 4k for 0-ISA_END_ADDRESS */
> +#define INIT_PGT_BUF_SIZE	(7 * PAGE_SIZE)
>  RESERVE_BRK(early_pgt_alloc, INIT_PGT_BUF_SIZE);
>  void  __init early_alloc_pgt_buf(void)

That patch allows me to boot again.  I've also attached a boot with the
debug options that you asked for.

I'm really curious to see the full changelog for why this patch helps
any why it's only triggered for large memory sizes. :)

> [    0.000000] Initializing cgroup subsys cpuset
> [    0.000000] Initializing cgroup subsys cpu
> [    0.000000] Initializing cgroup subsys cpuacct
> [    0.000000] Linux version 3.11.0-rc4-00153-g14e9419-dirty (davehans@viggo.jf.intel.com) (gcc version 4.6.3 20120306 (Red Hat 4.6.3-2) (GCC) ) #33 SMP Mon Aug 12 09:22:46 PDT 2013
> [    0.000000] Command line: root=/dev/sda1 console=ttyS0,115200 earlyprintk=ttyS0,115200 debug debug ignore_loglevel memblock=debug
> [    0.000000] e820: BIOS-provided physical RAM map:
> [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009f3ff] usable
> [    0.000000] BIOS-e820: [mem 0x000000000009f400-0x000000000009ffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000dfffbfff] usable
> [    0.000000] BIOS-e820: [mem 0x00000000dfffc000-0x00000000dfffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000e80effffff] usable
> [    0.000000] bootconsole [earlyser0] enabled
> [    0.000000] debug: ignoring loglevel setting.
> [    0.000000] NX (Execute Disable) protection: active
> [    0.000000] SMBIOS 2.4 present.
> [    0.000000] DMI: Bochs Bochs, BIOS Bochs 01/01/2007
> [    0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
> [    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
> [    0.000000] No AGP bridge found
> [    0.000000] e820: last_pfn = 0xe80f000 max_arch_pfn = 0x400000000
> [    0.000000] MTRR default type: write-back
> [    0.000000] MTRR fixed ranges enabled:
> [    0.000000]   00000-9FFFF write-back
> [    0.000000]   A0000-BFFFF uncachable
> [    0.000000]   C0000-FFFFF write-protect
> [    0.000000] MTRR variable ranges enabled:
> [    0.000000]   0 base 00E0000000 mask FFE0000000 uncachable
> [    0.000000]   1 disabled
> [    0.000000]   2 disabled
> [    0.000000]   3 disabled
> [    0.000000]   4 disabled
> [    0.000000]   5 disabled
> [    0.000000]   6 disabled
> [    0.000000]   7 disabled
> [    0.000000] PAT not supported by CPU.
> [    0.000000] e820: last_pfn = 0xdfffc max_arch_pfn = 0x400000000
> [    0.000000] found SMP MP-table at [mem 0x000fdb00-0x000fdb0f] mapped at [ffff8800000fdb00]
> [    0.000000] memblock_reserve: [0x000000000fdb00-0x000000000fdb10] smp_scan_config+0x101/0x135
> [    0.000000] memblock_reserve: [0x000000000fdb10-0x000000000fdbf0] smp_scan_config+0x11a/0x135
> [    0.000000] memblock_reserve: [0x00000002085000-0x0000000208b000] setup_arch+0x5ed/0xc79
> [    0.000000] MEMBLOCK configuration:
> [    0.000000]  memory size = 0xe7eef9a400 reserved size = 0x10ec000
> [    0.000000]  memory.cnt  = 0x3
> [    0.000000]  memory[0x0]	[0x00000000001000-0x0000000009efff], 0x9e000 bytes
> [    0.000000]  memory[0x1]	[0x00000000100000-0x000000dfffbfff], 0xdfefc000 bytes
> [    0.000000]  memory[0x2]	[0x00000100000000-0x0000e80effffff], 0xe70f000000 bytes
> [    0.000000]  reserved.cnt  = 0x2
> [    0.000000]  reserved[0x0]	[0x0000000009f000-0x000000000fffff], 0x61000 bytes
> [    0.000000]  reserved[0x1]	[0x00000001000000-0x0000000208afff], 0x108b000 bytes
> [    0.000000] memblock_reserve: [0x00000000099000-0x0000000009f000] reserve_real_mode+0x61/0x87
> [    0.000000] Base memory trampoline at [ffff880000099000] 99000 size 24576
> [    0.000000] memblock_reserve: [0x00000000000000-0x00000000010000] setup_arch+0x6cf/0xc79
> [    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
> [    0.000000]  [mem 0x00000000-0x000fffff] page 4k
> [    0.000000] BRK [0x02086000, 0x02086fff] PGTABLE
> [    0.000000] BRK [0x02087000, 0x02087fff] PGTABLE
> [    0.000000] BRK [0x02088000, 0x02088fff] PGTABLE
> [    0.000000] init_memory_mapping: [mem 0xe80ee00000-0xe80effffff]
> [    0.000000]  [mem 0xe80ee00000-0xe80effffff] page 4k
> [    0.000000] BRK [0x02089000, 0x02089fff] PGTABLE
> [    0.000000] BRK [0x0208a000, 0x0208afff] PGTABLE
> [    0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory
> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.11.0-rc4-00153-g14e9419-dirty #33
> [    0.000000] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [    0.000000]  0000000000000001 ffffffff81c01ad8 ffffffff8165b550 000000000000019b
> [    0.000000]  ffffffff819ee0b8 ffffffff81c01b58 ffffffff81658600 0000000000000000
> [    0.000000]  0000000000000008 ffffffff81c01b68 ffffffff81c01b08 0000000000000006
> [    0.000000] Call Trace:
> [    0.000000]  [<ffffffff8165b550>] dump_stack+0x55/0x76
> [    0.000000]  [<ffffffff81658600>] panic+0xbb/0x1cb
> [    0.000000]  [<ffffffff8164ee0a>] alloc_low_pages+0x15a/0x160
> [    0.000000]  [<ffffffff816533bb>] phys_pmd_init+0x1e2/0x28d
> [    0.000000]  [<ffffffff81658c46>] ? printk+0x4d/0x4f
> [    0.000000]  [<ffffffff81653638>] phys_pud_init+0x1d2/0x25c
> [    0.000000]  [<ffffffff81653fc7>] kernel_physical_mapping_init+0x14c/0x1ea
> [    0.000000]  [<ffffffff8164eff3>] init_memory_mapping+0x1e3/0x350
> [    0.000000]  [<ffffffff81d09d89>] init_range_memory_mapping+0xc2/0x10b
> [    0.000000]  [<ffffffff81d09f05>] init_mem_mapping+0x133/0x1e2
> [    0.000000]  [<ffffffff81cfaf11>] setup_arch+0x6d4/0xc79
> [    0.000000]  [<ffffffff81658c46>] ? printk+0x4d/0x4f
> [    0.000000]  [<ffffffff81cf4b59>] start_kernel+0xc9/0x3f3
> [    0.000000]  [<ffffffff81cf45a6>] x86_64_start_reservations+0x2a/0x2c
> [    0.000000]  [<ffffffff81cf4694>] x86_64_start_kernel+0xec/0xf3


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] x86: Fix booting with DEBUG_PAGE_ALLOC with more than 512G RAM
  2013-08-12 16:27         ` Dave Hansen
@ 2013-08-12 23:43           ` Yinghai Lu
  2013-08-12 23:50             ` Dave Hansen
  2013-08-20  8:22             ` [tip:x86/urgent] x86/mm: Fix boot crash with DEBUG_PAGE_ALLOC= y and " tip-bot for Yinghai Lu
  2013-08-12 23:47           ` x86: early boot crash: "alloc_low_page: ran out of memory" (bisected) Yinghai Lu
  1 sibling, 2 replies; 11+ messages in thread
From: Yinghai Lu @ 2013-08-12 23:43 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Ingo Molnar, linux-kernel, Dave Hansen, Yinghai Lu, stable

Dave reported that system have early crash if DEBUG_PAGEALLOC is selected,
when system have between 500G and 600G.

> [    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
> [    0.000000]  [mem 0x00000000-0x000fffff] page 4k
> [    0.000000] BRK [0x02086000, 0x02086fff] PGTABLE
> [    0.000000] BRK [0x02087000, 0x02087fff] PGTABLE
> [    0.000000] BRK [0x02088000, 0x02088fff] PGTABLE
> [    0.000000] init_memory_mapping: [mem 0xe80ee00000-0xe80effffff]
> [    0.000000]  [mem 0xe80ee00000-0xe80effffff] page 4k
> [    0.000000] BRK [0x02089000, 0x02089fff] PGTABLE
> [    0.000000] BRK [0x0208a000, 0x0208afff] PGTABLE
> [    0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory

It turns out that we missed increasing needed pages in BRK to mapping
initial 2M and [0,1M) when we switch to use #PF handler set mem mapping.

> commit 8170e6bed465b4b0c7687f93e9948aca4358a33b
> Author: H. Peter Anvin <hpa@zytor.com>
> Date:   Thu Jan 24 12:19:52 2013 -0800
>
>     x86, 64bit: Use a #PF handler to materialize early mappings on demand

Before that, we have maping from [0,512M) in head_64.S, and we can
spare two pages [0-1M).  After that change, we can not reuse pages anymore.

When we have more than 512M ram, we need extra page for pgd page with
[512G, 1024g).

Increase pages in BRK for page table to solve the booting problem.

Reported-by: Dave Hansen <dave.hansen@intel.com>
Bisected-by: Dave Hansen <dave.hansen@intel.com>
Tested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: <stable@vger.kernel.org> 3.9+

---
 arch/x86/mm/init.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux-2.6/arch/x86/mm/init.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/init.c
+++ linux-2.6/arch/x86/mm/init.c
@@ -78,8 +78,8 @@ __ref void *alloc_low_pages(unsigned int
 	return __va(pfn << PAGE_SHIFT);
 }
 
-/* need 4 4k for initial PMD_SIZE, 4k for 0-ISA_END_ADDRESS */
-#define INIT_PGT_BUF_SIZE	(5 * PAGE_SIZE)
+/* need 3 4k for initial PMD_SIZE,  3 4k for 0-ISA_END_ADDRESS */
+#define INIT_PGT_BUF_SIZE	(6 * PAGE_SIZE)
 RESERVE_BRK(early_pgt_alloc, INIT_PGT_BUF_SIZE);
 void  __init early_alloc_pgt_buf(void)
 {

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: x86: early boot crash: "alloc_low_page: ran out of memory" (bisected)
  2013-08-12 16:27         ` Dave Hansen
  2013-08-12 23:43           ` [PATCH] x86: Fix booting with DEBUG_PAGE_ALLOC with more than 512G RAM Yinghai Lu
@ 2013-08-12 23:47           ` Yinghai Lu
  1 sibling, 0 replies; 11+ messages in thread
From: Yinghai Lu @ 2013-08-12 23:47 UTC (permalink / raw)
  To: Dave Hansen; +Cc: the arch/x86 maintainers, LKML, H. Peter Anvin

On Mon, Aug 12, 2013 at 9:27 AM, Dave Hansen <dave@sr71.net> wrote:
> On 08/09/2013 07:21 PM, Yinghai Lu wrote:
>> -/* need 4 4k for initial PMD_SIZE, 4k for 0-ISA_END_ADDRESS */
>> -#define INIT_PGT_BUF_SIZE    (5 * PAGE_SIZE)
>> +/* need 4 4k for initial PMD_SIZE,  3 4k for 0-ISA_END_ADDRESS */
>> +#define INIT_PGT_BUF_SIZE    (7 * PAGE_SIZE)
>>  RESERVE_BRK(early_pgt_alloc, INIT_PGT_BUF_SIZE);
>>  void  __init early_alloc_pgt_buf(void)
>
> That patch allows me to boot again.  I've also attached a boot with the
> debug options that you asked for.
>
> I'm really curious to see the full changelog for why this patch helps
> any why it's only triggered for large memory sizes. :)

Thanks, please check changelog at

https://patchwork.kernel.org/patch/2843321/

Yinghai

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86: Fix booting with DEBUG_PAGE_ALLOC with more than 512G RAM
  2013-08-12 23:43           ` [PATCH] x86: Fix booting with DEBUG_PAGE_ALLOC with more than 512G RAM Yinghai Lu
@ 2013-08-12 23:50             ` Dave Hansen
  2013-08-12 23:59               ` Yinghai Lu
  2013-08-20  8:22             ` [tip:x86/urgent] x86/mm: Fix boot crash with DEBUG_PAGE_ALLOC= y and " tip-bot for Yinghai Lu
  1 sibling, 1 reply; 11+ messages in thread
From: Dave Hansen @ 2013-08-12 23:50 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Ingo Molnar, linux-kernel, stable

On 08/12/2013 04:43 PM, Yinghai Lu wrote:
>> > commit 8170e6bed465b4b0c7687f93e9948aca4358a33b
>> > Author: H. Peter Anvin <hpa@zytor.com>
>> > Date:   Thu Jan 24 12:19:52 2013 -0800
>> >
>> >     x86, 64bit: Use a #PF handler to materialize early mappings on demand
> Before that, we have maping from [0,512M) in head_64.S, and we can
> spare two pages [0-1M).  After that change, we can not reuse pages anymore.
> 
> When we have more than 512M ram, we need extra page for pgd page with
> [512G, 1024g).
> 
> Increase pages in BRK for page table to solve the booting problem.

So how much does this get us up to?  1TB?  That's actually _fairly_
small today.  I've got a fairly old machine with that much in it, and
it's only half full of DIMMs.

It's also a bit worrying that this is completely disconnected from the
other code in the kernel that is concerned with the amount of total
address space in the system: MAX_PHYSADDR_BITS.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] x86: Fix booting with DEBUG_PAGE_ALLOC with more than 512G RAM
  2013-08-12 23:50             ` Dave Hansen
@ 2013-08-12 23:59               ` Yinghai Lu
  0 siblings, 0 replies; 11+ messages in thread
From: Yinghai Lu @ 2013-08-12 23:59 UTC (permalink / raw)
  To: Dave Hansen
  Cc: H. Peter Anvin, Ingo Molnar, Linux Kernel Mailing List, stable

On Mon, Aug 12, 2013 at 4:50 PM, Dave Hansen <dave.hansen@intel.com> wrote:
> On 08/12/2013 04:43 PM, Yinghai Lu wrote:
>>> > commit 8170e6bed465b4b0c7687f93e9948aca4358a33b
>>> > Author: H. Peter Anvin <hpa@zytor.com>
>>> > Date:   Thu Jan 24 12:19:52 2013 -0800
>>> >
>>> >     x86, 64bit: Use a #PF handler to materialize early mappings on demand
>> Before that, we have maping from [0,512M) in head_64.S, and we can
>> spare two pages [0-1M).  After that change, we can not reuse pages anymore.
>>
>> When we have more than 512M ram, we need extra page for pgd page with
>> [512G, 1024g).
>>
>> Increase pages in BRK for page table to solve the booting problem.
>
> So how much does this get us up to?  1TB?  That's actually _fairly_
> small today.  I've got a fairly old machine with that much in it, and
> it's only half full of DIMMs.
>
> It's also a bit worrying that this is completely disconnected from the
> other code in the kernel that is concerned with the amount of total
> address space in the system: MAX_PHYSADDR_BITS.

3 pages for [0,1M)
3 pages for initial 2M. ( it is 2M alignment).
are enough.

one page for PGD page (cover 512g), one page for PUD page (cover 1G)
and one page for PMD page (cover 2M).

After initial 2M is mapped, we will use that mapped 2M for other
memory range page table buffer.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [tip:x86/urgent] x86/mm: Fix boot crash with DEBUG_PAGE_ALLOC= y and more than 512G RAM
  2013-08-12 23:43           ` [PATCH] x86: Fix booting with DEBUG_PAGE_ALLOC with more than 512G RAM Yinghai Lu
  2013-08-12 23:50             ` Dave Hansen
@ 2013-08-20  8:22             ` tip-bot for Yinghai Lu
  1 sibling, 0 replies; 11+ messages in thread
From: tip-bot for Yinghai Lu @ 2013-08-20  8:22 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, yinghai, dave.hansen, tglx

Commit-ID:  527bf129f9a780e11b251cf2467dc30118a57d16
Gitweb:     http://git.kernel.org/tip/527bf129f9a780e11b251cf2467dc30118a57d16
Author:     Yinghai Lu <yinghai@kernel.org>
AuthorDate: Mon, 12 Aug 2013 16:43:24 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 20 Aug 2013 10:06:50 +0200

x86/mm: Fix boot crash with DEBUG_PAGE_ALLOC=y and more than 512G RAM

Dave Hansen reported that systems between 500G and 600G RAM
crash early if DEBUG_PAGEALLOC is selected.

 > [    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
 > [    0.000000]  [mem 0x00000000-0x000fffff] page 4k
 > [    0.000000] BRK [0x02086000, 0x02086fff] PGTABLE
 > [    0.000000] BRK [0x02087000, 0x02087fff] PGTABLE
 > [    0.000000] BRK [0x02088000, 0x02088fff] PGTABLE
 > [    0.000000] init_memory_mapping: [mem 0xe80ee00000-0xe80effffff]
 > [    0.000000]  [mem 0xe80ee00000-0xe80effffff] page 4k
 > [    0.000000] BRK [0x02089000, 0x02089fff] PGTABLE
 > [    0.000000] BRK [0x0208a000, 0x0208afff] PGTABLE
 > [    0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory

It turns out that we missed increasing needed pages in BRK to
mapping initial 2M and [0,1M) when we switched to use the #PF
handler to set memory mappings:

 > commit 8170e6bed465b4b0c7687f93e9948aca4358a33b
 > Author: H. Peter Anvin <hpa@zytor.com>
 > Date:   Thu Jan 24 12:19:52 2013 -0800
 >
 >     x86, 64bit: Use a #PF handler to materialize early mappings on demand

Before that, we had the maping from [0,512M) in head_64.S, and we
can spare two pages [0-1M).  After that change, we can not reuse
pages anymore.

When we have more than 512M ram, we need an extra page for pgd page
with [512G, 1024g).

Increase pages in BRK for page table to solve the boot crash.

Reported-by: Dave Hansen <dave.hansen@intel.com>
Bisected-by: Dave Hansen <dave.hansen@intel.com>
Tested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: <stable@vger.kernel.org> # v3.9 and later
Link: http://lkml.kernel.org/r/1376351004-4015-1-git-send-email-yinghai@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/mm/init.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 2ec29ac..04664cd 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -78,8 +78,8 @@ __ref void *alloc_low_pages(unsigned int num)
 	return __va(pfn << PAGE_SHIFT);
 }
 
-/* need 4 4k for initial PMD_SIZE, 4k for 0-ISA_END_ADDRESS */
-#define INIT_PGT_BUF_SIZE	(5 * PAGE_SIZE)
+/* need 3 4k for initial PMD_SIZE,  3 4k for 0-ISA_END_ADDRESS */
+#define INIT_PGT_BUF_SIZE	(6 * PAGE_SIZE)
 RESERVE_BRK(early_pgt_alloc, INIT_PGT_BUF_SIZE);
 void  __init early_alloc_pgt_buf(void)
 {

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-08-20  8:23 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-09 23:18 x86: early boot crash: "alloc_low_page: ran out of memory" (bisected) Dave Hansen
2013-08-09 23:23 ` Yinghai Lu
2013-08-10  1:19   ` Dave Hansen
2013-08-10  2:10     ` Yinghai Lu
2013-08-10  2:21       ` Yinghai Lu
2013-08-12 16:27         ` Dave Hansen
2013-08-12 23:43           ` [PATCH] x86: Fix booting with DEBUG_PAGE_ALLOC with more than 512G RAM Yinghai Lu
2013-08-12 23:50             ` Dave Hansen
2013-08-12 23:59               ` Yinghai Lu
2013-08-20  8:22             ` [tip:x86/urgent] x86/mm: Fix boot crash with DEBUG_PAGE_ALLOC= y and " tip-bot for Yinghai Lu
2013-08-12 23:47           ` x86: early boot crash: "alloc_low_page: ran out of memory" (bisected) Yinghai Lu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.