All of lore.kernel.org
 help / color / mirror / Atom feed
* x86: early boot crash: "alloc_low_page: ran out of memory" (bisected)
@ 2013-08-09 23:18 Dave Hansen
  2013-08-09 23:23 ` Yinghai Lu
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Hansen @ 2013-08-09 23:18 UTC (permalink / raw)
  To: Yinghai Lu, x86, LKML, H. Peter Anvin

I'm getting a 100% reproducible panic early in boot:

> [    0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory

I'm not sure why I didn't run in to this until now.  I think there are a
couple of config options that need to get set just right to trigger it,
but CONFIG_DEBUG_PAGEALLOC seems to be the main one.  Full config is here:

	http://sr71.net/~dave/intel/foo/config-bigbox-crash-20130809.txt

I bisected it back to this commit (which I seem to remember causing some
other probems):

> commit 8170e6bed465b4b0c7687f93e9948aca4358a33b
> Author: H. Peter Anvin <hpa@zytor.com>
> Date:   Thu Jan 24 12:19:52 2013 -0800
> 
>     x86, 64bit: Use a #PF handler to materialize early mappings on demand

I need somewhere between 500G and 600G of memory to trigger it, but it
can be triggered using qemu with much less _actual_ RAM than that.  From
looking at the dmesg diffs, I suspect that the delta in memory use
between using 1G and 4k ptes for the identity mapping (DEBUG_PAGEALLOC
forces 4k pages) is the proximate trigger.

I also suspect that alloc_low_pages() is buggy in the way it manipulates
min/max_pfn_mapped.  I'm quite baffled how 'max_pfn_mapped' is supposed
to get set up correctly.  Current code says:

	max_pfn_mapped = 0; /* will get exact value next */

but I certainly don't see it getting set later on in that function, or
_ever_ as adding some printk()'s shows:

> +[    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
> +[    0.000000]  [mem 0x00000000-0x000fffff] page 4k
> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 0 max_pfn_mapped: 0
> +[    0.000000] BRK [0x02086000, 0x02086fff] PGTABLE
> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 0 max_pfn_mapped: 0
> +[    0.000000] BRK [0x02087000, 0x02087fff] PGTABLE
> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 0 max_pfn_mapped: 0
> +[    0.000000] BRK [0x02088000, 0x02088fff] PGTABLE
> +[    0.000000] init_memory_mapping: [mem 0xf07fe00000-0xf07fffffff]
> +[    0.000000]  [mem 0xf07fe00000-0xf07fffffff] page 4k
> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 252182528 max_pfn_mapped: 0
> +[    0.000000] BRK [0x02089000, 0x02089fff] PGTABLE
> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 252182528 max_pfn_mapped: 0
> +[    0.000000] BRK [0x0208a000, 0x0208afff] PGTABLE
> +[    0.000000] alloc_low_pages(1) min_pfn_mapped: 252182528 max_pfn_mapped: 0
> +[    0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory

I'll take a closer look at it next week, but figured I'd report it first.

Full dmesg:

> early console in setup code
> [    0.000000] Initializing cgroup subsys cpuset
> [    0.000000] Initializing cgroup subsys cpu
> [    0.000000] Linux version 3.8.0-rc5-00059-g8170e6b (davehans@viggo.jf.intel.com) (gcc version 4.6.3 20120306 (Red Hat 4.6.3-2) (GCC) ) #29 SMP Fri Aug 9 15:56:12 PDT 2013
> [    0.000000] Command line: root=/dev/sda1 console=ttyS0,115200 earlyprintk=ttyS0,115200 debug
> [    0.000000] e820: BIOS-provided physical RAM map:
> [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009f3ff] usable
> [    0.000000] BIOS-e820: [mem 0x000000000009f400-0x000000000009ffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000dfffbfff] usable
> [    0.000000] BIOS-e820: [mem 0x00000000dfffc000-0x00000000dfffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000929bffffff] usable
> [    0.000000] bootconsole [earlyser0] enabled
> [    0.000000] NX (Execute Disable) protection: active
> [    0.000000] SMBIOS 2.4 present.
> [    0.000000] DMI: Bochs Bochs, BIOS Bochs 01/01/2007
> [    0.000000] e820: update [mem 0x00000000-0x0000ffff] usable ==> reserved
> [    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
> [    0.000000] No AGP bridge found
> [    0.000000] e820: last_pfn = 0x929c000 max_arch_pfn = 0x400000000
> [    0.000000] MTRR default type: write-back
> [    0.000000] MTRR fixed ranges enabled:
> [    0.000000]   00000-9FFFF write-back
> [    0.000000]   A0000-BFFFF uncachable
> [    0.000000]   C0000-FFFFF write-protect
> [    0.000000] MTRR variable ranges enabled:
> [    0.000000]   0 base 00E0000000 mask FFE0000000 uncachable
> [    0.000000]   1 disabled
> [    0.000000]   2 disabled
> [    0.000000]   3 disabled
> [    0.000000]   4 disabled
> [    0.000000]   5 disabled
> [    0.000000]   6 disabled
> [    0.000000]   7 disabled
> [    0.000000] PAT not supported by CPU.
> [    0.000000] e820: last_pfn = 0xdfffc max_arch_pfn = 0x400000000
> [    0.000000] found SMP MP-table at [mem 0x000fdb00-0x000fdb0f] mapped at [ffff8800000fdb00]
> [    0.000000] initial memory mapped: [mem 0x00000000-0xffffffffffffffff]
> [    0.000000] Base memory trampoline at [ffff880000099000] 99000 size 24576
> [    0.000000] init_memory_mapping: [mem 0x00000000-0x000fffff]
> [    0.000000]  [mem 0x00000000-0x000fffff] page 4k
> [    0.000000] BRK [0x0205a000, 0x0205afff] PGTABLE
> [    0.000000] BRK [0x0205b000, 0x0205bfff] PGTABLE
> [    0.000000] BRK [0x0205c000, 0x0205cfff] PGTABLE
> [    0.000000] init_memory_mapping: [mem 0x929be00000-0x929bffffff]
> [    0.000000]  [mem 0x929be00000-0x929bffffff] page 4k
> [    0.000000] BRK [0x0205d000, 0x0205dfff] PGTABLE
> [    0.000000] BRK [0x0205e000, 0x0205efff] PGTABLE
> [    0.000000] Kernel panic - not syncing: alloc_low_page: ran out of memory
> [    0.000000] Pid: 0, comm: swapper Not tainted 3.8.0-rc5-00059-g8170e6b #29
> [    0.000000] Call Trace:
> [    0.000000]  [<ffffffff81639b47>] panic+0xbb/0x1cb
> [    0.000000]  [<ffffffff816257aa>] alloc_low_pages+0x15a/0x160
> [    0.000000]  [<ffffffff81634d46>] phys_pmd_init+0x1f1/0x290
> [    0.000000]  [<ffffffff81634fb7>] phys_pud_init+0x1d2/0x24f
> [    0.000000]  [<ffffffff81635132>] kernel_physical_mapping_init+0xfe/0x16e
> [    0.000000]  [<ffffffff81625993>] init_memory_mapping+0x1e3/0x350
> [    0.000000]  [<ffffffff81cf5c5d>] init_range_memory_mapping+0xc2/0x10b
> [    0.000000]  [<ffffffff81cf5dd9>] init_mem_mapping+0x133/0x1c8
> [    0.000000]  [<ffffffff81ce77ad>] setup_arch+0x6ef/0xbe4
> [    0.000000]  [<ffffffff81639ca4>] ? printk+0x4d/0x4f
> [    0.000000]  [<ffffffff81ce3b4d>] start_kernel+0xce/0x3b3
> [    0.000000]  [<ffffffff81ce3592>] x86_64_start_reservations+0x91/0x95
> [    0.000000]  [<ffffffff81ce3681>] x86_64_start_kernel+0xeb/0xf2

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-08-20  8:23 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-09 23:18 x86: early boot crash: "alloc_low_page: ran out of memory" (bisected) Dave Hansen
2013-08-09 23:23 ` Yinghai Lu
2013-08-10  1:19   ` Dave Hansen
2013-08-10  2:10     ` Yinghai Lu
2013-08-10  2:21       ` Yinghai Lu
2013-08-12 16:27         ` Dave Hansen
2013-08-12 23:43           ` [PATCH] x86: Fix booting with DEBUG_PAGE_ALLOC with more than 512G RAM Yinghai Lu
2013-08-12 23:50             ` Dave Hansen
2013-08-12 23:59               ` Yinghai Lu
2013-08-20  8:22             ` [tip:x86/urgent] x86/mm: Fix boot crash with DEBUG_PAGE_ALLOC= y and " tip-bot for Yinghai Lu
2013-08-12 23:47           ` x86: early boot crash: "alloc_low_page: ran out of memory" (bisected) Yinghai Lu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.