* [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM @ 2018-08-30 9:03 Feng Tang 2018-08-30 10:44 ` Peter Zijlstra 0 siblings, 1 reply; 29+ messages in thread From: Feng Tang @ 2018-08-30 9:03 UTC (permalink / raw) To: linux-kernel, x86, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Peter Zijlstra, Dave Hansen, Andi Kleen Cc: Feng Tang We hit a kernel panic when enabling earlycon for a platform, the call trace is: panic+0xd2/0x220 __alloc_bootmem+0x31/0x34 spp_getpage+0x60/0x8a fill_pte+0x71/0x130 __set_pte_vaddr+0x1d/0x50 set_pte_vaddr+0x3c/0x60 __native_set_fixmap+0x23/0x30 native_set_fixmap+0x30/0x40 setup_earlycon+0x1e0/0x32f param_setup_earlycon+0x13/0x22 do_early_param+0x5b/0x90 parse_args+0x1f7/0x300 parse_early_options+0x24/0x28 parse_early_param+0x65/0x73 setup_arch+0x31e/0x9f1 start_kernel+0x58/0x44e The root cause is that when CONFIG_NO_BOOTMEM=y, before e820__memblock_setup() is called there is no memory for bootmem to allocate, and any alloc_bootmem() in this time window will trigger a panic. Seems this is a known issue as already mentioned in arch/x86/kernel/setup.c: " /* after early param, so could get panic from serial */ memblock_x86_reserve_range_setup_data(); " By reserving some memory in the linker script, small memory request could be fulfilled by bootmem allocator. And this memory region is not wasted, but usable after kernel boot. Signed-off-by: Feng Tang <feng.tang@intel.com> --- arch/x86/include/asm/sections.h | 5 +++++ arch/x86/kernel/setup.c | 12 +++++++++++- arch/x86/kernel/vmlinux.lds.S | 9 +++++++++ 3 files changed, 25 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/sections.h b/arch/x86/include/asm/sections.h index 4a911a382ade..d568829510e4 100644 --- a/arch/x86/include/asm/sections.h +++ b/arch/x86/include/asm/sections.h @@ -9,6 +9,11 @@ extern char __brk_base[], __brk_limit[]; extern struct exception_table_entry __stop___ex_table[]; extern char __end_rodata_aligned[]; +#ifdef CONFIG_NO_BOOTMEM +extern char __bootmem_start[], __bootmem_end[]; +#endif + + #if defined(CONFIG_X86_64) extern char __end_rodata_hpage_align[]; extern char __entry_trampoline_start[], __entry_trampoline_end[]; diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index b4866badb235..a4e3d9ec1a83 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -829,6 +829,17 @@ void __init setup_arch(char **cmdline_p) */ memblock_reserve(0, PAGE_SIZE); +#ifdef CONFIG_NO_BOOTMEM + /* + * There is small time window till e820__memblock_setup() is + * called, in which __bootmem_alloc() has no available memory + * to allocate and will trigger panic. Adding this revered bootmem + * can alleviate the situation. + */ + memblock_add(__pa_symbol(__bootmem_start), + __bootmem_end - __bootmem_start); +#endif + early_reserve_initrd(); /* @@ -989,7 +1000,6 @@ void __init setup_arch(char **cmdline_p) x86_report_nx(); - /* after early param, so could get panic from serial */ memblock_x86_reserve_range_setup_data(); if (acpi_mps_check()) { diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index 8bde0a419f86..fec1965d668f 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -367,6 +367,15 @@ SECTIONS __brk_limit = .; } +#ifdef CONFIG_NO_BOOTMEM + . = ALIGN(PAGE_SIZE); + .bootmem : AT(ADDR(.bootmem) - LOAD_OFFSET) { + __bootmem_start = .; + . += 128 * 1024; /* 128 KB for early boot mem */ + __bootmem_end = .; + } +#endif + . = ALIGN(PAGE_SIZE); /* keep VO_INIT_SIZE page aligned */ _end = .; -- 2.14.1 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-30 9:03 [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM Feng Tang @ 2018-08-30 10:44 ` Peter Zijlstra 2018-08-30 11:12 ` Michal Hocko 2018-08-30 12:43 ` Feng Tang 0 siblings, 2 replies; 29+ messages in thread From: Peter Zijlstra @ 2018-08-30 10:44 UTC (permalink / raw) To: Feng Tang Cc: linux-kernel, x86, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Thu, Aug 30, 2018 at 05:03:19PM +0800, Feng Tang wrote: > We hit a kernel panic when enabling earlycon for a platform, the > call trace is: > > panic+0xd2/0x220 > __alloc_bootmem+0x31/0x34 > spp_getpage+0x60/0x8a > fill_pte+0x71/0x130 > __set_pte_vaddr+0x1d/0x50 > set_pte_vaddr+0x3c/0x60 > __native_set_fixmap+0x23/0x30 > native_set_fixmap+0x30/0x40 > setup_earlycon+0x1e0/0x32f > param_setup_earlycon+0x13/0x22 > do_early_param+0x5b/0x90 > parse_args+0x1f7/0x300 > parse_early_options+0x24/0x28 > parse_early_param+0x65/0x73 > setup_arch+0x31e/0x9f1 > start_kernel+0x58/0x44e > > The root cause is that when CONFIG_NO_BOOTMEM=y, before > e820__memblock_setup() is called there is no memory for bootmem > to allocate, Which you bloody well asked for by using NO_BOOTMEM=y. Going down this route; adding hacks for every little thing that does want bootmem, completely defeats the purpose. If anything, make the earlycon thing depend on NO_BOOTMEM=n. That also solves your problem. No earlycon, no panic. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-30 10:44 ` Peter Zijlstra @ 2018-08-30 11:12 ` Michal Hocko 2018-08-30 11:54 ` Thomas Gleixner 2018-08-30 12:09 ` Peter Zijlstra 2018-08-30 12:43 ` Feng Tang 1 sibling, 2 replies; 29+ messages in thread From: Michal Hocko @ 2018-08-30 11:12 UTC (permalink / raw) To: Peter Zijlstra Cc: Feng Tang, linux-kernel, x86, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Thu 30-08-18 12:44:02, Peter Zijlstra wrote: > On Thu, Aug 30, 2018 at 05:03:19PM +0800, Feng Tang wrote: > > We hit a kernel panic when enabling earlycon for a platform, the > > call trace is: > > > > panic+0xd2/0x220 > > __alloc_bootmem+0x31/0x34 > > spp_getpage+0x60/0x8a > > fill_pte+0x71/0x130 > > __set_pte_vaddr+0x1d/0x50 > > set_pte_vaddr+0x3c/0x60 > > __native_set_fixmap+0x23/0x30 > > native_set_fixmap+0x30/0x40 > > setup_earlycon+0x1e0/0x32f > > param_setup_earlycon+0x13/0x22 > > do_early_param+0x5b/0x90 > > parse_args+0x1f7/0x300 > > parse_early_options+0x24/0x28 > > parse_early_param+0x65/0x73 > > setup_arch+0x31e/0x9f1 > > start_kernel+0x58/0x44e > > > > The root cause is that when CONFIG_NO_BOOTMEM=y, before > > e820__memblock_setup() is called there is no memory for bootmem > > to allocate, > > Which you bloody well asked for by using NO_BOOTMEM=y. > > Going down this route; adding hacks for every little thing that does > want bootmem, completely defeats the purpose. > > If anything, make the earlycon thing depend on NO_BOOTMEM=n. That also > solves your problem. No earlycon, no panic. Well, there is endeavor to remove bootmem allocator altogether. So making earlycon depend on NO_BOOTMEM=n doesn't sound like a good fit to me. I am not familiar with this code path but why cannot we postpone the allocation to later or use a statically allocated storage? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-30 11:12 ` Michal Hocko @ 2018-08-30 11:54 ` Thomas Gleixner 2018-08-30 12:49 ` Michal Hocko 2018-08-30 12:09 ` Peter Zijlstra 1 sibling, 1 reply; 29+ messages in thread From: Thomas Gleixner @ 2018-08-30 11:54 UTC (permalink / raw) To: Michal Hocko Cc: Peter Zijlstra, Feng Tang, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Thu, 30 Aug 2018, Michal Hocko wrote: > On Thu 30-08-18 12:44:02, Peter Zijlstra wrote: > > On Thu, Aug 30, 2018 at 05:03:19PM +0800, Feng Tang wrote: > > > The root cause is that when CONFIG_NO_BOOTMEM=y, before > > > e820__memblock_setup() is called there is no memory for bootmem > > > to allocate, > > > > Which you bloody well asked for by using NO_BOOTMEM=y. > > > > Going down this route; adding hacks for every little thing that does > > want bootmem, completely defeats the purpose. > > > > If anything, make the earlycon thing depend on NO_BOOTMEM=n. That also > > solves your problem. No earlycon, no panic. > > Well, there is endeavor to remove bootmem allocator altogether. So > making earlycon depend on NO_BOOTMEM=n doesn't sound like a good fit to If we want to remove bootmem, then reintroducing it with a static bootmem section doesn't make any sense at all. Thanks, tglx ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-30 11:54 ` Thomas Gleixner @ 2018-08-30 12:49 ` Michal Hocko 2018-08-30 12:59 ` Feng Tang 0 siblings, 1 reply; 29+ messages in thread From: Michal Hocko @ 2018-08-30 12:49 UTC (permalink / raw) To: Thomas Gleixner Cc: Peter Zijlstra, Feng Tang, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Thu 30-08-18 13:54:19, Thomas Gleixner wrote: > On Thu, 30 Aug 2018, Michal Hocko wrote: > > On Thu 30-08-18 12:44:02, Peter Zijlstra wrote: > > > On Thu, Aug 30, 2018 at 05:03:19PM +0800, Feng Tang wrote: > > > > The root cause is that when CONFIG_NO_BOOTMEM=y, before > > > > e820__memblock_setup() is called there is no memory for bootmem > > > > to allocate, > > > > > > Which you bloody well asked for by using NO_BOOTMEM=y. > > > > > > Going down this route; adding hacks for every little thing that does > > > want bootmem, completely defeats the purpose. > > > > > > If anything, make the earlycon thing depend on NO_BOOTMEM=n. That also > > > solves your problem. No earlycon, no panic. > > > > Well, there is endeavor to remove bootmem allocator altogether. So > > making earlycon depend on NO_BOOTMEM=n doesn't sound like a good fit to > > If we want to remove bootmem, then reintroducing it with a static bootmem > section doesn't make any sense at all. I have actually checked the code now and see what you mean. I thought it would be a single allocation that is needed but that is not the case so the static buffer is not going to fly. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-30 12:49 ` Michal Hocko @ 2018-08-30 12:59 ` Feng Tang 2018-08-30 13:05 ` Thomas Gleixner 0 siblings, 1 reply; 29+ messages in thread From: Feng Tang @ 2018-08-30 12:59 UTC (permalink / raw) To: Michal Hocko Cc: Thomas Gleixner, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Thu, Aug 30, 2018 at 02:49:15PM +0200, Michal Hocko wrote: > On Thu 30-08-18 13:54:19, Thomas Gleixner wrote: > > On Thu, 30 Aug 2018, Michal Hocko wrote: > > > On Thu 30-08-18 12:44:02, Peter Zijlstra wrote: > > > > On Thu, Aug 30, 2018 at 05:03:19PM +0800, Feng Tang wrote: > > > > > The root cause is that when CONFIG_NO_BOOTMEM=y, before > > > > > e820__memblock_setup() is called there is no memory for bootmem > > > > > to allocate, > > > > > > > > Which you bloody well asked for by using NO_BOOTMEM=y. > > > > > > > > Going down this route; adding hacks for every little thing that does > > > > want bootmem, completely defeats the purpose. > > > > > > > > If anything, make the earlycon thing depend on NO_BOOTMEM=n. That also > > > > solves your problem. No earlycon, no panic. > > > > > > Well, there is endeavor to remove bootmem allocator altogether. So > > > making earlycon depend on NO_BOOTMEM=n doesn't sound like a good fit to > > > > If we want to remove bootmem, then reintroducing it with a static bootmem > > section doesn't make any sense at all. > > I have actually checked the code now and see what you mean. I thought it > would be a single allocation that is needed but that is not the case so > the static buffer is not going to fly. Exactly! I tried several ways for the static allocation, like in data, in bss section, but they all failed, as in the very start of setup_arch(): memblock_reserve(__pa_symbol(_text), (unsigned long)__bss_stop - (unsigned long)_text); Those [_text, __bss_stop] is not able to be used by alloc_bootmem(). And I only got this patch, and really appreciate any good suggestions. Thanks, Feng ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-30 12:59 ` Feng Tang @ 2018-08-30 13:05 ` Thomas Gleixner 2018-08-30 13:19 ` Feng Tang 0 siblings, 1 reply; 29+ messages in thread From: Thomas Gleixner @ 2018-08-30 13:05 UTC (permalink / raw) To: Feng Tang Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Thu, 30 Aug 2018, Feng Tang wrote: > On Thu, Aug 30, 2018 at 02:49:15PM +0200, Michal Hocko wrote: > > On Thu 30-08-18 13:54:19, Thomas Gleixner wrote: > > > On Thu, 30 Aug 2018, Michal Hocko wrote: > > > > On Thu 30-08-18 12:44:02, Peter Zijlstra wrote: > > > > > On Thu, Aug 30, 2018 at 05:03:19PM +0800, Feng Tang wrote: > > > > > > The root cause is that when CONFIG_NO_BOOTMEM=y, before > > > > > > e820__memblock_setup() is called there is no memory for bootmem > > > > > > to allocate, > > > > > > > > > > Which you bloody well asked for by using NO_BOOTMEM=y. > > > > > > > > > > Going down this route; adding hacks for every little thing that does > > > > > want bootmem, completely defeats the purpose. > > > > > > > > > > If anything, make the earlycon thing depend on NO_BOOTMEM=n. That also > > > > > solves your problem. No earlycon, no panic. > > > > > > > > Well, there is endeavor to remove bootmem allocator altogether. So > > > > making earlycon depend on NO_BOOTMEM=n doesn't sound like a good fit to > > > > > > If we want to remove bootmem, then reintroducing it with a static bootmem > > > section doesn't make any sense at all. > > > > I have actually checked the code now and see what you mean. I thought it > > would be a single allocation that is needed but that is not the case so > > the static buffer is not going to fly. > > Exactly! I tried several ways for the static allocation, like in data, in bss > section, but they all failed, as in the very start of setup_arch(): > > memblock_reserve(__pa_symbol(_text), > (unsigned long)__bss_stop - (unsigned long)_text); > > Those [_text, __bss_stop] is not able to be used by alloc_bootmem(). And I > only got this patch, and really appreciate any good suggestions. And why do you want to use bootmem in the first place? If boot mem is going away, why are you not fixing the early console crap to NOT use bootmem at all? Thanks, tglx ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-30 13:05 ` Thomas Gleixner @ 2018-08-30 13:19 ` Feng Tang 2018-08-30 13:25 ` Thomas Gleixner 0 siblings, 1 reply; 29+ messages in thread From: Feng Tang @ 2018-08-30 13:19 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen Hi Thomas, On Thu, Aug 30, 2018 at 03:05:31PM +0200, Thomas Gleixner wrote: > On Thu, 30 Aug 2018, Feng Tang wrote: > > On Thu, Aug 30, 2018 at 02:49:15PM +0200, Michal Hocko wrote: > > > On Thu 30-08-18 13:54:19, Thomas Gleixner wrote: > > > > On Thu, 30 Aug 2018, Michal Hocko wrote: > > > > > On Thu 30-08-18 12:44:02, Peter Zijlstra wrote: > > > > > > On Thu, Aug 30, 2018 at 05:03:19PM +0800, Feng Tang wrote: > > > > > > > The root cause is that when CONFIG_NO_BOOTMEM=y, before > > > > > > > e820__memblock_setup() is called there is no memory for bootmem > > > > > > > to allocate, > > > > > > > > > > > > Which you bloody well asked for by using NO_BOOTMEM=y. > > > > > > > > > > > > Going down this route; adding hacks for every little thing that does > > > > > > want bootmem, completely defeats the purpose. > > > > > > > > > > > > If anything, make the earlycon thing depend on NO_BOOTMEM=n. That also > > > > > > solves your problem. No earlycon, no panic. > > > > > > > > > > Well, there is endeavor to remove bootmem allocator altogether. So > > > > > making earlycon depend on NO_BOOTMEM=n doesn't sound like a good fit to > > > > > > > > If we want to remove bootmem, then reintroducing it with a static bootmem > > > > section doesn't make any sense at all. > > > > > > I have actually checked the code now and see what you mean. I thought it > > > would be a single allocation that is needed but that is not the case so > > > the static buffer is not going to fly. > > > > Exactly! I tried several ways for the static allocation, like in data, in bss > > section, but they all failed, as in the very start of setup_arch(): > > > > memblock_reserve(__pa_symbol(_text), > > (unsigned long)__bss_stop - (unsigned long)_text); > > > > Those [_text, __bss_stop] is not able to be used by alloc_bootmem(). And I > > only got this patch, and really appreciate any good suggestions. > > And why do you want to use bootmem in the first place? If boot mem is going > away, why are you not fixing the early console crap to NOT use bootmem at > all? The earlycon need use fixmap to map the mmio register, while fixmap will need a new page for page table if the pmd/pte is not already their, as shown in the commit log: panic+0xd2/0x220 __alloc_bootmem+0x31/0x34 spp_getpage+0x60/0x8a fill_pte+0x71/0x130 __set_pte_vaddr+0x1d/0x50 set_pte_vaddr+0x3c/0x60 __native_set_fixmap+0x23/0x30 native_set_fixmap+0x30/0x40 setup_earlycon+0x1e0/0x32f param_setup_earlycon+0x13/0x22 do_early_param+0x5b/0x90 parse_args+0x1f7/0x300 parse_early_options+0x24/0x28 parse_early_param+0x65/0x73 setup_arch+0x31e/0x9f1 start_kernel+0x58/0x44e inside the spp_getpage(): if (after_bootmem) ptr = (void *) get_zeroed_page(GFP_ATOMIC); else ptr = alloc_bootmem_pages(PAGE_SIZE); Thanks, Feng ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-30 13:19 ` Feng Tang @ 2018-08-30 13:25 ` Thomas Gleixner 2018-08-30 13:55 ` Feng Tang 2018-08-31 6:15 ` Feng Tang 0 siblings, 2 replies; 29+ messages in thread From: Thomas Gleixner @ 2018-08-30 13:25 UTC (permalink / raw) To: Feng Tang Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Thu, 30 Aug 2018, Feng Tang wrote: > On Thu, Aug 30, 2018 at 03:05:31PM +0200, Thomas Gleixner wrote: > > > Those [_text, __bss_stop] is not able to be used by alloc_bootmem(). And I > > > only got this patch, and really appreciate any good suggestions. > > > > And why do you want to use bootmem in the first place? If boot mem is going > > away, why are you not fixing the early console crap to NOT use bootmem at > > all? > > The earlycon need use fixmap to map the mmio register, while fixmap will > need a new page for page table if the pmd/pte is not already their, as shown > in the commit log: > > panic+0xd2/0x220 > __alloc_bootmem+0x31/0x34 > spp_getpage+0x60/0x8a > fill_pte+0x71/0x130 > __set_pte_vaddr+0x1d/0x50 > set_pte_vaddr+0x3c/0x60 > __native_set_fixmap+0x23/0x30 > native_set_fixmap+0x30/0x40 > setup_earlycon+0x1e0/0x32f > param_setup_earlycon+0x13/0x22 > do_early_param+0x5b/0x90 > parse_args+0x1f7/0x300 > parse_early_options+0x24/0x28 > parse_early_param+0x65/0x73 > setup_arch+0x31e/0x9f1 > start_kernel+0x58/0x44e > > inside the spp_getpage(): > > if (after_bootmem) > ptr = (void *) get_zeroed_page(GFP_ATOMIC); > else > ptr = alloc_bootmem_pages(PAGE_SIZE); And where is the problem? We know that we need the fixmap during early boot anyway, so allocating the PTE page statically and setting it up early enough is not really rocket science. No allocator required. Thanks, tglx ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-30 13:25 ` Thomas Gleixner @ 2018-08-30 13:55 ` Feng Tang 2018-08-31 6:15 ` Feng Tang 1 sibling, 0 replies; 29+ messages in thread From: Feng Tang @ 2018-08-30 13:55 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Thu, Aug 30, 2018 at 03:25:42PM +0200, Thomas Gleixner wrote: > On Thu, 30 Aug 2018, Feng Tang wrote: > > On Thu, Aug 30, 2018 at 03:05:31PM +0200, Thomas Gleixner wrote: > > > > Those [_text, __bss_stop] is not able to be used by alloc_bootmem(). And I > > > > only got this patch, and really appreciate any good suggestions. > > > > > > And why do you want to use bootmem in the first place? If boot mem is going > > > away, why are you not fixing the early console crap to NOT use bootmem at > > > all? > > > > The earlycon need use fixmap to map the mmio register, while fixmap will > > need a new page for page table if the pmd/pte is not already their, as shown > > in the commit log: > > > > panic+0xd2/0x220 > > __alloc_bootmem+0x31/0x34 > > spp_getpage+0x60/0x8a > > fill_pte+0x71/0x130 > > __set_pte_vaddr+0x1d/0x50 > > set_pte_vaddr+0x3c/0x60 > > __native_set_fixmap+0x23/0x30 > > native_set_fixmap+0x30/0x40 > > setup_earlycon+0x1e0/0x32f > > param_setup_earlycon+0x13/0x22 > > do_early_param+0x5b/0x90 > > parse_args+0x1f7/0x300 > > parse_early_options+0x24/0x28 > > parse_early_param+0x65/0x73 > > setup_arch+0x31e/0x9f1 > > start_kernel+0x58/0x44e > > > > inside the spp_getpage(): > > > > if (after_bootmem) > > ptr = (void *) get_zeroed_page(GFP_ATOMIC); > > else > > ptr = alloc_bootmem_pages(PAGE_SIZE); > > And where is the problem? We know that we need the fixmap during early boot > anyway, so allocating the PTE page statically and setting it up early > enough is not really rocket science. No allocator required. Yes, if the bootmem itslef is going way, setting up the page table for fixmap is a cleaner solution. Thanks, Feng ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-30 13:25 ` Thomas Gleixner 2018-08-30 13:55 ` Feng Tang @ 2018-08-31 6:15 ` Feng Tang 2018-08-31 11:33 ` Thomas Gleixner 1 sibling, 1 reply; 29+ messages in thread From: Feng Tang @ 2018-08-31 6:15 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Thu, Aug 30, 2018 at 03:25:42PM +0200, Thomas Gleixner wrote: > And where is the problem? We know that we need the fixmap during early boot > anyway, so allocating the PTE page statically and setting it up early > enough is not really rocket science. No allocator required. Further check shows the page table for fixmap is already setup in [-12M, -10M], but fixmap's address space breaks the expection under certain kernel config. So here is another fix, please review, thanks! ---------------------------------------------------------------- From f63e4e5c6166a2f8d13ad3142b7a49b2fd1f8edf Mon Sep 17 00:00:00 2001 From: Feng Tang <feng.tang@intel.com> Date: Fri, 31 Aug 2018 13:48:21 +0800 Subject: [PATCH] x86, mm: Make the real fixmap address consistent We hit a kernel panic when enabling earlycon for a platform, the call trace is: panic+0xd2/0x220 __alloc_bootmem+0x31/0x34 spp_getpage+0x60/0x8a fill_pte+0x71/0x130 __set_pte_vaddr+0x1d/0x50 set_pte_vaddr+0x3c/0x60 __native_set_fixmap+0x23/0x30 native_set_fixmap+0x30/0x40 setup_earlycon+0x1e0/0x32f param_setup_earlycon+0x13/0x22 do_early_param+0x5b/0x90 parse_args+0x1f7/0x300 parse_early_options+0x24/0x28 parse_early_param+0x65/0x73 setup_arch+0x31e/0x9f1 start_kernel+0x58/0x44e This panic happens as the earlycon's fixmap address has no pmd/pte ready, and __set_fixmap will try to allocate memory to setup the page table, and trigger panic due to no memory. x86 kernel actually prepares the page table for fixmap in head_64.S: NEXT_PAGE(level2_fixmap_pgt) .fill 506,8,0 .quad level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC /* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */ .fill 5,8,0 and it expects the fixmap address is in [-12M, -10M] range, but current code in fixmap.h will break the expectation when X86_VSYSCALL_EMULATION=n #ifdef CONFIG_X86_VSYSCALL_EMULATION VSYSCALL_PAGE = (FIXADDR_TOP - VSYSCALL_ADDR) >> PAGE_SHIFT, #endif So removing the "#ifdef" will make the fixmap address space stable in [-12M, -10M] and fix the issue. Signed-off-by: Feng Tang <feng.tang@intel.com> --- arch/x86/include/asm/fixmap.h | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index e203169931c7..fcf27171f493 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -67,9 +67,7 @@ enum fixed_addresses { #ifdef CONFIG_X86_32 FIX_HOLE, #else -#ifdef CONFIG_X86_VSYSCALL_EMULATION VSYSCALL_PAGE = (FIXADDR_TOP - VSYSCALL_ADDR) >> PAGE_SHIFT, -#endif #endif FIX_DBGP_BASE, FIX_EARLYCON_MEM_BASE, -- 2.14.1 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-31 6:15 ` Feng Tang @ 2018-08-31 11:33 ` Thomas Gleixner 2018-08-31 13:36 ` Feng Tang 0 siblings, 1 reply; 29+ messages in thread From: Thomas Gleixner @ 2018-08-31 11:33 UTC (permalink / raw) To: Feng Tang Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Fri, 31 Aug 2018, Feng Tang wrote: > On Thu, Aug 30, 2018 at 03:25:42PM +0200, Thomas Gleixner wrote: > This panic happens as the earlycon's fixmap address has no > pmd/pte ready, and __set_fixmap will try to allocate memory to > setup the page table, and trigger panic due to no memory. > > x86 kernel actually prepares the page table for fixmap in head_64.S: > > NEXT_PAGE(level2_fixmap_pgt) > .fill 506,8,0 > .quad level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC > /* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */ > .fill 5,8,0 > > and it expects the fixmap address is in [-12M, -10M] range, but > current code in fixmap.h will break the expectation when > X86_VSYSCALL_EMULATION=n > > #ifdef CONFIG_X86_VSYSCALL_EMULATION > VSYSCALL_PAGE = (FIXADDR_TOP - VSYSCALL_ADDR) >> PAGE_SHIFT, > #endif > > So removing the "#ifdef" will make the fixmap address space stable in > [-12M, -10M] and fix the issue. Why on earth are you not fixing the damned PTE setup which is the obvious and correct thing to do? Thanks, tglx ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-31 11:33 ` Thomas Gleixner @ 2018-08-31 13:36 ` Feng Tang 2018-09-07 8:17 ` Feng Tang 0 siblings, 1 reply; 29+ messages in thread From: Feng Tang @ 2018-08-31 13:36 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Fri, Aug 31, 2018 at 01:33:05PM +0200, Thomas Gleixner wrote: > On Fri, 31 Aug 2018, Feng Tang wrote: > > On Thu, Aug 30, 2018 at 03:25:42PM +0200, Thomas Gleixner wrote: > > This panic happens as the earlycon's fixmap address has no > > pmd/pte ready, and __set_fixmap will try to allocate memory to > > setup the page table, and trigger panic due to no memory. > > > > x86 kernel actually prepares the page table for fixmap in head_64.S: > > > > NEXT_PAGE(level2_fixmap_pgt) > > .fill 506,8,0 > > .quad level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC > > /* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */ > > .fill 5,8,0 > > > > and it expects the fixmap address is in [-12M, -10M] range, but > > current code in fixmap.h will break the expectation when > > X86_VSYSCALL_EMULATION=n > > > > #ifdef CONFIG_X86_VSYSCALL_EMULATION > > VSYSCALL_PAGE = (FIXADDR_TOP - VSYSCALL_ADDR) >> PAGE_SHIFT, > > #endif > > > > So removing the "#ifdef" will make the fixmap address space stable in > > [-12M, -10M] and fix the issue. > > Why on earth are you not fixing the damned PTE setup which is the obvious > and correct thing to do? Any sugestion? I can only have patch like this: --- diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 15ebc2fc166e..8cdb27ccc3a3 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -446,11 +446,15 @@ NEXT_PAGE(level2_kernel_pgt) NEXT_PAGE(level2_fixmap_pgt) .fill 506,8,0 - .quad level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC + .quad level1_fixmap_pgt0 - __START_KERNEL_map + _PAGE_TABLE_NOENC + .quad level1_fixmap_pgt1 - __START_KERNEL_map + _PAGE_TABLE_NOENC /* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */ - .fill 5,8,0 + .fill 4,8,0 -NEXT_PAGE(level1_fixmap_pgt) +NEXT_PAGE(level1_fixmap_pgt0) + .fill 512,8,0 + +NEXT_PAGE(level1_fixmap_pgt1) .fill 512,8,0 #undef PMDS Thanks, Feng ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-31 13:36 ` Feng Tang @ 2018-09-07 8:17 ` Feng Tang 2018-09-07 10:52 ` Thomas Gleixner 0 siblings, 1 reply; 29+ messages in thread From: Feng Tang @ 2018-09-07 8:17 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen Hi Thomas, On Fri, Aug 31, 2018 at 09:36:59PM +0800, Feng Tang wrote: > On Fri, Aug 31, 2018 at 01:33:05PM +0200, Thomas Gleixner wrote: > > On Fri, 31 Aug 2018, Feng Tang wrote: > > > On Thu, Aug 30, 2018 at 03:25:42PM +0200, Thomas Gleixner wrote: > > > This panic happens as the earlycon's fixmap address has no > > > pmd/pte ready, and __set_fixmap will try to allocate memory to > > > setup the page table, and trigger panic due to no memory. > > > > > > x86 kernel actually prepares the page table for fixmap in head_64.S: > > > > > > NEXT_PAGE(level2_fixmap_pgt) > > > .fill 506,8,0 > > > .quad level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC > > > /* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */ > > > .fill 5,8,0 > > > > > > and it expects the fixmap address is in [-12M, -10M] range, but > > > current code in fixmap.h will break the expectation when > > > X86_VSYSCALL_EMULATION=n > > > > > > #ifdef CONFIG_X86_VSYSCALL_EMULATION > > > VSYSCALL_PAGE = (FIXADDR_TOP - VSYSCALL_ADDR) >> PAGE_SHIFT, > > > #endif > > > > > > So removing the "#ifdef" will make the fixmap address space stable in > > > [-12M, -10M] and fix the issue. > > > > Why on earth are you not fixing the damned PTE setup which is the obvious > > and correct thing to do? > > Any sugestion? I can only have patch like this: Could you review this patch, as at this time window there is no usable memory block or other memory allocators that I know, so I follow the exisitng static fixmap page table code and add one more. Thanks, Feng > > --- > diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S > index 15ebc2fc166e..8cdb27ccc3a3 100644 > --- a/arch/x86/kernel/head_64.S > +++ b/arch/x86/kernel/head_64.S > @@ -446,11 +446,15 @@ NEXT_PAGE(level2_kernel_pgt) > > NEXT_PAGE(level2_fixmap_pgt) > .fill 506,8,0 > - .quad level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC > + .quad level1_fixmap_pgt0 - __START_KERNEL_map + _PAGE_TABLE_NOENC > + .quad level1_fixmap_pgt1 - __START_KERNEL_map + _PAGE_TABLE_NOENC > /* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */ > - .fill 5,8,0 > + .fill 4,8,0 > > -NEXT_PAGE(level1_fixmap_pgt) > +NEXT_PAGE(level1_fixmap_pgt0) > + .fill 512,8,0 > + > +NEXT_PAGE(level1_fixmap_pgt1) > .fill 512,8,0 > > #undef PMDS > > > Thanks, > Feng > ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-09-07 8:17 ` Feng Tang @ 2018-09-07 10:52 ` Thomas Gleixner 2018-09-10 9:39 ` Feng Tang 0 siblings, 1 reply; 29+ messages in thread From: Thomas Gleixner @ 2018-09-07 10:52 UTC (permalink / raw) To: Feng Tang Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Fri, 7 Sep 2018, Feng Tang wrote: > On Fri, Aug 31, 2018 at 09:36:59PM +0800, Feng Tang wrote: > > Any sugestion? I can only have patch like this: > > Could you review this patch, as at this time window there is no usable memory > block or other memory allocators that I know, so I follow the exisitng static > fixmap page table code and add one more. You really want to add a proper sanity check for this. The static stuff has to cover the fixmap. Just making it work for this particular case is sloppy at best as the next larger change of the fixmap will just bring the problem back. Thanks, tglx ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-09-07 10:52 ` Thomas Gleixner @ 2018-09-10 9:39 ` Feng Tang 2018-09-10 9:53 ` Thomas Gleixner 0 siblings, 1 reply; 29+ messages in thread From: Feng Tang @ 2018-09-10 9:39 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Fri, Sep 07, 2018 at 12:52:17PM +0200, Thomas Gleixner wrote: > On Fri, 7 Sep 2018, Feng Tang wrote: > > On Fri, Aug 31, 2018 at 09:36:59PM +0800, Feng Tang wrote: > > > Any sugestion? I can only have patch like this: > > > > Could you review this patch, as at this time window there is no usable memory > > block or other memory allocators that I know, so I follow the exisitng static > > fixmap page table code and add one more. > > You really want to add a proper sanity check for this. The static stuff has > to cover the fixmap. Just making it work for this particular case is sloppy > at best as the next larger change of the fixmap will just bring the problem > back. Yes, agree. Since the issue happens very early before any serial console, I can only think of a build time check. Thanks, Feng --- arch/x86/kernel/head_64.S | 6 +++++- arch/x86/mm/pgtable.c | 13 +++++++++++++ 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 15ebc2fc166e..3a1d33b43c29 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -448,11 +448,15 @@ NEXT_PAGE(level2_fixmap_pgt) .fill 506,8,0 .quad level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC /* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */ - .fill 5,8,0 + .quad level1_fixmap_pgt1 - __START_KERNEL_map + _PAGE_TABLE_NOENC + .fill 4,8,0 NEXT_PAGE(level1_fixmap_pgt) .fill 512,8,0 +NEXT_PAGE(level1_fixmap_pgt1) + .fill 512,8,0 + #undef PMDS .data diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index e848a4811785..fc172551048a 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -637,6 +637,19 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte) { unsigned long address = __fix_to_virt(idx); +#ifdef CONFIG_X86_64 + /* + * In arch/x86/kernel/head_64.S, the static page table + * has been setup for 4M space [-12M, -8M] + * 0xFFFFFFFFFF400000 ~ 0xFFFFFFFFFF7FFFFF + * Add a sanity check whether fixed_address crosses + * the boundary. + */ + #define FIXMAP_STATIC_PGTABLE_START 0xFFFFFFFFFF400000 + BUILD_BUG_ON(__fix_to_virt(__end_of_permanent_fixed_addresses) + < FIXMAP_STATIC_PGTABLE_START); +#endif + if (idx >= __end_of_fixed_addresses) { BUG(); return; -- 2.14.1 > > Thanks, > > tglx ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-09-10 9:39 ` Feng Tang @ 2018-09-10 9:53 ` Thomas Gleixner 2018-09-11 6:14 ` Feng Tang 0 siblings, 1 reply; 29+ messages in thread From: Thomas Gleixner @ 2018-09-10 9:53 UTC (permalink / raw) To: Feng Tang Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Mon, 10 Sep 2018, Feng Tang wrote: > On Fri, Sep 07, 2018 at 12:52:17PM +0200, Thomas Gleixner wrote: > > You really want to add a proper sanity check for this. The static stuff has > > to cover the fixmap. Just making it work for this particular case is sloppy > > at best as the next larger change of the fixmap will just bring the problem > > back. > Yes, agree. > > Since the issue happens very early before any serial console, I can only think > of a build time check. Of course. That's obvious. The size _IS_ known at build time. > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c > index e848a4811785..fc172551048a 100644 > --- a/arch/x86/mm/pgtable.c > +++ b/arch/x86/mm/pgtable.c > @@ -637,6 +637,19 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte) > { > unsigned long address = __fix_to_virt(idx); > > +#ifdef CONFIG_X86_64 > + /* > + * In arch/x86/kernel/head_64.S, the static page table > + * has been setup for 4M space [-12M, -8M] > + * 0xFFFFFFFFFF400000 ~ 0xFFFFFFFFFF7FFFFF > + * Add a sanity check whether fixed_address crosses > + * the boundary. > + */ > + #define FIXMAP_STATIC_PGTABLE_START 0xFFFFFFFFFF400000 Errm what? This is beyond hillarious, really. What helps that if I remove that second FIXMAP entry in head_64.S? The size of the fixmap is known at compile time and it is known when building head_64.S. So the obvious check is to check right there where the static pagetable is set up and where you know how many entries you set up whether it's covering the full size or not and err out there. Thanks, tglx ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-09-10 9:53 ` Thomas Gleixner @ 2018-09-11 6:14 ` Feng Tang 2018-09-15 10:29 ` Thomas Gleixner 0 siblings, 1 reply; 29+ messages in thread From: Feng Tang @ 2018-09-11 6:14 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen Hi Thomas, On Mon, Sep 10, 2018 at 11:53:39AM +0200, Thomas Gleixner wrote: > On Mon, 10 Sep 2018, Feng Tang wrote: > > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c > > index e848a4811785..fc172551048a 100644 > > --- a/arch/x86/mm/pgtable.c > > +++ b/arch/x86/mm/pgtable.c > > @@ -637,6 +637,19 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte) > > { > > unsigned long address = __fix_to_virt(idx); > > > > +#ifdef CONFIG_X86_64 > > + /* > > + * In arch/x86/kernel/head_64.S, the static page table > > + * has been setup for 4M space [-12M, -8M] > > + * 0xFFFFFFFFFF400000 ~ 0xFFFFFFFFFF7FFFFF > > + * Add a sanity check whether fixed_address crosses > > + * the boundary. > > + */ > > + #define FIXMAP_STATIC_PGTABLE_START 0xFFFFFFFFFF400000 > > Errm what? This is beyond hillarious, really. What helps that if I remove > that second FIXMAP entry in head_64.S? > > The size of the fixmap is known at compile time and it is known when > building head_64.S. So the obvious check is to check right there where the > static pagetable is set up and where you know how many entries you set up > whether it's covering the full size or not and err out there. Thanks for the suggestion, and I have 2 patches: one adds a build warning, the other prepares fixmap page table on demand and doesn't need warning. But I met a problem, that the "__end_of_permanent_fixed_addresses" is defined in fixmap.h, which is protected by #ifndef __ASSEMBLY__, also fixmap.h reference many other header file, which makes it harder to extract the definition out. Any suggestion on this? thanks! - Feng patch1: --- diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 3a1d33b43c29..5863623890af 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -451,12 +451,20 @@ NEXT_PAGE(level2_fixmap_pgt) .quad level1_fixmap_pgt1 - __START_KERNEL_map + _PAGE_TABLE_NOENC .fill 4,8,0 +/* + * Next are static page table for 4M fixmap space: + * [-12M, -10M], [-10M, -8M] + */ NEXT_PAGE(level1_fixmap_pgt) .fill 512,8,0 NEXT_PAGE(level1_fixmap_pgt1) .fill 512,8,0 +#if __end_of_permanent_fixed_addresses > 1024 +.error "Total fixmap space exceeds the static page table capacity, please expand the page table!" +#endif + #undef PMDS .data Patch2: --- diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 15ebc2fc166e..2b98ce234686 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -24,6 +24,7 @@ #include "../entry/calling.h" #include <asm/export.h> #include <asm/nospec-branch.h> +#include <asm/fixmap.h> #ifdef CONFIG_PARAVIRT #include <asm/asm-offsets.h> @@ -444,14 +445,21 @@ NEXT_PAGE(level2_kernel_pgt) PMDS(0, __PAGE_KERNEL_LARGE_EXEC, KERNEL_IMAGE_SIZE/PMD_SIZE) +#define FIXMAP_PMD_NUM ((__end_of_permanent_fixed_addresses + 511) / 512) NEXT_PAGE(level2_fixmap_pgt) - .fill 506,8,0 - .quad level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC + .fill (512 - 4 - FIXMAP_PMD_NUM),8,0 + pgtno = 0 + .rept (FIXMAP_PMD_NUM) + .quad level1_fixmap_pgt + (pgtno << PAGE_SHIFT) - __START_KERNEL_map + _PAGE_TABLE_NOENC + pgtno = pgtno + 1 + .endr /* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */ - .fill 5,8,0 + .fill 4,8,0 NEXT_PAGE(level1_fixmap_pgt) + .rept (FIXMAP_PMD_NUM) .fill 512,8,0 + .endr #undef PMDS ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-09-11 6:14 ` Feng Tang @ 2018-09-15 10:29 ` Thomas Gleixner 2018-09-15 16:47 ` Feng Tang 0 siblings, 1 reply; 29+ messages in thread From: Thomas Gleixner @ 2018-09-15 10:29 UTC (permalink / raw) To: Feng Tang Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Tue, 11 Sep 2018, Feng Tang wrote: > Thanks for the suggestion, and I have 2 patches: one adds a build warning, > the other prepares fixmap page table on demand and doesn't need warning. The latter please. > But I met a problem, that the "__end_of_permanent_fixed_addresses" is > defined in fixmap.h, which is protected by #ifndef __ASSEMBLY__, also > fixmap.h reference many other header file, which makes it harder to > extract the definition out. Any suggestion on this? thanks! What prevents you from moving the enum out of the __ASSEMBLY__ protected section aside of a bit of careful work? Thanks, tglx ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-09-15 10:29 ` Thomas Gleixner @ 2018-09-15 16:47 ` Feng Tang 2018-09-15 17:28 ` Thomas Gleixner 0 siblings, 1 reply; 29+ messages in thread From: Feng Tang @ 2018-09-15 16:47 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen Hi Thomas, On Sat, Sep 15, 2018 at 12:29:50PM +0200, Thomas Gleixner wrote: > On Tue, 11 Sep 2018, Feng Tang wrote: > > Thanks for the suggestion, and I have 2 patches: one adds a build warning, > > the other prepares fixmap page table on demand and doesn't need warning. > > The latter please. Okay. > > > But I met a problem, that the "__end_of_permanent_fixed_addresses" is > > defined in fixmap.h, which is protected by #ifndef __ASSEMBLY__, also > > fixmap.h reference many other header file, which makes it harder to > > extract the definition out. Any suggestion on this? thanks! > > What prevents you from moving the enum out of the __ASSEMBLY__ protected > section aside of a bit of careful work? I have tried to change some header files incluing fixmap.h/apicdef.h/ vsyscall.h, and most of the .c files compile fine now, but I can not use the "__end_of_permanent_fixed_addresses" in head_64.S as it is a enum type, and could not be recognized by assembly code. Thanks, Feng > > Thanks, > > tglx ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-09-15 16:47 ` Feng Tang @ 2018-09-15 17:28 ` Thomas Gleixner 2018-09-16 14:35 ` Feng Tang 0 siblings, 1 reply; 29+ messages in thread From: Thomas Gleixner @ 2018-09-15 17:28 UTC (permalink / raw) To: Feng Tang Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Sun, 16 Sep 2018, Feng Tang wrote: > I have tried to change some header files incluing fixmap.h/apicdef.h/ > vsyscall.h, and most of the .c files compile fine now, but I can not > use the "__end_of_permanent_fixed_addresses" in head_64.S as it is a > enum type, and could not be recognized by assembly code. Hrmm. I did not think about the enum. So we have two possibilities: 1) Have some preprocessing which provides the info for the assembler 2) Use a constant for the number of PMDs which is defined in a header and then compile time checked against the size of the fixmap in a C-file. #1 would be preferred, but for a quick fix #2 is okay as well. Thanks, tglx ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-09-15 17:28 ` Thomas Gleixner @ 2018-09-16 14:35 ` Feng Tang 2018-09-16 14:43 ` Thomas Gleixner 0 siblings, 1 reply; 29+ messages in thread From: Feng Tang @ 2018-09-16 14:35 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Hocko, Peter Zijlstra, linux-kernel, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Sat, Sep 15, 2018 at 07:28:03PM +0200, Thomas Gleixner wrote: > On Sun, 16 Sep 2018, Feng Tang wrote: > > I have tried to change some header files incluing fixmap.h/apicdef.h/ > > vsyscall.h, and most of the .c files compile fine now, but I can not > > use the "__end_of_permanent_fixed_addresses" in head_64.S as it is a > > enum type, and could not be recognized by assembly code. > > Hrmm. I did not think about the enum. So we have two possibilities: > > 1) Have some preprocessing which provides the info for the assembler > > 2) Use a constant for the number of PMDs which is defined in a header and > then compile time checked against the size of the fixmap in a C-file. As there is no fixmap.c, I put the chck into __native_set_fixmap temply, please review, thanks - Feng diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index e203169931c7..ffaa0fa31044 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -14,6 +14,9 @@ #ifndef _ASM_X86_FIXMAP_H #define _ASM_X86_FIXMAP_H +/* Put here to be accessble for assembly code like head_64.S */ +#define FIXMAP_PMD_NUM 2 + #ifndef __ASSEMBLY__ #include <linux/kernel.h> #include <asm/acpi.h> diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index f773d5e6c8cc..d6820d4c6e47 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -14,6 +14,7 @@ #include <asm/processor.h> #include <linux/bitops.h> #include <linux/threads.h> +#include <asm/fixmap.h> extern p4d_t level4_kernel_pgt[512]; extern p4d_t level4_ident_pgt[512]; @@ -22,7 +23,7 @@ extern pud_t level3_ident_pgt[512]; extern pmd_t level2_kernel_pgt[512]; extern pmd_t level2_fixmap_pgt[512]; extern pmd_t level2_ident_pgt[512]; -extern pte_t level1_fixmap_pgt[512]; +extern pte_t level1_fixmap_pgt[512 * FIXMAP_PMD_NUM]; extern pgd_t init_top_pgt[]; #define swapper_pg_dir init_top_pgt diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 15ebc2fc166e..985eaff70792 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -24,6 +24,7 @@ #include "../entry/calling.h" #include <asm/export.h> #include <asm/nospec-branch.h> +#include <asm/fixmap.h> #ifdef CONFIG_PARAVIRT #include <asm/asm-offsets.h> @@ -445,13 +446,19 @@ NEXT_PAGE(level2_kernel_pgt) KERNEL_IMAGE_SIZE/PMD_SIZE) NEXT_PAGE(level2_fixmap_pgt) - .fill 506,8,0 - .quad level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC - /* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */ - .fill 5,8,0 + .fill (512 - 4 - FIXMAP_PMD_NUM),8,0 + pgtno = 0 + .rept (FIXMAP_PMD_NUM) + .quad level1_fixmap_pgt + (pgtno << PAGE_SHIFT) - __START_KERNEL_map + _PAGE_TABLE_NOENC + pgtno = pgtno + 1 + .endr + /* 6 MB reserved space + a 2MB hole */ + .fill 4,8,0 NEXT_PAGE(level1_fixmap_pgt) + .rept (FIXMAP_PMD_NUM) .fill 512,8,0 + .endr #undef PMDS diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index e848a4811785..a927f5f39bee 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -637,6 +637,16 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte) { unsigned long address = __fix_to_virt(idx); +#ifdef CONFIG_X86_64 + /* + * In arch/x86/kernel/head_64.S, the static page table has + * been setup for fixmap. Add a sanity compiling check + * whether fixmap space has exceeded the capacity. + */ + BUILD_BUG_ON((__end_of_permanent_fixed_addresses + 511)/512 + > FIXMAP_PMD_NUM); +#endif + if (idx >= __end_of_fixed_addresses) { BUG(); return; diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index 45b700ac5fe7..b0aa2364a3c6 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -1953,7 +1953,10 @@ void __init xen_setup_kernel_pagetable(pgd_t *pgd, unsigned long max_pfn) set_page_prot(level2_ident_pgt, PAGE_KERNEL_RO); set_page_prot(level2_kernel_pgt, PAGE_KERNEL_RO); set_page_prot(level2_fixmap_pgt, PAGE_KERNEL_RO); - set_page_prot(level1_fixmap_pgt, PAGE_KERNEL_RO); + + for (i = 0; i < FIXMAP_PMD_NUM; i++) + set_page_prot(level1_fixmap_pgt + i * 512, + PAGE_KERNEL_RO); /* Pin down new L4 */ pin_pagetable_pfn(MMUEXT_PIN_L4_TABLE, > > #1 would be preferred, but for a quick fix #2 is okay as well. > > Thanks, > > tglx > ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-09-16 14:35 ` Feng Tang @ 2018-09-16 14:43 ` Thomas Gleixner 2018-09-16 15:06 ` Feng Tang 0 siblings, 1 reply; 29+ messages in thread From: Thomas Gleixner @ 2018-09-16 14:43 UTC (permalink / raw) To: Feng Tang Cc: Michal Hocko, Peter Zijlstra, LKML, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Sun, 16 Sep 2018, Feng Tang wrote: > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c > index e848a4811785..a927f5f39bee 100644 > --- a/arch/x86/mm/pgtable.c > +++ b/arch/x86/mm/pgtable.c > @@ -637,6 +637,16 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte) > { > unsigned long address = __fix_to_virt(idx); > > +#ifdef CONFIG_X86_64 > + /* > + * In arch/x86/kernel/head_64.S, the static page table has > + * been setup for fixmap. Add a sanity compiling check > + * whether fixmap space has exceeded the capacity. > + */ > + BUILD_BUG_ON((__end_of_permanent_fixed_addresses + 511)/512 > + > FIXMAP_PMD_NUM); There exist macros for rounding up and doing this in the header file is perfectly fine. > +#endif > + > if (idx >= __end_of_fixed_addresses) { > BUG(); > return; > diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c > index 45b700ac5fe7..b0aa2364a3c6 100644 > --- a/arch/x86/xen/mmu_pv.c > +++ b/arch/x86/xen/mmu_pv.c > @@ -1953,7 +1953,10 @@ void __init xen_setup_kernel_pagetable(pgd_t *pgd, unsigned long max_pfn) > set_page_prot(level2_ident_pgt, PAGE_KERNEL_RO); > set_page_prot(level2_kernel_pgt, PAGE_KERNEL_RO); > set_page_prot(level2_fixmap_pgt, PAGE_KERNEL_RO); > - set_page_prot(level1_fixmap_pgt, PAGE_KERNEL_RO); > + > + for (i = 0; i < FIXMAP_PMD_NUM; i++) > + set_page_prot(level1_fixmap_pgt + i * 512, > + PAGE_KERNEL_RO); The line break is there because? Thanks, tglx ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-09-16 14:43 ` Thomas Gleixner @ 2018-09-16 15:06 ` Feng Tang 2018-09-17 7:01 ` Feng Tang 0 siblings, 1 reply; 29+ messages in thread From: Feng Tang @ 2018-09-16 15:06 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Hocko, Peter Zijlstra, LKML, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen Hi Thomas, On Sun, Sep 16, 2018 at 04:43:55PM +0200, Thomas Gleixner wrote: > On Sun, 16 Sep 2018, Feng Tang wrote: > > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c > > index e848a4811785..a927f5f39bee 100644 > > --- a/arch/x86/mm/pgtable.c > > +++ b/arch/x86/mm/pgtable.c > > @@ -637,6 +637,16 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte) > > { > > unsigned long address = __fix_to_virt(idx); > > > > +#ifdef CONFIG_X86_64 > > + /* > > + * In arch/x86/kernel/head_64.S, the static page table has > > + * been setup for fixmap. Add a sanity compiling check > > + * whether fixmap space has exceeded the capacity. > > + */ > > + BUILD_BUG_ON((__end_of_permanent_fixed_addresses + 511)/512 > > + > FIXMAP_PMD_NUM); > > There exist macros for rounding up and doing this in the header file is > perfectly fine. I'll find the sample for the round up. And for the header file check, I initially put this check directly in the fixmap.h like: #if ((__end_of_permanent_fixed_addresses + 511)/512 > FIXMAP_PMD_NUM) #erro "some warning...." #endif But failed to compile, I got the warning warning: "__end_of_permanent_fixed_addresses" is not defined I guess the header file preprocessing can't handle it. > > > +#endif > > + > > if (idx >= __end_of_fixed_addresses) { > > BUG(); > > return; > > diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c > > index 45b700ac5fe7..b0aa2364a3c6 100644 > > --- a/arch/x86/xen/mmu_pv.c > > +++ b/arch/x86/xen/mmu_pv.c > > @@ -1953,7 +1953,10 @@ void __init xen_setup_kernel_pagetable(pgd_t *pgd, unsigned long max_pfn) > > set_page_prot(level2_ident_pgt, PAGE_KERNEL_RO); > > set_page_prot(level2_kernel_pgt, PAGE_KERNEL_RO); > > set_page_prot(level2_fixmap_pgt, PAGE_KERNEL_RO); > > - set_page_prot(level1_fixmap_pgt, PAGE_KERNEL_RO); > > + > > + for (i = 0; i < FIXMAP_PMD_NUM; i++) > > + set_page_prot(level1_fixmap_pgt + i * 512, > > + PAGE_KERNEL_RO); > > The line break is there because? Will fix it. I used a MACRO before changing to 512 and forgot to remove the line break :) Thanks, Feng ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-09-16 15:06 ` Feng Tang @ 2018-09-17 7:01 ` Feng Tang 2018-09-17 7:01 ` Thomas Gleixner 0 siblings, 1 reply; 29+ messages in thread From: Feng Tang @ 2018-09-17 7:01 UTC (permalink / raw) To: Thomas Gleixner Cc: Michal Hocko, Peter Zijlstra, LKML, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen Hi Thomas, On Sun, Sep 16, 2018 at 11:06:41PM +0800, Feng Tang wrote: > Hi Thomas, > > On Sun, Sep 16, 2018 at 04:43:55PM +0200, Thomas Gleixner wrote: > > On Sun, 16 Sep 2018, Feng Tang wrote: > > > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c > > > index e848a4811785..a927f5f39bee 100644 > > > --- a/arch/x86/mm/pgtable.c > > > +++ b/arch/x86/mm/pgtable.c > > > @@ -637,6 +637,16 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte) > > > { > > > unsigned long address = __fix_to_virt(idx); > > > > > > +#ifdef CONFIG_X86_64 > > > + /* > > > + * In arch/x86/kernel/head_64.S, the static page table has > > > + * been setup for fixmap. Add a sanity compiling check > > > + * whether fixmap space has exceeded the capacity. > > > + */ > > > + BUILD_BUG_ON((__end_of_permanent_fixed_addresses + 511)/512 > > > + > FIXMAP_PMD_NUM); > > > > There exist macros for rounding up and doing this in the header file is > > perfectly fine. > > I'll find the sample for the round up. > > And for the header file check, I initially put this check directly in > the fixmap.h like: > > #if ((__end_of_permanent_fixed_addresses + 511)/512 > FIXMAP_PMD_NUM) > #erro "some warning...." > #endif > > But failed to compile, I got the warning > warning: "__end_of_permanent_fixed_addresses" is not defined > > I guess the header file preprocessing can't handle it. Below is a patch against 4.19-rc4, verified with earlycon working fine, please review, thanks! - Feng --- From 2710b48eb29a86c8ee568d00d5b04820c3fe1f3a Mon Sep 17 00:00:00 2001 From: Feng Tang <feng.tang@intel.com> Date: Mon, 17 Sep 2018 12:12:08 +0800 Subject: [PATCH] x86/mm: Expand static page table for fixmap space We met a kernel panic when enabling earlycon, which is due to the fixmap address of earlycon is not statically setup. Currently the static fixmap setup in head_64.S only covers 2M virtual address space, while it acutually could be in 4M space with different kernel configurations. So increase the static space to 4M for now by defining FIXMAP_PMD_NUM to 2, and add a build time check for future possible overflow so that the macro could be increased accordingly. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Feng Tang <feng.tang@intel.com> --- arch/x86/include/asm/fixmap.h | 3 +++ arch/x86/include/asm/pgtable_64.h | 3 ++- arch/x86/kernel/head_64.S | 15 +++++++++++---- arch/x86/mm/pgtable.c | 10 ++++++++++ arch/x86/xen/mmu_pv.c | 4 +++- 5 files changed, 29 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index e203169..ffaa0fa 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -14,6 +14,9 @@ #ifndef _ASM_X86_FIXMAP_H #define _ASM_X86_FIXMAP_H +/* Put here to be accessble for assembly code like head_64.S */ +#define FIXMAP_PMD_NUM 2 + #ifndef __ASSEMBLY__ #include <linux/kernel.h> #include <asm/acpi.h> diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index ce2b590..9c85b54 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -14,6 +14,7 @@ #include <asm/processor.h> #include <linux/bitops.h> #include <linux/threads.h> +#include <asm/fixmap.h> extern p4d_t level4_kernel_pgt[512]; extern p4d_t level4_ident_pgt[512]; @@ -22,7 +23,7 @@ extern pud_t level3_ident_pgt[512]; extern pmd_t level2_kernel_pgt[512]; extern pmd_t level2_fixmap_pgt[512]; extern pmd_t level2_ident_pgt[512]; -extern pte_t level1_fixmap_pgt[512]; +extern pte_t level1_fixmap_pgt[512 * FIXMAP_PMD_NUM]; extern pgd_t init_top_pgt[]; #define swapper_pg_dir init_top_pgt diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 15ebc2f..985eaff 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -24,6 +24,7 @@ #include "../entry/calling.h" #include <asm/export.h> #include <asm/nospec-branch.h> +#include <asm/fixmap.h> #ifdef CONFIG_PARAVIRT #include <asm/asm-offsets.h> @@ -445,13 +446,19 @@ NEXT_PAGE(level2_kernel_pgt) KERNEL_IMAGE_SIZE/PMD_SIZE) NEXT_PAGE(level2_fixmap_pgt) - .fill 506,8,0 - .quad level1_fixmap_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC - /* 8MB reserved for vsyscalls + a 2MB hole = 4 + 1 entries */ - .fill 5,8,0 + .fill (512 - 4 - FIXMAP_PMD_NUM),8,0 + pgtno = 0 + .rept (FIXMAP_PMD_NUM) + .quad level1_fixmap_pgt + (pgtno << PAGE_SHIFT) - __START_KERNEL_map + _PAGE_TABLE_NOENC + pgtno = pgtno + 1 + .endr + /* 6 MB reserved space + a 2MB hole */ + .fill 4,8,0 NEXT_PAGE(level1_fixmap_pgt) + .rept (FIXMAP_PMD_NUM) .fill 512,8,0 + .endr #undef PMDS diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index ae39455..c9d94d0 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -637,6 +637,16 @@ void __native_set_fixmap(enum fixed_addresses idx, pte_t pte) { unsigned long address = __fix_to_virt(idx); +#ifdef CONFIG_X86_64 + /* + * In arch/x86/kernel/head_64.S, the static page table has + * been setup for fixmap. Add a sanity compiling check + * whether fixmap space has exceeded the capacity. + */ + BUILD_BUG_ON(__end_of_permanent_fixed_addresses + > (FIXMAP_PMD_NUM * PTRS_PER_PMD)); +#endif + if (idx >= __end_of_fixed_addresses) { BUG(); return; diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index 2fe5c9b..19077f9 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -1952,7 +1952,9 @@ void __init xen_setup_kernel_pagetable(pgd_t *pgd, unsigned long max_pfn) set_page_prot(level2_ident_pgt, PAGE_KERNEL_RO); set_page_prot(level2_kernel_pgt, PAGE_KERNEL_RO); set_page_prot(level2_fixmap_pgt, PAGE_KERNEL_RO); - set_page_prot(level1_fixmap_pgt, PAGE_KERNEL_RO); + + for (i = 0; i < FIXMAP_PMD_NUM; i++) + set_page_prot(level1_fixmap_pgt + i * 512, PAGE_KERNEL_RO); /* Pin down new L4 */ pin_pagetable_pfn(MMUEXT_PIN_L4_TABLE, -- 2.7.4 ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-09-17 7:01 ` Feng Tang @ 2018-09-17 7:01 ` Thomas Gleixner 0 siblings, 0 replies; 29+ messages in thread From: Thomas Gleixner @ 2018-09-17 7:01 UTC (permalink / raw) To: Feng Tang Cc: Michal Hocko, Peter Zijlstra, LKML, x86, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Mon, 17 Sep 2018, Feng Tang wrote: > > Below is a patch against 4.19-rc4, verified with earlycon > working fine, please review, thanks! Aside of the whitespace damage this looks about right. Care to send a proper patch? Thanks, tglx ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-30 11:12 ` Michal Hocko 2018-08-30 11:54 ` Thomas Gleixner @ 2018-08-30 12:09 ` Peter Zijlstra 2018-08-30 12:39 ` Michal Hocko 1 sibling, 1 reply; 29+ messages in thread From: Peter Zijlstra @ 2018-08-30 12:09 UTC (permalink / raw) To: Michal Hocko Cc: Feng Tang, linux-kernel, x86, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Thu, Aug 30, 2018 at 01:12:07PM +0200, Michal Hocko wrote: > On Thu 30-08-18 12:44:02, Peter Zijlstra wrote: > > On Thu, Aug 30, 2018 at 05:03:19PM +0800, Feng Tang wrote: > > > We hit a kernel panic when enabling earlycon for a platform, the > > > call trace is: > > > > > > panic+0xd2/0x220 > > > __alloc_bootmem+0x31/0x34 > > > spp_getpage+0x60/0x8a > > > fill_pte+0x71/0x130 > > > __set_pte_vaddr+0x1d/0x50 > > > set_pte_vaddr+0x3c/0x60 > > > __native_set_fixmap+0x23/0x30 > > > native_set_fixmap+0x30/0x40 > > > setup_earlycon+0x1e0/0x32f > > > param_setup_earlycon+0x13/0x22 > > > do_early_param+0x5b/0x90 > > > parse_args+0x1f7/0x300 > > > parse_early_options+0x24/0x28 > > > parse_early_param+0x65/0x73 > > > setup_arch+0x31e/0x9f1 > > > start_kernel+0x58/0x44e > > > > > > The root cause is that when CONFIG_NO_BOOTMEM=y, before > > > e820__memblock_setup() is called there is no memory for bootmem > > > to allocate, > > > > Which you bloody well asked for by using NO_BOOTMEM=y. > > > > Going down this route; adding hacks for every little thing that does > > want bootmem, completely defeats the purpose. > > > > If anything, make the earlycon thing depend on NO_BOOTMEM=n. That also > > solves your problem. No earlycon, no panic. > > Well, there is endeavor to remove bootmem allocator altogether. So wasn't aware of that. why? > making earlycon depend on NO_BOOTMEM=n doesn't sound like a good fit to > me. I am not familiar with this code path but why cannot we postpone the > allocation to later or use a statically allocated storage? because 'early'... I suppose. Youu really want the early con up and running asap. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-30 12:09 ` Peter Zijlstra @ 2018-08-30 12:39 ` Michal Hocko 0 siblings, 0 replies; 29+ messages in thread From: Michal Hocko @ 2018-08-30 12:39 UTC (permalink / raw) To: Peter Zijlstra Cc: Feng Tang, linux-kernel, x86, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen On Thu 30-08-18 14:09:26, Peter Zijlstra wrote: > On Thu, Aug 30, 2018 at 01:12:07PM +0200, Michal Hocko wrote: > > On Thu 30-08-18 12:44:02, Peter Zijlstra wrote: > > > On Thu, Aug 30, 2018 at 05:03:19PM +0800, Feng Tang wrote: > > > > We hit a kernel panic when enabling earlycon for a platform, the > > > > call trace is: > > > > > > > > panic+0xd2/0x220 > > > > __alloc_bootmem+0x31/0x34 > > > > spp_getpage+0x60/0x8a > > > > fill_pte+0x71/0x130 > > > > __set_pte_vaddr+0x1d/0x50 > > > > set_pte_vaddr+0x3c/0x60 > > > > __native_set_fixmap+0x23/0x30 > > > > native_set_fixmap+0x30/0x40 > > > > setup_earlycon+0x1e0/0x32f > > > > param_setup_earlycon+0x13/0x22 > > > > do_early_param+0x5b/0x90 > > > > parse_args+0x1f7/0x300 > > > > parse_early_options+0x24/0x28 > > > > parse_early_param+0x65/0x73 > > > > setup_arch+0x31e/0x9f1 > > > > start_kernel+0x58/0x44e > > > > > > > > The root cause is that when CONFIG_NO_BOOTMEM=y, before > > > > e820__memblock_setup() is called there is no memory for bootmem > > > > to allocate, > > > > > > Which you bloody well asked for by using NO_BOOTMEM=y. > > > > > > Going down this route; adding hacks for every little thing that does > > > want bootmem, completely defeats the purpose. > > > > > > If anything, make the earlycon thing depend on NO_BOOTMEM=n. That also > > > solves your problem. No earlycon, no panic. > > > > Well, there is endeavor to remove bootmem allocator altogether. So > > wasn't aware of that. why? Because it is yet another allocator that we have to maintain and that is not really needed. > > making earlycon depend on NO_BOOTMEM=n doesn't sound like a good fit to > > me. I am not familiar with this code path but why cannot we postpone the > > allocation to later or use a statically allocated storage? > > because 'early'... I suppose. Youu really want the early con up and > running asap. Well, the memblock allocator is initialized quite early as well... If that is not early enough then we have to play tricks. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM 2018-08-30 10:44 ` Peter Zijlstra 2018-08-30 11:12 ` Michal Hocko @ 2018-08-30 12:43 ` Feng Tang 1 sibling, 0 replies; 29+ messages in thread From: Feng Tang @ 2018-08-30 12:43 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-kernel, x86, Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Yinghai Lu, Dave Hansen, Andi Kleen Hi Peter, On Thu, Aug 30, 2018 at 12:44:02PM +0200, Peter Zijlstra wrote: > On Thu, Aug 30, 2018 at 05:03:19PM +0800, Feng Tang wrote: > > We hit a kernel panic when enabling earlycon for a platform, the > > call trace is: > > > > panic+0xd2/0x220 > > __alloc_bootmem+0x31/0x34 > > spp_getpage+0x60/0x8a > > fill_pte+0x71/0x130 > > __set_pte_vaddr+0x1d/0x50 > > set_pte_vaddr+0x3c/0x60 > > __native_set_fixmap+0x23/0x30 > > native_set_fixmap+0x30/0x40 > > setup_earlycon+0x1e0/0x32f > > param_setup_earlycon+0x13/0x22 > > do_early_param+0x5b/0x90 > > parse_args+0x1f7/0x300 > > parse_early_options+0x24/0x28 > > parse_early_param+0x65/0x73 > > setup_arch+0x31e/0x9f1 > > start_kernel+0x58/0x44e > > > > The root cause is that when CONFIG_NO_BOOTMEM=y, before > > e820__memblock_setup() is called there is no memory for bootmem > > to allocate, > > Which you bloody well asked for by using NO_BOOTMEM=y. > > Going down this route; adding hacks for every little thing that does > want bootmem, completely defeats the purpose. > > If anything, make the earlycon thing depend on NO_BOOTMEM=n. That also > solves your problem. No earlycon, no panic. In current x86, the NO_BOOTMEM is "y" by default, and almost all the servers/desktops I could access have CONFIG_BOOTMEM=y, while earlycon is important for enabling new platforms, as well as kernel debugging. IMHO, it's better to keep earlycon capability. Also any alloc_bootmem() call in that time window will trigger panic. Thanks, Feng ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2018-09-17 7:01 UTC | newest] Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-08-30 9:03 [PATCH] x86, mm: Reserver some memory for bootmem allocator for NO_BOOTMEM Feng Tang 2018-08-30 10:44 ` Peter Zijlstra 2018-08-30 11:12 ` Michal Hocko 2018-08-30 11:54 ` Thomas Gleixner 2018-08-30 12:49 ` Michal Hocko 2018-08-30 12:59 ` Feng Tang 2018-08-30 13:05 ` Thomas Gleixner 2018-08-30 13:19 ` Feng Tang 2018-08-30 13:25 ` Thomas Gleixner 2018-08-30 13:55 ` Feng Tang 2018-08-31 6:15 ` Feng Tang 2018-08-31 11:33 ` Thomas Gleixner 2018-08-31 13:36 ` Feng Tang 2018-09-07 8:17 ` Feng Tang 2018-09-07 10:52 ` Thomas Gleixner 2018-09-10 9:39 ` Feng Tang 2018-09-10 9:53 ` Thomas Gleixner 2018-09-11 6:14 ` Feng Tang 2018-09-15 10:29 ` Thomas Gleixner 2018-09-15 16:47 ` Feng Tang 2018-09-15 17:28 ` Thomas Gleixner 2018-09-16 14:35 ` Feng Tang 2018-09-16 14:43 ` Thomas Gleixner 2018-09-16 15:06 ` Feng Tang 2018-09-17 7:01 ` Feng Tang 2018-09-17 7:01 ` Thomas Gleixner 2018-08-30 12:09 ` Peter Zijlstra 2018-08-30 12:39 ` Michal Hocko 2018-08-30 12:43 ` Feng Tang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).