* [PATCH 0/3] Split lowmem and kernel mappings @ 2021-05-17 14:57 Linus Walleij 2021-05-17 14:57 ` [PATCH 1/3] ARM: Split KERNEL_OFFSET from PAGE_OFFSET Linus Walleij ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Linus Walleij @ 2021-05-17 14:57 UTC (permalink / raw) To: linux-arm-kernel, Russell King, Arnd Bergmann Cc: Florian Fainelli, Geert Uytterhoeven, Ard Biesheuvel, Linus Walleij This patch set of 3 patches is the beginnings of an idea from Arnd: to move the kernel image (executable and non-executable sections) into the VMALLOC area. To get there we first need to pry apart the section mappings of lowmem and the kernel. This series: - Creates a specific KERNEL_OFFSET and removes the implied assumption that the kernel will always start at PAGE_OFFSET. - Saves the kernel physical section start and end addresses from the early assembly-coded section mappings so we can reuse these later without having to rely on virtual-to-physical mappings. - Splits the map_lowmem() function into two: map_lowmem() which maps the lowmem proper and map_kernel() that maps the kernel. The split is nice (IMO) because the functions become cleaner and easier to read. It is possible that the refactoring of map_lowmem() also avoids a few pitfalls from the earlier code. I have a prototype patch that on top of this actually moves the kernel into the VMALLOC area at 0xf1000000 but I want to keep things apart to avoid too much discussion and confusion. One thing at the time. I think these patches are fine in their own right, but I wrote them so I need others to think the same. If you want to see the prototype patch(es) it is here: https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator.git/log/?h=kernel-in-vmalloc-v5.13-rc1 Linus Walleij (3): ARM: Split KERNEL_OFFSET from PAGE_OFFSET ARM: Define kernel physical section start and end ARM: Map the lowmem and kernel separately arch/arm/include/asm/memory.h | 15 +++- arch/arm/kernel/head.S | 30 +++++-- arch/arm/kernel/vmlinux.lds.S | 2 +- arch/arm/mm/mmu.c | 144 ++++++++++++++++++++++++++-------- 4 files changed, 151 insertions(+), 40 deletions(-) -- 2.31.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] ARM: Split KERNEL_OFFSET from PAGE_OFFSET 2021-05-17 14:57 [PATCH 0/3] Split lowmem and kernel mappings Linus Walleij @ 2021-05-17 14:57 ` Linus Walleij 2021-05-17 14:57 ` [PATCH 2/3] ARM: Define kernel physical section start and end Linus Walleij 2021-05-17 14:57 ` [PATCH 3/3] ARM: Map the lowmem and kernel separately Linus Walleij 2 siblings, 0 replies; 12+ messages in thread From: Linus Walleij @ 2021-05-17 14:57 UTC (permalink / raw) To: linux-arm-kernel, Russell King, Arnd Bergmann Cc: Florian Fainelli, Geert Uytterhoeven, Ard Biesheuvel, Linus Walleij We want to be able to compile the kernel into an address different from PAGE_OFFSET (start of lowmem) + TEXT_OFFSET, so start to pry apart the address of where the kernel is located from the address where the lowmem is located by defining and using KERNEL_OFFSET in a few key places. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> --- arch/arm/include/asm/memory.h | 8 +++++++- arch/arm/kernel/head.S | 3 +-- arch/arm/kernel/vmlinux.lds.S | 2 +- 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index a711322d9f40..a5f1d1826a98 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -20,8 +20,14 @@ #endif #include <asm/kasan_def.h> -/* PAGE_OFFSET - the virtual address of the start of the kernel image */ +/* + * PAGE_OFFSET: the virtual address of the start of lowmem, memory above + * the virtual address range for userspace. + * KERNEL_OFFSET: the virtual address of the start of the kernel image. + * we may further offset this with TEXT_OFFSET in practice. + */ #define PAGE_OFFSET UL(CONFIG_PAGE_OFFSET) +#define KERNEL_OFFSET (PAGE_OFFSET) #ifdef CONFIG_MMU diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S index 7f62c5eccdf3..4e2daaa7636a 100644 --- a/arch/arm/kernel/head.S +++ b/arch/arm/kernel/head.S @@ -23,7 +23,6 @@ #if defined(CONFIG_DEBUG_LL) && !defined(CONFIG_DEBUG_SEMIHOSTING) #include CONFIG_DEBUG_LL_INCLUDE #endif - /* * swapper_pg_dir is the virtual address of the initial page table. * We place the page tables 16K below KERNEL_RAM_VADDR. Therefore, we must @@ -31,7 +30,7 @@ * the least significant 16 bits to be 0x8000, but we could probably * relax this restriction to KERNEL_RAM_VADDR >= PAGE_OFFSET + 0x4000. */ -#define KERNEL_RAM_VADDR (PAGE_OFFSET + TEXT_OFFSET) +#define KERNEL_RAM_VADDR (KERNEL_OFFSET + TEXT_OFFSET) #if (KERNEL_RAM_VADDR & 0xffff) != 0x8000 #error KERNEL_RAM_VADDR must start at 0xXXXX8000 #endif diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S index f7f4620d59c3..20c4f6d20c7a 100644 --- a/arch/arm/kernel/vmlinux.lds.S +++ b/arch/arm/kernel/vmlinux.lds.S @@ -47,7 +47,7 @@ SECTIONS #endif } - . = PAGE_OFFSET + TEXT_OFFSET; + . = KERNEL_OFFSET + TEXT_OFFSET; .head.text : { _text = .; HEAD_TEXT -- 2.31.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/3] ARM: Define kernel physical section start and end 2021-05-17 14:57 [PATCH 0/3] Split lowmem and kernel mappings Linus Walleij 2021-05-17 14:57 ` [PATCH 1/3] ARM: Split KERNEL_OFFSET from PAGE_OFFSET Linus Walleij @ 2021-05-17 14:57 ` Linus Walleij 2021-05-17 14:57 ` [PATCH 3/3] ARM: Map the lowmem and kernel separately Linus Walleij 2 siblings, 0 replies; 12+ messages in thread From: Linus Walleij @ 2021-05-17 14:57 UTC (permalink / raw) To: linux-arm-kernel, Russell King, Arnd Bergmann Cc: Florian Fainelli, Geert Uytterhoeven, Ard Biesheuvel, Linus Walleij When we are mapping the initial sections in head.S we know very well where the start and end of the kernel image in physical memory is placed. Later on it gets hard to determine this. Save the information into two variables named kernel_sec_start and kernel_sec_end for convenience for later work involving the physical start and end of the kernel. These variables are section-aligned corresponding to the early section mappings set up in head.S. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> --- arch/arm/include/asm/memory.h | 7 +++++++ arch/arm/kernel/head.S | 27 ++++++++++++++++++++++++--- 2 files changed, 31 insertions(+), 3 deletions(-) diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index a5f1d1826a98..ca213d7d095f 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -158,6 +158,13 @@ extern unsigned long vectors_base; #ifndef __ASSEMBLY__ +/* + * Physical start and end address of the kernel sections. These addresses are + * 2MB-aligned to match the section mappings placed over the kernel. + */ +extern phys_addr_t kernel_sec_start; +extern phys_addr_t kernel_sec_end; + /* * Physical vs virtual RAM address space conversion. These are * private definitions which should NOT be used outside memory.h diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S index 4e2daaa7636a..9eb0b4dbcc12 100644 --- a/arch/arm/kernel/head.S +++ b/arch/arm/kernel/head.S @@ -47,6 +47,20 @@ .globl swapper_pg_dir .equ swapper_pg_dir, KERNEL_RAM_VADDR - PG_DIR_SIZE + /* + * This needs to be assigned at runtime when the linker symbols are + * resolved. + */ + .pushsection .data + .align 2 + .globl kernel_sec_start + .globl kernel_sec_end +kernel_sec_start: + .long 0 +kernel_sec_end: + .long 0 + .popsection + .macro pgtbl, rd, phys add \rd, \phys, #TEXT_OFFSET sub \rd, \rd, #PG_DIR_SIZE @@ -229,16 +243,23 @@ __create_page_tables: blo 1b /* - * Map our RAM from the start to the end of the kernel .bss section. + * The main matter: map in the kernel using section mappings, and + * set two variables to indicate the physical start and end of the + * kernel. */ - add r0, r4, #PAGE_OFFSET >> (SECTION_SHIFT - PMD_ORDER) + add r0, r4, #KERNEL_OFFSET >> (SECTION_SHIFT - PMD_ORDER) ldr r6, =(_end - 1) - orr r3, r8, r7 + adr_l r5, kernel_sec_start @ _pa(kernel_sec_start) + str r8, [r5] @ Save physical start of kernel + orr r3, r8, r7 @ Add the MMU flags add r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER) 1: str r3, [r0], #1 << PMD_ORDER add r3, r3, #1 << SECTION_SHIFT cmp r0, r6 bls 1b + eor r3, r3, r7 @ Remove the MMU flags + adr_l r5, kernel_sec_end @ _pa(kernel_sec_end) + str r3, [r5] @ Save physical end of kernel #ifdef CONFIG_XIP_KERNEL /* -- 2.31.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/3] ARM: Map the lowmem and kernel separately 2021-05-17 14:57 [PATCH 0/3] Split lowmem and kernel mappings Linus Walleij 2021-05-17 14:57 ` [PATCH 1/3] ARM: Split KERNEL_OFFSET from PAGE_OFFSET Linus Walleij 2021-05-17 14:57 ` [PATCH 2/3] ARM: Define kernel physical section start and end Linus Walleij @ 2021-05-17 14:57 ` Linus Walleij 2021-07-30 20:22 ` Nishanth Menon 2 siblings, 1 reply; 12+ messages in thread From: Linus Walleij @ 2021-05-17 14:57 UTC (permalink / raw) To: linux-arm-kernel, Russell King, Arnd Bergmann Cc: Florian Fainelli, Geert Uytterhoeven, Ard Biesheuvel, Linus Walleij Using our knowledge of where the physical kernel sections start and end we can split mapping of lowmem and kernel apart. This is helpful when you want to place the kernel independently from lowmem and not be limited to putting it into lowmem only, but also into places such as the VMALLOC area. We extensively rewrite the lowmem mapping code to account for all cases where the kernel image overlaps with the lowmem in different ways. This is helpful to handle situations which occur when the kernel is loaded in different places and makes it possible to place the kernel in a more random manner which is done with e.g. KASLR. We sprinkle some comments with illustrations and pr_debug() over it so it is also very evident to readers what is happening. We now use the kernel_sec_start and kernel_sec_end instead of relying on __pa() (phys_to_virt) to provide this. This is helpful if we want to resolve physical-to-virtual and virtual-to-physical mappings at runtime rather than compiletime, especially if we are not using patch phys to virt. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> --- arch/arm/mm/mmu.c | 144 +++++++++++++++++++++++++++++++++++----------- 1 file changed, 111 insertions(+), 33 deletions(-) diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c index c1e12aab67b8..ac37959507b7 100644 --- a/arch/arm/mm/mmu.c +++ b/arch/arm/mm/mmu.c @@ -1457,8 +1457,6 @@ static void __init kmap_init(void) static void __init map_lowmem(void) { - phys_addr_t kernel_x_start = round_down(__pa(KERNEL_START), SECTION_SIZE); - phys_addr_t kernel_x_end = round_up(__pa(__init_end), SECTION_SIZE); phys_addr_t start, end; u64 i; @@ -1466,55 +1464,126 @@ static void __init map_lowmem(void) for_each_mem_range(i, &start, &end) { struct map_desc map; + pr_debug("map lowmem start: 0x%08llx, end: 0x%08llx\n", + (long long)start, (long long)end); if (end > arm_lowmem_limit) end = arm_lowmem_limit; if (start >= end) break; - if (end < kernel_x_start) { - map.pfn = __phys_to_pfn(start); - map.virtual = __phys_to_virt(start); - map.length = end - start; - map.type = MT_MEMORY_RWX; + /* + * If our kernel image is in the VMALLOC area we need to remove + * the kernel physical memory from lowmem since the kernel will + * be mapped separately. + * + * The kernel will typically be at the very start of lowmem, + * but any placement relative to memory ranges is possible. + * + * If the memblock contains the kernel, we have to chisel out + * the kernel memory from it and map each part separately. We + * get 6 different theoretical cases: + * + * +--------+ +--------+ + * +-- start --+ +--------+ | Kernel | | Kernel | + * | | | Kernel | | case 2 | | case 5 | + * | | | case 1 | +--------+ | | +--------+ + * | Memory | +--------+ | | | Kernel | + * | range | +--------+ | | | case 6 | + * | | | Kernel | +--------+ | | +--------+ + * | | | case 3 | | Kernel | | | + * +-- end ----+ +--------+ | case 4 | | | + * +--------+ +--------+ + */ - create_mapping(&map); - } else if (start >= kernel_x_end) { - map.pfn = __phys_to_pfn(start); - map.virtual = __phys_to_virt(start); - map.length = end - start; - map.type = MT_MEMORY_RW; + /* Case 5: kernel covers range, don't map anything, should be rare */ + if ((start > kernel_sec_start) && (end < kernel_sec_end)) + break; - create_mapping(&map); - } else { - /* This better cover the entire kernel */ - if (start < kernel_x_start) { + /* Cases where the kernel is starting inside the range */ + if ((kernel_sec_start >= start) && (kernel_sec_start <= end)) { + /* Case 6: kernel is embedded in the range, we need two mappings */ + if ((start < kernel_sec_start) && (end > kernel_sec_end)) { + /* Map memory below the kernel */ map.pfn = __phys_to_pfn(start); map.virtual = __phys_to_virt(start); - map.length = kernel_x_start - start; + map.length = kernel_sec_start - start; map.type = MT_MEMORY_RW; - create_mapping(&map); - } - - map.pfn = __phys_to_pfn(kernel_x_start); - map.virtual = __phys_to_virt(kernel_x_start); - map.length = kernel_x_end - kernel_x_start; - map.type = MT_MEMORY_RWX; - - create_mapping(&map); - - if (kernel_x_end < end) { - map.pfn = __phys_to_pfn(kernel_x_end); - map.virtual = __phys_to_virt(kernel_x_end); - map.length = end - kernel_x_end; + /* Map memory above the kernel */ + map.pfn = __phys_to_pfn(kernel_sec_end); + map.virtual = __phys_to_virt(kernel_sec_end); + map.length = end - kernel_sec_end; map.type = MT_MEMORY_RW; - create_mapping(&map); + break; } + /* Case 1: kernel and range start at the same address, should be common */ + if (kernel_sec_start == start) + start = kernel_sec_end; + /* Case 3: kernel and range end at the same address, should be rare */ + if (kernel_sec_end == end) + end = kernel_sec_start; + } else if ((kernel_sec_start < start) && (kernel_sec_end > start) && (kernel_sec_end < end)) { + /* Case 2: kernel ends inside range, starts below it */ + start = kernel_sec_end; + } else if ((kernel_sec_start > start) && (kernel_sec_start < end) && (kernel_sec_end > end)) { + /* Case 4: kernel starts inside range, ends above it */ + end = kernel_sec_start; } + map.pfn = __phys_to_pfn(start); + map.virtual = __phys_to_virt(start); + map.length = end - start; + map.type = MT_MEMORY_RW; + create_mapping(&map); } } +static void __init map_kernel(void) +{ + /* + * We use the well known kernel section start and end and split the area in the + * middle like this: + * . . + * | RW memory | + * +----------------+ kernel_x_start + * | Executable | + * | kernel memory | + * +----------------+ kernel_x_end / kernel_nx_start + * | Non-executable | + * | kernel memory | + * +----------------+ kernel_nx_end + * | RW memory | + * . . + * + * Notice that we are dealing with section sized mappings here so all of this + * will be bumped to the closest section boundary. This means that some of the + * non-executable part of the kernel memory is actually mapped as executable. + * This will only persist until we turn on proper memory management later on + * and we remap the whole kernel with page granularity. + */ + phys_addr_t kernel_x_start = kernel_sec_start; + phys_addr_t kernel_x_end = round_up(__pa(__init_end), SECTION_SIZE); + phys_addr_t kernel_nx_start = kernel_x_end; + phys_addr_t kernel_nx_end = kernel_sec_end; + struct map_desc map; + + map.pfn = __phys_to_pfn(kernel_x_start); + map.virtual = __phys_to_virt(kernel_x_start); + map.length = kernel_x_end - kernel_x_start; + map.type = MT_MEMORY_RWX; + create_mapping(&map); + + /* If the nx part is small it may end up covered by the tail of the RWX section */ + if (kernel_x_end == kernel_nx_end) + return; + + map.pfn = __phys_to_pfn(kernel_nx_start); + map.virtual = __phys_to_virt(kernel_nx_start); + map.length = kernel_nx_end - kernel_nx_start; + map.type = MT_MEMORY_RW; + create_mapping(&map); +} + #ifdef CONFIG_ARM_PV_FIXUP typedef void pgtables_remap(long long offset, unsigned long pgd); pgtables_remap lpae_pgtables_remap_asm; @@ -1645,9 +1714,18 @@ void __init paging_init(const struct machine_desc *mdesc) { void *zero_page; + pr_debug("physical kernel sections: 0x%08x-0x%08x\n", + kernel_sec_start, kernel_sec_end); + prepare_page_table(); map_lowmem(); memblock_set_current_limit(arm_lowmem_limit); + pr_debug("lowmem limit is %08llx\n", (long long)arm_lowmem_limit); + /* + * After this point early_alloc(), i.e. the memblock allocator, can + * be used + */ + map_kernel(); dma_contiguous_remap(); early_fixmap_shutdown(); devicemaps_init(mdesc); -- 2.31.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately 2021-05-17 14:57 ` [PATCH 3/3] ARM: Map the lowmem and kernel separately Linus Walleij @ 2021-07-30 20:22 ` Nishanth Menon 2021-07-30 22:40 ` Linus Walleij 0 siblings, 1 reply; 12+ messages in thread From: Nishanth Menon @ 2021-07-30 20:22 UTC (permalink / raw) To: Linus Walleij Cc: Florian Fainelli, Arnd Bergmann, Geert Uytterhoeven, santosh.shilimkar, Russell King, Ard Biesheuvel, linux-arm-kernel On 16:57-20210517, Linus Walleij wrote: > Using our knowledge of where the physical kernel sections start > and end we can split mapping of lowmem and kernel apart. > [...] > > Signed-off-by: Linus Walleij <linus.walleij@linaro.org> We noticed a regression in k2g platforms with this patch. commit 6e121df14ccd ("ARM: 9090/1: Map the lowmem and kernel separately") Boot log: https://pastebin.ubuntu.com/p/Sf3r28D8MR/ Bisect log: https://pastebin.ubuntu.com/p/6PTYpNVFDy/ -- Regards, Nishanth Menon Key (0xDDB5849D1736249D) / Fingerprint: F8A2 8693 54EB 8232 17A3 1A34 DDB5 849D 1736 249D _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately 2021-07-30 20:22 ` Nishanth Menon @ 2021-07-30 22:40 ` Linus Walleij 2021-08-03 10:27 ` Grygorii Strashko 0 siblings, 1 reply; 12+ messages in thread From: Linus Walleij @ 2021-07-30 22:40 UTC (permalink / raw) To: Nishanth Menon, santosh.shilimkar, Grygorii Strashko, Santosh Shilimkar Cc: Florian Fainelli, Arnd Bergmann, Geert Uytterhoeven, Russell King, Ard Biesheuvel, Linux ARM On Fri, Jul 30, 2021 at 10:23 PM Nishanth Menon <nm@ti.com> wrote: > On 16:57-20210517, Linus Walleij wrote: > > Using our knowledge of where the physical kernel sections start > > and end we can split mapping of lowmem and kernel apart. > > > [...] > > > > Signed-off-by: Linus Walleij <linus.walleij@linaro.org> > > > We noticed a regression in k2g platforms with this patch. > commit 6e121df14ccd ("ARM: 9090/1: Map the lowmem and kernel > separately") > > Boot log: https://pastebin.ubuntu.com/p/Sf3r28D8MR/ > Bisect log: https://pastebin.ubuntu.com/p/6PTYpNVFDy/ Given how invasive the patch is I'm surprised that k2g is the only thing so far complaining about it. Let's fix this! I've had similar reports and in those cases it has been that the boot loader did not turn off the caches before jumping to execute the Linux kernel. So first can you check that cache is off? Next can you turn on DEBUG_LL and enable the debug messages in arch/arm/mmu.c in map_lowmem() in some way that works for you, for example I just add a patch like this: https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator.git/patch/?id=d10c46631a6eaef1339e376b878de985143b27ff (Beware of ugly debug habits!) Next is this using the keystone 2 pv fixup thing mdesc->pv_fixup mdesc->pv_fixup in early_paging_init()? I added a special print there to see if that gets called. This code is really scary but I suppose Santosh can explain what is supposed to happen here and what I may have stepped on. (I see Santosh work at Oracle now? Interesting.) Yours, Linus Walleij _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately 2021-07-30 22:40 ` Linus Walleij @ 2021-08-03 10:27 ` Grygorii Strashko 2021-08-03 10:49 ` Russell King (Oracle) 0 siblings, 1 reply; 12+ messages in thread From: Grygorii Strashko @ 2021-08-03 10:27 UTC (permalink / raw) To: Linus Walleij, Nishanth Menon, santosh.shilimkar, Santosh Shilimkar Cc: Florian Fainelli, Arnd Bergmann, Geert Uytterhoeven, Russell King, Tero Kristo, Ard Biesheuvel, Linux ARM On 31/07/2021 01:40, Linus Walleij wrote: > On Fri, Jul 30, 2021 at 10:23 PM Nishanth Menon <nm@ti.com> wrote: >> On 16:57-20210517, Linus Walleij wrote: >>> Using our knowledge of where the physical kernel sections start >>> and end we can split mapping of lowmem and kernel apart. >>> >> [...] >>> >>> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> >> >> >> We noticed a regression in k2g platforms with this patch. >> commit 6e121df14ccd ("ARM: 9090/1: Map the lowmem and kernel >> separately") >> >> Boot log: https://pastebin.ubuntu.com/p/Sf3r28D8MR/ >> Bisect log: https://pastebin.ubuntu.com/p/6PTYpNVFDy/ > > Given how invasive the patch is I'm surprised that k2g is the > only thing so far complaining about it. Let's fix this! > > I've had similar reports and in those cases it has been that the boot > loader did not turn off the caches before jumping to execute the > Linux kernel. > > So first can you check that cache is off? > > Next can you turn on DEBUG_LL and enable the debug messages > in arch/arm/mmu.c in map_lowmem() in some way that works for > you, for example I just add a patch like this: > https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator.git/patch/?id=d10c46631a6eaef1339e376b878de985143b27ff > (Beware of ugly debug habits!) > > Next is this using the keystone 2 pv fixup thing mdesc->pv_fixup > mdesc->pv_fixup in early_paging_init()? > > I added a special print > there to see if that gets called. This code is really scary but I > suppose Santosh can explain what is supposed to happen here > and what I may have stepped on. (I see Santosh work at Oracle > now? Interesting.) It should be called on all keystone 2 platforms by default (LAPE is default mode for K2). Huh. Below as I remember it: - K2 starts using memory window (aliased memory) at 00 8000 0000 (No IO coherency is supported) - K2, early at boot, is switching to LPAE memory window 08 0000 0000 (IO coherency is supported) so all, already mapped memory, has to be fixed - phys addresses. -- Best regards, grygorii _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately 2021-08-03 10:27 ` Grygorii Strashko @ 2021-08-03 10:49 ` Russell King (Oracle) 2021-08-03 11:34 ` Linus Walleij 2021-08-04 10:05 ` Linus Walleij 0 siblings, 2 replies; 12+ messages in thread From: Russell King (Oracle) @ 2021-08-03 10:49 UTC (permalink / raw) To: Grygorii Strashko Cc: Nishanth Menon, Florian Fainelli, Geert Uytterhoeven, Arnd Bergmann, Linus Walleij, santosh.shilimkar, Tero Kristo, Santosh Shilimkar, Ard Biesheuvel, Linux ARM On Tue, Aug 03, 2021 at 01:27:18PM +0300, Grygorii Strashko wrote: > > > On 31/07/2021 01:40, Linus Walleij wrote: > > On Fri, Jul 30, 2021 at 10:23 PM Nishanth Menon <nm@ti.com> wrote: > > > On 16:57-20210517, Linus Walleij wrote: > > > > Using our knowledge of where the physical kernel sections start > > > > and end we can split mapping of lowmem and kernel apart. > > > > > > > [...] > > > > > > > > Signed-off-by: Linus Walleij <linus.walleij@linaro.org> > > > > > > > > > We noticed a regression in k2g platforms with this patch. > > > commit 6e121df14ccd ("ARM: 9090/1: Map the lowmem and kernel > > > separately") > > > > > > Boot log: https://pastebin.ubuntu.com/p/Sf3r28D8MR/ > > > Bisect log: https://pastebin.ubuntu.com/p/6PTYpNVFDy/ > > > > Given how invasive the patch is I'm surprised that k2g is the > > only thing so far complaining about it. Let's fix this! > > > > I've had similar reports and in those cases it has been that the boot > > loader did not turn off the caches before jumping to execute the > > Linux kernel. > > > > So first can you check that cache is off? > > > > Next can you turn on DEBUG_LL and enable the debug messages > > in arch/arm/mmu.c in map_lowmem() in some way that works for > > you, for example I just add a patch like this: > > https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator.git/patch/?id=d10c46631a6eaef1339e376b878de985143b27ff > > (Beware of ugly debug habits!) > > > > Next is this using the keystone 2 pv fixup thing mdesc->pv_fixup > > mdesc->pv_fixup in early_paging_init()? > > > > I added a special print > > there to see if that gets called. This code is really scary but I > > suppose Santosh can explain what is supposed to happen here > > and what I may have stepped on. (I see Santosh work at Oracle > > now? Interesting.) > > It should be called on all keystone 2 platforms by default (LAPE is default mode for K2). > > Huh. Below as I remember it: > - K2 starts using memory window (aliased memory) at 00 8000 0000 (No IO coherency is supported) > - K2, early at boot, is switching to LPAE memory window 08 0000 0000 (IO coherency is supported) > so all, already mapped memory, has to be fixed - phys addresses. If I remember correctly, the code that fixes that up assumes that (a) the kernel is mapped using section mappings in the lowmem mapping at PAGE_OFFSET.._end-1 (b) that no other RAM is mapped (iow, it's called before the kernel starts building the real page mappings in paging_init()). It looks to me like Linus missed the code in arch/arm/mm/pv-fixup-asm.S and as the kernel is no longer mapped in the lowmem mapping, this likely writes a load of entries in the page tables that are random... -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately 2021-08-03 10:49 ` Russell King (Oracle) @ 2021-08-03 11:34 ` Linus Walleij 2021-08-04 10:05 ` Linus Walleij 1 sibling, 0 replies; 12+ messages in thread From: Linus Walleij @ 2021-08-03 11:34 UTC (permalink / raw) To: Russell King (Oracle) Cc: Nishanth Menon, Grygorii Strashko, Geert Uytterhoeven, Arnd Bergmann, Tero Kristo, santosh.shilimkar, Florian Fainelli, Santosh Shilimkar, Ard Biesheuvel, Linux ARM On Tue, Aug 3, 2021 at 12:49 PM Russell King (Oracle) <linux@armlinux.org.uk> wrote: > It looks to me like Linus missed the code in arch/arm/mm/pv-fixup-asm.S > and as the kernel is no longer mapped in the lowmem mapping, this > likely writes a load of entries in the page tables that are random... Oops. I'll do my homework. I'll read this code and see if I can fix this. Yours, Linus Walleij _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately 2021-08-03 10:49 ` Russell King (Oracle) 2021-08-03 11:34 ` Linus Walleij @ 2021-08-04 10:05 ` Linus Walleij 2021-08-04 10:36 ` Russell King (Oracle) 1 sibling, 1 reply; 12+ messages in thread From: Linus Walleij @ 2021-08-04 10:05 UTC (permalink / raw) To: Russell King (Oracle) Cc: Grygorii Strashko, Nishanth Menon, santosh.shilimkar, Santosh Shilimkar, Linux ARM, Arnd Bergmann, Florian Fainelli, Geert Uytterhoeven, Ard Biesheuvel, Suman Anna, Tero Kristo On Tue, Aug 3, 2021 at 12:49 PM Russell King (Oracle) <linux@armlinux.org.uk> wrote: > > It should be called on all keystone 2 platforms by default (LAPE is default mode for K2). > > > > Huh. Below as I remember it: > > - K2 starts using memory window (aliased memory) at 00 8000 0000 (No IO coherency is supported) > > - K2, early at boot, is switching to LPAE memory window 08 0000 0000 (IO coherency is supported) > > so all, already mapped memory, has to be fixed - phys addresses. > > If I remember correctly, the code that fixes that up assumes that > (a) the kernel is mapped using section mappings in the lowmem mapping > at PAGE_OFFSET.._end-1 > (b) that no other RAM is mapped (iow, it's called before the kernel > starts building the real page mappings in paging_init()). > > It looks to me like Linus missed the code in arch/arm/mm/pv-fixup-asm.S > and as the kernel is no longer mapped in the lowmem mapping, this > likely writes a load of entries in the page tables that are random... I looked into this! Indeed early_mm_init() is called before paging_init() and early_mm_init() calls lpae_pgtables_remap_asm() in pv-fixup-asm.S to add the offset to all the section mappings over the kernel. The section mappings we have at this point are coming from head.S, so those get modified with the offset. What I don't understand is what map_lowmem() was doing with the kernel mappings on the Keystone 2 before my patch. If it actually overwrites the kernel mapping with a new one, it will undo what lpae_pgtables_remap_asm() has done without applying the offset. So I suppose map_lowmem(), due to the memory ranges passed, would just bail the kernel mappings and leave them as-is. Nishanth: what happens if you simply comment out map_kernel like this? diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c index 7583bda5ea7d..25aa7d05a931 100644 --- a/arch/arm/mm/mmu.c +++ b/arch/arm/mm/mmu.c @@ -1727,7 +1727,7 @@ void __init paging_init(const struct machine_desc *mdesc) * After this point early_alloc(), i.e. the memblock allocator, can * be used */ - map_kernel(); + // map_kernel(); dma_contiguous_remap(); early_fixmap_shutdown(); devicemaps_init(mdesc); Yours, Linus Walleij _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately 2021-08-04 10:05 ` Linus Walleij @ 2021-08-04 10:36 ` Russell King (Oracle) 2021-08-04 13:12 ` Linus Walleij 0 siblings, 1 reply; 12+ messages in thread From: Russell King (Oracle) @ 2021-08-04 10:36 UTC (permalink / raw) To: Linus Walleij Cc: Grygorii Strashko, Nishanth Menon, santosh.shilimkar, Santosh Shilimkar, Linux ARM, Arnd Bergmann, Florian Fainelli, Geert Uytterhoeven, Ard Biesheuvel, Suman Anna, Tero Kristo On Wed, Aug 04, 2021 at 12:05:54PM +0200, Linus Walleij wrote: > On Tue, Aug 3, 2021 at 12:49 PM Russell King (Oracle) > <linux@armlinux.org.uk> wrote: > > > > It should be called on all keystone 2 platforms by default (LAPE is default mode for K2). > > > > > > Huh. Below as I remember it: > > > - K2 starts using memory window (aliased memory) at 00 8000 0000 (No IO coherency is supported) > > > - K2, early at boot, is switching to LPAE memory window 08 0000 0000 (IO coherency is supported) > > > so all, already mapped memory, has to be fixed - phys addresses. > > > > If I remember correctly, the code that fixes that up assumes that > > (a) the kernel is mapped using section mappings in the lowmem mapping > > at PAGE_OFFSET.._end-1 > > (b) that no other RAM is mapped (iow, it's called before the kernel > > starts building the real page mappings in paging_init()). > > > > It looks to me like Linus missed the code in arch/arm/mm/pv-fixup-asm.S > > and as the kernel is no longer mapped in the lowmem mapping, this > > likely writes a load of entries in the page tables that are random... > > I looked into this! > > Indeed early_mm_init() is called before paging_init() and early_mm_init() > calls lpae_pgtables_remap_asm() in pv-fixup-asm.S to add the offset > to all the section mappings over the kernel. > > The section mappings we have at this point are coming from head.S, > so those get modified with the offset. > > What I don't understand is what map_lowmem() was doing with the > kernel mappings on the Keystone 2 before my patch. Essentially, what happened with Keystone 2 is that the head.S code does its usual thing, mapping the kernel using the physical address that the early code is running at using section mappings, before then switching the MMU on and running from the virtual address space. At this point, the phys<->virt conversions work for this physical address space (which, on Keystone 2 is the low physical space which isn't coherent with devices.) early_mm_init() gets called, we spot we're on Keystone 2, and we initiate the change - the MMU needs a break-before-make for these mappings, and as we're using these mappings to execute our code, we have to turn the MMU off to do the update. We update the phys<->virt conversion, and then fixup the page tables. lpae_pgtables_remap() is passed the offset between the low physical address and the high physical address, as well as the low physical address of the page tables. (since when we turn the MMU off, all we have access to are the low physical mapping.) The structure of the LPAE page tables is: - First 4k - L1 entries. - Following 32k - L2 entries mapped as sections. In pv-fixup-asm.S: ldr r6, =(_end - 1) add r7, r2, #0x1000 add r6, r7, r6, lsr #SECTION_SHIFT - L2_ORDER add r7, r7, #PAGE_OFFSET >> (SECTION_SHIFT - L2_ORDER) r7 is the low physical address in the L2 entries, r6 is the last (inclusive) entry. Essentially, all we do is add "offset" to every section mapping corresponding to addresses between PAGE_OFFSET and _end. We also do that for the boot data (two section mappings) at FDT_FIXED_BASE, the four L1 entries, and the two TTBR physical address register values. > If it actually overwrites the kernel mapping with a new one, it will > undo what lpae_pgtables_remap_asm() has done without applying > the offset. > > So I suppose map_lowmem(), due to the memory ranges passed, > would just bail the kernel mappings and leave them as-is. By the time map_lowmem() gets called, the phys<->virt mappings will have been modified throughout the kernel so that virtual addresses in the direct mapping will translate to the high physical addresses. So, things like __pa(__init_end) in map_kernel() will translate __init_end to the high physical address rather than the low physical address. My guess would be the problem is that kernel_sec_start and kernel_sec_end are not being adjusted, and are still pointing at the low physical addresses rather than the high ones. Consequently, kernel_x_start and kernel_nx_end points at the low phys space, but kernel_x_end and kernel_nx_start points to the high phys space. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately 2021-08-04 10:36 ` Russell King (Oracle) @ 2021-08-04 13:12 ` Linus Walleij 0 siblings, 0 replies; 12+ messages in thread From: Linus Walleij @ 2021-08-04 13:12 UTC (permalink / raw) To: Russell King (Oracle) Cc: Grygorii Strashko, Nishanth Menon, santosh.shilimkar, Santosh Shilimkar, Linux ARM, Arnd Bergmann, Florian Fainelli, Geert Uytterhoeven, Ard Biesheuvel, Suman Anna, Tero Kristo Thanks so much for explaining this Russell! On Wed, Aug 4, 2021 at 12:37 PM Russell King (Oracle) <linux@armlinux.org.uk> wrote: > My guess would be the problem is that kernel_sec_start and > kernel_sec_end are not being adjusted, and are still pointing at > the low physical addresses rather than the high ones. > Consequently, kernel_x_start and kernel_nx_end points at the low phys > space, but kernel_x_end and kernel_nx_start points to the high phys > space. I bet this is right, I will make a patch for Nishanth to test! Thanks a *lot* for your elaborate explanation. Yours, Linus Walleij _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-08-04 13:15 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-05-17 14:57 [PATCH 0/3] Split lowmem and kernel mappings Linus Walleij 2021-05-17 14:57 ` [PATCH 1/3] ARM: Split KERNEL_OFFSET from PAGE_OFFSET Linus Walleij 2021-05-17 14:57 ` [PATCH 2/3] ARM: Define kernel physical section start and end Linus Walleij 2021-05-17 14:57 ` [PATCH 3/3] ARM: Map the lowmem and kernel separately Linus Walleij 2021-07-30 20:22 ` Nishanth Menon 2021-07-30 22:40 ` Linus Walleij 2021-08-03 10:27 ` Grygorii Strashko 2021-08-03 10:49 ` Russell King (Oracle) 2021-08-03 11:34 ` Linus Walleij 2021-08-04 10:05 ` Linus Walleij 2021-08-04 10:36 ` Russell King (Oracle) 2021-08-04 13:12 ` Linus Walleij
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).