linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Split lowmem and kernel mappings
@ 2021-05-17 14:57 Linus Walleij
  2021-05-17 14:57 ` [PATCH 1/3] ARM: Split KERNEL_OFFSET from PAGE_OFFSET Linus Walleij
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Linus Walleij @ 2021-05-17 14:57 UTC (permalink / raw)
  To: linux-arm-kernel, Russell King, Arnd Bergmann
  Cc: Florian Fainelli, Geert Uytterhoeven, Ard Biesheuvel, Linus Walleij

This patch set of 3 patches is the beginnings of an idea
from Arnd: to move the kernel image (executable and
non-executable sections) into the VMALLOC area.

To get there we first need to pry apart the section
mappings of lowmem and the kernel.

This series:

- Creates a specific KERNEL_OFFSET and removes the implied
  assumption that the kernel will always start at PAGE_OFFSET.

- Saves the kernel physical section start and end addresses
  from the early assembly-coded section mappings so we can
  reuse these later without having to rely on
  virtual-to-physical mappings.

- Splits the map_lowmem() function into two: map_lowmem()
  which maps the lowmem proper and map_kernel() that maps
  the kernel. The split is nice (IMO) because the functions
  become cleaner and easier to read. It is possible that
  the refactoring of map_lowmem() also avoids a few
  pitfalls from the earlier code.

I have a prototype patch that on top of this actually moves
the kernel into the VMALLOC area at 0xf1000000 but I want
to keep things apart to avoid too much discussion and
confusion. One thing at the time. I think these patches
are fine in their own right, but I wrote them so I need
others to think the same.

If you want to see the prototype patch(es) it is here:
https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator.git/log/?h=kernel-in-vmalloc-v5.13-rc1

Linus Walleij (3):
  ARM: Split KERNEL_OFFSET from PAGE_OFFSET
  ARM: Define kernel physical section start and end
  ARM: Map the lowmem and kernel separately

 arch/arm/include/asm/memory.h |  15 +++-
 arch/arm/kernel/head.S        |  30 +++++--
 arch/arm/kernel/vmlinux.lds.S |   2 +-
 arch/arm/mm/mmu.c             | 144 ++++++++++++++++++++++++++--------
 4 files changed, 151 insertions(+), 40 deletions(-)

-- 
2.31.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] ARM: Split KERNEL_OFFSET from PAGE_OFFSET
  2021-05-17 14:57 [PATCH 0/3] Split lowmem and kernel mappings Linus Walleij
@ 2021-05-17 14:57 ` Linus Walleij
  2021-05-17 14:57 ` [PATCH 2/3] ARM: Define kernel physical section start and end Linus Walleij
  2021-05-17 14:57 ` [PATCH 3/3] ARM: Map the lowmem and kernel separately Linus Walleij
  2 siblings, 0 replies; 12+ messages in thread
From: Linus Walleij @ 2021-05-17 14:57 UTC (permalink / raw)
  To: linux-arm-kernel, Russell King, Arnd Bergmann
  Cc: Florian Fainelli, Geert Uytterhoeven, Ard Biesheuvel, Linus Walleij

We want to be able to compile the kernel into an address different
from PAGE_OFFSET (start of lowmem) + TEXT_OFFSET, so start to pry
apart the address of where the kernel is located from the address
where the lowmem is located by defining and using KERNEL_OFFSET in
a few key places.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 arch/arm/include/asm/memory.h | 8 +++++++-
 arch/arm/kernel/head.S        | 3 +--
 arch/arm/kernel/vmlinux.lds.S | 2 +-
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index a711322d9f40..a5f1d1826a98 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -20,8 +20,14 @@
 #endif
 #include <asm/kasan_def.h>
 
-/* PAGE_OFFSET - the virtual address of the start of the kernel image */
+/*
+ * PAGE_OFFSET: the virtual address of the start of lowmem, memory above
+ *   the virtual address range for userspace.
+ * KERNEL_OFFSET: the virtual address of the start of the kernel image.
+ *   we may further offset this with TEXT_OFFSET in practice.
+ */
 #define PAGE_OFFSET		UL(CONFIG_PAGE_OFFSET)
+#define KERNEL_OFFSET		(PAGE_OFFSET)
 
 #ifdef CONFIG_MMU
 
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 7f62c5eccdf3..4e2daaa7636a 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -23,7 +23,6 @@
 #if defined(CONFIG_DEBUG_LL) && !defined(CONFIG_DEBUG_SEMIHOSTING)
 #include CONFIG_DEBUG_LL_INCLUDE
 #endif
-
 /*
  * swapper_pg_dir is the virtual address of the initial page table.
  * We place the page tables 16K below KERNEL_RAM_VADDR.  Therefore, we must
@@ -31,7 +30,7 @@
  * the least significant 16 bits to be 0x8000, but we could probably
  * relax this restriction to KERNEL_RAM_VADDR >= PAGE_OFFSET + 0x4000.
  */
-#define KERNEL_RAM_VADDR	(PAGE_OFFSET + TEXT_OFFSET)
+#define KERNEL_RAM_VADDR	(KERNEL_OFFSET + TEXT_OFFSET)
 #if (KERNEL_RAM_VADDR & 0xffff) != 0x8000
 #error KERNEL_RAM_VADDR must start at 0xXXXX8000
 #endif
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index f7f4620d59c3..20c4f6d20c7a 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -47,7 +47,7 @@ SECTIONS
 #endif
 	}
 
-	. = PAGE_OFFSET + TEXT_OFFSET;
+	. = KERNEL_OFFSET + TEXT_OFFSET;
 	.head.text : {
 		_text = .;
 		HEAD_TEXT
-- 
2.31.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] ARM: Define kernel physical section start and end
  2021-05-17 14:57 [PATCH 0/3] Split lowmem and kernel mappings Linus Walleij
  2021-05-17 14:57 ` [PATCH 1/3] ARM: Split KERNEL_OFFSET from PAGE_OFFSET Linus Walleij
@ 2021-05-17 14:57 ` Linus Walleij
  2021-05-17 14:57 ` [PATCH 3/3] ARM: Map the lowmem and kernel separately Linus Walleij
  2 siblings, 0 replies; 12+ messages in thread
From: Linus Walleij @ 2021-05-17 14:57 UTC (permalink / raw)
  To: linux-arm-kernel, Russell King, Arnd Bergmann
  Cc: Florian Fainelli, Geert Uytterhoeven, Ard Biesheuvel, Linus Walleij

When we are mapping the initial sections in head.S we
know very well where the start and end of the kernel image
in physical memory is placed. Later on it gets hard
to determine this.

Save the information into two variables named
kernel_sec_start and kernel_sec_end for convenience
for later work involving the physical start and end
of the kernel. These variables are section-aligned
corresponding to the early section mappings set up
in head.S.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 arch/arm/include/asm/memory.h |  7 +++++++
 arch/arm/kernel/head.S        | 27 ++++++++++++++++++++++++---
 2 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index a5f1d1826a98..ca213d7d095f 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -158,6 +158,13 @@ extern unsigned long vectors_base;
 
 #ifndef __ASSEMBLY__
 
+/*
+ * Physical start and end address of the kernel sections. These addresses are
+ * 2MB-aligned to match the section mappings placed over the kernel.
+ */
+extern phys_addr_t kernel_sec_start;
+extern phys_addr_t kernel_sec_end;
+
 /*
  * Physical vs virtual RAM address space conversion.  These are
  * private definitions which should NOT be used outside memory.h
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 4e2daaa7636a..9eb0b4dbcc12 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -47,6 +47,20 @@
 	.globl	swapper_pg_dir
 	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - PG_DIR_SIZE
 
+	/*
+	 * This needs to be assigned at runtime when the linker symbols are
+	 * resolved.
+	 */
+	.pushsection .data
+	.align	2
+	.globl	kernel_sec_start
+	.globl	kernel_sec_end
+kernel_sec_start:
+	.long	0
+kernel_sec_end:
+	.long	0
+	.popsection
+
 	.macro	pgtbl, rd, phys
 	add	\rd, \phys, #TEXT_OFFSET
 	sub	\rd, \rd, #PG_DIR_SIZE
@@ -229,16 +243,23 @@ __create_page_tables:
 	blo	1b
 
 	/*
-	 * Map our RAM from the start to the end of the kernel .bss section.
+	 * The main matter: map in the kernel using section mappings, and
+	 * set two variables to indicate the physical start and end of the
+	 * kernel.
 	 */
-	add	r0, r4, #PAGE_OFFSET >> (SECTION_SHIFT - PMD_ORDER)
+	add	r0, r4, #KERNEL_OFFSET >> (SECTION_SHIFT - PMD_ORDER)
 	ldr	r6, =(_end - 1)
-	orr	r3, r8, r7
+	adr_l	r5, kernel_sec_start		@ _pa(kernel_sec_start)
+	str	r8, [r5]			@ Save physical start of kernel
+	orr	r3, r8, r7			@ Add the MMU flags
 	add	r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER)
 1:	str	r3, [r0], #1 << PMD_ORDER
 	add	r3, r3, #1 << SECTION_SHIFT
 	cmp	r0, r6
 	bls	1b
+	eor	r3, r3, r7			@ Remove the MMU flags
+	adr_l	r5, kernel_sec_end		@ _pa(kernel_sec_end)
+	str	r3, [r5]			@ Save physical end of kernel
 
 #ifdef CONFIG_XIP_KERNEL
 	/*
-- 
2.31.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] ARM: Map the lowmem and kernel separately
  2021-05-17 14:57 [PATCH 0/3] Split lowmem and kernel mappings Linus Walleij
  2021-05-17 14:57 ` [PATCH 1/3] ARM: Split KERNEL_OFFSET from PAGE_OFFSET Linus Walleij
  2021-05-17 14:57 ` [PATCH 2/3] ARM: Define kernel physical section start and end Linus Walleij
@ 2021-05-17 14:57 ` Linus Walleij
  2021-07-30 20:22   ` Nishanth Menon
  2 siblings, 1 reply; 12+ messages in thread
From: Linus Walleij @ 2021-05-17 14:57 UTC (permalink / raw)
  To: linux-arm-kernel, Russell King, Arnd Bergmann
  Cc: Florian Fainelli, Geert Uytterhoeven, Ard Biesheuvel, Linus Walleij

Using our knowledge of where the physical kernel sections start
and end we can split mapping of lowmem and kernel apart.

This is helpful when you want to place the kernel independently
from lowmem and not be limited to putting it into lowmem only,
but also into places such as the VMALLOC area.

We extensively rewrite the lowmem mapping code to account for
all cases where the kernel image overlaps with the lowmem in
different ways. This is helpful to handle situations which
occur when the kernel is loaded in different places and makes
it possible to place the kernel in a more random manner
which is done with e.g. KASLR.

We sprinkle some comments with illustrations and pr_debug()
over it so it is also very evident to readers what is happening.

We now use the kernel_sec_start and kernel_sec_end instead
of relying on __pa() (phys_to_virt) to provide this. This
is helpful if we want to resolve physical-to-virtual and
virtual-to-physical mappings at runtime rather than
compiletime, especially if we are not using patch phys to
virt.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 arch/arm/mm/mmu.c | 144 +++++++++++++++++++++++++++++++++++-----------
 1 file changed, 111 insertions(+), 33 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index c1e12aab67b8..ac37959507b7 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1457,8 +1457,6 @@ static void __init kmap_init(void)
 
 static void __init map_lowmem(void)
 {
-	phys_addr_t kernel_x_start = round_down(__pa(KERNEL_START), SECTION_SIZE);
-	phys_addr_t kernel_x_end = round_up(__pa(__init_end), SECTION_SIZE);
 	phys_addr_t start, end;
 	u64 i;
 
@@ -1466,55 +1464,126 @@ static void __init map_lowmem(void)
 	for_each_mem_range(i, &start, &end) {
 		struct map_desc map;
 
+		pr_debug("map lowmem start: 0x%08llx, end: 0x%08llx\n",
+			 (long long)start, (long long)end);
 		if (end > arm_lowmem_limit)
 			end = arm_lowmem_limit;
 		if (start >= end)
 			break;
 
-		if (end < kernel_x_start) {
-			map.pfn = __phys_to_pfn(start);
-			map.virtual = __phys_to_virt(start);
-			map.length = end - start;
-			map.type = MT_MEMORY_RWX;
+		/*
+		 * If our kernel image is in the VMALLOC area we need to remove
+		 * the kernel physical memory from lowmem since the kernel will
+		 * be mapped separately.
+		 *
+		 * The kernel will typically be at the very start of lowmem,
+		 * but any placement relative to memory ranges is possible.
+		 *
+		 * If the memblock contains the kernel, we have to chisel out
+		 * the kernel memory from it and map each part separately. We
+		 * get 6 different theoretical cases:
+		 *
+		 *                            +--------+ +--------+
+		 *  +-- start --+  +--------+ | Kernel | | Kernel |
+		 *  |           |  | Kernel | | case 2 | | case 5 |
+		 *  |           |  | case 1 | +--------+ |        | +--------+
+		 *  |  Memory   |  +--------+            |        | | Kernel |
+		 *  |  range    |  +--------+            |        | | case 6 |
+		 *  |           |  | Kernel | +--------+ |        | +--------+
+		 *  |           |  | case 3 | | Kernel | |        |
+		 *  +-- end ----+  +--------+ | case 4 | |        |
+		 *                            +--------+ +--------+
+		 */
 
-			create_mapping(&map);
-		} else if (start >= kernel_x_end) {
-			map.pfn = __phys_to_pfn(start);
-			map.virtual = __phys_to_virt(start);
-			map.length = end - start;
-			map.type = MT_MEMORY_RW;
+		/* Case 5: kernel covers range, don't map anything, should be rare */
+		if ((start > kernel_sec_start) && (end < kernel_sec_end))
+			break;
 
-			create_mapping(&map);
-		} else {
-			/* This better cover the entire kernel */
-			if (start < kernel_x_start) {
+		/* Cases where the kernel is starting inside the range */
+		if ((kernel_sec_start >= start) && (kernel_sec_start <= end)) {
+			/* Case 6: kernel is embedded in the range, we need two mappings */
+			if ((start < kernel_sec_start) && (end > kernel_sec_end)) {
+				/* Map memory below the kernel */
 				map.pfn = __phys_to_pfn(start);
 				map.virtual = __phys_to_virt(start);
-				map.length = kernel_x_start - start;
+				map.length = kernel_sec_start - start;
 				map.type = MT_MEMORY_RW;
-
 				create_mapping(&map);
-			}
-
-			map.pfn = __phys_to_pfn(kernel_x_start);
-			map.virtual = __phys_to_virt(kernel_x_start);
-			map.length = kernel_x_end - kernel_x_start;
-			map.type = MT_MEMORY_RWX;
-
-			create_mapping(&map);
-
-			if (kernel_x_end < end) {
-				map.pfn = __phys_to_pfn(kernel_x_end);
-				map.virtual = __phys_to_virt(kernel_x_end);
-				map.length = end - kernel_x_end;
+				/* Map memory above the kernel */
+				map.pfn = __phys_to_pfn(kernel_sec_end);
+				map.virtual = __phys_to_virt(kernel_sec_end);
+				map.length = end - kernel_sec_end;
 				map.type = MT_MEMORY_RW;
-
 				create_mapping(&map);
+				break;
 			}
+			/* Case 1: kernel and range start at the same address, should be common */
+			if (kernel_sec_start == start)
+				start = kernel_sec_end;
+			/* Case 3: kernel and range end at the same address, should be rare */
+			if (kernel_sec_end == end)
+				end = kernel_sec_start;
+		} else if ((kernel_sec_start < start) && (kernel_sec_end > start) && (kernel_sec_end < end)) {
+			/* Case 2: kernel ends inside range, starts below it */
+			start = kernel_sec_end;
+		} else if ((kernel_sec_start > start) && (kernel_sec_start < end) && (kernel_sec_end > end)) {
+			/* Case 4: kernel starts inside range, ends above it */
+			end = kernel_sec_start;
 		}
+		map.pfn = __phys_to_pfn(start);
+		map.virtual = __phys_to_virt(start);
+		map.length = end - start;
+		map.type = MT_MEMORY_RW;
+		create_mapping(&map);
 	}
 }
 
+static void __init map_kernel(void)
+{
+	/*
+	 * We use the well known kernel section start and end and split the area in the
+	 * middle like this:
+	 *  .                .
+	 *  | RW memory      |
+	 *  +----------------+ kernel_x_start
+	 *  | Executable     |
+	 *  | kernel memory  |
+	 *  +----------------+ kernel_x_end / kernel_nx_start
+	 *  | Non-executable |
+	 *  | kernel memory  |
+	 *  +----------------+ kernel_nx_end
+	 *  | RW memory      |
+	 *  .                .
+	 *
+	 * Notice that we are dealing with section sized mappings here so all of this
+	 * will be bumped to the closest section boundary. This means that some of the
+	 * non-executable part of the kernel memory is actually mapped as executable.
+	 * This will only persist until we turn on proper memory management later on
+	 * and we remap the whole kernel with page granularity.
+	 */
+	phys_addr_t kernel_x_start = kernel_sec_start;
+	phys_addr_t kernel_x_end = round_up(__pa(__init_end), SECTION_SIZE);
+	phys_addr_t kernel_nx_start = kernel_x_end;
+	phys_addr_t kernel_nx_end = kernel_sec_end;
+	struct map_desc map;
+
+	map.pfn = __phys_to_pfn(kernel_x_start);
+	map.virtual = __phys_to_virt(kernel_x_start);
+	map.length = kernel_x_end - kernel_x_start;
+	map.type = MT_MEMORY_RWX;
+	create_mapping(&map);
+
+	/* If the nx part is small it may end up covered by the tail of the RWX section */
+	if (kernel_x_end == kernel_nx_end)
+		return;
+
+	map.pfn = __phys_to_pfn(kernel_nx_start);
+	map.virtual = __phys_to_virt(kernel_nx_start);
+	map.length = kernel_nx_end - kernel_nx_start;
+	map.type = MT_MEMORY_RW;
+	create_mapping(&map);
+}
+
 #ifdef CONFIG_ARM_PV_FIXUP
 typedef void pgtables_remap(long long offset, unsigned long pgd);
 pgtables_remap lpae_pgtables_remap_asm;
@@ -1645,9 +1714,18 @@ void __init paging_init(const struct machine_desc *mdesc)
 {
 	void *zero_page;
 
+	pr_debug("physical kernel sections: 0x%08x-0x%08x\n",
+		 kernel_sec_start, kernel_sec_end);
+
 	prepare_page_table();
 	map_lowmem();
 	memblock_set_current_limit(arm_lowmem_limit);
+	pr_debug("lowmem limit is %08llx\n", (long long)arm_lowmem_limit);
+	/*
+	 * After this point early_alloc(), i.e. the memblock allocator, can
+	 * be used
+	 */
+	map_kernel();
 	dma_contiguous_remap();
 	early_fixmap_shutdown();
 	devicemaps_init(mdesc);
-- 
2.31.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately
  2021-05-17 14:57 ` [PATCH 3/3] ARM: Map the lowmem and kernel separately Linus Walleij
@ 2021-07-30 20:22   ` Nishanth Menon
  2021-07-30 22:40     ` Linus Walleij
  0 siblings, 1 reply; 12+ messages in thread
From: Nishanth Menon @ 2021-07-30 20:22 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Florian Fainelli, Arnd Bergmann, Geert Uytterhoeven,
	santosh.shilimkar, Russell King, Ard Biesheuvel,
	linux-arm-kernel

On 16:57-20210517, Linus Walleij wrote:
> Using our knowledge of where the physical kernel sections start
> and end we can split mapping of lowmem and kernel apart.
> 
[...]
> 
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>


We noticed a regression in k2g platforms with this patch.
commit 6e121df14ccd ("ARM: 9090/1: Map the lowmem and kernel
separately")

Boot log: https://pastebin.ubuntu.com/p/Sf3r28D8MR/
Bisect log: https://pastebin.ubuntu.com/p/6PTYpNVFDy/

-- 
Regards,
Nishanth Menon
Key (0xDDB5849D1736249D) / Fingerprint: F8A2 8693 54EB 8232 17A3  1A34 DDB5 849D 1736 249D

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately
  2021-07-30 20:22   ` Nishanth Menon
@ 2021-07-30 22:40     ` Linus Walleij
  2021-08-03 10:27       ` Grygorii Strashko
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Walleij @ 2021-07-30 22:40 UTC (permalink / raw)
  To: Nishanth Menon, santosh.shilimkar, Grygorii Strashko, Santosh Shilimkar
  Cc: Florian Fainelli, Arnd Bergmann, Geert Uytterhoeven,
	Russell King, Ard Biesheuvel, Linux ARM

On Fri, Jul 30, 2021 at 10:23 PM Nishanth Menon <nm@ti.com> wrote:
> On 16:57-20210517, Linus Walleij wrote:
> > Using our knowledge of where the physical kernel sections start
> > and end we can split mapping of lowmem and kernel apart.
> >
> [...]
> >
> > Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
>
>
> We noticed a regression in k2g platforms with this patch.
> commit 6e121df14ccd ("ARM: 9090/1: Map the lowmem and kernel
> separately")
>
> Boot log: https://pastebin.ubuntu.com/p/Sf3r28D8MR/
> Bisect log: https://pastebin.ubuntu.com/p/6PTYpNVFDy/

Given how invasive the patch is I'm surprised that k2g is the
only thing so far complaining about it. Let's fix this!

I've had similar reports and in those cases it has been that the boot
loader did not turn off the caches before jumping to execute the
Linux kernel.

So first can you check that cache is off?

Next can you turn on DEBUG_LL and enable the debug messages
in arch/arm/mmu.c in map_lowmem() in some way that works for
you, for example I just add a patch like this:
https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator.git/patch/?id=d10c46631a6eaef1339e376b878de985143b27ff
(Beware of ugly debug habits!)

Next is this using the keystone 2 pv fixup thing  mdesc->pv_fixup
mdesc->pv_fixup in early_paging_init()?

I added a special print
there to see if that gets called. This code is really scary but I
suppose Santosh can explain what is supposed to happen here
and what I may have stepped on. (I see Santosh work at Oracle
now? Interesting.)

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately
  2021-07-30 22:40     ` Linus Walleij
@ 2021-08-03 10:27       ` Grygorii Strashko
  2021-08-03 10:49         ` Russell King (Oracle)
  0 siblings, 1 reply; 12+ messages in thread
From: Grygorii Strashko @ 2021-08-03 10:27 UTC (permalink / raw)
  To: Linus Walleij, Nishanth Menon, santosh.shilimkar, Santosh Shilimkar
  Cc: Florian Fainelli, Arnd Bergmann, Geert Uytterhoeven,
	Russell King, Tero Kristo, Ard Biesheuvel, Linux ARM



On 31/07/2021 01:40, Linus Walleij wrote:
> On Fri, Jul 30, 2021 at 10:23 PM Nishanth Menon <nm@ti.com> wrote:
>> On 16:57-20210517, Linus Walleij wrote:
>>> Using our knowledge of where the physical kernel sections start
>>> and end we can split mapping of lowmem and kernel apart.
>>>
>> [...]
>>>
>>> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
>>
>>
>> We noticed a regression in k2g platforms with this patch.
>> commit 6e121df14ccd ("ARM: 9090/1: Map the lowmem and kernel
>> separately")
>>
>> Boot log: https://pastebin.ubuntu.com/p/Sf3r28D8MR/
>> Bisect log: https://pastebin.ubuntu.com/p/6PTYpNVFDy/
> 
> Given how invasive the patch is I'm surprised that k2g is the
> only thing so far complaining about it. Let's fix this!
> 
> I've had similar reports and in those cases it has been that the boot
> loader did not turn off the caches before jumping to execute the
> Linux kernel.
> 
> So first can you check that cache is off?
> 
> Next can you turn on DEBUG_LL and enable the debug messages
> in arch/arm/mmu.c in map_lowmem() in some way that works for
> you, for example I just add a patch like this:
> https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator.git/patch/?id=d10c46631a6eaef1339e376b878de985143b27ff
> (Beware of ugly debug habits!)
> 
> Next is this using the keystone 2 pv fixup thing  mdesc->pv_fixup
> mdesc->pv_fixup in early_paging_init()?
> 
> I added a special print
> there to see if that gets called. This code is really scary but I
> suppose Santosh can explain what is supposed to happen here
> and what I may have stepped on. (I see Santosh work at Oracle
> now? Interesting.)

It should be called on all keystone 2 platforms by default (LAPE is default mode for K2).

Huh. Below as I remember it:
- K2 starts using memory window (aliased memory) at 00 8000 0000 (No IO coherency is supported)
- K2, early at boot, is switching to LPAE memory window 08 0000 0000 (IO coherency is supported)
   so all, already mapped memory, has to be fixed - phys addresses.


-- 
Best regards,
grygorii

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately
  2021-08-03 10:27       ` Grygorii Strashko
@ 2021-08-03 10:49         ` Russell King (Oracle)
  2021-08-03 11:34           ` Linus Walleij
  2021-08-04 10:05           ` Linus Walleij
  0 siblings, 2 replies; 12+ messages in thread
From: Russell King (Oracle) @ 2021-08-03 10:49 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Nishanth Menon, Florian Fainelli, Geert Uytterhoeven,
	Arnd Bergmann, Linus Walleij, santosh.shilimkar, Tero Kristo,
	Santosh Shilimkar, Ard Biesheuvel, Linux ARM

On Tue, Aug 03, 2021 at 01:27:18PM +0300, Grygorii Strashko wrote:
> 
> 
> On 31/07/2021 01:40, Linus Walleij wrote:
> > On Fri, Jul 30, 2021 at 10:23 PM Nishanth Menon <nm@ti.com> wrote:
> > > On 16:57-20210517, Linus Walleij wrote:
> > > > Using our knowledge of where the physical kernel sections start
> > > > and end we can split mapping of lowmem and kernel apart.
> > > > 
> > > [...]
> > > > 
> > > > Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> > > 
> > > 
> > > We noticed a regression in k2g platforms with this patch.
> > > commit 6e121df14ccd ("ARM: 9090/1: Map the lowmem and kernel
> > > separately")
> > > 
> > > Boot log: https://pastebin.ubuntu.com/p/Sf3r28D8MR/
> > > Bisect log: https://pastebin.ubuntu.com/p/6PTYpNVFDy/
> > 
> > Given how invasive the patch is I'm surprised that k2g is the
> > only thing so far complaining about it. Let's fix this!
> > 
> > I've had similar reports and in those cases it has been that the boot
> > loader did not turn off the caches before jumping to execute the
> > Linux kernel.
> > 
> > So first can you check that cache is off?
> > 
> > Next can you turn on DEBUG_LL and enable the debug messages
> > in arch/arm/mmu.c in map_lowmem() in some way that works for
> > you, for example I just add a patch like this:
> > https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-integrator.git/patch/?id=d10c46631a6eaef1339e376b878de985143b27ff
> > (Beware of ugly debug habits!)
> > 
> > Next is this using the keystone 2 pv fixup thing  mdesc->pv_fixup
> > mdesc->pv_fixup in early_paging_init()?
> > 
> > I added a special print
> > there to see if that gets called. This code is really scary but I
> > suppose Santosh can explain what is supposed to happen here
> > and what I may have stepped on. (I see Santosh work at Oracle
> > now? Interesting.)
> 
> It should be called on all keystone 2 platforms by default (LAPE is default mode for K2).
> 
> Huh. Below as I remember it:
> - K2 starts using memory window (aliased memory) at 00 8000 0000 (No IO coherency is supported)
> - K2, early at boot, is switching to LPAE memory window 08 0000 0000 (IO coherency is supported)
>   so all, already mapped memory, has to be fixed - phys addresses.

If I remember correctly, the code that fixes that up assumes that
(a) the kernel is mapped using section mappings in the lowmem mapping
    at PAGE_OFFSET.._end-1
(b) that no other RAM is mapped (iow, it's called before the kernel
    starts building the real page mappings in paging_init()).

It looks to me like Linus missed the code in arch/arm/mm/pv-fixup-asm.S
and as the kernel is no longer mapped in the lowmem mapping, this
likely writes a load of entries in the page tables that are random...

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately
  2021-08-03 10:49         ` Russell King (Oracle)
@ 2021-08-03 11:34           ` Linus Walleij
  2021-08-04 10:05           ` Linus Walleij
  1 sibling, 0 replies; 12+ messages in thread
From: Linus Walleij @ 2021-08-03 11:34 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Nishanth Menon, Grygorii Strashko, Geert Uytterhoeven,
	Arnd Bergmann, Tero Kristo, santosh.shilimkar, Florian Fainelli,
	Santosh Shilimkar, Ard Biesheuvel, Linux ARM

On Tue, Aug 3, 2021 at 12:49 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:

> It looks to me like Linus missed the code in arch/arm/mm/pv-fixup-asm.S
> and as the kernel is no longer mapped in the lowmem mapping, this
> likely writes a load of entries in the page tables that are random...

Oops. I'll do my homework. I'll read this code and see if I can fix this.

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately
  2021-08-03 10:49         ` Russell King (Oracle)
  2021-08-03 11:34           ` Linus Walleij
@ 2021-08-04 10:05           ` Linus Walleij
  2021-08-04 10:36             ` Russell King (Oracle)
  1 sibling, 1 reply; 12+ messages in thread
From: Linus Walleij @ 2021-08-04 10:05 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Grygorii Strashko, Nishanth Menon, santosh.shilimkar,
	Santosh Shilimkar, Linux ARM, Arnd Bergmann, Florian Fainelli,
	Geert Uytterhoeven, Ard Biesheuvel, Suman Anna, Tero Kristo

On Tue, Aug 3, 2021 at 12:49 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:

> > It should be called on all keystone 2 platforms by default (LAPE is default mode for K2).
> >
> > Huh. Below as I remember it:
> > - K2 starts using memory window (aliased memory) at 00 8000 0000 (No IO coherency is supported)
> > - K2, early at boot, is switching to LPAE memory window 08 0000 0000 (IO coherency is supported)
> >   so all, already mapped memory, has to be fixed - phys addresses.
>
> If I remember correctly, the code that fixes that up assumes that
> (a) the kernel is mapped using section mappings in the lowmem mapping
>     at PAGE_OFFSET.._end-1
> (b) that no other RAM is mapped (iow, it's called before the kernel
>     starts building the real page mappings in paging_init()).
>
> It looks to me like Linus missed the code in arch/arm/mm/pv-fixup-asm.S
> and as the kernel is no longer mapped in the lowmem mapping, this
> likely writes a load of entries in the page tables that are random...

I looked into this!

Indeed early_mm_init() is called before paging_init() and early_mm_init()
calls lpae_pgtables_remap_asm() in pv-fixup-asm.S to add the offset
to all the section mappings over the kernel.

The section mappings we have at this point are coming from head.S,
so those get modified with the offset.

What I don't understand is what map_lowmem() was doing with the
kernel mappings on the Keystone 2 before my patch.

If it actually overwrites the kernel mapping with a new one, it will
undo what lpae_pgtables_remap_asm() has done without applying
the offset.

So I suppose map_lowmem(), due to the memory ranges passed,
would just bail the kernel mappings and leave them as-is.

Nishanth: what happens if you simply comment out map_kernel
like this?

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 7583bda5ea7d..25aa7d05a931 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -1727,7 +1727,7 @@ void __init paging_init(const struct machine_desc *mdesc)
         * After this point early_alloc(), i.e. the memblock allocator, can
         * be used
         */
-       map_kernel();
+       // map_kernel();
        dma_contiguous_remap();
        early_fixmap_shutdown();
        devicemaps_init(mdesc);

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately
  2021-08-04 10:05           ` Linus Walleij
@ 2021-08-04 10:36             ` Russell King (Oracle)
  2021-08-04 13:12               ` Linus Walleij
  0 siblings, 1 reply; 12+ messages in thread
From: Russell King (Oracle) @ 2021-08-04 10:36 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Grygorii Strashko, Nishanth Menon, santosh.shilimkar,
	Santosh Shilimkar, Linux ARM, Arnd Bergmann, Florian Fainelli,
	Geert Uytterhoeven, Ard Biesheuvel, Suman Anna, Tero Kristo

On Wed, Aug 04, 2021 at 12:05:54PM +0200, Linus Walleij wrote:
> On Tue, Aug 3, 2021 at 12:49 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> 
> > > It should be called on all keystone 2 platforms by default (LAPE is default mode for K2).
> > >
> > > Huh. Below as I remember it:
> > > - K2 starts using memory window (aliased memory) at 00 8000 0000 (No IO coherency is supported)
> > > - K2, early at boot, is switching to LPAE memory window 08 0000 0000 (IO coherency is supported)
> > >   so all, already mapped memory, has to be fixed - phys addresses.
> >
> > If I remember correctly, the code that fixes that up assumes that
> > (a) the kernel is mapped using section mappings in the lowmem mapping
> >     at PAGE_OFFSET.._end-1
> > (b) that no other RAM is mapped (iow, it's called before the kernel
> >     starts building the real page mappings in paging_init()).
> >
> > It looks to me like Linus missed the code in arch/arm/mm/pv-fixup-asm.S
> > and as the kernel is no longer mapped in the lowmem mapping, this
> > likely writes a load of entries in the page tables that are random...
> 
> I looked into this!
> 
> Indeed early_mm_init() is called before paging_init() and early_mm_init()
> calls lpae_pgtables_remap_asm() in pv-fixup-asm.S to add the offset
> to all the section mappings over the kernel.
> 
> The section mappings we have at this point are coming from head.S,
> so those get modified with the offset.
> 
> What I don't understand is what map_lowmem() was doing with the
> kernel mappings on the Keystone 2 before my patch.

Essentially, what happened with Keystone 2 is that the head.S code
does its usual thing, mapping the kernel using the physical address
that the early code is running at using section mappings, before
then switching the MMU on and running from the virtual address space.
At this point, the phys<->virt conversions work for this physical
address space (which, on Keystone 2 is the low physical space which
isn't coherent with devices.)

early_mm_init() gets called, we spot we're on Keystone 2, and we
initiate the change - the MMU needs a break-before-make for these
mappings, and as we're using these mappings to execute our code,
we have to turn the MMU off to do the update.

We update the phys<->virt conversion, and then fixup the page
tables. lpae_pgtables_remap() is passed the offset between the
low physical address and the high physical address, as well as the
low physical address of the page tables. (since when we turn the MMU
off, all we have access to are the low physical mapping.)

The structure of the LPAE page tables is:

- First 4k - L1 entries.
- Following 32k - L2 entries mapped as sections.

In pv-fixup-asm.S:

        ldr     r6, =(_end - 1)
        add     r7, r2, #0x1000
        add     r6, r7, r6, lsr #SECTION_SHIFT - L2_ORDER
        add     r7, r7, #PAGE_OFFSET >> (SECTION_SHIFT - L2_ORDER)

r7 is the low physical address in the L2 entries, r6 is the last
(inclusive) entry. Essentially, all we do is add "offset" to every
section mapping corresponding to addresses between PAGE_OFFSET and
_end.

We also do that for the boot data (two section mappings) at
FDT_FIXED_BASE, the four L1 entries, and the two TTBR physical
address register values.

> If it actually overwrites the kernel mapping with a new one, it will
> undo what lpae_pgtables_remap_asm() has done without applying
> the offset.
> 
> So I suppose map_lowmem(), due to the memory ranges passed,
> would just bail the kernel mappings and leave them as-is.

By the time map_lowmem() gets called, the phys<->virt mappings will
have been modified throughout the kernel so that virtual addresses
in the direct mapping will translate to the high physical addresses.
So, things like __pa(__init_end) in map_kernel() will translate
__init_end to the high physical address rather than the low physical
address.

My guess would be the problem is that kernel_sec_start and
kernel_sec_end are not being adjusted, and are still pointing at
the low physical addresses rather than the high ones.
Consequently, kernel_x_start and kernel_nx_end points at the low phys
space, but kernel_x_end and kernel_nx_start points to the high phys
space.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] ARM: Map the lowmem and kernel separately
  2021-08-04 10:36             ` Russell King (Oracle)
@ 2021-08-04 13:12               ` Linus Walleij
  0 siblings, 0 replies; 12+ messages in thread
From: Linus Walleij @ 2021-08-04 13:12 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Grygorii Strashko, Nishanth Menon, santosh.shilimkar,
	Santosh Shilimkar, Linux ARM, Arnd Bergmann, Florian Fainelli,
	Geert Uytterhoeven, Ard Biesheuvel, Suman Anna, Tero Kristo

Thanks so much for explaining this Russell!

On Wed, Aug 4, 2021 at 12:37 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:

> My guess would be the problem is that kernel_sec_start and
> kernel_sec_end are not being adjusted, and are still pointing at
> the low physical addresses rather than the high ones.
> Consequently, kernel_x_start and kernel_nx_end points at the low phys
> space, but kernel_x_end and kernel_nx_start points to the high phys
> space.

I bet this is right, I will make a patch for Nishanth to test!

Thanks a *lot* for your elaborate explanation.

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-08-04 13:15 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-17 14:57 [PATCH 0/3] Split lowmem and kernel mappings Linus Walleij
2021-05-17 14:57 ` [PATCH 1/3] ARM: Split KERNEL_OFFSET from PAGE_OFFSET Linus Walleij
2021-05-17 14:57 ` [PATCH 2/3] ARM: Define kernel physical section start and end Linus Walleij
2021-05-17 14:57 ` [PATCH 3/3] ARM: Map the lowmem and kernel separately Linus Walleij
2021-07-30 20:22   ` Nishanth Menon
2021-07-30 22:40     ` Linus Walleij
2021-08-03 10:27       ` Grygorii Strashko
2021-08-03 10:49         ` Russell King (Oracle)
2021-08-03 11:34           ` Linus Walleij
2021-08-04 10:05           ` Linus Walleij
2021-08-04 10:36             ` Russell King (Oracle)
2021-08-04 13:12               ` Linus Walleij

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).