All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-08 18:21 ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:21 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

The purpose of this series is to remove any explicit cache maintenance
for coherency during early boot that becomes unnecessary if we simply
retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
and use it to populate the ID map page tables. After setting up this
preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
TCR and SCTLR registers as before, and proceed as usual, avoiding the
need for any manipulations of memory while the MMU and caches are off.

The only properties of the firmware provided 1:1 map we rely on is that
it does not require any explicit cache maintenance for coherency, and
that it covers the entire memory footprint of the image, including the
BSS and padding at the end - all else is under control of the kernel
itself, as before.

Changes since v4:
- add patch to align the callers of finalise_el2()
- also clean HYP text to the PoC when booting at EL2 with the MMU on
- add a warning and a taint when doing non-EFI boot with the MMU and
  caches enabled
- rebase onto zboot changes in efi/next - this means that patches #6 and
  #7 will not apply onto arm64/for-next so a shared stable branch will
  be needed if we want to queue this up for v6.2

Changes since v3:
- drop EFI_LOADER_CODE memory type patch that has been queued in the
  mean time
- rebased onto [partial] series that moves efi-entry.S into the libstub/
  source directory
- fixed a correctness issue in patch #2

Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>

Ard Biesheuvel (7):
  arm64: head: Move all finalise_el2 calls to after __enable_mmu
  arm64: kernel: move identity map out of .text mapping
  arm64: head: record the MMU state at primary entry
  arm64: head: avoid cache invalidation when entering with the MMU on
  arm64: head: Clean the ID map and the HYP text to the PoC if needed
  arm64: lds: reduce effective minimum image alignment to 64k
  efi: arm64: enter with MMU and caches enabled

 arch/arm64/include/asm/efi.h               |  9 +-
 arch/arm64/kernel/head.S                   | 93 +++++++++++++++-----
 arch/arm64/kernel/image-vars.h             |  5 +-
 arch/arm64/kernel/setup.c                  |  9 +-
 arch/arm64/kernel/sleep.S                  |  6 +-
 arch/arm64/kernel/vmlinux.lds.S            | 13 ++-
 arch/arm64/mm/cache.S                      |  5 +-
 arch/arm64/mm/proc.S                       |  2 -
 drivers/firmware/efi/libstub/Makefile      |  4 +-
 drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------
 drivers/firmware/efi/libstub/arm64-stub.c  | 26 ++++--
 drivers/firmware/efi/libstub/arm64.c       | 41 +++++++--
 include/linux/efi.h                        |  6 +-
 13 files changed, 159 insertions(+), 127 deletions(-)
 delete mode 100644 drivers/firmware/efi/libstub/arm64-entry.S

-- 
2.35.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-08 18:21 ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:21 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

The purpose of this series is to remove any explicit cache maintenance
for coherency during early boot that becomes unnecessary if we simply
retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
and use it to populate the ID map page tables. After setting up this
preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
TCR and SCTLR registers as before, and proceed as usual, avoiding the
need for any manipulations of memory while the MMU and caches are off.

The only properties of the firmware provided 1:1 map we rely on is that
it does not require any explicit cache maintenance for coherency, and
that it covers the entire memory footprint of the image, including the
BSS and padding at the end - all else is under control of the kernel
itself, as before.

Changes since v4:
- add patch to align the callers of finalise_el2()
- also clean HYP text to the PoC when booting at EL2 with the MMU on
- add a warning and a taint when doing non-EFI boot with the MMU and
  caches enabled
- rebase onto zboot changes in efi/next - this means that patches #6 and
  #7 will not apply onto arm64/for-next so a shared stable branch will
  be needed if we want to queue this up for v6.2

Changes since v3:
- drop EFI_LOADER_CODE memory type patch that has been queued in the
  mean time
- rebased onto [partial] series that moves efi-entry.S into the libstub/
  source directory
- fixed a correctness issue in patch #2

Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>

Ard Biesheuvel (7):
  arm64: head: Move all finalise_el2 calls to after __enable_mmu
  arm64: kernel: move identity map out of .text mapping
  arm64: head: record the MMU state at primary entry
  arm64: head: avoid cache invalidation when entering with the MMU on
  arm64: head: Clean the ID map and the HYP text to the PoC if needed
  arm64: lds: reduce effective minimum image alignment to 64k
  efi: arm64: enter with MMU and caches enabled

 arch/arm64/include/asm/efi.h               |  9 +-
 arch/arm64/kernel/head.S                   | 93 +++++++++++++++-----
 arch/arm64/kernel/image-vars.h             |  5 +-
 arch/arm64/kernel/setup.c                  |  9 +-
 arch/arm64/kernel/sleep.S                  |  6 +-
 arch/arm64/kernel/vmlinux.lds.S            | 13 ++-
 arch/arm64/mm/cache.S                      |  5 +-
 arch/arm64/mm/proc.S                       |  2 -
 drivers/firmware/efi/libstub/Makefile      |  4 +-
 drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------
 drivers/firmware/efi/libstub/arm64-stub.c  | 26 ++++--
 drivers/firmware/efi/libstub/arm64.c       | 41 +++++++--
 include/linux/efi.h                        |  6 +-
 13 files changed, 159 insertions(+), 127 deletions(-)
 delete mode 100644 drivers/firmware/efi/libstub/arm64-entry.S

-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v5 1/7] arm64: head: Move all finalise_el2 calls to after __enable_mmu
  2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:21   ` Ard Biesheuvel
  -1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:21 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

In the primary boot path, finalise_el2() is called much later than on
the secondary boot or resume-from-suspend paths, and this does not
appear to be intentional.

Since we aim to do as little as possible before enabling the MMU and
caches, align secondary and resume with primary boot, and defer the call
to after the MMU is turned on. This also removes the need to clean
finalise_el2() to the PoC once we enable support for booting with the
MMU on.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S  | 5 ++++-
 arch/arm64/kernel/sleep.S | 5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2196aad7b55bcef0..c59e0d95b44d0901 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -584,7 +584,6 @@ SYM_FUNC_START_LOCAL(secondary_startup)
 	 * Common entry point for secondary CPUs.
 	 */
 	mov	x20, x0				// preserve boot mode
-	bl	finalise_el2
 	bl	__cpu_secondary_check52bitva
 #if VA_BITS > 48
 	ldr_l	x0, vabits_actual
@@ -600,6 +599,10 @@ SYM_FUNC_END(secondary_startup)
 SYM_FUNC_START_LOCAL(__secondary_switched)
 	mov	x0, x20
 	bl	set_cpu_boot_mode_flag
+
+	mov	x0, x20
+	bl	finalise_el2
+
 	str_l	xzr, __early_cpu_boot_status, x3
 	adr_l	x5, vectors
 	msr	vbar_el1, x5
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 97c9de57725dfddb..7b7c56e048346e97 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -100,7 +100,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
 	.pushsection ".idmap.text", "awx"
 SYM_CODE_START(cpu_resume)
 	bl	init_kernel_el
-	bl	finalise_el2
+	mov	x19, x0			// preserve boot mode
 #if VA_BITS > 48
 	ldr_l	x0, vabits_actual
 #endif
@@ -116,6 +116,9 @@ SYM_CODE_END(cpu_resume)
 	.popsection
 
 SYM_FUNC_START(_cpu_resume)
+	mov	x0, x19
+	bl	finalise_el2
+
 	mrs	x1, mpidr_el1
 	adr_l	x8, mpidr_hash		// x8 = struct mpidr_hash virt address
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 1/7] arm64: head: Move all finalise_el2 calls to after __enable_mmu
@ 2022-11-08 18:21   ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:21 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

In the primary boot path, finalise_el2() is called much later than on
the secondary boot or resume-from-suspend paths, and this does not
appear to be intentional.

Since we aim to do as little as possible before enabling the MMU and
caches, align secondary and resume with primary boot, and defer the call
to after the MMU is turned on. This also removes the need to clean
finalise_el2() to the PoC once we enable support for booting with the
MMU on.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S  | 5 ++++-
 arch/arm64/kernel/sleep.S | 5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2196aad7b55bcef0..c59e0d95b44d0901 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -584,7 +584,6 @@ SYM_FUNC_START_LOCAL(secondary_startup)
 	 * Common entry point for secondary CPUs.
 	 */
 	mov	x20, x0				// preserve boot mode
-	bl	finalise_el2
 	bl	__cpu_secondary_check52bitva
 #if VA_BITS > 48
 	ldr_l	x0, vabits_actual
@@ -600,6 +599,10 @@ SYM_FUNC_END(secondary_startup)
 SYM_FUNC_START_LOCAL(__secondary_switched)
 	mov	x0, x20
 	bl	set_cpu_boot_mode_flag
+
+	mov	x0, x20
+	bl	finalise_el2
+
 	str_l	xzr, __early_cpu_boot_status, x3
 	adr_l	x5, vectors
 	msr	vbar_el1, x5
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 97c9de57725dfddb..7b7c56e048346e97 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -100,7 +100,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
 	.pushsection ".idmap.text", "awx"
 SYM_CODE_START(cpu_resume)
 	bl	init_kernel_el
-	bl	finalise_el2
+	mov	x19, x0			// preserve boot mode
 #if VA_BITS > 48
 	ldr_l	x0, vabits_actual
 #endif
@@ -116,6 +116,9 @@ SYM_CODE_END(cpu_resume)
 	.popsection
 
 SYM_FUNC_START(_cpu_resume)
+	mov	x0, x19
+	bl	finalise_el2
+
 	mrs	x1, mpidr_el1
 	adr_l	x8, mpidr_hash		// x8 = struct mpidr_hash virt address
 
-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 2/7] arm64: kernel: move identity map out of .text mapping
  2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:21   ` Ard Biesheuvel
  -1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:21 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

Reorganize the ID map slightly so that only code that is executed with
the MMU off or via the 1:1 mapping remains. This allows us to move the
identity map out of the .text segment, as it will no longer need
executable permissions via the kernel mapping.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S        | 28 +++++++++++---------
 arch/arm64/kernel/vmlinux.lds.S |  2 +-
 arch/arm64/mm/proc.S            |  2 --
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index c59e0d95b44d0901..272877c5b4fa1203 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -540,19 +540,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
 	eret
 SYM_FUNC_END(init_kernel_el)
 
-/*
- * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
- * in w0. See arch/arm64/include/asm/virt.h for more info.
- */
-SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
-	adr_l	x1, __boot_cpu_mode
-	cmp	w0, #BOOT_CPU_MODE_EL2
-	b.ne	1f
-	add	x1, x1, #4
-1:	str	w0, [x1]			// Save CPU boot mode
-	ret
-SYM_FUNC_END(set_cpu_boot_mode_flag)
-
 	/*
 	 * This provides a "holding pen" for platforms to hold all secondary
 	 * cores are held until we're ready for them to initialise.
@@ -596,6 +583,7 @@ SYM_FUNC_START_LOCAL(secondary_startup)
 	br	x8
 SYM_FUNC_END(secondary_startup)
 
+	.text
 SYM_FUNC_START_LOCAL(__secondary_switched)
 	mov	x0, x20
 	bl	set_cpu_boot_mode_flag
@@ -628,6 +616,19 @@ SYM_FUNC_START_LOCAL(__secondary_too_slow)
 	b	__secondary_too_slow
 SYM_FUNC_END(__secondary_too_slow)
 
+/*
+ * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
+ * in w0. See arch/arm64/include/asm/virt.h for more info.
+ */
+SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
+	adr_l	x1, __boot_cpu_mode
+	cmp	w0, #BOOT_CPU_MODE_EL2
+	b.ne	1f
+	add	x1, x1, #4
+1:	str	w0, [x1]			// Save CPU boot mode
+	ret
+SYM_FUNC_END(set_cpu_boot_mode_flag)
+
 /*
  * The booting CPU updates the failed status @__early_cpu_boot_status,
  * with MMU turned off.
@@ -659,6 +660,7 @@ SYM_FUNC_END(__secondary_too_slow)
  * Checks if the selected granule size is supported by the CPU.
  * If it isn't, park the CPU
  */
+	.section ".idmap.text","awx"
 SYM_FUNC_START(__enable_mmu)
 	mrs	x3, ID_AA64MMFR0_EL1
 	ubfx	x3, x3, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 45131e354e27f1f8..c7727a1740ce11f5 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -168,7 +168,6 @@ SECTIONS
 			LOCK_TEXT
 			KPROBES_TEXT
 			HYPERVISOR_TEXT
-			IDMAP_TEXT
 			*(.gnu.warning)
 		. = ALIGN(16);
 		*(.got)			/* Global offset table		*/
@@ -195,6 +194,7 @@ SECTIONS
 		TRAMP_TEXT
 		HIBERNATE_TEXT
 		KEXEC_TEXT
+		IDMAP_TEXT
 		. = ALIGN(PAGE_SIZE);
 	}
 
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index b9ecbbae1e1abca1..d7ca6f23fb0d1334 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -110,7 +110,6 @@ SYM_FUNC_END(cpu_do_suspend)
  *
  * x0: Address of context pointer
  */
-	.pushsection ".idmap.text", "awx"
 SYM_FUNC_START(cpu_do_resume)
 	ldp	x2, x3, [x0]
 	ldp	x4, x5, [x0, #16]
@@ -166,7 +165,6 @@ alternative_else_nop_endif
 	isb
 	ret
 SYM_FUNC_END(cpu_do_resume)
-	.popsection
 #endif
 
 	.pushsection ".idmap.text", "awx"
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 2/7] arm64: kernel: move identity map out of .text mapping
@ 2022-11-08 18:21   ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:21 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

Reorganize the ID map slightly so that only code that is executed with
the MMU off or via the 1:1 mapping remains. This allows us to move the
identity map out of the .text segment, as it will no longer need
executable permissions via the kernel mapping.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S        | 28 +++++++++++---------
 arch/arm64/kernel/vmlinux.lds.S |  2 +-
 arch/arm64/mm/proc.S            |  2 --
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index c59e0d95b44d0901..272877c5b4fa1203 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -540,19 +540,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
 	eret
 SYM_FUNC_END(init_kernel_el)
 
-/*
- * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
- * in w0. See arch/arm64/include/asm/virt.h for more info.
- */
-SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
-	adr_l	x1, __boot_cpu_mode
-	cmp	w0, #BOOT_CPU_MODE_EL2
-	b.ne	1f
-	add	x1, x1, #4
-1:	str	w0, [x1]			// Save CPU boot mode
-	ret
-SYM_FUNC_END(set_cpu_boot_mode_flag)
-
 	/*
 	 * This provides a "holding pen" for platforms to hold all secondary
 	 * cores are held until we're ready for them to initialise.
@@ -596,6 +583,7 @@ SYM_FUNC_START_LOCAL(secondary_startup)
 	br	x8
 SYM_FUNC_END(secondary_startup)
 
+	.text
 SYM_FUNC_START_LOCAL(__secondary_switched)
 	mov	x0, x20
 	bl	set_cpu_boot_mode_flag
@@ -628,6 +616,19 @@ SYM_FUNC_START_LOCAL(__secondary_too_slow)
 	b	__secondary_too_slow
 SYM_FUNC_END(__secondary_too_slow)
 
+/*
+ * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
+ * in w0. See arch/arm64/include/asm/virt.h for more info.
+ */
+SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
+	adr_l	x1, __boot_cpu_mode
+	cmp	w0, #BOOT_CPU_MODE_EL2
+	b.ne	1f
+	add	x1, x1, #4
+1:	str	w0, [x1]			// Save CPU boot mode
+	ret
+SYM_FUNC_END(set_cpu_boot_mode_flag)
+
 /*
  * The booting CPU updates the failed status @__early_cpu_boot_status,
  * with MMU turned off.
@@ -659,6 +660,7 @@ SYM_FUNC_END(__secondary_too_slow)
  * Checks if the selected granule size is supported by the CPU.
  * If it isn't, park the CPU
  */
+	.section ".idmap.text","awx"
 SYM_FUNC_START(__enable_mmu)
 	mrs	x3, ID_AA64MMFR0_EL1
 	ubfx	x3, x3, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 45131e354e27f1f8..c7727a1740ce11f5 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -168,7 +168,6 @@ SECTIONS
 			LOCK_TEXT
 			KPROBES_TEXT
 			HYPERVISOR_TEXT
-			IDMAP_TEXT
 			*(.gnu.warning)
 		. = ALIGN(16);
 		*(.got)			/* Global offset table		*/
@@ -195,6 +194,7 @@ SECTIONS
 		TRAMP_TEXT
 		HIBERNATE_TEXT
 		KEXEC_TEXT
+		IDMAP_TEXT
 		. = ALIGN(PAGE_SIZE);
 	}
 
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index b9ecbbae1e1abca1..d7ca6f23fb0d1334 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -110,7 +110,6 @@ SYM_FUNC_END(cpu_do_suspend)
  *
  * x0: Address of context pointer
  */
-	.pushsection ".idmap.text", "awx"
 SYM_FUNC_START(cpu_do_resume)
 	ldp	x2, x3, [x0]
 	ldp	x4, x5, [x0, #16]
@@ -166,7 +165,6 @@ alternative_else_nop_endif
 	isb
 	ret
 SYM_FUNC_END(cpu_do_resume)
-	.popsection
 #endif
 
 	.pushsection ".idmap.text", "awx"
-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 3/7] arm64: head: record the MMU state at primary entry
  2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:22   ` Ard Biesheuvel
  -1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

Prepare for being able to deal with primary entry with the MMU and
caches enabled, by recording whether or not we entered with the MMU on
in register x19 and in a global variable. (Note that setting this
variable to '1' does not require cache invalidation, nor is it required
for storing the bootargs in that case, so omit the cache maintenance).

Since boot with the MMU enabled is not permitted by the bare metal boot
protocol, ensure that a diagnostic is emitted and a taint bit set if
the MMU was found to be enabled on a non-EFI boot. We will make an
exception for EFI boot later, which has strict requirements for the
mapping of system memory, permitting us to relax the boot protocol and
hand over from the EFI stub to the core kernel with MMU and caches left
enabled.

While at it, add 'pre_disable_mmu_workaround' macro invocations to
init_kernel_el, as its manipulation of SCTLR_ELx may amount to disabling
of the MMU after subsequent patches.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S  | 21 ++++++++++++++++++++
 arch/arm64/kernel/setup.c |  9 +++++++--
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 272877c5b4fa1203..3e654e43fa115947 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -77,6 +77,7 @@
 	 * primary lowlevel boot path:
 	 *
 	 *  Register   Scope                      Purpose
+	 *  x19        primary_entry() .. start_kernel()        whether we entered with the MMU on
 	 *  x20        primary_entry() .. __primary_switch()    CPU boot mode
 	 *  x21        primary_entry() .. start_kernel()        FDT pointer passed at boot in x0
 	 *  x22        create_idmap() .. start_kernel()         ID map VA of the DT blob
@@ -86,6 +87,7 @@
 	 *  x28        create_idmap()                           callee preserved temp register
 	 */
 SYM_CODE_START(primary_entry)
+	bl	record_mmu_state
 	bl	preserve_boot_args
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mov	x20, x0
@@ -109,6 +111,19 @@ SYM_CODE_START(primary_entry)
 	b	__primary_switch
 SYM_CODE_END(primary_entry)
 
+SYM_CODE_START_LOCAL(record_mmu_state)
+	mrs	x19, CurrentEL
+	cmp	x19, #CurrentEL_EL2
+	mrs	x19, sctlr_el1
+	b.ne	0f
+	mrs	x19, sctlr_el2
+0:	tst	x19, #SCTLR_ELx_C		// Z := (C == 0)
+	and	x19, x19, #SCTLR_ELx_M		// isolate M bit
+	ccmp	x19, xzr, #4, ne		// Z |= (M == 0)
+	cset	x19, ne				// set x19 if !Z
+	ret
+SYM_CODE_END(record_mmu_state)
+
 /*
  * Preserve the arguments passed by the bootloader in x0 .. x3
  */
@@ -119,11 +134,14 @@ SYM_CODE_START_LOCAL(preserve_boot_args)
 	stp	x21, x1, [x0]			// x0 .. x3 at kernel entry
 	stp	x2, x3, [x0, #16]
 
+	cbnz	x19, 0f				// skip cache invalidation if MMU is on
 	dmb	sy				// needed before dc ivac with
 						// MMU off
 
 	add	x1, x0, #0x20			// 4 x 8 bytes
 	b	dcache_inval_poc		// tail call
+0:	str_l   x19, mmu_enabled_at_boot, x0
+	ret
 SYM_CODE_END(preserve_boot_args)
 
 SYM_FUNC_START_LOCAL(clear_page_tables)
@@ -494,6 +512,7 @@ SYM_FUNC_START(init_kernel_el)
 
 SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
 	mov_q	x0, INIT_SCTLR_EL1_MMU_OFF
+	pre_disable_mmu_workaround
 	msr	sctlr_el1, x0
 	isb
 	mov_q	x0, INIT_PSTATE_EL1
@@ -526,11 +545,13 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
 	cbz	x0, 1f
 
 	/* Set a sane SCTLR_EL1, the VHE way */
+	pre_disable_mmu_workaround
 	msr_s	SYS_SCTLR_EL12, x1
 	mov	x2, #BOOT_CPU_FLAG_E2H
 	b	2f
 
 1:
+	pre_disable_mmu_workaround
 	msr	sctlr_el1, x1
 	mov	x2, xzr
 2:
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index fea3223704b6339a..11cf21afafa9f852 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -56,6 +56,7 @@ static int num_standard_resources;
 static struct resource *standard_resources;
 
 phys_addr_t __fdt_pointer __initdata;
+u64 mmu_enabled_at_boot __initdata;
 
 /*
  * Standard memory resources
@@ -328,8 +329,12 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
 	xen_early_init();
 	efi_init();
 
-	if (!efi_enabled(EFI_BOOT) && ((u64)_text % MIN_KIMG_ALIGN) != 0)
-	     pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
+	if (!efi_enabled(EFI_BOOT)) {
+		if ((u64)_text % MIN_KIMG_ALIGN)
+			pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
+		WARN_TAINT(mmu_enabled_at_boot, TAINT_FIRMWARE_WORKAROUND,
+			   FW_BUG "Booted with MMU enabled!");
+	}
 
 	arm64_memblock_init();
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 3/7] arm64: head: record the MMU state at primary entry
@ 2022-11-08 18:22   ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

Prepare for being able to deal with primary entry with the MMU and
caches enabled, by recording whether or not we entered with the MMU on
in register x19 and in a global variable. (Note that setting this
variable to '1' does not require cache invalidation, nor is it required
for storing the bootargs in that case, so omit the cache maintenance).

Since boot with the MMU enabled is not permitted by the bare metal boot
protocol, ensure that a diagnostic is emitted and a taint bit set if
the MMU was found to be enabled on a non-EFI boot. We will make an
exception for EFI boot later, which has strict requirements for the
mapping of system memory, permitting us to relax the boot protocol and
hand over from the EFI stub to the core kernel with MMU and caches left
enabled.

While at it, add 'pre_disable_mmu_workaround' macro invocations to
init_kernel_el, as its manipulation of SCTLR_ELx may amount to disabling
of the MMU after subsequent patches.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S  | 21 ++++++++++++++++++++
 arch/arm64/kernel/setup.c |  9 +++++++--
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 272877c5b4fa1203..3e654e43fa115947 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -77,6 +77,7 @@
 	 * primary lowlevel boot path:
 	 *
 	 *  Register   Scope                      Purpose
+	 *  x19        primary_entry() .. start_kernel()        whether we entered with the MMU on
 	 *  x20        primary_entry() .. __primary_switch()    CPU boot mode
 	 *  x21        primary_entry() .. start_kernel()        FDT pointer passed at boot in x0
 	 *  x22        create_idmap() .. start_kernel()         ID map VA of the DT blob
@@ -86,6 +87,7 @@
 	 *  x28        create_idmap()                           callee preserved temp register
 	 */
 SYM_CODE_START(primary_entry)
+	bl	record_mmu_state
 	bl	preserve_boot_args
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mov	x20, x0
@@ -109,6 +111,19 @@ SYM_CODE_START(primary_entry)
 	b	__primary_switch
 SYM_CODE_END(primary_entry)
 
+SYM_CODE_START_LOCAL(record_mmu_state)
+	mrs	x19, CurrentEL
+	cmp	x19, #CurrentEL_EL2
+	mrs	x19, sctlr_el1
+	b.ne	0f
+	mrs	x19, sctlr_el2
+0:	tst	x19, #SCTLR_ELx_C		// Z := (C == 0)
+	and	x19, x19, #SCTLR_ELx_M		// isolate M bit
+	ccmp	x19, xzr, #4, ne		// Z |= (M == 0)
+	cset	x19, ne				// set x19 if !Z
+	ret
+SYM_CODE_END(record_mmu_state)
+
 /*
  * Preserve the arguments passed by the bootloader in x0 .. x3
  */
@@ -119,11 +134,14 @@ SYM_CODE_START_LOCAL(preserve_boot_args)
 	stp	x21, x1, [x0]			// x0 .. x3 at kernel entry
 	stp	x2, x3, [x0, #16]
 
+	cbnz	x19, 0f				// skip cache invalidation if MMU is on
 	dmb	sy				// needed before dc ivac with
 						// MMU off
 
 	add	x1, x0, #0x20			// 4 x 8 bytes
 	b	dcache_inval_poc		// tail call
+0:	str_l   x19, mmu_enabled_at_boot, x0
+	ret
 SYM_CODE_END(preserve_boot_args)
 
 SYM_FUNC_START_LOCAL(clear_page_tables)
@@ -494,6 +512,7 @@ SYM_FUNC_START(init_kernel_el)
 
 SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
 	mov_q	x0, INIT_SCTLR_EL1_MMU_OFF
+	pre_disable_mmu_workaround
 	msr	sctlr_el1, x0
 	isb
 	mov_q	x0, INIT_PSTATE_EL1
@@ -526,11 +545,13 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
 	cbz	x0, 1f
 
 	/* Set a sane SCTLR_EL1, the VHE way */
+	pre_disable_mmu_workaround
 	msr_s	SYS_SCTLR_EL12, x1
 	mov	x2, #BOOT_CPU_FLAG_E2H
 	b	2f
 
 1:
+	pre_disable_mmu_workaround
 	msr	sctlr_el1, x1
 	mov	x2, xzr
 2:
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index fea3223704b6339a..11cf21afafa9f852 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -56,6 +56,7 @@ static int num_standard_resources;
 static struct resource *standard_resources;
 
 phys_addr_t __fdt_pointer __initdata;
+u64 mmu_enabled_at_boot __initdata;
 
 /*
  * Standard memory resources
@@ -328,8 +329,12 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
 	xen_early_init();
 	efi_init();
 
-	if (!efi_enabled(EFI_BOOT) && ((u64)_text % MIN_KIMG_ALIGN) != 0)
-	     pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
+	if (!efi_enabled(EFI_BOOT)) {
+		if ((u64)_text % MIN_KIMG_ALIGN)
+			pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
+		WARN_TAINT(mmu_enabled_at_boot, TAINT_FIRMWARE_WORKAROUND,
+			   FW_BUG "Booted with MMU enabled!");
+	}
 
 	arm64_memblock_init();
 
-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 4/7] arm64: head: avoid cache invalidation when entering with the MMU on
  2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:22   ` Ard Biesheuvel
  -1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

If we enter with the MMU on, there is no need for explicit cache
invalidation for stores to memory, as they will be coherent with the
caches.

Let's take advantage of this, and create the ID map with the MMU still
enabled if that is how we entered, and avoid any cache invalidation
calls in that case.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 3e654e43fa115947..a7c84cde67c5c652 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -89,9 +89,9 @@
 SYM_CODE_START(primary_entry)
 	bl	record_mmu_state
 	bl	preserve_boot_args
+	bl	create_idmap
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mov	x20, x0
-	bl	create_idmap
 
 	/*
 	 * The following calls CPU setup code, see arch/arm64/mm/proc.S for
@@ -378,12 +378,13 @@ SYM_FUNC_START_LOCAL(create_idmap)
 	 * accesses (MMU disabled), invalidate those tables again to
 	 * remove any speculatively loaded cache lines.
 	 */
+	cbnz	x19, 0f				// skip cache invalidation if MMU is on
 	dmb	sy
 
 	adrp	x0, init_idmap_pg_dir
 	adrp	x1, init_idmap_pg_end
 	bl	dcache_inval_poc
-	ret	x28
+0:	ret	x28
 SYM_FUNC_END(create_idmap)
 
 SYM_FUNC_START_LOCAL(create_kernel_mapping)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 4/7] arm64: head: avoid cache invalidation when entering with the MMU on
@ 2022-11-08 18:22   ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

If we enter with the MMU on, there is no need for explicit cache
invalidation for stores to memory, as they will be coherent with the
caches.

Let's take advantage of this, and create the ID map with the MMU still
enabled if that is how we entered, and avoid any cache invalidation
calls in that case.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 3e654e43fa115947..a7c84cde67c5c652 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -89,9 +89,9 @@
 SYM_CODE_START(primary_entry)
 	bl	record_mmu_state
 	bl	preserve_boot_args
+	bl	create_idmap
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mov	x20, x0
-	bl	create_idmap
 
 	/*
 	 * The following calls CPU setup code, see arch/arm64/mm/proc.S for
@@ -378,12 +378,13 @@ SYM_FUNC_START_LOCAL(create_idmap)
 	 * accesses (MMU disabled), invalidate those tables again to
 	 * remove any speculatively loaded cache lines.
 	 */
+	cbnz	x19, 0f				// skip cache invalidation if MMU is on
 	dmb	sy
 
 	adrp	x0, init_idmap_pg_dir
 	adrp	x1, init_idmap_pg_end
 	bl	dcache_inval_poc
-	ret	x28
+0:	ret	x28
 SYM_FUNC_END(create_idmap)
 
 SYM_FUNC_START_LOCAL(create_kernel_mapping)
-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 5/7] arm64: head: Clean the ID map and the HYP text to the PoC if needed
  2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:22   ` Ard Biesheuvel
  -1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

If we enter with the MMU and caches enabled, the bootloader may not have
performed any cache maintenance to the PoC. So clean the ID mapped page
to the PoC, to ensure that instruction and data accesses with the MMU
off see the correct data. For similar reasons, clean all the HYP text to
the PoC as well when entering at EL2 with the MMU and caches enabled.

Note that this means primary_entry() itself needs to be moved into the
ID map as well, as we will return from init_kernel_el() with the MMU and
caches off.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S  | 31 +++++++++++++++++---
 arch/arm64/kernel/sleep.S |  1 +
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index a7c84cde67c5c652..825f1d0549661030 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -70,7 +70,7 @@
 
 	__EFI_PE_HEADER
 
-	__INIT
+	.section ".idmap.text","awx"
 
 	/*
 	 * The following callee saved general purpose registers are used on the
@@ -90,6 +90,17 @@ SYM_CODE_START(primary_entry)
 	bl	record_mmu_state
 	bl	preserve_boot_args
 	bl	create_idmap
+
+	/*
+	 * If we entered with the MMU and caches on, clean the ID mapped part
+	 * of the primary boot code to the PoC so we can safely execute it with
+	 * the MMU off.
+	 */
+	cbz	x19, 0f
+	adrp	x0, __idmap_text_start
+	adr_l	x1, __idmap_text_end
+	bl	dcache_clean_poc
+0:	mov	x19, x0
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mov	x20, x0
 
@@ -111,6 +122,7 @@ SYM_CODE_START(primary_entry)
 	b	__primary_switch
 SYM_CODE_END(primary_entry)
 
+	__INIT
 SYM_CODE_START_LOCAL(record_mmu_state)
 	mrs	x19, CurrentEL
 	cmp	x19, #CurrentEL_EL2
@@ -505,10 +517,12 @@ SYM_FUNC_END(__primary_switched)
  * Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in x0 if
  * booted in EL1 or EL2 respectively, with the top 32 bits containing
  * potential context flags. These flags are *not* stored in __boot_cpu_mode.
+ *
+ * x0: whether we are being called from the primary boot path with the MMU on
  */
 SYM_FUNC_START(init_kernel_el)
-	mrs	x0, CurrentEL
-	cmp	x0, #CurrentEL_EL2
+	mrs	x1, CurrentEL
+	cmp	x1, #CurrentEL_EL2
 	b.eq	init_el2
 
 SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
@@ -523,6 +537,14 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
 	eret
 
 SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
+	msr	elr_el2, lr
+
+	// clean all HYP code to the PoC if we booted at EL2 with the MMU on
+	cbz	x0, 0f
+	adrp	x0, __hyp_idmap_text_start
+	adr_l	x1, __hyp_text_end
+	bl	dcache_clean_poc
+0:
 	mov_q	x0, HCR_HOST_NVHE_FLAGS
 	msr	hcr_el2, x0
 	isb
@@ -556,7 +578,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
 	msr	sctlr_el1, x1
 	mov	x2, xzr
 2:
-	msr	elr_el2, lr
 	mov	w0, #BOOT_CPU_MODE_EL2
 	orr	x0, x0, x2
 	eret
@@ -567,6 +588,7 @@ SYM_FUNC_END(init_kernel_el)
 	 * cores are held until we're ready for them to initialise.
 	 */
 SYM_FUNC_START(secondary_holding_pen)
+	mov	x0, xzr
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mrs	x2, mpidr_el1
 	mov_q	x1, MPIDR_HWID_BITMASK
@@ -584,6 +606,7 @@ SYM_FUNC_END(secondary_holding_pen)
 	 * be used where CPUs are brought online dynamically by the kernel.
 	 */
 SYM_FUNC_START(secondary_entry)
+	mov	x0, xzr
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	b	secondary_startup
 SYM_FUNC_END(secondary_entry)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 7b7c56e048346e97..2ae7cff1953aaf87 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -99,6 +99,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
 
 	.pushsection ".idmap.text", "awx"
 SYM_CODE_START(cpu_resume)
+	mov	x0, xzr
 	bl	init_kernel_el
 	mov	x19, x0			// preserve boot mode
 #if VA_BITS > 48
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 5/7] arm64: head: Clean the ID map and the HYP text to the PoC if needed
@ 2022-11-08 18:22   ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

If we enter with the MMU and caches enabled, the bootloader may not have
performed any cache maintenance to the PoC. So clean the ID mapped page
to the PoC, to ensure that instruction and data accesses with the MMU
off see the correct data. For similar reasons, clean all the HYP text to
the PoC as well when entering at EL2 with the MMU and caches enabled.

Note that this means primary_entry() itself needs to be moved into the
ID map as well, as we will return from init_kernel_el() with the MMU and
caches off.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S  | 31 +++++++++++++++++---
 arch/arm64/kernel/sleep.S |  1 +
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index a7c84cde67c5c652..825f1d0549661030 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -70,7 +70,7 @@
 
 	__EFI_PE_HEADER
 
-	__INIT
+	.section ".idmap.text","awx"
 
 	/*
 	 * The following callee saved general purpose registers are used on the
@@ -90,6 +90,17 @@ SYM_CODE_START(primary_entry)
 	bl	record_mmu_state
 	bl	preserve_boot_args
 	bl	create_idmap
+
+	/*
+	 * If we entered with the MMU and caches on, clean the ID mapped part
+	 * of the primary boot code to the PoC so we can safely execute it with
+	 * the MMU off.
+	 */
+	cbz	x19, 0f
+	adrp	x0, __idmap_text_start
+	adr_l	x1, __idmap_text_end
+	bl	dcache_clean_poc
+0:	mov	x19, x0
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mov	x20, x0
 
@@ -111,6 +122,7 @@ SYM_CODE_START(primary_entry)
 	b	__primary_switch
 SYM_CODE_END(primary_entry)
 
+	__INIT
 SYM_CODE_START_LOCAL(record_mmu_state)
 	mrs	x19, CurrentEL
 	cmp	x19, #CurrentEL_EL2
@@ -505,10 +517,12 @@ SYM_FUNC_END(__primary_switched)
  * Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in x0 if
  * booted in EL1 or EL2 respectively, with the top 32 bits containing
  * potential context flags. These flags are *not* stored in __boot_cpu_mode.
+ *
+ * x0: whether we are being called from the primary boot path with the MMU on
  */
 SYM_FUNC_START(init_kernel_el)
-	mrs	x0, CurrentEL
-	cmp	x0, #CurrentEL_EL2
+	mrs	x1, CurrentEL
+	cmp	x1, #CurrentEL_EL2
 	b.eq	init_el2
 
 SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
@@ -523,6 +537,14 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
 	eret
 
 SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
+	msr	elr_el2, lr
+
+	// clean all HYP code to the PoC if we booted at EL2 with the MMU on
+	cbz	x0, 0f
+	adrp	x0, __hyp_idmap_text_start
+	adr_l	x1, __hyp_text_end
+	bl	dcache_clean_poc
+0:
 	mov_q	x0, HCR_HOST_NVHE_FLAGS
 	msr	hcr_el2, x0
 	isb
@@ -556,7 +578,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
 	msr	sctlr_el1, x1
 	mov	x2, xzr
 2:
-	msr	elr_el2, lr
 	mov	w0, #BOOT_CPU_MODE_EL2
 	orr	x0, x0, x2
 	eret
@@ -567,6 +588,7 @@ SYM_FUNC_END(init_kernel_el)
 	 * cores are held until we're ready for them to initialise.
 	 */
 SYM_FUNC_START(secondary_holding_pen)
+	mov	x0, xzr
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mrs	x2, mpidr_el1
 	mov_q	x1, MPIDR_HWID_BITMASK
@@ -584,6 +606,7 @@ SYM_FUNC_END(secondary_holding_pen)
 	 * be used where CPUs are brought online dynamically by the kernel.
 	 */
 SYM_FUNC_START(secondary_entry)
+	mov	x0, xzr
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	b	secondary_startup
 SYM_FUNC_END(secondary_entry)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 7b7c56e048346e97..2ae7cff1953aaf87 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -99,6 +99,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
 
 	.pushsection ".idmap.text", "awx"
 SYM_CODE_START(cpu_resume)
+	mov	x0, xzr
 	bl	init_kernel_el
 	mov	x19, x0			// preserve boot mode
 #if VA_BITS > 48
-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 6/7] arm64: lds: reduce effective minimum image alignment to 64k
  2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:22   ` Ard Biesheuvel
  -1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

Our segment alignment is 64k for all configurations, and coincidentally,
this is the largest alignment supported by the PE/COFF executable
format used by EFI. This means that generally, there is no need to move
the image around in memory after it has been loaded by the firmware,
which can be advantageous as it also permits us to rely on the memory
attributes set by the firmware (R-X for [_text, __inittext_end] and RW-
for [__initdata_begin, _end].

However, the minimum alignment of the image is actually 128k on 64k
pages configurations with CONFIG_VMAP_STACK=y, due to the existence of a
single 128k aligned object in the image, which is the stack of the init
task.

Let's work around this by adding some padding before the init stack
allocation, so we can round down the stack pointer to a suitably aligned
value if the image is not aligned to 128k in memory.

Note that this does not affect the boot protocol, which still requires 2
MiB alignment for bare metal boot, but is only part of the internal
contract between the EFI stub and the kernel proper.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/include/asm/efi.h    |  9 +--------
 arch/arm64/kernel/head.S        |  3 +++
 arch/arm64/kernel/vmlinux.lds.S | 11 ++++++++++-
 include/linux/efi.h             |  6 +-----
 4 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 108b115dbf5b7436..7ed7a0e621a5b0b6 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -54,13 +54,6 @@ efi_status_t __efi_rt_asm_wrapper(void *, const char *, ...);
 
 /* arch specific definitions used by the stub code */
 
-/*
- * In some configurations (e.g. VMAP_STACK && 64K pages), stacks built into the
- * kernel need greater alignment than we require the segments to be padded to.
- */
-#define EFI_KIMG_ALIGN	\
-	(SEGMENT_ALIGN > THREAD_ALIGN ? SEGMENT_ALIGN : THREAD_ALIGN)
-
 /*
  * On arm64, we have to ensure that the initrd ends up in the linear region,
  * which is a 1 GB aligned region of size '1UL << (VA_BITS_MIN - 1)' that is
@@ -88,7 +81,7 @@ static inline unsigned long efi_get_kimg_min_align(void)
 	 * 2M alignment if KASLR was explicitly disabled, even if it was not
 	 * going to be activated to begin with.
 	 */
-	return efi_nokaslr ? MIN_KIMG_ALIGN : EFI_KIMG_ALIGN;
+	return efi_nokaslr ? MIN_KIMG_ALIGN : SEGMENT_ALIGN;
 }
 
 #define EFI_ALLOC_ALIGN		SZ_64K
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 825f1d0549661030..8d7c6155da59e215 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -429,6 +429,9 @@ SYM_FUNC_END(create_kernel_mapping)
 	msr	sp_el0, \tsk
 
 	ldr	\tmp1, [\tsk, #TSK_STACK]
+#if THREAD_ALIGN > SEGMENT_ALIGN
+	bic	\tmp1, \tmp1, #THREAD_ALIGN - 1
+#endif
 	add	sp, \tmp1, #THREAD_SIZE
 	sub	sp, sp, #PT_REGS_SIZE
 
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index c7727a1740ce11f5..5002d869fa7f1767 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -274,7 +274,16 @@ SECTIONS
 
 	_data = .;
 	_sdata = .;
-	RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_ALIGN)
+#if THREAD_ALIGN > SEGMENT_ALIGN
+	/*
+	 * Add some padding for the init stack so we can fix up any potential
+	 * misalignment at runtime. In practice, this can only occur on 64k
+	 * pages configurations with CONFIG_VMAP_STACK=y.
+	 */
+	. += THREAD_ALIGN - SEGMENT_ALIGN;
+	ASSERT(. == init_stack, "init_stack not at start of RW_DATA as expected")
+#endif
+	RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, SEGMENT_ALIGN)
 
 	/*
 	 * Data written with the MMU off but read with the MMU on requires
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 16b7318957b0709f..19eda0bb4617a4cf 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -421,11 +421,7 @@ void efi_native_runtime_setup(void);
 /*
  * This GUID may be installed onto the kernel image's handle as a NULL protocol
  * to signal to the stub that the placement of the image should be respected,
- * and moving the image in physical memory is undesirable. To ensure
- * compatibility with 64k pages kernels with virtually mapped stacks, and to
- * avoid defeating physical randomization, this protocol should only be
- * installed if the image was placed at a randomized 128k aligned address in
- * memory.
+ * and moving the image in physical memory is undesirable.
  */
 #define LINUX_EFI_LOADED_IMAGE_FIXED_GUID	EFI_GUID(0xf5a37b6d, 0x3344, 0x42a5,  0xb6, 0xbb, 0x97, 0x86, 0x48, 0xc1, 0x89, 0x0a)
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 6/7] arm64: lds: reduce effective minimum image alignment to 64k
@ 2022-11-08 18:22   ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

Our segment alignment is 64k for all configurations, and coincidentally,
this is the largest alignment supported by the PE/COFF executable
format used by EFI. This means that generally, there is no need to move
the image around in memory after it has been loaded by the firmware,
which can be advantageous as it also permits us to rely on the memory
attributes set by the firmware (R-X for [_text, __inittext_end] and RW-
for [__initdata_begin, _end].

However, the minimum alignment of the image is actually 128k on 64k
pages configurations with CONFIG_VMAP_STACK=y, due to the existence of a
single 128k aligned object in the image, which is the stack of the init
task.

Let's work around this by adding some padding before the init stack
allocation, so we can round down the stack pointer to a suitably aligned
value if the image is not aligned to 128k in memory.

Note that this does not affect the boot protocol, which still requires 2
MiB alignment for bare metal boot, but is only part of the internal
contract between the EFI stub and the kernel proper.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/include/asm/efi.h    |  9 +--------
 arch/arm64/kernel/head.S        |  3 +++
 arch/arm64/kernel/vmlinux.lds.S | 11 ++++++++++-
 include/linux/efi.h             |  6 +-----
 4 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 108b115dbf5b7436..7ed7a0e621a5b0b6 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -54,13 +54,6 @@ efi_status_t __efi_rt_asm_wrapper(void *, const char *, ...);
 
 /* arch specific definitions used by the stub code */
 
-/*
- * In some configurations (e.g. VMAP_STACK && 64K pages), stacks built into the
- * kernel need greater alignment than we require the segments to be padded to.
- */
-#define EFI_KIMG_ALIGN	\
-	(SEGMENT_ALIGN > THREAD_ALIGN ? SEGMENT_ALIGN : THREAD_ALIGN)
-
 /*
  * On arm64, we have to ensure that the initrd ends up in the linear region,
  * which is a 1 GB aligned region of size '1UL << (VA_BITS_MIN - 1)' that is
@@ -88,7 +81,7 @@ static inline unsigned long efi_get_kimg_min_align(void)
 	 * 2M alignment if KASLR was explicitly disabled, even if it was not
 	 * going to be activated to begin with.
 	 */
-	return efi_nokaslr ? MIN_KIMG_ALIGN : EFI_KIMG_ALIGN;
+	return efi_nokaslr ? MIN_KIMG_ALIGN : SEGMENT_ALIGN;
 }
 
 #define EFI_ALLOC_ALIGN		SZ_64K
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 825f1d0549661030..8d7c6155da59e215 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -429,6 +429,9 @@ SYM_FUNC_END(create_kernel_mapping)
 	msr	sp_el0, \tsk
 
 	ldr	\tmp1, [\tsk, #TSK_STACK]
+#if THREAD_ALIGN > SEGMENT_ALIGN
+	bic	\tmp1, \tmp1, #THREAD_ALIGN - 1
+#endif
 	add	sp, \tmp1, #THREAD_SIZE
 	sub	sp, sp, #PT_REGS_SIZE
 
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index c7727a1740ce11f5..5002d869fa7f1767 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -274,7 +274,16 @@ SECTIONS
 
 	_data = .;
 	_sdata = .;
-	RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_ALIGN)
+#if THREAD_ALIGN > SEGMENT_ALIGN
+	/*
+	 * Add some padding for the init stack so we can fix up any potential
+	 * misalignment at runtime. In practice, this can only occur on 64k
+	 * pages configurations with CONFIG_VMAP_STACK=y.
+	 */
+	. += THREAD_ALIGN - SEGMENT_ALIGN;
+	ASSERT(. == init_stack, "init_stack not at start of RW_DATA as expected")
+#endif
+	RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, SEGMENT_ALIGN)
 
 	/*
 	 * Data written with the MMU off but read with the MMU on requires
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 16b7318957b0709f..19eda0bb4617a4cf 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -421,11 +421,7 @@ void efi_native_runtime_setup(void);
 /*
  * This GUID may be installed onto the kernel image's handle as a NULL protocol
  * to signal to the stub that the placement of the image should be respected,
- * and moving the image in physical memory is undesirable. To ensure
- * compatibility with 64k pages kernels with virtually mapped stacks, and to
- * avoid defeating physical randomization, this protocol should only be
- * installed if the image was placed at a randomized 128k aligned address in
- * memory.
+ * and moving the image in physical memory is undesirable.
  */
 #define LINUX_EFI_LOADED_IMAGE_FIXED_GUID	EFI_GUID(0xf5a37b6d, 0x3344, 0x42a5,  0xb6, 0xbb, 0x97, 0x86, 0x48, 0xc1, 0x89, 0x0a)
 
-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 7/7] efi: arm64: enter with MMU and caches enabled
  2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:22   ` Ard Biesheuvel
  -1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

Instead of cleaning the entire loaded kernel image to the PoC and
disabling the MMU and caches before branching to the kernel's bare metal
entry point, we can leave the MMU and caches enabled, and rely on EFI's
cacheable 1:1 mapping of all of system RAM (which is mandated by the
spec) to populate the initial page tables.

This removes the need for managing coherency in software, which is
tedious and error prone.

Note that we still need to clean the executable region of the image to
the PoU if this is required for I/D coherency, but only if we actually
decided to move the image in memory, as otherwise, this will have been
taken care of by the loader.

This change affects both the builtin EFI stub as well as the zboot
decompressor, which now carries the entire EFI stub along with the
decompression code and the compressed image.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/image-vars.h             |  5 +-
 arch/arm64/mm/cache.S                      |  5 +-
 drivers/firmware/efi/libstub/Makefile      |  4 +-
 drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------------
 drivers/firmware/efi/libstub/arm64-stub.c  | 26 +++++---
 drivers/firmware/efi/libstub/arm64.c       | 41 ++++++++++--
 6 files changed, 61 insertions(+), 87 deletions(-)

diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index f31130ba02331060..40ebb882d2d8c97b 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -10,7 +10,7 @@
 #error This file should only be included in vmlinux.lds.S
 #endif
 
-PROVIDE(__efistub_primary_entry_offset	= primary_entry - _text);
+PROVIDE(__efistub_primary_entry		= primary_entry);
 
 /*
  * The EFI stub has its own symbol namespace prefixed by __efistub_, to
@@ -21,10 +21,11 @@ PROVIDE(__efistub_primary_entry_offset	= primary_entry - _text);
  * linked at. The routines below are all implemented in assembler in a
  * position independent manner
  */
-PROVIDE(__efistub_dcache_clean_poc	= __pi_dcache_clean_poc);
+PROVIDE(__efistub_caches_clean_inval_pou = __pi_caches_clean_inval_pou);
 
 PROVIDE(__efistub__text			= _text);
 PROVIDE(__efistub__end			= _end);
+PROVIDE(__efistub___inittext_end       	= __inittext_end);
 PROVIDE(__efistub__edata		= _edata);
 PROVIDE(__efistub_screen_info		= screen_info);
 PROVIDE(__efistub__ctype		= _ctype);
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 081058d4e4366edb..8c3b3ee9b1d725c8 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -52,10 +52,11 @@ alternative_else_nop_endif
  *	- start   - virtual start address of region
  *	- end     - virtual end address of region
  */
-SYM_FUNC_START(caches_clean_inval_pou)
+SYM_FUNC_START(__pi_caches_clean_inval_pou)
 	caches_clean_inval_pou_macro
 	ret
-SYM_FUNC_END(caches_clean_inval_pou)
+SYM_FUNC_END(__pi_caches_clean_inval_pou)
+SYM_FUNC_ALIAS(caches_clean_inval_pou, __pi_caches_clean_inval_pou)
 
 /*
  *	caches_clean_inval_user_pou(start,end)
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 402dfb30ddc7a01e..f838ab98978f1038 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -86,7 +86,7 @@ lib-$(CONFIG_EFI_GENERIC_STUB)	+= efi-stub.o string.o intrinsics.o systable.o \
 				   screen_info.o efi-stub-entry.o
 
 lib-$(CONFIG_ARM)		+= arm32-stub.o
-lib-$(CONFIG_ARM64)		+= arm64.o arm64-stub.o arm64-entry.o
+lib-$(CONFIG_ARM64)		+= arm64.o arm64-stub.o
 lib-$(CONFIG_X86)		+= x86-stub.o
 lib-$(CONFIG_RISCV)		+= riscv.o riscv-stub.o
 lib-$(CONFIG_LOONGARCH)		+= loongarch.o loongarch-stub.o
@@ -140,7 +140,7 @@ STUBCOPY_RELOC-$(CONFIG_ARM)	:= R_ARM_ABS
 #
 STUBCOPY_FLAGS-$(CONFIG_ARM64)	+= --prefix-alloc-sections=.init \
 				   --prefix-symbols=__efistub_
-STUBCOPY_RELOC-$(CONFIG_ARM64)	:= R_AARCH64_ABS64
+STUBCOPY_RELOC-$(CONFIG_ARM64)	:= R_AARCH64_ABS
 
 # For RISC-V, we don't need anything special other than arm64. Keep all the
 # symbols in .init section and make sure that no absolute symbols references
diff --git a/drivers/firmware/efi/libstub/arm64-entry.S b/drivers/firmware/efi/libstub/arm64-entry.S
deleted file mode 100644
index b5c17e89a4fc0c21..0000000000000000
--- a/drivers/firmware/efi/libstub/arm64-entry.S
+++ /dev/null
@@ -1,67 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * EFI entry point.
- *
- * Copyright (C) 2013, 2014 Red Hat, Inc.
- * Author: Mark Salter <msalter@redhat.com>
- */
-#include <linux/linkage.h>
-#include <asm/assembler.h>
-
-	/*
-	 * The entrypoint of a arm64 bare metal image is at offset #0 of the
-	 * image, so this is a reasonable default for primary_entry_offset.
-	 * Only when the EFI stub is integrated into the core kernel, it is not
-	 * guaranteed that the PE/COFF header has been copied to memory too, so
-	 * in this case, primary_entry_offset should be overridden by the
-	 * linker and point to primary_entry() directly.
-	 */
-	.weak	primary_entry_offset
-
-SYM_CODE_START(efi_enter_kernel)
-	/*
-	 * efi_pe_entry() will have copied the kernel image if necessary and we
-	 * end up here with device tree address in x1 and the kernel entry
-	 * point stored in x0. Save those values in registers which are
-	 * callee preserved.
-	 */
-	ldr	w2, =primary_entry_offset
-	add	x19, x0, x2		// relocated Image entrypoint
-
-	mov	x0, x1			// DTB address
-	mov	x1, xzr
-	mov	x2, xzr
-	mov	x3, xzr
-
-	/*
-	 * Clean the remainder of this routine to the PoC
-	 * so that we can safely disable the MMU and caches.
-	 */
-	adr	x4, 1f
-	dc	civac, x4
-	dsb	sy
-
-	/* Turn off Dcache and MMU */
-	mrs	x4, CurrentEL
-	cmp	x4, #CurrentEL_EL2
-	mrs	x4, sctlr_el1
-	b.ne	0f
-	mrs	x4, sctlr_el2
-0:	bic	x4, x4, #SCTLR_ELx_M
-	bic	x4, x4, #SCTLR_ELx_C
-	b.eq	1f
-	b	2f
-
-	.balign	32
-1:	pre_disable_mmu_workaround
-	msr	sctlr_el2, x4
-	isb
-	br	x19		// jump to kernel entrypoint
-
-2:	pre_disable_mmu_workaround
-	msr	sctlr_el1, x4
-	isb
-	br	x19		// jump to kernel entrypoint
-
-	.org	1b + 32
-SYM_CODE_END(efi_enter_kernel)
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 7f0aab3a8ab302d6..00fb2eab6d0c74ef 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -58,7 +58,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 				 efi_handle_t image_handle)
 {
 	efi_status_t status;
-	unsigned long kernel_size, kernel_memsize = 0;
+	unsigned long kernel_size, kernel_codesize, kernel_memsize;
 	u32 phys_seed = 0;
 	u64 min_kimg_align = efi_get_kimg_min_align();
 
@@ -93,6 +93,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 			SEGMENT_ALIGN >> 10);
 
 	kernel_size = _edata - _text;
+	kernel_codesize = __inittext_end - _text;
 	kernel_memsize = kernel_size + (_end - _edata);
 	*reserve_size = kernel_memsize;
 
@@ -120,7 +121,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 			 */
 			*image_addr = (u64)_text;
 			*reserve_size = 0;
-			goto clean_image_to_poc;
+			return EFI_SUCCESS;
 		}
 
 		status = efi_allocate_pages_aligned(*reserve_size, reserve_addr,
@@ -136,14 +137,21 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 
 	*image_addr = *reserve_addr;
 	memcpy((void *)*image_addr, _text, kernel_size);
+	caches_clean_inval_pou(*image_addr, *image_addr + kernel_codesize);
 
-clean_image_to_poc:
+	return EFI_SUCCESS;
+}
+
+asmlinkage void primary_entry(void);
+
+unsigned long primary_entry_offset(void)
+{
 	/*
-	 * Clean the copied Image to the PoC, and ensure it is not shadowed by
-	 * stale icache entries from before relocation.
+	 * When built as part of the kernel, the EFI stub cannot branch to the
+	 * kernel proper via the image header, as the PE/COFF header is
+	 * strictly not part of the in-memory presentation of the image, only
+	 * of the file representation. So instead, we need to jump to the
+	 * actual entrypoint in the .text region of the image.
 	 */
-	dcache_clean_poc(*image_addr, *image_addr + kernel_size);
-	asm("ic ialluis");
-
-	return EFI_SUCCESS;
+	return (char *)primary_entry - _text;
 }
diff --git a/drivers/firmware/efi/libstub/arm64.c b/drivers/firmware/efi/libstub/arm64.c
index d2e94972c5fad523..99f86ddc91cf10cf 100644
--- a/drivers/firmware/efi/libstub/arm64.c
+++ b/drivers/firmware/efi/libstub/arm64.c
@@ -41,6 +41,12 @@ efi_status_t check_platform_features(void)
 	return EFI_SUCCESS;
 }
 
+#ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
+#define DCTYPE	"civac"
+#else
+#define DCTYPE	"cvau"
+#endif
+
 void efi_cache_sync_image(unsigned long image_base,
 			  unsigned long alloc_size,
 			  unsigned long code_size)
@@ -49,13 +55,38 @@ void efi_cache_sync_image(unsigned long image_base,
 	u64 lsize = 4 << cpuid_feature_extract_unsigned_field(ctr,
 						CTR_EL0_DminLine_SHIFT);
 
-	do {
-		asm("dc civac, %0" :: "r"(image_base));
-		image_base += lsize;
-		alloc_size -= lsize;
-	} while (alloc_size >= lsize);
+	/* only perform the cache maintenance if needed for I/D coherency */
+	if (!(ctr & BIT(CTR_EL0_IDC_SHIFT))) {
+		do {
+			asm("dc " DCTYPE ", %0" :: "r"(image_base));
+			image_base += lsize;
+			code_size -= lsize;
+		} while (code_size >= lsize);
+	}
 
 	asm("ic ialluis");
 	dsb(ish);
 	isb();
 }
+
+unsigned long __weak primary_entry_offset(void)
+{
+	/*
+	 * By default, we can invoke the kernel via the branch instruction in
+	 * the image header, so offset #0. This will be overridden by the EFI
+	 * stub build that is linked into the core kernel, as in that case, the
+	 * image header may not have been loaded into memory, or may be mapped
+	 * with non-executable permissions.
+	 */
+       return 0;
+}
+
+void __noreturn efi_enter_kernel(unsigned long entrypoint,
+				 unsigned long fdt_addr,
+				 unsigned long fdt_size)
+{
+	void (* __noreturn enter_kernel)(u64, u64, u64, u64);
+
+	enter_kernel = (void *)entrypoint + primary_entry_offset();
+	enter_kernel(fdt_addr, 0, 0, 0);
+}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v5 7/7] efi: arm64: enter with MMU and caches enabled
@ 2022-11-08 18:22   ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
	Catalin Marinas, Marc Zyngier, Mark Rutland

Instead of cleaning the entire loaded kernel image to the PoC and
disabling the MMU and caches before branching to the kernel's bare metal
entry point, we can leave the MMU and caches enabled, and rely on EFI's
cacheable 1:1 mapping of all of system RAM (which is mandated by the
spec) to populate the initial page tables.

This removes the need for managing coherency in software, which is
tedious and error prone.

Note that we still need to clean the executable region of the image to
the PoU if this is required for I/D coherency, but only if we actually
decided to move the image in memory, as otherwise, this will have been
taken care of by the loader.

This change affects both the builtin EFI stub as well as the zboot
decompressor, which now carries the entire EFI stub along with the
decompression code and the compressed image.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/image-vars.h             |  5 +-
 arch/arm64/mm/cache.S                      |  5 +-
 drivers/firmware/efi/libstub/Makefile      |  4 +-
 drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------------
 drivers/firmware/efi/libstub/arm64-stub.c  | 26 +++++---
 drivers/firmware/efi/libstub/arm64.c       | 41 ++++++++++--
 6 files changed, 61 insertions(+), 87 deletions(-)

diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index f31130ba02331060..40ebb882d2d8c97b 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -10,7 +10,7 @@
 #error This file should only be included in vmlinux.lds.S
 #endif
 
-PROVIDE(__efistub_primary_entry_offset	= primary_entry - _text);
+PROVIDE(__efistub_primary_entry		= primary_entry);
 
 /*
  * The EFI stub has its own symbol namespace prefixed by __efistub_, to
@@ -21,10 +21,11 @@ PROVIDE(__efistub_primary_entry_offset	= primary_entry - _text);
  * linked at. The routines below are all implemented in assembler in a
  * position independent manner
  */
-PROVIDE(__efistub_dcache_clean_poc	= __pi_dcache_clean_poc);
+PROVIDE(__efistub_caches_clean_inval_pou = __pi_caches_clean_inval_pou);
 
 PROVIDE(__efistub__text			= _text);
 PROVIDE(__efistub__end			= _end);
+PROVIDE(__efistub___inittext_end       	= __inittext_end);
 PROVIDE(__efistub__edata		= _edata);
 PROVIDE(__efistub_screen_info		= screen_info);
 PROVIDE(__efistub__ctype		= _ctype);
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 081058d4e4366edb..8c3b3ee9b1d725c8 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -52,10 +52,11 @@ alternative_else_nop_endif
  *	- start   - virtual start address of region
  *	- end     - virtual end address of region
  */
-SYM_FUNC_START(caches_clean_inval_pou)
+SYM_FUNC_START(__pi_caches_clean_inval_pou)
 	caches_clean_inval_pou_macro
 	ret
-SYM_FUNC_END(caches_clean_inval_pou)
+SYM_FUNC_END(__pi_caches_clean_inval_pou)
+SYM_FUNC_ALIAS(caches_clean_inval_pou, __pi_caches_clean_inval_pou)
 
 /*
  *	caches_clean_inval_user_pou(start,end)
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 402dfb30ddc7a01e..f838ab98978f1038 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -86,7 +86,7 @@ lib-$(CONFIG_EFI_GENERIC_STUB)	+= efi-stub.o string.o intrinsics.o systable.o \
 				   screen_info.o efi-stub-entry.o
 
 lib-$(CONFIG_ARM)		+= arm32-stub.o
-lib-$(CONFIG_ARM64)		+= arm64.o arm64-stub.o arm64-entry.o
+lib-$(CONFIG_ARM64)		+= arm64.o arm64-stub.o
 lib-$(CONFIG_X86)		+= x86-stub.o
 lib-$(CONFIG_RISCV)		+= riscv.o riscv-stub.o
 lib-$(CONFIG_LOONGARCH)		+= loongarch.o loongarch-stub.o
@@ -140,7 +140,7 @@ STUBCOPY_RELOC-$(CONFIG_ARM)	:= R_ARM_ABS
 #
 STUBCOPY_FLAGS-$(CONFIG_ARM64)	+= --prefix-alloc-sections=.init \
 				   --prefix-symbols=__efistub_
-STUBCOPY_RELOC-$(CONFIG_ARM64)	:= R_AARCH64_ABS64
+STUBCOPY_RELOC-$(CONFIG_ARM64)	:= R_AARCH64_ABS
 
 # For RISC-V, we don't need anything special other than arm64. Keep all the
 # symbols in .init section and make sure that no absolute symbols references
diff --git a/drivers/firmware/efi/libstub/arm64-entry.S b/drivers/firmware/efi/libstub/arm64-entry.S
deleted file mode 100644
index b5c17e89a4fc0c21..0000000000000000
--- a/drivers/firmware/efi/libstub/arm64-entry.S
+++ /dev/null
@@ -1,67 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * EFI entry point.
- *
- * Copyright (C) 2013, 2014 Red Hat, Inc.
- * Author: Mark Salter <msalter@redhat.com>
- */
-#include <linux/linkage.h>
-#include <asm/assembler.h>
-
-	/*
-	 * The entrypoint of a arm64 bare metal image is at offset #0 of the
-	 * image, so this is a reasonable default for primary_entry_offset.
-	 * Only when the EFI stub is integrated into the core kernel, it is not
-	 * guaranteed that the PE/COFF header has been copied to memory too, so
-	 * in this case, primary_entry_offset should be overridden by the
-	 * linker and point to primary_entry() directly.
-	 */
-	.weak	primary_entry_offset
-
-SYM_CODE_START(efi_enter_kernel)
-	/*
-	 * efi_pe_entry() will have copied the kernel image if necessary and we
-	 * end up here with device tree address in x1 and the kernel entry
-	 * point stored in x0. Save those values in registers which are
-	 * callee preserved.
-	 */
-	ldr	w2, =primary_entry_offset
-	add	x19, x0, x2		// relocated Image entrypoint
-
-	mov	x0, x1			// DTB address
-	mov	x1, xzr
-	mov	x2, xzr
-	mov	x3, xzr
-
-	/*
-	 * Clean the remainder of this routine to the PoC
-	 * so that we can safely disable the MMU and caches.
-	 */
-	adr	x4, 1f
-	dc	civac, x4
-	dsb	sy
-
-	/* Turn off Dcache and MMU */
-	mrs	x4, CurrentEL
-	cmp	x4, #CurrentEL_EL2
-	mrs	x4, sctlr_el1
-	b.ne	0f
-	mrs	x4, sctlr_el2
-0:	bic	x4, x4, #SCTLR_ELx_M
-	bic	x4, x4, #SCTLR_ELx_C
-	b.eq	1f
-	b	2f
-
-	.balign	32
-1:	pre_disable_mmu_workaround
-	msr	sctlr_el2, x4
-	isb
-	br	x19		// jump to kernel entrypoint
-
-2:	pre_disable_mmu_workaround
-	msr	sctlr_el1, x4
-	isb
-	br	x19		// jump to kernel entrypoint
-
-	.org	1b + 32
-SYM_CODE_END(efi_enter_kernel)
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 7f0aab3a8ab302d6..00fb2eab6d0c74ef 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -58,7 +58,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 				 efi_handle_t image_handle)
 {
 	efi_status_t status;
-	unsigned long kernel_size, kernel_memsize = 0;
+	unsigned long kernel_size, kernel_codesize, kernel_memsize;
 	u32 phys_seed = 0;
 	u64 min_kimg_align = efi_get_kimg_min_align();
 
@@ -93,6 +93,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 			SEGMENT_ALIGN >> 10);
 
 	kernel_size = _edata - _text;
+	kernel_codesize = __inittext_end - _text;
 	kernel_memsize = kernel_size + (_end - _edata);
 	*reserve_size = kernel_memsize;
 
@@ -120,7 +121,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 			 */
 			*image_addr = (u64)_text;
 			*reserve_size = 0;
-			goto clean_image_to_poc;
+			return EFI_SUCCESS;
 		}
 
 		status = efi_allocate_pages_aligned(*reserve_size, reserve_addr,
@@ -136,14 +137,21 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 
 	*image_addr = *reserve_addr;
 	memcpy((void *)*image_addr, _text, kernel_size);
+	caches_clean_inval_pou(*image_addr, *image_addr + kernel_codesize);
 
-clean_image_to_poc:
+	return EFI_SUCCESS;
+}
+
+asmlinkage void primary_entry(void);
+
+unsigned long primary_entry_offset(void)
+{
 	/*
-	 * Clean the copied Image to the PoC, and ensure it is not shadowed by
-	 * stale icache entries from before relocation.
+	 * When built as part of the kernel, the EFI stub cannot branch to the
+	 * kernel proper via the image header, as the PE/COFF header is
+	 * strictly not part of the in-memory presentation of the image, only
+	 * of the file representation. So instead, we need to jump to the
+	 * actual entrypoint in the .text region of the image.
 	 */
-	dcache_clean_poc(*image_addr, *image_addr + kernel_size);
-	asm("ic ialluis");
-
-	return EFI_SUCCESS;
+	return (char *)primary_entry - _text;
 }
diff --git a/drivers/firmware/efi/libstub/arm64.c b/drivers/firmware/efi/libstub/arm64.c
index d2e94972c5fad523..99f86ddc91cf10cf 100644
--- a/drivers/firmware/efi/libstub/arm64.c
+++ b/drivers/firmware/efi/libstub/arm64.c
@@ -41,6 +41,12 @@ efi_status_t check_platform_features(void)
 	return EFI_SUCCESS;
 }
 
+#ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
+#define DCTYPE	"civac"
+#else
+#define DCTYPE	"cvau"
+#endif
+
 void efi_cache_sync_image(unsigned long image_base,
 			  unsigned long alloc_size,
 			  unsigned long code_size)
@@ -49,13 +55,38 @@ void efi_cache_sync_image(unsigned long image_base,
 	u64 lsize = 4 << cpuid_feature_extract_unsigned_field(ctr,
 						CTR_EL0_DminLine_SHIFT);
 
-	do {
-		asm("dc civac, %0" :: "r"(image_base));
-		image_base += lsize;
-		alloc_size -= lsize;
-	} while (alloc_size >= lsize);
+	/* only perform the cache maintenance if needed for I/D coherency */
+	if (!(ctr & BIT(CTR_EL0_IDC_SHIFT))) {
+		do {
+			asm("dc " DCTYPE ", %0" :: "r"(image_base));
+			image_base += lsize;
+			code_size -= lsize;
+		} while (code_size >= lsize);
+	}
 
 	asm("ic ialluis");
 	dsb(ish);
 	isb();
 }
+
+unsigned long __weak primary_entry_offset(void)
+{
+	/*
+	 * By default, we can invoke the kernel via the branch instruction in
+	 * the image header, so offset #0. This will be overridden by the EFI
+	 * stub build that is linked into the core kernel, as in that case, the
+	 * image header may not have been loaded into memory, or may be mapped
+	 * with non-executable permissions.
+	 */
+       return 0;
+}
+
+void __noreturn efi_enter_kernel(unsigned long entrypoint,
+				 unsigned long fdt_addr,
+				 unsigned long fdt_size)
+{
+	void (* __noreturn enter_kernel)(u64, u64, u64, u64);
+
+	enter_kernel = (void *)entrypoint + primary_entry_offset();
+	enter_kernel(fdt_addr, 0, 0, 0);
+}
-- 
2.35.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 5/7] arm64: head: Clean the ID map and the HYP text to the PoC if needed
  2022-11-08 18:22   ` Ard Biesheuvel
@ 2022-11-08 22:11     ` Ard Biesheuvel
  -1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 22:11 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Will Deacon, Catalin Marinas, Marc Zyngier,
	Mark Rutland

On Tue, 8 Nov 2022 at 19:22, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> If we enter with the MMU and caches enabled, the bootloader may not have
> performed any cache maintenance to the PoC. So clean the ID mapped page
> to the PoC, to ensure that instruction and data accesses with the MMU
> off see the correct data. For similar reasons, clean all the HYP text to
> the PoC as well when entering at EL2 with the MMU and caches enabled.
>
> Note that this means primary_entry() itself needs to be moved into the
> ID map as well, as we will return from init_kernel_el() with the MMU and
> caches off.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm64/kernel/head.S  | 31 +++++++++++++++++---
>  arch/arm64/kernel/sleep.S |  1 +
>  2 files changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index a7c84cde67c5c652..825f1d0549661030 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -70,7 +70,7 @@
>
>         __EFI_PE_HEADER
>
> -       __INIT
> +       .section ".idmap.text","awx"
>
>         /*
>          * The following callee saved general purpose registers are used on the
> @@ -90,6 +90,17 @@ SYM_CODE_START(primary_entry)
>         bl      record_mmu_state
>         bl      preserve_boot_args
>         bl      create_idmap
> +
> +       /*
> +        * If we entered with the MMU and caches on, clean the ID mapped part
> +        * of the primary boot code to the PoC so we can safely execute it with
> +        * the MMU off.
> +        */
> +       cbz     x19, 0f
> +       adrp    x0, __idmap_text_start
> +       adr_l   x1, __idmap_text_end
> +       bl      dcache_clean_poc
> +0:     mov     x19, x0

This is wrong, it should be

mov x0, x19


>         bl      init_kernel_el                  // w0=cpu_boot_mode
>         mov     x20, x0
>
> @@ -111,6 +122,7 @@ SYM_CODE_START(primary_entry)
>         b       __primary_switch
>  SYM_CODE_END(primary_entry)
>
> +       __INIT
>  SYM_CODE_START_LOCAL(record_mmu_state)
>         mrs     x19, CurrentEL
>         cmp     x19, #CurrentEL_EL2
> @@ -505,10 +517,12 @@ SYM_FUNC_END(__primary_switched)
>   * Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in x0 if
>   * booted in EL1 or EL2 respectively, with the top 32 bits containing
>   * potential context flags. These flags are *not* stored in __boot_cpu_mode.
> + *
> + * x0: whether we are being called from the primary boot path with the MMU on
>   */
>  SYM_FUNC_START(init_kernel_el)
> -       mrs     x0, CurrentEL
> -       cmp     x0, #CurrentEL_EL2
> +       mrs     x1, CurrentEL
> +       cmp     x1, #CurrentEL_EL2
>         b.eq    init_el2
>
>  SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
> @@ -523,6 +537,14 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
>         eret
>
>  SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
> +       msr     elr_el2, lr
> +
> +       // clean all HYP code to the PoC if we booted at EL2 with the MMU on
> +       cbz     x0, 0f
> +       adrp    x0, __hyp_idmap_text_start
> +       adr_l   x1, __hyp_text_end
> +       bl      dcache_clean_poc
> +0:
>         mov_q   x0, HCR_HOST_NVHE_FLAGS
>         msr     hcr_el2, x0
>         isb
> @@ -556,7 +578,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
>         msr     sctlr_el1, x1
>         mov     x2, xzr
>  2:
> -       msr     elr_el2, lr
>         mov     w0, #BOOT_CPU_MODE_EL2
>         orr     x0, x0, x2
>         eret
> @@ -567,6 +588,7 @@ SYM_FUNC_END(init_kernel_el)
>          * cores are held until we're ready for them to initialise.
>          */
>  SYM_FUNC_START(secondary_holding_pen)
> +       mov     x0, xzr
>         bl      init_kernel_el                  // w0=cpu_boot_mode
>         mrs     x2, mpidr_el1
>         mov_q   x1, MPIDR_HWID_BITMASK
> @@ -584,6 +606,7 @@ SYM_FUNC_END(secondary_holding_pen)
>          * be used where CPUs are brought online dynamically by the kernel.
>          */
>  SYM_FUNC_START(secondary_entry)
> +       mov     x0, xzr
>         bl      init_kernel_el                  // w0=cpu_boot_mode
>         b       secondary_startup
>  SYM_FUNC_END(secondary_entry)
> diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
> index 7b7c56e048346e97..2ae7cff1953aaf87 100644
> --- a/arch/arm64/kernel/sleep.S
> +++ b/arch/arm64/kernel/sleep.S
> @@ -99,6 +99,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
>
>         .pushsection ".idmap.text", "awx"
>  SYM_CODE_START(cpu_resume)
> +       mov     x0, xzr
>         bl      init_kernel_el
>         mov     x19, x0                 // preserve boot mode
>  #if VA_BITS > 48
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 5/7] arm64: head: Clean the ID map and the HYP text to the PoC if needed
@ 2022-11-08 22:11     ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 22:11 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-efi, keescook, Will Deacon, Catalin Marinas, Marc Zyngier,
	Mark Rutland

On Tue, 8 Nov 2022 at 19:22, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> If we enter with the MMU and caches enabled, the bootloader may not have
> performed any cache maintenance to the PoC. So clean the ID mapped page
> to the PoC, to ensure that instruction and data accesses with the MMU
> off see the correct data. For similar reasons, clean all the HYP text to
> the PoC as well when entering at EL2 with the MMU and caches enabled.
>
> Note that this means primary_entry() itself needs to be moved into the
> ID map as well, as we will return from init_kernel_el() with the MMU and
> caches off.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm64/kernel/head.S  | 31 +++++++++++++++++---
>  arch/arm64/kernel/sleep.S |  1 +
>  2 files changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index a7c84cde67c5c652..825f1d0549661030 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -70,7 +70,7 @@
>
>         __EFI_PE_HEADER
>
> -       __INIT
> +       .section ".idmap.text","awx"
>
>         /*
>          * The following callee saved general purpose registers are used on the
> @@ -90,6 +90,17 @@ SYM_CODE_START(primary_entry)
>         bl      record_mmu_state
>         bl      preserve_boot_args
>         bl      create_idmap
> +
> +       /*
> +        * If we entered with the MMU and caches on, clean the ID mapped part
> +        * of the primary boot code to the PoC so we can safely execute it with
> +        * the MMU off.
> +        */
> +       cbz     x19, 0f
> +       adrp    x0, __idmap_text_start
> +       adr_l   x1, __idmap_text_end
> +       bl      dcache_clean_poc
> +0:     mov     x19, x0

This is wrong, it should be

mov x0, x19


>         bl      init_kernel_el                  // w0=cpu_boot_mode
>         mov     x20, x0
>
> @@ -111,6 +122,7 @@ SYM_CODE_START(primary_entry)
>         b       __primary_switch
>  SYM_CODE_END(primary_entry)
>
> +       __INIT
>  SYM_CODE_START_LOCAL(record_mmu_state)
>         mrs     x19, CurrentEL
>         cmp     x19, #CurrentEL_EL2
> @@ -505,10 +517,12 @@ SYM_FUNC_END(__primary_switched)
>   * Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in x0 if
>   * booted in EL1 or EL2 respectively, with the top 32 bits containing
>   * potential context flags. These flags are *not* stored in __boot_cpu_mode.
> + *
> + * x0: whether we are being called from the primary boot path with the MMU on
>   */
>  SYM_FUNC_START(init_kernel_el)
> -       mrs     x0, CurrentEL
> -       cmp     x0, #CurrentEL_EL2
> +       mrs     x1, CurrentEL
> +       cmp     x1, #CurrentEL_EL2
>         b.eq    init_el2
>
>  SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
> @@ -523,6 +537,14 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
>         eret
>
>  SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
> +       msr     elr_el2, lr
> +
> +       // clean all HYP code to the PoC if we booted at EL2 with the MMU on
> +       cbz     x0, 0f
> +       adrp    x0, __hyp_idmap_text_start
> +       adr_l   x1, __hyp_text_end
> +       bl      dcache_clean_poc
> +0:
>         mov_q   x0, HCR_HOST_NVHE_FLAGS
>         msr     hcr_el2, x0
>         isb
> @@ -556,7 +578,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
>         msr     sctlr_el1, x1
>         mov     x2, xzr
>  2:
> -       msr     elr_el2, lr
>         mov     w0, #BOOT_CPU_MODE_EL2
>         orr     x0, x0, x2
>         eret
> @@ -567,6 +588,7 @@ SYM_FUNC_END(init_kernel_el)
>          * cores are held until we're ready for them to initialise.
>          */
>  SYM_FUNC_START(secondary_holding_pen)
> +       mov     x0, xzr
>         bl      init_kernel_el                  // w0=cpu_boot_mode
>         mrs     x2, mpidr_el1
>         mov_q   x1, MPIDR_HWID_BITMASK
> @@ -584,6 +606,7 @@ SYM_FUNC_END(secondary_holding_pen)
>          * be used where CPUs are brought online dynamically by the kernel.
>          */
>  SYM_FUNC_START(secondary_entry)
> +       mov     x0, xzr
>         bl      init_kernel_el                  // w0=cpu_boot_mode
>         b       secondary_startup
>  SYM_FUNC_END(secondary_entry)
> diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
> index 7b7c56e048346e97..2ae7cff1953aaf87 100644
> --- a/arch/arm64/kernel/sleep.S
> +++ b/arch/arm64/kernel/sleep.S
> @@ -99,6 +99,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
>
>         .pushsection ".idmap.text", "awx"
>  SYM_CODE_START(cpu_resume)
> +       mov     x0, xzr
>         bl      init_kernel_el
>         mov     x19, x0                 // preserve boot mode
>  #if VA_BITS > 48
> --
> 2.35.1
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
  2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-11 17:36   ` Mark Rutland
  -1 siblings, 0 replies; 30+ messages in thread
From: Mark Rutland @ 2022-11-11 17:36 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, linux-efi, keescook, Will Deacon,
	Catalin Marinas, Marc Zyngier

Hi Ard,

Sorry for the late-in-the-day reply here...

On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> The purpose of this series is to remove any explicit cache maintenance
> for coherency during early boot that becomes unnecessary if we simply
> retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> and use it to populate the ID map page tables. After setting up this
> preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> TCR and SCTLR registers as before, and proceed as usual, avoiding the
> need for any manipulations of memory while the MMU and caches are off.
> 
> The only properties of the firmware provided 1:1 map we rely on is that
> it does not require any explicit cache maintenance for coherency, and
> that it covers the entire memory footprint of the image, including the
> BSS and padding at the end - all else is under control of the kernel
> itself, as before.

As a high-level thing, I'm still very much not keen on entering the kernel with
the MMU on. Given that we have to support booting with the MMU off for !EFI
boot (including kexec when EFI is in use), I think this makes it harder to
reason about the boot code overall (e.g. due to the conditional maintenance
added to head.S), and adds more scope for error, even if it simplifies the EFI
stub itself.

I reckon that (sticking with entering with the MMU off), there's more that we
can do to split the table creation into more stages, and to minimize the early
portion of that which has to run with the MMU off. That would benefit non-EFI
boot and kexec, and retain the single-boot-flow that we currently have.

My rough thinking was:

1) Reduce the idmap down to a single page, such that we only need to clear
   NR_PAGETABLE_LEVELS pages to initialize this.

2) Create a small stub at a fixed TTBR1 VA which we use to create a new initial
   mapping of the kernel image (either in TTBR0 as with the currently idmap, or
   in TTBR1 directly). The stub logic could be small enough that it could be
   mapped at page granularity, and we'd only need to initialize
   NR_PAGETABLE_LEVELS pages before enabling the MMU.

   This would then bounce onto the next stage, either in TTBR0 directly, or
   bouncing through there as with the TTBR1 replacement logic.

   We could plausibly write that in C, and the early page table asm logic could
   be simplified.

Thanks,
Mark.

> Changes since v4:
> - add patch to align the callers of finalise_el2()
> - also clean HYP text to the PoC when booting at EL2 with the MMU on
> - add a warning and a taint when doing non-EFI boot with the MMU and
>   caches enabled
> - rebase onto zboot changes in efi/next - this means that patches #6 and
>   #7 will not apply onto arm64/for-next so a shared stable branch will
>   be needed if we want to queue this up for v6.2
> 
> Changes since v3:
> - drop EFI_LOADER_CODE memory type patch that has been queued in the
>   mean time
> - rebased onto [partial] series that moves efi-entry.S into the libstub/
>   source directory
> - fixed a correctness issue in patch #2
> 
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> 
> Ard Biesheuvel (7):
>   arm64: head: Move all finalise_el2 calls to after __enable_mmu
>   arm64: kernel: move identity map out of .text mapping
>   arm64: head: record the MMU state at primary entry
>   arm64: head: avoid cache invalidation when entering with the MMU on
>   arm64: head: Clean the ID map and the HYP text to the PoC if needed
>   arm64: lds: reduce effective minimum image alignment to 64k
>   efi: arm64: enter with MMU and caches enabled
> 
>  arch/arm64/include/asm/efi.h               |  9 +-
>  arch/arm64/kernel/head.S                   | 93 +++++++++++++++-----
>  arch/arm64/kernel/image-vars.h             |  5 +-
>  arch/arm64/kernel/setup.c                  |  9 +-
>  arch/arm64/kernel/sleep.S                  |  6 +-
>  arch/arm64/kernel/vmlinux.lds.S            | 13 ++-
>  arch/arm64/mm/cache.S                      |  5 +-
>  arch/arm64/mm/proc.S                       |  2 -
>  drivers/firmware/efi/libstub/Makefile      |  4 +-
>  drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------
>  drivers/firmware/efi/libstub/arm64-stub.c  | 26 ++++--
>  drivers/firmware/efi/libstub/arm64.c       | 41 +++++++--
>  include/linux/efi.h                        |  6 +-
>  13 files changed, 159 insertions(+), 127 deletions(-)
>  delete mode 100644 drivers/firmware/efi/libstub/arm64-entry.S
> 
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-11 17:36   ` Mark Rutland
  0 siblings, 0 replies; 30+ messages in thread
From: Mark Rutland @ 2022-11-11 17:36 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, linux-efi, keescook, Will Deacon,
	Catalin Marinas, Marc Zyngier

Hi Ard,

Sorry for the late-in-the-day reply here...

On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> The purpose of this series is to remove any explicit cache maintenance
> for coherency during early boot that becomes unnecessary if we simply
> retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> and use it to populate the ID map page tables. After setting up this
> preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> TCR and SCTLR registers as before, and proceed as usual, avoiding the
> need for any manipulations of memory while the MMU and caches are off.
> 
> The only properties of the firmware provided 1:1 map we rely on is that
> it does not require any explicit cache maintenance for coherency, and
> that it covers the entire memory footprint of the image, including the
> BSS and padding at the end - all else is under control of the kernel
> itself, as before.

As a high-level thing, I'm still very much not keen on entering the kernel with
the MMU on. Given that we have to support booting with the MMU off for !EFI
boot (including kexec when EFI is in use), I think this makes it harder to
reason about the boot code overall (e.g. due to the conditional maintenance
added to head.S), and adds more scope for error, even if it simplifies the EFI
stub itself.

I reckon that (sticking with entering with the MMU off), there's more that we
can do to split the table creation into more stages, and to minimize the early
portion of that which has to run with the MMU off. That would benefit non-EFI
boot and kexec, and retain the single-boot-flow that we currently have.

My rough thinking was:

1) Reduce the idmap down to a single page, such that we only need to clear
   NR_PAGETABLE_LEVELS pages to initialize this.

2) Create a small stub at a fixed TTBR1 VA which we use to create a new initial
   mapping of the kernel image (either in TTBR0 as with the currently idmap, or
   in TTBR1 directly). The stub logic could be small enough that it could be
   mapped at page granularity, and we'd only need to initialize
   NR_PAGETABLE_LEVELS pages before enabling the MMU.

   This would then bounce onto the next stage, either in TTBR0 directly, or
   bouncing through there as with the TTBR1 replacement logic.

   We could plausibly write that in C, and the early page table asm logic could
   be simplified.

Thanks,
Mark.

> Changes since v4:
> - add patch to align the callers of finalise_el2()
> - also clean HYP text to the PoC when booting at EL2 with the MMU on
> - add a warning and a taint when doing non-EFI boot with the MMU and
>   caches enabled
> - rebase onto zboot changes in efi/next - this means that patches #6 and
>   #7 will not apply onto arm64/for-next so a shared stable branch will
>   be needed if we want to queue this up for v6.2
> 
> Changes since v3:
> - drop EFI_LOADER_CODE memory type patch that has been queued in the
>   mean time
> - rebased onto [partial] series that moves efi-entry.S into the libstub/
>   source directory
> - fixed a correctness issue in patch #2
> 
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> 
> Ard Biesheuvel (7):
>   arm64: head: Move all finalise_el2 calls to after __enable_mmu
>   arm64: kernel: move identity map out of .text mapping
>   arm64: head: record the MMU state at primary entry
>   arm64: head: avoid cache invalidation when entering with the MMU on
>   arm64: head: Clean the ID map and the HYP text to the PoC if needed
>   arm64: lds: reduce effective minimum image alignment to 64k
>   efi: arm64: enter with MMU and caches enabled
> 
>  arch/arm64/include/asm/efi.h               |  9 +-
>  arch/arm64/kernel/head.S                   | 93 +++++++++++++++-----
>  arch/arm64/kernel/image-vars.h             |  5 +-
>  arch/arm64/kernel/setup.c                  |  9 +-
>  arch/arm64/kernel/sleep.S                  |  6 +-
>  arch/arm64/kernel/vmlinux.lds.S            | 13 ++-
>  arch/arm64/mm/cache.S                      |  5 +-
>  arch/arm64/mm/proc.S                       |  2 -
>  drivers/firmware/efi/libstub/Makefile      |  4 +-
>  drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------
>  drivers/firmware/efi/libstub/arm64-stub.c  | 26 ++++--
>  drivers/firmware/efi/libstub/arm64.c       | 41 +++++++--
>  include/linux/efi.h                        |  6 +-
>  13 files changed, 159 insertions(+), 127 deletions(-)
>  delete mode 100644 drivers/firmware/efi/libstub/arm64-entry.S
> 
> -- 
> 2.35.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
  2022-11-11 17:36   ` Mark Rutland
@ 2022-11-15 11:17     ` Will Deacon
  -1 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2022-11-15 11:17 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Ard Biesheuvel, linux-arm-kernel, linux-efi, keescook,
	Catalin Marinas, Marc Zyngier

On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > The purpose of this series is to remove any explicit cache maintenance
> > for coherency during early boot that becomes unnecessary if we simply
> > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > and use it to populate the ID map page tables. After setting up this
> > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > need for any manipulations of memory while the MMU and caches are off.
> > 
> > The only properties of the firmware provided 1:1 map we rely on is that
> > it does not require any explicit cache maintenance for coherency, and
> > that it covers the entire memory footprint of the image, including the
> > BSS and padding at the end - all else is under control of the kernel
> > itself, as before.
> 
> As a high-level thing, I'm still very much not keen on entering the kernel with
> the MMU on. Given that we have to support booting with the MMU off for !EFI
> boot (including kexec when EFI is in use), I think this makes it harder to
> reason about the boot code overall (e.g. due to the conditional maintenance
> added to head.S), and adds more scope for error, even if it simplifies the EFI
> stub itself.

As discussed offline, two things that would help the current series are:

  (1) Some performance numbers comparing MMU off vs MMU on boot

  (2) Use of a separate entry point for the MMU on case, potentially failing
      the boot if the MMU is on and we're not using EFI

Will

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-15 11:17     ` Will Deacon
  0 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2022-11-15 11:17 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Ard Biesheuvel, linux-arm-kernel, linux-efi, keescook,
	Catalin Marinas, Marc Zyngier

On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > The purpose of this series is to remove any explicit cache maintenance
> > for coherency during early boot that becomes unnecessary if we simply
> > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > and use it to populate the ID map page tables. After setting up this
> > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > need for any manipulations of memory while the MMU and caches are off.
> > 
> > The only properties of the firmware provided 1:1 map we rely on is that
> > it does not require any explicit cache maintenance for coherency, and
> > that it covers the entire memory footprint of the image, including the
> > BSS and padding at the end - all else is under control of the kernel
> > itself, as before.
> 
> As a high-level thing, I'm still very much not keen on entering the kernel with
> the MMU on. Given that we have to support booting with the MMU off for !EFI
> boot (including kexec when EFI is in use), I think this makes it harder to
> reason about the boot code overall (e.g. due to the conditional maintenance
> added to head.S), and adds more scope for error, even if it simplifies the EFI
> stub itself.

As discussed offline, two things that would help the current series are:

  (1) Some performance numbers comparing MMU off vs MMU on boot

  (2) Use of a separate entry point for the MMU on case, potentially failing
      the boot if the MMU is on and we're not using EFI

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
  2022-11-15 11:17     ` Will Deacon
@ 2022-11-15 11:21       ` Ard Biesheuvel
  -1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-15 11:21 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
	Catalin Marinas, Marc Zyngier

On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
>
> On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > The purpose of this series is to remove any explicit cache maintenance
> > > for coherency during early boot that becomes unnecessary if we simply
> > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > and use it to populate the ID map page tables. After setting up this
> > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > need for any manipulations of memory while the MMU and caches are off.
> > >
> > > The only properties of the firmware provided 1:1 map we rely on is that
> > > it does not require any explicit cache maintenance for coherency, and
> > > that it covers the entire memory footprint of the image, including the
> > > BSS and padding at the end - all else is under control of the kernel
> > > itself, as before.
> >
> > As a high-level thing, I'm still very much not keen on entering the kernel with
> > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > boot (including kexec when EFI is in use), I think this makes it harder to
> > reason about the boot code overall (e.g. due to the conditional maintenance
> > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > stub itself.
>
> As discussed offline, two things that would help the current series are:
>
>   (1) Some performance numbers comparing MMU off vs MMU on boot
>
>   (2) Use of a separate entry point for the MMU on case, potentially failing
>       the boot if the MMU is on and we're not using EFI
>

Ack.

But thinking about (2) again, failing the boot is better done at a
time when you can inform the user about it, no?

IOW, just going into a deadloop really early if you enter the bare
metal entry point with the MMU on is going to be hard to distinguish
from other issues, whereas panicking after the console up is more
likely to help getting the actual issue diagnosed.

So perhaps we should panic() instead of warn+taint when this condition
occurs, and do it from an early initcall instead of from setup_arch().

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-15 11:21       ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-15 11:21 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
	Catalin Marinas, Marc Zyngier

On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
>
> On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > The purpose of this series is to remove any explicit cache maintenance
> > > for coherency during early boot that becomes unnecessary if we simply
> > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > and use it to populate the ID map page tables. After setting up this
> > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > need for any manipulations of memory while the MMU and caches are off.
> > >
> > > The only properties of the firmware provided 1:1 map we rely on is that
> > > it does not require any explicit cache maintenance for coherency, and
> > > that it covers the entire memory footprint of the image, including the
> > > BSS and padding at the end - all else is under control of the kernel
> > > itself, as before.
> >
> > As a high-level thing, I'm still very much not keen on entering the kernel with
> > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > boot (including kexec when EFI is in use), I think this makes it harder to
> > reason about the boot code overall (e.g. due to the conditional maintenance
> > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > stub itself.
>
> As discussed offline, two things that would help the current series are:
>
>   (1) Some performance numbers comparing MMU off vs MMU on boot
>
>   (2) Use of a separate entry point for the MMU on case, potentially failing
>       the boot if the MMU is on and we're not using EFI
>

Ack.

But thinking about (2) again, failing the boot is better done at a
time when you can inform the user about it, no?

IOW, just going into a deadloop really early if you enter the bare
metal entry point with the MMU on is going to be hard to distinguish
from other issues, whereas panicking after the console up is more
likely to help getting the actual issue diagnosed.

So perhaps we should panic() instead of warn+taint when this condition
occurs, and do it from an early initcall instead of from setup_arch().

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
  2022-11-15 11:21       ` Ard Biesheuvel
@ 2022-11-15 11:31         ` Will Deacon
  -1 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2022-11-15 11:31 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
	Catalin Marinas, Marc Zyngier

On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote:
> On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
> >
> > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > > The purpose of this series is to remove any explicit cache maintenance
> > > > for coherency during early boot that becomes unnecessary if we simply
> > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > > and use it to populate the ID map page tables. After setting up this
> > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > > need for any manipulations of memory while the MMU and caches are off.
> > > >
> > > > The only properties of the firmware provided 1:1 map we rely on is that
> > > > it does not require any explicit cache maintenance for coherency, and
> > > > that it covers the entire memory footprint of the image, including the
> > > > BSS and padding at the end - all else is under control of the kernel
> > > > itself, as before.
> > >
> > > As a high-level thing, I'm still very much not keen on entering the kernel with
> > > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > > boot (including kexec when EFI is in use), I think this makes it harder to
> > > reason about the boot code overall (e.g. due to the conditional maintenance
> > > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > > stub itself.
> >
> > As discussed offline, two things that would help the current series are:
> >
> >   (1) Some performance numbers comparing MMU off vs MMU on boot
> >
> >   (2) Use of a separate entry point for the MMU on case, potentially failing
> >       the boot if the MMU is on and we're not using EFI
> >
> 
> Ack.
> 
> But thinking about (2) again, failing the boot is better done at a
> time when you can inform the user about it, no?
> 
> IOW, just going into a deadloop really early if you enter the bare
> metal entry point with the MMU on is going to be hard to distinguish
> from other issues, whereas panicking after the console up is more
> likely to help getting the actual issue diagnosed.

Agreed.

> So perhaps we should panic() instead of warn+taint when this condition
> occurs, and do it from an early initcall instead of from setup_arch().

To be honest, and I appreciate that this is unhelpful, but I'm fine with
the warn+taint and prefer that to a fatal stop.

Will

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-15 11:31         ` Will Deacon
  0 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2022-11-15 11:31 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
	Catalin Marinas, Marc Zyngier

On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote:
> On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
> >
> > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > > The purpose of this series is to remove any explicit cache maintenance
> > > > for coherency during early boot that becomes unnecessary if we simply
> > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > > and use it to populate the ID map page tables. After setting up this
> > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > > need for any manipulations of memory while the MMU and caches are off.
> > > >
> > > > The only properties of the firmware provided 1:1 map we rely on is that
> > > > it does not require any explicit cache maintenance for coherency, and
> > > > that it covers the entire memory footprint of the image, including the
> > > > BSS and padding at the end - all else is under control of the kernel
> > > > itself, as before.
> > >
> > > As a high-level thing, I'm still very much not keen on entering the kernel with
> > > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > > boot (including kexec when EFI is in use), I think this makes it harder to
> > > reason about the boot code overall (e.g. due to the conditional maintenance
> > > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > > stub itself.
> >
> > As discussed offline, two things that would help the current series are:
> >
> >   (1) Some performance numbers comparing MMU off vs MMU on boot
> >
> >   (2) Use of a separate entry point for the MMU on case, potentially failing
> >       the boot if the MMU is on and we're not using EFI
> >
> 
> Ack.
> 
> But thinking about (2) again, failing the boot is better done at a
> time when you can inform the user about it, no?
> 
> IOW, just going into a deadloop really early if you enter the bare
> metal entry point with the MMU on is going to be hard to distinguish
> from other issues, whereas panicking after the console up is more
> likely to help getting the actual issue diagnosed.

Agreed.

> So perhaps we should panic() instead of warn+taint when this condition
> occurs, and do it from an early initcall instead of from setup_arch().

To be honest, and I appreciate that this is unhelpful, but I'm fine with
the warn+taint and prefer that to a fatal stop.

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
  2022-11-15 11:31         ` Will Deacon
@ 2022-11-26 14:16           ` Ard Biesheuvel
  -1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-26 14:16 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
	Catalin Marinas, Marc Zyngier

On Tue, 15 Nov 2022 at 12:31, Will Deacon <will@kernel.org> wrote:
>
> On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote:
> > On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
> > >
> > > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > > > The purpose of this series is to remove any explicit cache maintenance
> > > > > for coherency during early boot that becomes unnecessary if we simply
> > > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > > > and use it to populate the ID map page tables. After setting up this
> > > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > > > need for any manipulations of memory while the MMU and caches are off.
> > > > >
> > > > > The only properties of the firmware provided 1:1 map we rely on is that
> > > > > it does not require any explicit cache maintenance for coherency, and
> > > > > that it covers the entire memory footprint of the image, including the
> > > > > BSS and padding at the end - all else is under control of the kernel
> > > > > itself, as before.
> > > >
> > > > As a high-level thing, I'm still very much not keen on entering the kernel with
> > > > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > > > boot (including kexec when EFI is in use), I think this makes it harder to
> > > > reason about the boot code overall (e.g. due to the conditional maintenance
> > > > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > > > stub itself.
> > >
> > > As discussed offline, two things that would help the current series are:
> > >
> > >   (1) Some performance numbers comparing MMU off vs MMU on boot
> > >

Finally got around to measuring this - I lost access to my TX2 machine
for a couple of days during the past week,

With the patch below applied to mainline, I measure ~6 ms spent
cleaning the entire image to the PoC (which is the bulk of it) and
subsequently populating the initial ID map and activating it.

This drops to about 0.6 ms with my changes applied. This is unlikely
to ever matter in practice, perhaps, but I will note that booting a VM
in EFI mode using Tianocore/EDK2 from the point where KVM clears the
counter to the point where we start user space can be done (on the
same machine) in 500-700 ms so it is not entirely insignificant
either.

I could try and measure it on bare metal as well, but I suppose that
launch times are even less relevant there so I didn't bother.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-26 14:16           ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-26 14:16 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
	Catalin Marinas, Marc Zyngier

On Tue, 15 Nov 2022 at 12:31, Will Deacon <will@kernel.org> wrote:
>
> On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote:
> > On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
> > >
> > > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > > > The purpose of this series is to remove any explicit cache maintenance
> > > > > for coherency during early boot that becomes unnecessary if we simply
> > > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > > > and use it to populate the ID map page tables. After setting up this
> > > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > > > need for any manipulations of memory while the MMU and caches are off.
> > > > >
> > > > > The only properties of the firmware provided 1:1 map we rely on is that
> > > > > it does not require any explicit cache maintenance for coherency, and
> > > > > that it covers the entire memory footprint of the image, including the
> > > > > BSS and padding at the end - all else is under control of the kernel
> > > > > itself, as before.
> > > >
> > > > As a high-level thing, I'm still very much not keen on entering the kernel with
> > > > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > > > boot (including kexec when EFI is in use), I think this makes it harder to
> > > > reason about the boot code overall (e.g. due to the conditional maintenance
> > > > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > > > stub itself.
> > >
> > > As discussed offline, two things that would help the current series are:
> > >
> > >   (1) Some performance numbers comparing MMU off vs MMU on boot
> > >

Finally got around to measuring this - I lost access to my TX2 machine
for a couple of days during the past week,

With the patch below applied to mainline, I measure ~6 ms spent
cleaning the entire image to the PoC (which is the bulk of it) and
subsequently populating the initial ID map and activating it.

This drops to about 0.6 ms with my changes applied. This is unlikely
to ever matter in practice, perhaps, but I will note that booting a VM
in EFI mode using Tianocore/EDK2 from the point where KVM clears the
counter to the point where we start user space can be done (on the
same machine) in 500-700 ms so it is not entirely insignificant
either.

I could try and measure it on bare metal as well, but I suppose that
launch times are even less relevant there so I didn't bother.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
  2022-11-26 14:16           ` Ard Biesheuvel
@ 2022-11-26 14:17             ` Ard Biesheuvel
  -1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-26 14:17 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
	Catalin Marinas, Marc Zyngier

On Sat, 26 Nov 2022 at 15:16, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Tue, 15 Nov 2022 at 12:31, Will Deacon <will@kernel.org> wrote:
> >
> > On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote:
> > > On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
> > > >
> > > > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > > > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > > > > The purpose of this series is to remove any explicit cache maintenance
> > > > > > for coherency during early boot that becomes unnecessary if we simply
> > > > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > > > > and use it to populate the ID map page tables. After setting up this
> > > > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > > > > need for any manipulations of memory while the MMU and caches are off.
> > > > > >
> > > > > > The only properties of the firmware provided 1:1 map we rely on is that
> > > > > > it does not require any explicit cache maintenance for coherency, and
> > > > > > that it covers the entire memory footprint of the image, including the
> > > > > > BSS and padding at the end - all else is under control of the kernel
> > > > > > itself, as before.
> > > > >
> > > > > As a high-level thing, I'm still very much not keen on entering the kernel with
> > > > > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > > > > boot (including kexec when EFI is in use), I think this makes it harder to
> > > > > reason about the boot code overall (e.g. due to the conditional maintenance
> > > > > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > > > > stub itself.
> > > >
> > > > As discussed offline, two things that would help the current series are:
> > > >
> > > >   (1) Some performance numbers comparing MMU off vs MMU on boot
> > > >
>
> Finally got around to measuring this - I lost access to my TX2 machine
> for a couple of days during the past week,
>
> With the patch below applied to mainline, I measure ~6 ms spent
> cleaning the entire image to the PoC (which is the bulk of it) and
> subsequently populating the initial ID map and activating it.
>
> This drops to about 0.6 ms with my changes applied. This is unlikely
> to ever matter in practice, perhaps, but I will note that booting a VM
> in EFI mode using Tianocore/EDK2 from the point where KVM clears the
> counter to the point where we start user space can be done (on the
> same machine) in 500-700 ms so it is not entirely insignificant
> either.
>
> I could try and measure it on bare metal as well, but I suppose that
> launch times are even less relevant there so I didn't bother.

diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index 61a87fa1c3055e26..27f59784a1c0be2c 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -22,6 +22,7 @@ SYM_CODE_START(efi_enter_kernel)
        ldr     w2, =primary_entry_offset
        add     x19, x0, x2             // relocated Image entrypoint
        mov     x20, x1                 // DTB address
+       mrs     x27, cntvct_el0

        /*
         * Clean the copied Image to the PoC, and ensure it is not shadowed by
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2196aad7b55bcef0..068a7d111836382b 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -430,6 +430,8 @@ SYM_FUNC_START_LOCAL(__primary_switched)

        str_l   x21, __fdt_pointer, x5          // Save FDT pointer

+       str_l   x27, boot_args + 8, x5
+
        ldr_l   x4, kimage_vaddr                // Save the offset between
        sub     x4, x4, x0                      // the kernel virtual and
        str_l   x4, kimage_voffset, x5          // physical mappings
@@ -797,6 +799,10 @@ SYM_FUNC_START_LOCAL(__primary_switch)
        adrp    x1, reserved_pg_dir
        adrp    x2, init_idmap_pg_dir
        bl      __enable_mmu
+
+       mrs     x0, cntvct_el0
+       sub     x27, x0, x27
+
 #ifdef CONFIG_RELOCATABLE
        adrp    x23, KERNEL_START
        and     x23, x23, MIN_KIMG_ALIGN - 1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-26 14:17             ` Ard Biesheuvel
  0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-26 14:17 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
	Catalin Marinas, Marc Zyngier

On Sat, 26 Nov 2022 at 15:16, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Tue, 15 Nov 2022 at 12:31, Will Deacon <will@kernel.org> wrote:
> >
> > On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote:
> > > On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
> > > >
> > > > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > > > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > > > > The purpose of this series is to remove any explicit cache maintenance
> > > > > > for coherency during early boot that becomes unnecessary if we simply
> > > > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > > > > and use it to populate the ID map page tables. After setting up this
> > > > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > > > > need for any manipulations of memory while the MMU and caches are off.
> > > > > >
> > > > > > The only properties of the firmware provided 1:1 map we rely on is that
> > > > > > it does not require any explicit cache maintenance for coherency, and
> > > > > > that it covers the entire memory footprint of the image, including the
> > > > > > BSS and padding at the end - all else is under control of the kernel
> > > > > > itself, as before.
> > > > >
> > > > > As a high-level thing, I'm still very much not keen on entering the kernel with
> > > > > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > > > > boot (including kexec when EFI is in use), I think this makes it harder to
> > > > > reason about the boot code overall (e.g. due to the conditional maintenance
> > > > > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > > > > stub itself.
> > > >
> > > > As discussed offline, two things that would help the current series are:
> > > >
> > > >   (1) Some performance numbers comparing MMU off vs MMU on boot
> > > >
>
> Finally got around to measuring this - I lost access to my TX2 machine
> for a couple of days during the past week,
>
> With the patch below applied to mainline, I measure ~6 ms spent
> cleaning the entire image to the PoC (which is the bulk of it) and
> subsequently populating the initial ID map and activating it.
>
> This drops to about 0.6 ms with my changes applied. This is unlikely
> to ever matter in practice, perhaps, but I will note that booting a VM
> in EFI mode using Tianocore/EDK2 from the point where KVM clears the
> counter to the point where we start user space can be done (on the
> same machine) in 500-700 ms so it is not entirely insignificant
> either.
>
> I could try and measure it on bare metal as well, but I suppose that
> launch times are even less relevant there so I didn't bother.

diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index 61a87fa1c3055e26..27f59784a1c0be2c 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -22,6 +22,7 @@ SYM_CODE_START(efi_enter_kernel)
        ldr     w2, =primary_entry_offset
        add     x19, x0, x2             // relocated Image entrypoint
        mov     x20, x1                 // DTB address
+       mrs     x27, cntvct_el0

        /*
         * Clean the copied Image to the PoC, and ensure it is not shadowed by
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2196aad7b55bcef0..068a7d111836382b 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -430,6 +430,8 @@ SYM_FUNC_START_LOCAL(__primary_switched)

        str_l   x21, __fdt_pointer, x5          // Save FDT pointer

+       str_l   x27, boot_args + 8, x5
+
        ldr_l   x4, kimage_vaddr                // Save the offset between
        sub     x4, x4, x0                      // the kernel virtual and
        str_l   x4, kimage_voffset, x5          // physical mappings
@@ -797,6 +799,10 @@ SYM_FUNC_START_LOCAL(__primary_switch)
        adrp    x1, reserved_pg_dir
        adrp    x2, init_idmap_pg_dir
        bl      __enable_mmu
+
+       mrs     x0, cntvct_el0
+       sub     x27, x0, x27
+
 #ifdef CONFIG_RELOCATABLE
        adrp    x23, KERNEL_START
        and     x23, x23, MIN_KIMG_ALIGN - 1

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2022-11-26 14:18 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-08 18:21 [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot Ard Biesheuvel
2022-11-08 18:21 ` Ard Biesheuvel
2022-11-08 18:21 ` [PATCH v5 1/7] arm64: head: Move all finalise_el2 calls to after __enable_mmu Ard Biesheuvel
2022-11-08 18:21   ` Ard Biesheuvel
2022-11-08 18:21 ` [PATCH v5 2/7] arm64: kernel: move identity map out of .text mapping Ard Biesheuvel
2022-11-08 18:21   ` Ard Biesheuvel
2022-11-08 18:22 ` [PATCH v5 3/7] arm64: head: record the MMU state at primary entry Ard Biesheuvel
2022-11-08 18:22   ` Ard Biesheuvel
2022-11-08 18:22 ` [PATCH v5 4/7] arm64: head: avoid cache invalidation when entering with the MMU on Ard Biesheuvel
2022-11-08 18:22   ` Ard Biesheuvel
2022-11-08 18:22 ` [PATCH v5 5/7] arm64: head: Clean the ID map and the HYP text to the PoC if needed Ard Biesheuvel
2022-11-08 18:22   ` Ard Biesheuvel
2022-11-08 22:11   ` Ard Biesheuvel
2022-11-08 22:11     ` Ard Biesheuvel
2022-11-08 18:22 ` [PATCH v5 6/7] arm64: lds: reduce effective minimum image alignment to 64k Ard Biesheuvel
2022-11-08 18:22   ` Ard Biesheuvel
2022-11-08 18:22 ` [PATCH v5 7/7] efi: arm64: enter with MMU and caches enabled Ard Biesheuvel
2022-11-08 18:22   ` Ard Biesheuvel
2022-11-11 17:36 ` [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot Mark Rutland
2022-11-11 17:36   ` Mark Rutland
2022-11-15 11:17   ` Will Deacon
2022-11-15 11:17     ` Will Deacon
2022-11-15 11:21     ` Ard Biesheuvel
2022-11-15 11:21       ` Ard Biesheuvel
2022-11-15 11:31       ` Will Deacon
2022-11-15 11:31         ` Will Deacon
2022-11-26 14:16         ` Ard Biesheuvel
2022-11-26 14:16           ` Ard Biesheuvel
2022-11-26 14:17           ` Ard Biesheuvel
2022-11-26 14:17             ` Ard Biesheuvel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.