All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 0/6] arm64: Permit EFI boot with MMU and caches on
@ 2023-01-11 10:22 ` Ard Biesheuvel
  0 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

The purpose of this series is to remove any explicit cache maintenance
for coherency during early boot. Software managed coherency is error
prone and tedious, and running with the MMU off is generally bad for
performance, and it becomes unnecessary if we simply retain the
cacheable 1:1 mapping of all of system RAM provided by EFI, and use it
to populate the initial ID map page tables. After setting up this
preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
TCR and SCTLR registers as before, and proceed as usual, avoiding the
need for any manipulations of memory while the MMU and caches are off.

The only properties of the firmware provided 1:1 map we rely on is that
it does not require any explicit cache maintenance for coherency, and
that it covers the entire memory footprint of the image, including the
BSS and padding at the end - all else is under control of the kernel
itself, as before.

The final patch updates the EFI stub code so that it no longer disables
the MMU and caches or cleans the entire image to the PoC. Note that
some cache maintenace for I/D coherence may still be needed, in the
zboot case (which decompresses and boots a compressed kernel image) or
in cases where the image is moved in memory.

Changes since v6:
- drop the 64k alignment patch, which is not strictly a prerequisite,
  and will be revisited later if needed
- add back EFI stub changes now that all dependencies are in mainline
- panic() the kernel later in the boot if we detected a non-EFI boot
  occurring with the MMU and caches enabled

Changes since v5:
- add a special entry point into the boot sequence that is to be used by
  EFI only, and only permit booting with the MMU enabled when using that
  boot path;
- omit the final patch that would need to go via the EFI tree in any
  case - adding the new entrypoint specific for EFI makes it conflict
  even more badly, and I'll try to revisit this during the merge window
  or simply defer the final piece for the next release;

Changes since v4:
- add patch to align the callers of finalise_el2()
- also clean HYP text to the PoC when booting at EL2 with the MMU on
- add a warning and a taint when doing non-EFI boot with the MMU and
  caches enabled
- rebase onto zboot changes in efi/next - this means that patches #6 and
  #7 will not apply onto arm64/for-next so a shared stable branch will
  be needed if we want to queue this up for v6.2

Changes since v3:
- drop EFI_LOADER_CODE memory type patch that has been queued in the
  mean time
- rebased onto [partial] series that moves efi-entry.S into the libstub/
  source directory
- fixed a correctness issue in patch #2

Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>

Ard Biesheuvel (6):
  arm64: head: Move all finalise_el2 calls to after __enable_mmu
  arm64: kernel: move identity map out of .text mapping
  arm64: head: record the MMU state at primary entry
  arm64: head: avoid cache invalidation when entering with the MMU on
  arm64: head: Clean the ID map and the HYP text to the PoC if needed
  efi: arm64: enter with MMU and caches enabled

 arch/arm64/include/asm/efi.h               |  2 +
 arch/arm64/kernel/head.S                   | 89 +++++++++++++++-----
 arch/arm64/kernel/image-vars.h             |  5 +-
 arch/arm64/kernel/setup.c                  | 17 +++-
 arch/arm64/kernel/sleep.S                  |  6 +-
 arch/arm64/kernel/vmlinux.lds.S            |  2 +-
 arch/arm64/mm/cache.S                      |  1 +
 arch/arm64/mm/proc.S                       |  2 -
 drivers/firmware/efi/libstub/Makefile      |  4 +-
 drivers/firmware/efi/libstub/arm64-entry.S | 67 ---------------
 drivers/firmware/efi/libstub/arm64-stub.c  | 26 ++++--
 drivers/firmware/efi/libstub/arm64.c       | 41 +++++++--
 12 files changed, 151 insertions(+), 111 deletions(-)
 delete mode 100644 drivers/firmware/efi/libstub/arm64-entry.S

-- 
2.39.0


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v7 0/6] arm64: Permit EFI boot with MMU and caches on
@ 2023-01-11 10:22 ` Ard Biesheuvel
  0 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

The purpose of this series is to remove any explicit cache maintenance
for coherency during early boot. Software managed coherency is error
prone and tedious, and running with the MMU off is generally bad for
performance, and it becomes unnecessary if we simply retain the
cacheable 1:1 mapping of all of system RAM provided by EFI, and use it
to populate the initial ID map page tables. After setting up this
preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
TCR and SCTLR registers as before, and proceed as usual, avoiding the
need for any manipulations of memory while the MMU and caches are off.

The only properties of the firmware provided 1:1 map we rely on is that
it does not require any explicit cache maintenance for coherency, and
that it covers the entire memory footprint of the image, including the
BSS and padding at the end - all else is under control of the kernel
itself, as before.

The final patch updates the EFI stub code so that it no longer disables
the MMU and caches or cleans the entire image to the PoC. Note that
some cache maintenace for I/D coherence may still be needed, in the
zboot case (which decompresses and boots a compressed kernel image) or
in cases where the image is moved in memory.

Changes since v6:
- drop the 64k alignment patch, which is not strictly a prerequisite,
  and will be revisited later if needed
- add back EFI stub changes now that all dependencies are in mainline
- panic() the kernel later in the boot if we detected a non-EFI boot
  occurring with the MMU and caches enabled

Changes since v5:
- add a special entry point into the boot sequence that is to be used by
  EFI only, and only permit booting with the MMU enabled when using that
  boot path;
- omit the final patch that would need to go via the EFI tree in any
  case - adding the new entrypoint specific for EFI makes it conflict
  even more badly, and I'll try to revisit this during the merge window
  or simply defer the final piece for the next release;

Changes since v4:
- add patch to align the callers of finalise_el2()
- also clean HYP text to the PoC when booting at EL2 with the MMU on
- add a warning and a taint when doing non-EFI boot with the MMU and
  caches enabled
- rebase onto zboot changes in efi/next - this means that patches #6 and
  #7 will not apply onto arm64/for-next so a shared stable branch will
  be needed if we want to queue this up for v6.2

Changes since v3:
- drop EFI_LOADER_CODE memory type patch that has been queued in the
  mean time
- rebased onto [partial] series that moves efi-entry.S into the libstub/
  source directory
- fixed a correctness issue in patch #2

Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>

Ard Biesheuvel (6):
  arm64: head: Move all finalise_el2 calls to after __enable_mmu
  arm64: kernel: move identity map out of .text mapping
  arm64: head: record the MMU state at primary entry
  arm64: head: avoid cache invalidation when entering with the MMU on
  arm64: head: Clean the ID map and the HYP text to the PoC if needed
  efi: arm64: enter with MMU and caches enabled

 arch/arm64/include/asm/efi.h               |  2 +
 arch/arm64/kernel/head.S                   | 89 +++++++++++++++-----
 arch/arm64/kernel/image-vars.h             |  5 +-
 arch/arm64/kernel/setup.c                  | 17 +++-
 arch/arm64/kernel/sleep.S                  |  6 +-
 arch/arm64/kernel/vmlinux.lds.S            |  2 +-
 arch/arm64/mm/cache.S                      |  1 +
 arch/arm64/mm/proc.S                       |  2 -
 drivers/firmware/efi/libstub/Makefile      |  4 +-
 drivers/firmware/efi/libstub/arm64-entry.S | 67 ---------------
 drivers/firmware/efi/libstub/arm64-stub.c  | 26 ++++--
 drivers/firmware/efi/libstub/arm64.c       | 41 +++++++--
 12 files changed, 151 insertions(+), 111 deletions(-)
 delete mode 100644 drivers/firmware/efi/libstub/arm64-entry.S

-- 
2.39.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v7 1/6] arm64: head: Move all finalise_el2 calls to after __enable_mmu
  2023-01-11 10:22 ` Ard Biesheuvel
@ 2023-01-11 10:22   ` Ard Biesheuvel
  -1 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

In the primary boot path, finalise_el2() is called much later than on
the secondary boot or resume-from-suspend paths, and this does not
appear to be intentional.

Since we aim to do as little as possible before enabling the MMU and
caches, align secondary and resume with primary boot, and defer the call
to after the MMU is turned on. This also removes the need to clean
finalise_el2() to the PoC once we enable support for booting with the
MMU on.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S  | 5 ++++-
 arch/arm64/kernel/sleep.S | 5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 952e17bd1c0b4f91..c4e12d466a5f35f0 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -587,7 +587,6 @@ SYM_FUNC_START_LOCAL(secondary_startup)
 	 * Common entry point for secondary CPUs.
 	 */
 	mov	x20, x0				// preserve boot mode
-	bl	finalise_el2
 	bl	__cpu_secondary_check52bitva
 #if VA_BITS > 48
 	ldr_l	x0, vabits_actual
@@ -603,6 +602,10 @@ SYM_FUNC_END(secondary_startup)
 SYM_FUNC_START_LOCAL(__secondary_switched)
 	mov	x0, x20
 	bl	set_cpu_boot_mode_flag
+
+	mov	x0, x20
+	bl	finalise_el2
+
 	str_l	xzr, __early_cpu_boot_status, x3
 	adr_l	x5, vectors
 	msr	vbar_el1, x5
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 97c9de57725dfddb..7b7c56e048346e97 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -100,7 +100,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
 	.pushsection ".idmap.text", "awx"
 SYM_CODE_START(cpu_resume)
 	bl	init_kernel_el
-	bl	finalise_el2
+	mov	x19, x0			// preserve boot mode
 #if VA_BITS > 48
 	ldr_l	x0, vabits_actual
 #endif
@@ -116,6 +116,9 @@ SYM_CODE_END(cpu_resume)
 	.popsection
 
 SYM_FUNC_START(_cpu_resume)
+	mov	x0, x19
+	bl	finalise_el2
+
 	mrs	x1, mpidr_el1
 	adr_l	x8, mpidr_hash		// x8 = struct mpidr_hash virt address
 
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 1/6] arm64: head: Move all finalise_el2 calls to after __enable_mmu
@ 2023-01-11 10:22   ` Ard Biesheuvel
  0 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

In the primary boot path, finalise_el2() is called much later than on
the secondary boot or resume-from-suspend paths, and this does not
appear to be intentional.

Since we aim to do as little as possible before enabling the MMU and
caches, align secondary and resume with primary boot, and defer the call
to after the MMU is turned on. This also removes the need to clean
finalise_el2() to the PoC once we enable support for booting with the
MMU on.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S  | 5 ++++-
 arch/arm64/kernel/sleep.S | 5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 952e17bd1c0b4f91..c4e12d466a5f35f0 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -587,7 +587,6 @@ SYM_FUNC_START_LOCAL(secondary_startup)
 	 * Common entry point for secondary CPUs.
 	 */
 	mov	x20, x0				// preserve boot mode
-	bl	finalise_el2
 	bl	__cpu_secondary_check52bitva
 #if VA_BITS > 48
 	ldr_l	x0, vabits_actual
@@ -603,6 +602,10 @@ SYM_FUNC_END(secondary_startup)
 SYM_FUNC_START_LOCAL(__secondary_switched)
 	mov	x0, x20
 	bl	set_cpu_boot_mode_flag
+
+	mov	x0, x20
+	bl	finalise_el2
+
 	str_l	xzr, __early_cpu_boot_status, x3
 	adr_l	x5, vectors
 	msr	vbar_el1, x5
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 97c9de57725dfddb..7b7c56e048346e97 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -100,7 +100,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
 	.pushsection ".idmap.text", "awx"
 SYM_CODE_START(cpu_resume)
 	bl	init_kernel_el
-	bl	finalise_el2
+	mov	x19, x0			// preserve boot mode
 #if VA_BITS > 48
 	ldr_l	x0, vabits_actual
 #endif
@@ -116,6 +116,9 @@ SYM_CODE_END(cpu_resume)
 	.popsection
 
 SYM_FUNC_START(_cpu_resume)
+	mov	x0, x19
+	bl	finalise_el2
+
 	mrs	x1, mpidr_el1
 	adr_l	x8, mpidr_hash		// x8 = struct mpidr_hash virt address
 
-- 
2.39.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 2/6] arm64: kernel: move identity map out of .text mapping
  2023-01-11 10:22 ` Ard Biesheuvel
@ 2023-01-11 10:22   ` Ard Biesheuvel
  -1 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

Reorganize the ID map slightly so that only code that is executed with
the MMU off or via the 1:1 mapping remains. This allows us to move the
identity map out of the .text segment, as it will no longer need
executable permissions via the kernel mapping.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S        | 28 +++++++++++---------
 arch/arm64/kernel/vmlinux.lds.S |  2 +-
 arch/arm64/mm/proc.S            |  2 --
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index c4e12d466a5f35f0..bec97aad092c2b43 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -543,19 +543,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
 	eret
 SYM_FUNC_END(init_kernel_el)
 
-/*
- * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
- * in w0. See arch/arm64/include/asm/virt.h for more info.
- */
-SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
-	adr_l	x1, __boot_cpu_mode
-	cmp	w0, #BOOT_CPU_MODE_EL2
-	b.ne	1f
-	add	x1, x1, #4
-1:	str	w0, [x1]			// Save CPU boot mode
-	ret
-SYM_FUNC_END(set_cpu_boot_mode_flag)
-
 	/*
 	 * This provides a "holding pen" for platforms to hold all secondary
 	 * cores are held until we're ready for them to initialise.
@@ -599,6 +586,7 @@ SYM_FUNC_START_LOCAL(secondary_startup)
 	br	x8
 SYM_FUNC_END(secondary_startup)
 
+	.text
 SYM_FUNC_START_LOCAL(__secondary_switched)
 	mov	x0, x20
 	bl	set_cpu_boot_mode_flag
@@ -631,6 +619,19 @@ SYM_FUNC_START_LOCAL(__secondary_too_slow)
 	b	__secondary_too_slow
 SYM_FUNC_END(__secondary_too_slow)
 
+/*
+ * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
+ * in w0. See arch/arm64/include/asm/virt.h for more info.
+ */
+SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
+	adr_l	x1, __boot_cpu_mode
+	cmp	w0, #BOOT_CPU_MODE_EL2
+	b.ne	1f
+	add	x1, x1, #4
+1:	str	w0, [x1]			// Save CPU boot mode
+	ret
+SYM_FUNC_END(set_cpu_boot_mode_flag)
+
 /*
  * The booting CPU updates the failed status @__early_cpu_boot_status,
  * with MMU turned off.
@@ -662,6 +663,7 @@ SYM_FUNC_END(__secondary_too_slow)
  * Checks if the selected granule size is supported by the CPU.
  * If it isn't, park the CPU
  */
+	.section ".idmap.text","awx"
 SYM_FUNC_START(__enable_mmu)
 	mrs	x3, ID_AA64MMFR0_EL1
 	ubfx	x3, x3, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 4c13dafc98b8400f..407415a5163ab62f 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -179,7 +179,6 @@ SECTIONS
 			LOCK_TEXT
 			KPROBES_TEXT
 			HYPERVISOR_TEXT
-			IDMAP_TEXT
 			*(.gnu.warning)
 		. = ALIGN(16);
 		*(.got)			/* Global offset table		*/
@@ -206,6 +205,7 @@ SECTIONS
 		TRAMP_TEXT
 		HIBERNATE_TEXT
 		KEXEC_TEXT
+		IDMAP_TEXT
 		. = ALIGN(PAGE_SIZE);
 	}
 
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 066fa60b93d24827..91410f48809000a0 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -110,7 +110,6 @@ SYM_FUNC_END(cpu_do_suspend)
  *
  * x0: Address of context pointer
  */
-	.pushsection ".idmap.text", "awx"
 SYM_FUNC_START(cpu_do_resume)
 	ldp	x2, x3, [x0]
 	ldp	x4, x5, [x0, #16]
@@ -166,7 +165,6 @@ alternative_else_nop_endif
 	isb
 	ret
 SYM_FUNC_END(cpu_do_resume)
-	.popsection
 #endif
 
 	.pushsection ".idmap.text", "awx"
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 2/6] arm64: kernel: move identity map out of .text mapping
@ 2023-01-11 10:22   ` Ard Biesheuvel
  0 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

Reorganize the ID map slightly so that only code that is executed with
the MMU off or via the 1:1 mapping remains. This allows us to move the
identity map out of the .text segment, as it will no longer need
executable permissions via the kernel mapping.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S        | 28 +++++++++++---------
 arch/arm64/kernel/vmlinux.lds.S |  2 +-
 arch/arm64/mm/proc.S            |  2 --
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index c4e12d466a5f35f0..bec97aad092c2b43 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -543,19 +543,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
 	eret
 SYM_FUNC_END(init_kernel_el)
 
-/*
- * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
- * in w0. See arch/arm64/include/asm/virt.h for more info.
- */
-SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
-	adr_l	x1, __boot_cpu_mode
-	cmp	w0, #BOOT_CPU_MODE_EL2
-	b.ne	1f
-	add	x1, x1, #4
-1:	str	w0, [x1]			// Save CPU boot mode
-	ret
-SYM_FUNC_END(set_cpu_boot_mode_flag)
-
 	/*
 	 * This provides a "holding pen" for platforms to hold all secondary
 	 * cores are held until we're ready for them to initialise.
@@ -599,6 +586,7 @@ SYM_FUNC_START_LOCAL(secondary_startup)
 	br	x8
 SYM_FUNC_END(secondary_startup)
 
+	.text
 SYM_FUNC_START_LOCAL(__secondary_switched)
 	mov	x0, x20
 	bl	set_cpu_boot_mode_flag
@@ -631,6 +619,19 @@ SYM_FUNC_START_LOCAL(__secondary_too_slow)
 	b	__secondary_too_slow
 SYM_FUNC_END(__secondary_too_slow)
 
+/*
+ * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
+ * in w0. See arch/arm64/include/asm/virt.h for more info.
+ */
+SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
+	adr_l	x1, __boot_cpu_mode
+	cmp	w0, #BOOT_CPU_MODE_EL2
+	b.ne	1f
+	add	x1, x1, #4
+1:	str	w0, [x1]			// Save CPU boot mode
+	ret
+SYM_FUNC_END(set_cpu_boot_mode_flag)
+
 /*
  * The booting CPU updates the failed status @__early_cpu_boot_status,
  * with MMU turned off.
@@ -662,6 +663,7 @@ SYM_FUNC_END(__secondary_too_slow)
  * Checks if the selected granule size is supported by the CPU.
  * If it isn't, park the CPU
  */
+	.section ".idmap.text","awx"
 SYM_FUNC_START(__enable_mmu)
 	mrs	x3, ID_AA64MMFR0_EL1
 	ubfx	x3, x3, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 4c13dafc98b8400f..407415a5163ab62f 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -179,7 +179,6 @@ SECTIONS
 			LOCK_TEXT
 			KPROBES_TEXT
 			HYPERVISOR_TEXT
-			IDMAP_TEXT
 			*(.gnu.warning)
 		. = ALIGN(16);
 		*(.got)			/* Global offset table		*/
@@ -206,6 +205,7 @@ SECTIONS
 		TRAMP_TEXT
 		HIBERNATE_TEXT
 		KEXEC_TEXT
+		IDMAP_TEXT
 		. = ALIGN(PAGE_SIZE);
 	}
 
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 066fa60b93d24827..91410f48809000a0 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -110,7 +110,6 @@ SYM_FUNC_END(cpu_do_suspend)
  *
  * x0: Address of context pointer
  */
-	.pushsection ".idmap.text", "awx"
 SYM_FUNC_START(cpu_do_resume)
 	ldp	x2, x3, [x0]
 	ldp	x4, x5, [x0, #16]
@@ -166,7 +165,6 @@ alternative_else_nop_endif
 	isb
 	ret
 SYM_FUNC_END(cpu_do_resume)
-	.popsection
 #endif
 
 	.pushsection ".idmap.text", "awx"
-- 
2.39.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 3/6] arm64: head: record the MMU state at primary entry
  2023-01-11 10:22 ` Ard Biesheuvel
@ 2023-01-11 10:22   ` Ard Biesheuvel
  -1 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

Prepare for being able to deal with primary entry with the MMU and
caches enabled, by recording whether or not we entered with the MMU on
in register x19 and in a global variable. (Note that setting this
variable to '1' does not require cache invalidation, nor is it required
for storing the bootargs in that case, so omit the cache maintenance).

Since boot with the MMU and caches enabled is not permitted by the bare
metal boot protocol, ensure that a diagnostic is emitted and a taint bit
set if the MMU was found to be enabled on a non-EFI boot, and panic()
once the console is likely to be up. We will make an exception for EFI
boot later, which has strict requirements for the mapping of system
memory, permitting us to relax the boot protocol and hand over from the
EFI stub to the core kernel with MMU and caches left enabled.

While at it, add 'pre_disable_mmu_workaround' macro invocations to
init_kernel_el, as its manipulation of SCTLR_ELx may amount to disabling
of the MMU after subsequent patches.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S  | 20 ++++++++++++++++++++
 arch/arm64/kernel/setup.c | 17 +++++++++++++++--
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index bec97aad092c2b43..c3b898efd3b5288d 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -77,6 +77,7 @@
 	 * primary lowlevel boot path:
 	 *
 	 *  Register   Scope                      Purpose
+	 *  x19        primary_entry() .. start_kernel()        whether we entered with the MMU on
 	 *  x20        primary_entry() .. __primary_switch()    CPU boot mode
 	 *  x21        primary_entry() .. start_kernel()        FDT pointer passed at boot in x0
 	 *  x22        create_idmap() .. start_kernel()         ID map VA of the DT blob
@@ -86,6 +87,7 @@
 	 *  x28        create_idmap()                           callee preserved temp register
 	 */
 SYM_CODE_START(primary_entry)
+	bl	record_mmu_state
 	bl	preserve_boot_args
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mov	x20, x0
@@ -109,6 +111,18 @@ SYM_CODE_START(primary_entry)
 	b	__primary_switch
 SYM_CODE_END(primary_entry)
 
+SYM_CODE_START_LOCAL(record_mmu_state)
+	mrs	x19, CurrentEL
+	cmp	x19, #CurrentEL_EL2
+	mrs	x19, sctlr_el1
+	b.ne	0f
+	mrs	x19, sctlr_el2
+0:	tst	x19, #SCTLR_ELx_C		// Z := (C == 0)
+	and	x19, x19, #SCTLR_ELx_M		// isolate M bit
+	csel	x19, xzr, x19, eq		// clear x19 if Z
+	ret
+SYM_CODE_END(record_mmu_state)
+
 /*
  * Preserve the arguments passed by the bootloader in x0 .. x3
  */
@@ -119,11 +133,14 @@ SYM_CODE_START_LOCAL(preserve_boot_args)
 	stp	x21, x1, [x0]			// x0 .. x3 at kernel entry
 	stp	x2, x3, [x0, #16]
 
+	cbnz	x19, 0f				// skip cache invalidation if MMU is on
 	dmb	sy				// needed before dc ivac with
 						// MMU off
 
 	add	x1, x0, #0x20			// 4 x 8 bytes
 	b	dcache_inval_poc		// tail call
+0:	str_l   x19, mmu_enabled_at_boot, x0
+	ret
 SYM_CODE_END(preserve_boot_args)
 
 SYM_FUNC_START_LOCAL(clear_page_tables)
@@ -497,6 +514,7 @@ SYM_FUNC_START(init_kernel_el)
 
 SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
 	mov_q	x0, INIT_SCTLR_EL1_MMU_OFF
+	pre_disable_mmu_workaround
 	msr	sctlr_el1, x0
 	isb
 	mov_q	x0, INIT_PSTATE_EL1
@@ -529,11 +547,13 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
 	cbz	x0, 1f
 
 	/* Set a sane SCTLR_EL1, the VHE way */
+	pre_disable_mmu_workaround
 	msr_s	SYS_SCTLR_EL12, x1
 	mov	x2, #BOOT_CPU_FLAG_E2H
 	b	2f
 
 1:
+	pre_disable_mmu_workaround
 	msr	sctlr_el1, x1
 	mov	x2, xzr
 2:
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 12cfe9d0d3fac10d..b8ec7b3ac9cbe8a8 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -58,6 +58,7 @@ static int num_standard_resources;
 static struct resource *standard_resources;
 
 phys_addr_t __fdt_pointer __initdata;
+u64 mmu_enabled_at_boot __initdata;
 
 /*
  * Standard memory resources
@@ -332,8 +333,12 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
 	xen_early_init();
 	efi_init();
 
-	if (!efi_enabled(EFI_BOOT) && ((u64)_text % MIN_KIMG_ALIGN) != 0)
-	     pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
+	if (!efi_enabled(EFI_BOOT)) {
+		if ((u64)_text % MIN_KIMG_ALIGN)
+			pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
+		WARN_TAINT(mmu_enabled_at_boot, TAINT_FIRMWARE_WORKAROUND,
+			   FW_BUG "Booted with MMU enabled!");
+	}
 
 	arm64_memblock_init();
 
@@ -442,3 +447,11 @@ static int __init register_arm64_panic_block(void)
 	return 0;
 }
 device_initcall(register_arm64_panic_block);
+
+static int __init check_mmu_enabled_at_boot(void)
+{
+	if (!efi_enabled(EFI_BOOT) && mmu_enabled_at_boot)
+		panic("Non-EFI boot detected with MMU and caches enabled");
+	return 0;
+}
+device_initcall_sync(check_mmu_enabled_at_boot);
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 3/6] arm64: head: record the MMU state at primary entry
@ 2023-01-11 10:22   ` Ard Biesheuvel
  0 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

Prepare for being able to deal with primary entry with the MMU and
caches enabled, by recording whether or not we entered with the MMU on
in register x19 and in a global variable. (Note that setting this
variable to '1' does not require cache invalidation, nor is it required
for storing the bootargs in that case, so omit the cache maintenance).

Since boot with the MMU and caches enabled is not permitted by the bare
metal boot protocol, ensure that a diagnostic is emitted and a taint bit
set if the MMU was found to be enabled on a non-EFI boot, and panic()
once the console is likely to be up. We will make an exception for EFI
boot later, which has strict requirements for the mapping of system
memory, permitting us to relax the boot protocol and hand over from the
EFI stub to the core kernel with MMU and caches left enabled.

While at it, add 'pre_disable_mmu_workaround' macro invocations to
init_kernel_el, as its manipulation of SCTLR_ELx may amount to disabling
of the MMU after subsequent patches.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S  | 20 ++++++++++++++++++++
 arch/arm64/kernel/setup.c | 17 +++++++++++++++--
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index bec97aad092c2b43..c3b898efd3b5288d 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -77,6 +77,7 @@
 	 * primary lowlevel boot path:
 	 *
 	 *  Register   Scope                      Purpose
+	 *  x19        primary_entry() .. start_kernel()        whether we entered with the MMU on
 	 *  x20        primary_entry() .. __primary_switch()    CPU boot mode
 	 *  x21        primary_entry() .. start_kernel()        FDT pointer passed at boot in x0
 	 *  x22        create_idmap() .. start_kernel()         ID map VA of the DT blob
@@ -86,6 +87,7 @@
 	 *  x28        create_idmap()                           callee preserved temp register
 	 */
 SYM_CODE_START(primary_entry)
+	bl	record_mmu_state
 	bl	preserve_boot_args
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mov	x20, x0
@@ -109,6 +111,18 @@ SYM_CODE_START(primary_entry)
 	b	__primary_switch
 SYM_CODE_END(primary_entry)
 
+SYM_CODE_START_LOCAL(record_mmu_state)
+	mrs	x19, CurrentEL
+	cmp	x19, #CurrentEL_EL2
+	mrs	x19, sctlr_el1
+	b.ne	0f
+	mrs	x19, sctlr_el2
+0:	tst	x19, #SCTLR_ELx_C		// Z := (C == 0)
+	and	x19, x19, #SCTLR_ELx_M		// isolate M bit
+	csel	x19, xzr, x19, eq		// clear x19 if Z
+	ret
+SYM_CODE_END(record_mmu_state)
+
 /*
  * Preserve the arguments passed by the bootloader in x0 .. x3
  */
@@ -119,11 +133,14 @@ SYM_CODE_START_LOCAL(preserve_boot_args)
 	stp	x21, x1, [x0]			// x0 .. x3 at kernel entry
 	stp	x2, x3, [x0, #16]
 
+	cbnz	x19, 0f				// skip cache invalidation if MMU is on
 	dmb	sy				// needed before dc ivac with
 						// MMU off
 
 	add	x1, x0, #0x20			// 4 x 8 bytes
 	b	dcache_inval_poc		// tail call
+0:	str_l   x19, mmu_enabled_at_boot, x0
+	ret
 SYM_CODE_END(preserve_boot_args)
 
 SYM_FUNC_START_LOCAL(clear_page_tables)
@@ -497,6 +514,7 @@ SYM_FUNC_START(init_kernel_el)
 
 SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
 	mov_q	x0, INIT_SCTLR_EL1_MMU_OFF
+	pre_disable_mmu_workaround
 	msr	sctlr_el1, x0
 	isb
 	mov_q	x0, INIT_PSTATE_EL1
@@ -529,11 +547,13 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
 	cbz	x0, 1f
 
 	/* Set a sane SCTLR_EL1, the VHE way */
+	pre_disable_mmu_workaround
 	msr_s	SYS_SCTLR_EL12, x1
 	mov	x2, #BOOT_CPU_FLAG_E2H
 	b	2f
 
 1:
+	pre_disable_mmu_workaround
 	msr	sctlr_el1, x1
 	mov	x2, xzr
 2:
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 12cfe9d0d3fac10d..b8ec7b3ac9cbe8a8 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -58,6 +58,7 @@ static int num_standard_resources;
 static struct resource *standard_resources;
 
 phys_addr_t __fdt_pointer __initdata;
+u64 mmu_enabled_at_boot __initdata;
 
 /*
  * Standard memory resources
@@ -332,8 +333,12 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
 	xen_early_init();
 	efi_init();
 
-	if (!efi_enabled(EFI_BOOT) && ((u64)_text % MIN_KIMG_ALIGN) != 0)
-	     pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
+	if (!efi_enabled(EFI_BOOT)) {
+		if ((u64)_text % MIN_KIMG_ALIGN)
+			pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
+		WARN_TAINT(mmu_enabled_at_boot, TAINT_FIRMWARE_WORKAROUND,
+			   FW_BUG "Booted with MMU enabled!");
+	}
 
 	arm64_memblock_init();
 
@@ -442,3 +447,11 @@ static int __init register_arm64_panic_block(void)
 	return 0;
 }
 device_initcall(register_arm64_panic_block);
+
+static int __init check_mmu_enabled_at_boot(void)
+{
+	if (!efi_enabled(EFI_BOOT) && mmu_enabled_at_boot)
+		panic("Non-EFI boot detected with MMU and caches enabled");
+	return 0;
+}
+device_initcall_sync(check_mmu_enabled_at_boot);
-- 
2.39.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 4/6] arm64: head: avoid cache invalidation when entering with the MMU on
  2023-01-11 10:22 ` Ard Biesheuvel
@ 2023-01-11 10:22   ` Ard Biesheuvel
  -1 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

If we enter with the MMU on, there is no need for explicit cache
invalidation for stores to memory, as they will be coherent with the
caches.

Let's take advantage of this, and create the ID map with the MMU still
enabled if that is how we entered, and avoid any cache invalidation
calls in that case.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index c3b898efd3b5288d..d75f419206451d07 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -89,9 +89,9 @@
 SYM_CODE_START(primary_entry)
 	bl	record_mmu_state
 	bl	preserve_boot_args
+	bl	create_idmap
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mov	x20, x0
-	bl	create_idmap
 
 	/*
 	 * The following calls CPU setup code, see arch/arm64/mm/proc.S for
@@ -377,12 +377,13 @@ SYM_FUNC_START_LOCAL(create_idmap)
 	 * accesses (MMU disabled), invalidate those tables again to
 	 * remove any speculatively loaded cache lines.
 	 */
+	cbnz	x19, 0f				// skip cache invalidation if MMU is on
 	dmb	sy
 
 	adrp	x0, init_idmap_pg_dir
 	adrp	x1, init_idmap_pg_end
 	bl	dcache_inval_poc
-	ret	x28
+0:	ret	x28
 SYM_FUNC_END(create_idmap)
 
 SYM_FUNC_START_LOCAL(create_kernel_mapping)
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 4/6] arm64: head: avoid cache invalidation when entering with the MMU on
@ 2023-01-11 10:22   ` Ard Biesheuvel
  0 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

If we enter with the MMU on, there is no need for explicit cache
invalidation for stores to memory, as they will be coherent with the
caches.

Let's take advantage of this, and create the ID map with the MMU still
enabled if that is how we entered, and avoid any cache invalidation
calls in that case.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index c3b898efd3b5288d..d75f419206451d07 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -89,9 +89,9 @@
 SYM_CODE_START(primary_entry)
 	bl	record_mmu_state
 	bl	preserve_boot_args
+	bl	create_idmap
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mov	x20, x0
-	bl	create_idmap
 
 	/*
 	 * The following calls CPU setup code, see arch/arm64/mm/proc.S for
@@ -377,12 +377,13 @@ SYM_FUNC_START_LOCAL(create_idmap)
 	 * accesses (MMU disabled), invalidate those tables again to
 	 * remove any speculatively loaded cache lines.
 	 */
+	cbnz	x19, 0f				// skip cache invalidation if MMU is on
 	dmb	sy
 
 	adrp	x0, init_idmap_pg_dir
 	adrp	x1, init_idmap_pg_end
 	bl	dcache_inval_poc
-	ret	x28
+0:	ret	x28
 SYM_FUNC_END(create_idmap)
 
 SYM_FUNC_START_LOCAL(create_kernel_mapping)
-- 
2.39.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 5/6] arm64: head: Clean the ID map and the HYP text to the PoC if needed
  2023-01-11 10:22 ` Ard Biesheuvel
@ 2023-01-11 10:22   ` Ard Biesheuvel
  -1 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

If we enter with the MMU and caches enabled, the bootloader may not have
performed any cache maintenance to the PoC. So clean the ID mapped page
to the PoC, to ensure that instruction and data accesses with the MMU
off see the correct data. For similar reasons, clean all the HYP text to
the PoC as well when entering at EL2 with the MMU and caches enabled.

Note that this means primary_entry() itself needs to be moved into the
ID map as well, as we will return from init_kernel_el() with the MMU and
caches off.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S  | 31 +++++++++++++++++---
 arch/arm64/kernel/sleep.S |  1 +
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index d75f419206451d07..dc56e1d8f36eb387 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -70,7 +70,7 @@
 
 	__EFI_PE_HEADER
 
-	__INIT
+	.section ".idmap.text","awx"
 
 	/*
 	 * The following callee saved general purpose registers are used on the
@@ -90,6 +90,17 @@ SYM_CODE_START(primary_entry)
 	bl	record_mmu_state
 	bl	preserve_boot_args
 	bl	create_idmap
+
+	/*
+	 * If we entered with the MMU and caches on, clean the ID mapped part
+	 * of the primary boot code to the PoC so we can safely execute it with
+	 * the MMU off.
+	 */
+	cbz	x19, 0f
+	adrp	x0, __idmap_text_start
+	adr_l	x1, __idmap_text_end
+	bl	dcache_clean_poc
+0:	mov	x0, x19
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mov	x20, x0
 
@@ -111,6 +122,7 @@ SYM_CODE_START(primary_entry)
 	b	__primary_switch
 SYM_CODE_END(primary_entry)
 
+	__INIT
 SYM_CODE_START_LOCAL(record_mmu_state)
 	mrs	x19, CurrentEL
 	cmp	x19, #CurrentEL_EL2
@@ -507,10 +519,12 @@ SYM_FUNC_END(__primary_switched)
  * Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in x0 if
  * booted in EL1 or EL2 respectively, with the top 32 bits containing
  * potential context flags. These flags are *not* stored in __boot_cpu_mode.
+ *
+ * x0: whether we are being called from the primary boot path with the MMU on
  */
 SYM_FUNC_START(init_kernel_el)
-	mrs	x0, CurrentEL
-	cmp	x0, #CurrentEL_EL2
+	mrs	x1, CurrentEL
+	cmp	x1, #CurrentEL_EL2
 	b.eq	init_el2
 
 SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
@@ -525,6 +539,14 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
 	eret
 
 SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
+	msr	elr_el2, lr
+
+	// clean all HYP code to the PoC if we booted at EL2 with the MMU on
+	cbz	x0, 0f
+	adrp	x0, __hyp_idmap_text_start
+	adr_l	x1, __hyp_text_end
+	bl	dcache_clean_poc
+0:
 	mov_q	x0, HCR_HOST_NVHE_FLAGS
 	msr	hcr_el2, x0
 	isb
@@ -558,7 +580,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
 	msr	sctlr_el1, x1
 	mov	x2, xzr
 2:
-	msr	elr_el2, lr
 	mov	w0, #BOOT_CPU_MODE_EL2
 	orr	x0, x0, x2
 	eret
@@ -569,6 +590,7 @@ SYM_FUNC_END(init_kernel_el)
 	 * cores are held until we're ready for them to initialise.
 	 */
 SYM_FUNC_START(secondary_holding_pen)
+	mov	x0, xzr
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mrs	x2, mpidr_el1
 	mov_q	x1, MPIDR_HWID_BITMASK
@@ -586,6 +608,7 @@ SYM_FUNC_END(secondary_holding_pen)
 	 * be used where CPUs are brought online dynamically by the kernel.
 	 */
 SYM_FUNC_START(secondary_entry)
+	mov	x0, xzr
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	b	secondary_startup
 SYM_FUNC_END(secondary_entry)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 7b7c56e048346e97..2ae7cff1953aaf87 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -99,6 +99,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
 
 	.pushsection ".idmap.text", "awx"
 SYM_CODE_START(cpu_resume)
+	mov	x0, xzr
 	bl	init_kernel_el
 	mov	x19, x0			// preserve boot mode
 #if VA_BITS > 48
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 5/6] arm64: head: Clean the ID map and the HYP text to the PoC if needed
@ 2023-01-11 10:22   ` Ard Biesheuvel
  0 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

If we enter with the MMU and caches enabled, the bootloader may not have
performed any cache maintenance to the PoC. So clean the ID mapped page
to the PoC, to ensure that instruction and data accesses with the MMU
off see the correct data. For similar reasons, clean all the HYP text to
the PoC as well when entering at EL2 with the MMU and caches enabled.

Note that this means primary_entry() itself needs to be moved into the
ID map as well, as we will return from init_kernel_el() with the MMU and
caches off.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/kernel/head.S  | 31 +++++++++++++++++---
 arch/arm64/kernel/sleep.S |  1 +
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index d75f419206451d07..dc56e1d8f36eb387 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -70,7 +70,7 @@
 
 	__EFI_PE_HEADER
 
-	__INIT
+	.section ".idmap.text","awx"
 
 	/*
 	 * The following callee saved general purpose registers are used on the
@@ -90,6 +90,17 @@ SYM_CODE_START(primary_entry)
 	bl	record_mmu_state
 	bl	preserve_boot_args
 	bl	create_idmap
+
+	/*
+	 * If we entered with the MMU and caches on, clean the ID mapped part
+	 * of the primary boot code to the PoC so we can safely execute it with
+	 * the MMU off.
+	 */
+	cbz	x19, 0f
+	adrp	x0, __idmap_text_start
+	adr_l	x1, __idmap_text_end
+	bl	dcache_clean_poc
+0:	mov	x0, x19
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mov	x20, x0
 
@@ -111,6 +122,7 @@ SYM_CODE_START(primary_entry)
 	b	__primary_switch
 SYM_CODE_END(primary_entry)
 
+	__INIT
 SYM_CODE_START_LOCAL(record_mmu_state)
 	mrs	x19, CurrentEL
 	cmp	x19, #CurrentEL_EL2
@@ -507,10 +519,12 @@ SYM_FUNC_END(__primary_switched)
  * Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in x0 if
  * booted in EL1 or EL2 respectively, with the top 32 bits containing
  * potential context flags. These flags are *not* stored in __boot_cpu_mode.
+ *
+ * x0: whether we are being called from the primary boot path with the MMU on
  */
 SYM_FUNC_START(init_kernel_el)
-	mrs	x0, CurrentEL
-	cmp	x0, #CurrentEL_EL2
+	mrs	x1, CurrentEL
+	cmp	x1, #CurrentEL_EL2
 	b.eq	init_el2
 
 SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
@@ -525,6 +539,14 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
 	eret
 
 SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
+	msr	elr_el2, lr
+
+	// clean all HYP code to the PoC if we booted at EL2 with the MMU on
+	cbz	x0, 0f
+	adrp	x0, __hyp_idmap_text_start
+	adr_l	x1, __hyp_text_end
+	bl	dcache_clean_poc
+0:
 	mov_q	x0, HCR_HOST_NVHE_FLAGS
 	msr	hcr_el2, x0
 	isb
@@ -558,7 +580,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
 	msr	sctlr_el1, x1
 	mov	x2, xzr
 2:
-	msr	elr_el2, lr
 	mov	w0, #BOOT_CPU_MODE_EL2
 	orr	x0, x0, x2
 	eret
@@ -569,6 +590,7 @@ SYM_FUNC_END(init_kernel_el)
 	 * cores are held until we're ready for them to initialise.
 	 */
 SYM_FUNC_START(secondary_holding_pen)
+	mov	x0, xzr
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	mrs	x2, mpidr_el1
 	mov_q	x1, MPIDR_HWID_BITMASK
@@ -586,6 +608,7 @@ SYM_FUNC_END(secondary_holding_pen)
 	 * be used where CPUs are brought online dynamically by the kernel.
 	 */
 SYM_FUNC_START(secondary_entry)
+	mov	x0, xzr
 	bl	init_kernel_el			// w0=cpu_boot_mode
 	b	secondary_startup
 SYM_FUNC_END(secondary_entry)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 7b7c56e048346e97..2ae7cff1953aaf87 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -99,6 +99,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
 
 	.pushsection ".idmap.text", "awx"
 SYM_CODE_START(cpu_resume)
+	mov	x0, xzr
 	bl	init_kernel_el
 	mov	x19, x0			// preserve boot mode
 #if VA_BITS > 48
-- 
2.39.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 6/6] efi: arm64: enter with MMU and caches enabled
  2023-01-11 10:22 ` Ard Biesheuvel
@ 2023-01-11 10:22   ` Ard Biesheuvel
  -1 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

Instead of cleaning the entire loaded kernel image to the PoC and
disabling the MMU and caches before branching to the kernel's bare metal
entry point, we can leave the MMU and caches enabled, and rely on EFI's
cacheable 1:1 mapping of all of system RAM (which is mandated by the
spec) to populate the initial page tables.

This removes the need for managing coherency in software, which is
tedious and error prone.

Note that we still need to clean the executable region of the image to
the PoU if this is required for I/D coherency, but only if we actually
decided to move the image in memory, as otherwise, this will have been
taken care of by the loader.

This change affects both the builtin EFI stub as well as the zboot
decompressor, which now carries the entire EFI stub along with the
decompression code and the compressed image.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/include/asm/efi.h               |  2 +
 arch/arm64/kernel/image-vars.h             |  5 +-
 arch/arm64/mm/cache.S                      |  1 +
 drivers/firmware/efi/libstub/Makefile      |  4 +-
 drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------------
 drivers/firmware/efi/libstub/arm64-stub.c  | 26 +++++---
 drivers/firmware/efi/libstub/arm64.c       | 41 ++++++++++--
 7 files changed, 61 insertions(+), 85 deletions(-)

diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 31d13a6001df49c4..0f0e729b40efc9ab 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -105,6 +105,8 @@ static inline unsigned long efi_get_kimg_min_align(void)
 #define EFI_ALLOC_ALIGN		SZ_64K
 #define EFI_ALLOC_LIMIT		((1UL << 48) - 1)
 
+extern unsigned long primary_entry_offset(void);
+
 /*
  * On ARM systems, virtually remapped UEFI runtime services are set up in two
  * distinct stages:
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index d0e9bb5c91fccad6..73388b21d07d5524 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -10,7 +10,7 @@
 #error This file should only be included in vmlinux.lds.S
 #endif
 
-PROVIDE(__efistub_primary_entry_offset	= primary_entry - _text);
+PROVIDE(__efistub_primary_entry		= primary_entry);
 
 /*
  * The EFI stub has its own symbol namespace prefixed by __efistub_, to
@@ -21,10 +21,11 @@ PROVIDE(__efistub_primary_entry_offset	= primary_entry - _text);
  * linked at. The routines below are all implemented in assembler in a
  * position independent manner
  */
-PROVIDE(__efistub_dcache_clean_poc	= __pi_dcache_clean_poc);
+PROVIDE(__efistub_caches_clean_inval_pou = __pi_caches_clean_inval_pou);
 
 PROVIDE(__efistub__text			= _text);
 PROVIDE(__efistub__end			= _end);
+PROVIDE(__efistub___inittext_end       	= __inittext_end);
 PROVIDE(__efistub__edata		= _edata);
 PROVIDE(__efistub_screen_info		= screen_info);
 PROVIDE(__efistub__ctype		= _ctype);
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 081058d4e4366edb..503567c864fde05d 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -56,6 +56,7 @@ SYM_FUNC_START(caches_clean_inval_pou)
 	caches_clean_inval_pou_macro
 	ret
 SYM_FUNC_END(caches_clean_inval_pou)
+SYM_FUNC_ALIAS(__pi_caches_clean_inval_pou, caches_clean_inval_pou)
 
 /*
  *	caches_clean_inval_user_pou(start,end)
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index be8b8c6e8b40a17d..80d85a5169fb2c72 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -87,7 +87,7 @@ lib-$(CONFIG_EFI_GENERIC_STUB)	+= efi-stub.o string.o intrinsics.o systable.o \
 				   screen_info.o efi-stub-entry.o
 
 lib-$(CONFIG_ARM)		+= arm32-stub.o
-lib-$(CONFIG_ARM64)		+= arm64.o arm64-stub.o arm64-entry.o smbios.o
+lib-$(CONFIG_ARM64)		+= arm64.o arm64-stub.o smbios.o
 lib-$(CONFIG_X86)		+= x86-stub.o
 lib-$(CONFIG_RISCV)		+= riscv.o riscv-stub.o
 lib-$(CONFIG_LOONGARCH)		+= loongarch.o loongarch-stub.o
@@ -141,7 +141,7 @@ STUBCOPY_RELOC-$(CONFIG_ARM)	:= R_ARM_ABS
 #
 STUBCOPY_FLAGS-$(CONFIG_ARM64)	+= --prefix-alloc-sections=.init \
 				   --prefix-symbols=__efistub_
-STUBCOPY_RELOC-$(CONFIG_ARM64)	:= R_AARCH64_ABS64
+STUBCOPY_RELOC-$(CONFIG_ARM64)	:= R_AARCH64_ABS
 
 # For RISC-V, we don't need anything special other than arm64. Keep all the
 # symbols in .init section and make sure that no absolute symbols references
diff --git a/drivers/firmware/efi/libstub/arm64-entry.S b/drivers/firmware/efi/libstub/arm64-entry.S
deleted file mode 100644
index b5c17e89a4fc0c21..0000000000000000
--- a/drivers/firmware/efi/libstub/arm64-entry.S
+++ /dev/null
@@ -1,67 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * EFI entry point.
- *
- * Copyright (C) 2013, 2014 Red Hat, Inc.
- * Author: Mark Salter <msalter@redhat.com>
- */
-#include <linux/linkage.h>
-#include <asm/assembler.h>
-
-	/*
-	 * The entrypoint of a arm64 bare metal image is at offset #0 of the
-	 * image, so this is a reasonable default for primary_entry_offset.
-	 * Only when the EFI stub is integrated into the core kernel, it is not
-	 * guaranteed that the PE/COFF header has been copied to memory too, so
-	 * in this case, primary_entry_offset should be overridden by the
-	 * linker and point to primary_entry() directly.
-	 */
-	.weak	primary_entry_offset
-
-SYM_CODE_START(efi_enter_kernel)
-	/*
-	 * efi_pe_entry() will have copied the kernel image if necessary and we
-	 * end up here with device tree address in x1 and the kernel entry
-	 * point stored in x0. Save those values in registers which are
-	 * callee preserved.
-	 */
-	ldr	w2, =primary_entry_offset
-	add	x19, x0, x2		// relocated Image entrypoint
-
-	mov	x0, x1			// DTB address
-	mov	x1, xzr
-	mov	x2, xzr
-	mov	x3, xzr
-
-	/*
-	 * Clean the remainder of this routine to the PoC
-	 * so that we can safely disable the MMU and caches.
-	 */
-	adr	x4, 1f
-	dc	civac, x4
-	dsb	sy
-
-	/* Turn off Dcache and MMU */
-	mrs	x4, CurrentEL
-	cmp	x4, #CurrentEL_EL2
-	mrs	x4, sctlr_el1
-	b.ne	0f
-	mrs	x4, sctlr_el2
-0:	bic	x4, x4, #SCTLR_ELx_M
-	bic	x4, x4, #SCTLR_ELx_C
-	b.eq	1f
-	b	2f
-
-	.balign	32
-1:	pre_disable_mmu_workaround
-	msr	sctlr_el2, x4
-	isb
-	br	x19		// jump to kernel entrypoint
-
-2:	pre_disable_mmu_workaround
-	msr	sctlr_el1, x4
-	isb
-	br	x19		// jump to kernel entrypoint
-
-	.org	1b + 32
-SYM_CODE_END(efi_enter_kernel)
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 7327b98d8e3fe961..d4a6b12a87413024 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -58,7 +58,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 				 efi_handle_t image_handle)
 {
 	efi_status_t status;
-	unsigned long kernel_size, kernel_memsize = 0;
+	unsigned long kernel_size, kernel_codesize, kernel_memsize;
 	u32 phys_seed = 0;
 	u64 min_kimg_align = efi_get_kimg_min_align();
 
@@ -93,6 +93,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 			SEGMENT_ALIGN >> 10);
 
 	kernel_size = _edata - _text;
+	kernel_codesize = __inittext_end - _text;
 	kernel_memsize = kernel_size + (_end - _edata);
 	*reserve_size = kernel_memsize;
 
@@ -121,7 +122,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 			 */
 			*image_addr = (u64)_text;
 			*reserve_size = 0;
-			goto clean_image_to_poc;
+			return EFI_SUCCESS;
 		}
 
 		status = efi_allocate_pages_aligned(*reserve_size, reserve_addr,
@@ -137,14 +138,21 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 
 	*image_addr = *reserve_addr;
 	memcpy((void *)*image_addr, _text, kernel_size);
+	caches_clean_inval_pou(*image_addr, *image_addr + kernel_codesize);
 
-clean_image_to_poc:
+	return EFI_SUCCESS;
+}
+
+asmlinkage void primary_entry(void);
+
+unsigned long primary_entry_offset(void)
+{
 	/*
-	 * Clean the copied Image to the PoC, and ensure it is not shadowed by
-	 * stale icache entries from before relocation.
+	 * When built as part of the kernel, the EFI stub cannot branch to the
+	 * kernel proper via the image header, as the PE/COFF header is
+	 * strictly not part of the in-memory presentation of the image, only
+	 * of the file representation. So instead, we need to jump to the
+	 * actual entrypoint in the .text region of the image.
 	 */
-	dcache_clean_poc(*image_addr, *image_addr + kernel_size);
-	asm("ic ialluis");
-
-	return EFI_SUCCESS;
+	return (char *)primary_entry - _text;
 }
diff --git a/drivers/firmware/efi/libstub/arm64.c b/drivers/firmware/efi/libstub/arm64.c
index ff2d18c42ee74979..f5da4fbccd860ab1 100644
--- a/drivers/firmware/efi/libstub/arm64.c
+++ b/drivers/firmware/efi/libstub/arm64.c
@@ -56,6 +56,12 @@ efi_status_t check_platform_features(void)
 	return EFI_SUCCESS;
 }
 
+#ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
+#define DCTYPE	"civac"
+#else
+#define DCTYPE	"cvau"
+#endif
+
 void efi_cache_sync_image(unsigned long image_base,
 			  unsigned long alloc_size,
 			  unsigned long code_size)
@@ -64,13 +70,38 @@ void efi_cache_sync_image(unsigned long image_base,
 	u64 lsize = 4 << cpuid_feature_extract_unsigned_field(ctr,
 						CTR_EL0_DminLine_SHIFT);
 
-	do {
-		asm("dc civac, %0" :: "r"(image_base));
-		image_base += lsize;
-		alloc_size -= lsize;
-	} while (alloc_size >= lsize);
+	/* only perform the cache maintenance if needed for I/D coherency */
+	if (!(ctr & BIT(CTR_EL0_IDC_SHIFT))) {
+		do {
+			asm("dc " DCTYPE ", %0" :: "r"(image_base));
+			image_base += lsize;
+			code_size -= lsize;
+		} while (code_size >= lsize);
+	}
 
 	asm("ic ialluis");
 	dsb(ish);
 	isb();
 }
+
+unsigned long __weak primary_entry_offset(void)
+{
+	/*
+	 * By default, we can invoke the kernel via the branch instruction in
+	 * the image header, so offset #0. This will be overridden by the EFI
+	 * stub build that is linked into the core kernel, as in that case, the
+	 * image header may not have been loaded into memory, or may be mapped
+	 * with non-executable permissions.
+	 */
+       return 0;
+}
+
+void __noreturn efi_enter_kernel(unsigned long entrypoint,
+				 unsigned long fdt_addr,
+				 unsigned long fdt_size)
+{
+	void (* __noreturn enter_kernel)(u64, u64, u64, u64);
+
+	enter_kernel = (void *)entrypoint + primary_entry_offset();
+	enter_kernel(fdt_addr, 0, 0, 0);
+}
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v7 6/6] efi: arm64: enter with MMU and caches enabled
@ 2023-01-11 10:22   ` Ard Biesheuvel
  0 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:22 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Ard Biesheuvel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

Instead of cleaning the entire loaded kernel image to the PoC and
disabling the MMU and caches before branching to the kernel's bare metal
entry point, we can leave the MMU and caches enabled, and rely on EFI's
cacheable 1:1 mapping of all of system RAM (which is mandated by the
spec) to populate the initial page tables.

This removes the need for managing coherency in software, which is
tedious and error prone.

Note that we still need to clean the executable region of the image to
the PoU if this is required for I/D coherency, but only if we actually
decided to move the image in memory, as otherwise, this will have been
taken care of by the loader.

This change affects both the builtin EFI stub as well as the zboot
decompressor, which now carries the entire EFI stub along with the
decompression code and the compressed image.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/include/asm/efi.h               |  2 +
 arch/arm64/kernel/image-vars.h             |  5 +-
 arch/arm64/mm/cache.S                      |  1 +
 drivers/firmware/efi/libstub/Makefile      |  4 +-
 drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------------
 drivers/firmware/efi/libstub/arm64-stub.c  | 26 +++++---
 drivers/firmware/efi/libstub/arm64.c       | 41 ++++++++++--
 7 files changed, 61 insertions(+), 85 deletions(-)

diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 31d13a6001df49c4..0f0e729b40efc9ab 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -105,6 +105,8 @@ static inline unsigned long efi_get_kimg_min_align(void)
 #define EFI_ALLOC_ALIGN		SZ_64K
 #define EFI_ALLOC_LIMIT		((1UL << 48) - 1)
 
+extern unsigned long primary_entry_offset(void);
+
 /*
  * On ARM systems, virtually remapped UEFI runtime services are set up in two
  * distinct stages:
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index d0e9bb5c91fccad6..73388b21d07d5524 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -10,7 +10,7 @@
 #error This file should only be included in vmlinux.lds.S
 #endif
 
-PROVIDE(__efistub_primary_entry_offset	= primary_entry - _text);
+PROVIDE(__efistub_primary_entry		= primary_entry);
 
 /*
  * The EFI stub has its own symbol namespace prefixed by __efistub_, to
@@ -21,10 +21,11 @@ PROVIDE(__efistub_primary_entry_offset	= primary_entry - _text);
  * linked at. The routines below are all implemented in assembler in a
  * position independent manner
  */
-PROVIDE(__efistub_dcache_clean_poc	= __pi_dcache_clean_poc);
+PROVIDE(__efistub_caches_clean_inval_pou = __pi_caches_clean_inval_pou);
 
 PROVIDE(__efistub__text			= _text);
 PROVIDE(__efistub__end			= _end);
+PROVIDE(__efistub___inittext_end       	= __inittext_end);
 PROVIDE(__efistub__edata		= _edata);
 PROVIDE(__efistub_screen_info		= screen_info);
 PROVIDE(__efistub__ctype		= _ctype);
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 081058d4e4366edb..503567c864fde05d 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -56,6 +56,7 @@ SYM_FUNC_START(caches_clean_inval_pou)
 	caches_clean_inval_pou_macro
 	ret
 SYM_FUNC_END(caches_clean_inval_pou)
+SYM_FUNC_ALIAS(__pi_caches_clean_inval_pou, caches_clean_inval_pou)
 
 /*
  *	caches_clean_inval_user_pou(start,end)
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index be8b8c6e8b40a17d..80d85a5169fb2c72 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -87,7 +87,7 @@ lib-$(CONFIG_EFI_GENERIC_STUB)	+= efi-stub.o string.o intrinsics.o systable.o \
 				   screen_info.o efi-stub-entry.o
 
 lib-$(CONFIG_ARM)		+= arm32-stub.o
-lib-$(CONFIG_ARM64)		+= arm64.o arm64-stub.o arm64-entry.o smbios.o
+lib-$(CONFIG_ARM64)		+= arm64.o arm64-stub.o smbios.o
 lib-$(CONFIG_X86)		+= x86-stub.o
 lib-$(CONFIG_RISCV)		+= riscv.o riscv-stub.o
 lib-$(CONFIG_LOONGARCH)		+= loongarch.o loongarch-stub.o
@@ -141,7 +141,7 @@ STUBCOPY_RELOC-$(CONFIG_ARM)	:= R_ARM_ABS
 #
 STUBCOPY_FLAGS-$(CONFIG_ARM64)	+= --prefix-alloc-sections=.init \
 				   --prefix-symbols=__efistub_
-STUBCOPY_RELOC-$(CONFIG_ARM64)	:= R_AARCH64_ABS64
+STUBCOPY_RELOC-$(CONFIG_ARM64)	:= R_AARCH64_ABS
 
 # For RISC-V, we don't need anything special other than arm64. Keep all the
 # symbols in .init section and make sure that no absolute symbols references
diff --git a/drivers/firmware/efi/libstub/arm64-entry.S b/drivers/firmware/efi/libstub/arm64-entry.S
deleted file mode 100644
index b5c17e89a4fc0c21..0000000000000000
--- a/drivers/firmware/efi/libstub/arm64-entry.S
+++ /dev/null
@@ -1,67 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * EFI entry point.
- *
- * Copyright (C) 2013, 2014 Red Hat, Inc.
- * Author: Mark Salter <msalter@redhat.com>
- */
-#include <linux/linkage.h>
-#include <asm/assembler.h>
-
-	/*
-	 * The entrypoint of a arm64 bare metal image is at offset #0 of the
-	 * image, so this is a reasonable default for primary_entry_offset.
-	 * Only when the EFI stub is integrated into the core kernel, it is not
-	 * guaranteed that the PE/COFF header has been copied to memory too, so
-	 * in this case, primary_entry_offset should be overridden by the
-	 * linker and point to primary_entry() directly.
-	 */
-	.weak	primary_entry_offset
-
-SYM_CODE_START(efi_enter_kernel)
-	/*
-	 * efi_pe_entry() will have copied the kernel image if necessary and we
-	 * end up here with device tree address in x1 and the kernel entry
-	 * point stored in x0. Save those values in registers which are
-	 * callee preserved.
-	 */
-	ldr	w2, =primary_entry_offset
-	add	x19, x0, x2		// relocated Image entrypoint
-
-	mov	x0, x1			// DTB address
-	mov	x1, xzr
-	mov	x2, xzr
-	mov	x3, xzr
-
-	/*
-	 * Clean the remainder of this routine to the PoC
-	 * so that we can safely disable the MMU and caches.
-	 */
-	adr	x4, 1f
-	dc	civac, x4
-	dsb	sy
-
-	/* Turn off Dcache and MMU */
-	mrs	x4, CurrentEL
-	cmp	x4, #CurrentEL_EL2
-	mrs	x4, sctlr_el1
-	b.ne	0f
-	mrs	x4, sctlr_el2
-0:	bic	x4, x4, #SCTLR_ELx_M
-	bic	x4, x4, #SCTLR_ELx_C
-	b.eq	1f
-	b	2f
-
-	.balign	32
-1:	pre_disable_mmu_workaround
-	msr	sctlr_el2, x4
-	isb
-	br	x19		// jump to kernel entrypoint
-
-2:	pre_disable_mmu_workaround
-	msr	sctlr_el1, x4
-	isb
-	br	x19		// jump to kernel entrypoint
-
-	.org	1b + 32
-SYM_CODE_END(efi_enter_kernel)
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 7327b98d8e3fe961..d4a6b12a87413024 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -58,7 +58,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 				 efi_handle_t image_handle)
 {
 	efi_status_t status;
-	unsigned long kernel_size, kernel_memsize = 0;
+	unsigned long kernel_size, kernel_codesize, kernel_memsize;
 	u32 phys_seed = 0;
 	u64 min_kimg_align = efi_get_kimg_min_align();
 
@@ -93,6 +93,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 			SEGMENT_ALIGN >> 10);
 
 	kernel_size = _edata - _text;
+	kernel_codesize = __inittext_end - _text;
 	kernel_memsize = kernel_size + (_end - _edata);
 	*reserve_size = kernel_memsize;
 
@@ -121,7 +122,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 			 */
 			*image_addr = (u64)_text;
 			*reserve_size = 0;
-			goto clean_image_to_poc;
+			return EFI_SUCCESS;
 		}
 
 		status = efi_allocate_pages_aligned(*reserve_size, reserve_addr,
@@ -137,14 +138,21 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
 
 	*image_addr = *reserve_addr;
 	memcpy((void *)*image_addr, _text, kernel_size);
+	caches_clean_inval_pou(*image_addr, *image_addr + kernel_codesize);
 
-clean_image_to_poc:
+	return EFI_SUCCESS;
+}
+
+asmlinkage void primary_entry(void);
+
+unsigned long primary_entry_offset(void)
+{
 	/*
-	 * Clean the copied Image to the PoC, and ensure it is not shadowed by
-	 * stale icache entries from before relocation.
+	 * When built as part of the kernel, the EFI stub cannot branch to the
+	 * kernel proper via the image header, as the PE/COFF header is
+	 * strictly not part of the in-memory presentation of the image, only
+	 * of the file representation. So instead, we need to jump to the
+	 * actual entrypoint in the .text region of the image.
 	 */
-	dcache_clean_poc(*image_addr, *image_addr + kernel_size);
-	asm("ic ialluis");
-
-	return EFI_SUCCESS;
+	return (char *)primary_entry - _text;
 }
diff --git a/drivers/firmware/efi/libstub/arm64.c b/drivers/firmware/efi/libstub/arm64.c
index ff2d18c42ee74979..f5da4fbccd860ab1 100644
--- a/drivers/firmware/efi/libstub/arm64.c
+++ b/drivers/firmware/efi/libstub/arm64.c
@@ -56,6 +56,12 @@ efi_status_t check_platform_features(void)
 	return EFI_SUCCESS;
 }
 
+#ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
+#define DCTYPE	"civac"
+#else
+#define DCTYPE	"cvau"
+#endif
+
 void efi_cache_sync_image(unsigned long image_base,
 			  unsigned long alloc_size,
 			  unsigned long code_size)
@@ -64,13 +70,38 @@ void efi_cache_sync_image(unsigned long image_base,
 	u64 lsize = 4 << cpuid_feature_extract_unsigned_field(ctr,
 						CTR_EL0_DminLine_SHIFT);
 
-	do {
-		asm("dc civac, %0" :: "r"(image_base));
-		image_base += lsize;
-		alloc_size -= lsize;
-	} while (alloc_size >= lsize);
+	/* only perform the cache maintenance if needed for I/D coherency */
+	if (!(ctr & BIT(CTR_EL0_IDC_SHIFT))) {
+		do {
+			asm("dc " DCTYPE ", %0" :: "r"(image_base));
+			image_base += lsize;
+			code_size -= lsize;
+		} while (code_size >= lsize);
+	}
 
 	asm("ic ialluis");
 	dsb(ish);
 	isb();
 }
+
+unsigned long __weak primary_entry_offset(void)
+{
+	/*
+	 * By default, we can invoke the kernel via the branch instruction in
+	 * the image header, so offset #0. This will be overridden by the EFI
+	 * stub build that is linked into the core kernel, as in that case, the
+	 * image header may not have been loaded into memory, or may be mapped
+	 * with non-executable permissions.
+	 */
+       return 0;
+}
+
+void __noreturn efi_enter_kernel(unsigned long entrypoint,
+				 unsigned long fdt_addr,
+				 unsigned long fdt_size)
+{
+	void (* __noreturn enter_kernel)(u64, u64, u64, u64);
+
+	enter_kernel = (void *)entrypoint + primary_entry_offset();
+	enter_kernel(fdt_addr, 0, 0, 0);
+}
-- 
2.39.0


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 0/6] arm64: Permit EFI boot with MMU and caches on
  2023-01-11 10:22 ` Ard Biesheuvel
@ 2023-01-11 10:26   ` Ard Biesheuvel
  -1 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:26 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Will Deacon, Catalin Marinas, Marc Zyngier,
	Mark Rutland

On Wed, 11 Jan 2023 at 11:23, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> The purpose of this series is to remove any explicit cache maintenance
> for coherency during early boot. Software managed coherency is error
> prone and tedious, and running with the MMU off is generally bad for
> performance, and it becomes unnecessary if we simply retain the
> cacheable 1:1 mapping of all of system RAM provided by EFI, and use it
> to populate the initial ID map page tables. After setting up this
> preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> TCR and SCTLR registers as before, and proceed as usual, avoiding the
> need for any manipulations of memory while the MMU and caches are off.
>

Forgot to mention: as it happens, patches #1 and #2 in this series
also work around a problem that was reported the other day, where the
allyesconfig build fails to link [0]

I bisected this to

commit 5e5ff73c2e5863f93fc5fd78d178cd8f2af12464
Author: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
Date:   Mon Oct 17 20:04:50 2022 +0530

    asm-generic/io: Add _RET_IP_ to MMIO trace for more accurate debug info

which seems entirely unrelated, but looks like it may be causing the
number of direct calls (and therefore the number of trampolines) to
increase, causing the ID map to blow up like it does.

[0] https://lore.kernel.org/all/CAMj1kXGAf7ikEU5jLoik0xrOde0xBg0yJkOo5=PtEtNXoUxMXA@mail.gmail.com/



> The only properties of the firmware provided 1:1 map we rely on is that
> it does not require any explicit cache maintenance for coherency, and
> that it covers the entire memory footprint of the image, including the
> BSS and padding at the end - all else is under control of the kernel
> itself, as before.
>
> The final patch updates the EFI stub code so that it no longer disables
> the MMU and caches or cleans the entire image to the PoC. Note that
> some cache maintenace for I/D coherence may still be needed, in the
> zboot case (which decompresses and boots a compressed kernel image) or
> in cases where the image is moved in memory.
>
> Changes since v6:
> - drop the 64k alignment patch, which is not strictly a prerequisite,
>   and will be revisited later if needed
> - add back EFI stub changes now that all dependencies are in mainline
> - panic() the kernel later in the boot if we detected a non-EFI boot
>   occurring with the MMU and caches enabled
>
> Changes since v5:
> - add a special entry point into the boot sequence that is to be used by
>   EFI only, and only permit booting with the MMU enabled when using that
>   boot path;
> - omit the final patch that would need to go via the EFI tree in any
>   case - adding the new entrypoint specific for EFI makes it conflict
>   even more badly, and I'll try to revisit this during the merge window
>   or simply defer the final piece for the next release;
>
> Changes since v4:
> - add patch to align the callers of finalise_el2()
> - also clean HYP text to the PoC when booting at EL2 with the MMU on
> - add a warning and a taint when doing non-EFI boot with the MMU and
>   caches enabled
> - rebase onto zboot changes in efi/next - this means that patches #6 and
>   #7 will not apply onto arm64/for-next so a shared stable branch will
>   be needed if we want to queue this up for v6.2
>
> Changes since v3:
> - drop EFI_LOADER_CODE memory type patch that has been queued in the
>   mean time
> - rebased onto [partial] series that moves efi-entry.S into the libstub/
>   source directory
> - fixed a correctness issue in patch #2
>
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
>
> Ard Biesheuvel (6):
>   arm64: head: Move all finalise_el2 calls to after __enable_mmu
>   arm64: kernel: move identity map out of .text mapping
>   arm64: head: record the MMU state at primary entry
>   arm64: head: avoid cache invalidation when entering with the MMU on
>   arm64: head: Clean the ID map and the HYP text to the PoC if needed
>   efi: arm64: enter with MMU and caches enabled
>
>  arch/arm64/include/asm/efi.h               |  2 +
>  arch/arm64/kernel/head.S                   | 89 +++++++++++++++-----
>  arch/arm64/kernel/image-vars.h             |  5 +-
>  arch/arm64/kernel/setup.c                  | 17 +++-
>  arch/arm64/kernel/sleep.S                  |  6 +-
>  arch/arm64/kernel/vmlinux.lds.S            |  2 +-
>  arch/arm64/mm/cache.S                      |  1 +
>  arch/arm64/mm/proc.S                       |  2 -
>  drivers/firmware/efi/libstub/Makefile      |  4 +-
>  drivers/firmware/efi/libstub/arm64-entry.S | 67 ---------------
>  drivers/firmware/efi/libstub/arm64-stub.c  | 26 ++++--
>  drivers/firmware/efi/libstub/arm64.c       | 41 +++++++--
>  12 files changed, 151 insertions(+), 111 deletions(-)
>  delete mode 100644 drivers/firmware/efi/libstub/arm64-entry.S
>
> --
> 2.39.0
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 0/6] arm64: Permit EFI boot with MMU and caches on
@ 2023-01-11 10:26   ` Ard Biesheuvel
  0 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-11 10:26 UTC (permalink / raw)
  To: linux-efi
  Cc: linux-arm-kernel, Will Deacon, Catalin Marinas, Marc Zyngier,
	Mark Rutland

On Wed, 11 Jan 2023 at 11:23, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> The purpose of this series is to remove any explicit cache maintenance
> for coherency during early boot. Software managed coherency is error
> prone and tedious, and running with the MMU off is generally bad for
> performance, and it becomes unnecessary if we simply retain the
> cacheable 1:1 mapping of all of system RAM provided by EFI, and use it
> to populate the initial ID map page tables. After setting up this
> preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> TCR and SCTLR registers as before, and proceed as usual, avoiding the
> need for any manipulations of memory while the MMU and caches are off.
>

Forgot to mention: as it happens, patches #1 and #2 in this series
also work around a problem that was reported the other day, where the
allyesconfig build fails to link [0]

I bisected this to

commit 5e5ff73c2e5863f93fc5fd78d178cd8f2af12464
Author: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
Date:   Mon Oct 17 20:04:50 2022 +0530

    asm-generic/io: Add _RET_IP_ to MMIO trace for more accurate debug info

which seems entirely unrelated, but looks like it may be causing the
number of direct calls (and therefore the number of trampolines) to
increase, causing the ID map to blow up like it does.

[0] https://lore.kernel.org/all/CAMj1kXGAf7ikEU5jLoik0xrOde0xBg0yJkOo5=PtEtNXoUxMXA@mail.gmail.com/



> The only properties of the firmware provided 1:1 map we rely on is that
> it does not require any explicit cache maintenance for coherency, and
> that it covers the entire memory footprint of the image, including the
> BSS and padding at the end - all else is under control of the kernel
> itself, as before.
>
> The final patch updates the EFI stub code so that it no longer disables
> the MMU and caches or cleans the entire image to the PoC. Note that
> some cache maintenace for I/D coherence may still be needed, in the
> zboot case (which decompresses and boots a compressed kernel image) or
> in cases where the image is moved in memory.
>
> Changes since v6:
> - drop the 64k alignment patch, which is not strictly a prerequisite,
>   and will be revisited later if needed
> - add back EFI stub changes now that all dependencies are in mainline
> - panic() the kernel later in the boot if we detected a non-EFI boot
>   occurring with the MMU and caches enabled
>
> Changes since v5:
> - add a special entry point into the boot sequence that is to be used by
>   EFI only, and only permit booting with the MMU enabled when using that
>   boot path;
> - omit the final patch that would need to go via the EFI tree in any
>   case - adding the new entrypoint specific for EFI makes it conflict
>   even more badly, and I'll try to revisit this during the merge window
>   or simply defer the final piece for the next release;
>
> Changes since v4:
> - add patch to align the callers of finalise_el2()
> - also clean HYP text to the PoC when booting at EL2 with the MMU on
> - add a warning and a taint when doing non-EFI boot with the MMU and
>   caches enabled
> - rebase onto zboot changes in efi/next - this means that patches #6 and
>   #7 will not apply onto arm64/for-next so a shared stable branch will
>   be needed if we want to queue this up for v6.2
>
> Changes since v3:
> - drop EFI_LOADER_CODE memory type patch that has been queued in the
>   mean time
> - rebased onto [partial] series that moves efi-entry.S into the libstub/
>   source directory
> - fixed a correctness issue in patch #2
>
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
>
> Ard Biesheuvel (6):
>   arm64: head: Move all finalise_el2 calls to after __enable_mmu
>   arm64: kernel: move identity map out of .text mapping
>   arm64: head: record the MMU state at primary entry
>   arm64: head: avoid cache invalidation when entering with the MMU on
>   arm64: head: Clean the ID map and the HYP text to the PoC if needed
>   efi: arm64: enter with MMU and caches enabled
>
>  arch/arm64/include/asm/efi.h               |  2 +
>  arch/arm64/kernel/head.S                   | 89 +++++++++++++++-----
>  arch/arm64/kernel/image-vars.h             |  5 +-
>  arch/arm64/kernel/setup.c                  | 17 +++-
>  arch/arm64/kernel/sleep.S                  |  6 +-
>  arch/arm64/kernel/vmlinux.lds.S            |  2 +-
>  arch/arm64/mm/cache.S                      |  1 +
>  arch/arm64/mm/proc.S                       |  2 -
>  drivers/firmware/efi/libstub/Makefile      |  4 +-
>  drivers/firmware/efi/libstub/arm64-entry.S | 67 ---------------
>  drivers/firmware/efi/libstub/arm64-stub.c  | 26 ++++--
>  drivers/firmware/efi/libstub/arm64.c       | 41 +++++++--
>  12 files changed, 151 insertions(+), 111 deletions(-)
>  delete mode 100644 drivers/firmware/efi/libstub/arm64-entry.S
>
> --
> 2.39.0
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 0/6] arm64: Permit EFI boot with MMU and caches on
  2023-01-11 10:22 ` Ard Biesheuvel
@ 2023-01-24 12:10   ` Catalin Marinas
  -1 siblings, 0 replies; 26+ messages in thread
From: Catalin Marinas @ 2023-01-24 12:10 UTC (permalink / raw)
  To: linux-efi, Ard Biesheuvel
  Cc: Will Deacon, linux-arm-kernel, Marc Zyngier, Mark Rutland

On Wed, 11 Jan 2023 11:22:30 +0100, Ard Biesheuvel wrote:
> The purpose of this series is to remove any explicit cache maintenance
> for coherency during early boot. Software managed coherency is error
> prone and tedious, and running with the MMU off is generally bad for
> performance, and it becomes unnecessary if we simply retain the
> cacheable 1:1 mapping of all of system RAM provided by EFI, and use it
> to populate the initial ID map page tables. After setting up this
> preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> TCR and SCTLR registers as before, and proceed as usual, avoiding the
> need for any manipulations of memory while the MMU and caches are off.
> 
> [...]

Applied to arm64 (for-next/efi-boot-mmu-on), thanks!

[1/6] arm64: head: Move all finalise_el2 calls to after __enable_mmu
      https://git.kernel.org/arm64/c/82e4958800c0
[2/6] arm64: kernel: move identity map out of .text mapping
      https://git.kernel.org/arm64/c/af7249b317e4
[3/6] arm64: head: record the MMU state at primary entry
      https://git.kernel.org/arm64/c/9d7c13e5dde3
[4/6] arm64: head: avoid cache invalidation when entering with the MMU on
      https://git.kernel.org/arm64/c/32b135a7fafe
[5/6] arm64: head: Clean the ID map and the HYP text to the PoC if needed
      https://git.kernel.org/arm64/c/3dcf60bbfd28
[6/6] efi: arm64: enter with MMU and caches enabled
      https://git.kernel.org/arm64/c/617861703830

-- 
Catalin


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 0/6] arm64: Permit EFI boot with MMU and caches on
@ 2023-01-24 12:10   ` Catalin Marinas
  0 siblings, 0 replies; 26+ messages in thread
From: Catalin Marinas @ 2023-01-24 12:10 UTC (permalink / raw)
  To: linux-efi, Ard Biesheuvel
  Cc: Will Deacon, linux-arm-kernel, Marc Zyngier, Mark Rutland

On Wed, 11 Jan 2023 11:22:30 +0100, Ard Biesheuvel wrote:
> The purpose of this series is to remove any explicit cache maintenance
> for coherency during early boot. Software managed coherency is error
> prone and tedious, and running with the MMU off is generally bad for
> performance, and it becomes unnecessary if we simply retain the
> cacheable 1:1 mapping of all of system RAM provided by EFI, and use it
> to populate the initial ID map page tables. After setting up this
> preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> TCR and SCTLR registers as before, and proceed as usual, avoiding the
> need for any manipulations of memory while the MMU and caches are off.
> 
> [...]

Applied to arm64 (for-next/efi-boot-mmu-on), thanks!

[1/6] arm64: head: Move all finalise_el2 calls to after __enable_mmu
      https://git.kernel.org/arm64/c/82e4958800c0
[2/6] arm64: kernel: move identity map out of .text mapping
      https://git.kernel.org/arm64/c/af7249b317e4
[3/6] arm64: head: record the MMU state at primary entry
      https://git.kernel.org/arm64/c/9d7c13e5dde3
[4/6] arm64: head: avoid cache invalidation when entering with the MMU on
      https://git.kernel.org/arm64/c/32b135a7fafe
[5/6] arm64: head: Clean the ID map and the HYP text to the PoC if needed
      https://git.kernel.org/arm64/c/3dcf60bbfd28
[6/6] efi: arm64: enter with MMU and caches enabled
      https://git.kernel.org/arm64/c/617861703830

-- 
Catalin


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 4/6] arm64: head: avoid cache invalidation when entering with the MMU on
  2023-01-11 10:22   ` Ard Biesheuvel
@ 2023-01-25 16:32     ` Nathan Chancellor
  -1 siblings, 0 replies; 26+ messages in thread
From: Nathan Chancellor @ 2023-01-25 16:32 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland, llvm

Hi Ard,

On Wed, Jan 11, 2023 at 11:22:34AM +0100, Ard Biesheuvel wrote:
> If we enter with the MMU on, there is no need for explicit cache
> invalidation for stores to memory, as they will be coherent with the
> caches.
> 
> Let's take advantage of this, and create the ID map with the MMU still
> enabled if that is how we entered, and avoid any cache invalidation
> calls in that case.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm64/kernel/head.S | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index c3b898efd3b5288d..d75f419206451d07 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -89,9 +89,9 @@
>  SYM_CODE_START(primary_entry)
>  	bl	record_mmu_state
>  	bl	preserve_boot_args
> +	bl	create_idmap
>  	bl	init_kernel_el			// w0=cpu_boot_mode
>  	mov	x20, x0
> -	bl	create_idmap
>  
>  	/*
>  	 * The following calls CPU setup code, see arch/arm64/mm/proc.S for
> @@ -377,12 +377,13 @@ SYM_FUNC_START_LOCAL(create_idmap)
>  	 * accesses (MMU disabled), invalidate those tables again to
>  	 * remove any speculatively loaded cache lines.
>  	 */
> +	cbnz	x19, 0f				// skip cache invalidation if MMU is on
>  	dmb	sy
>  
>  	adrp	x0, init_idmap_pg_dir
>  	adrp	x1, init_idmap_pg_end
>  	bl	dcache_inval_poc
> -	ret	x28
> +0:	ret	x28
>  SYM_FUNC_END(create_idmap)
>  
>  SYM_FUNC_START_LOCAL(create_kernel_mapping)
> -- 
> 2.39.0
> 

Our CI started reporting a boot failure in QEMU with defconfig +
CONFIG_CPU_BIG_ENDIAN=y after this patch as commit 32b135a7fafe ("arm64:
head: avoid cache invalidation when entering with the MMU on") in the
arm64 tree (and now next-20230125).

https://github.com/ClangBuiltLinux/continuous-integration2/actions/runs/4001750912/jobs/6868612292

$ timeout --foreground 3m qemu-system-aarch64 \
-cpu max,pauth-impdef=true \
-machine virt,gic-version=max,virtualization=true \
-kernel Image.gz \
-append "console=ttyAMA0 earlycon" \
-display none \
-initrd rootfs.cpio
-m 512m \
-nodefaults \
-no-reboot \
-serial mon:stdio
qemu-system-aarch64: terminating on signal 15 from pid 389 (timeout)

defconfig is fine at the same change.

There is no output, which makes sense since this is pretty early in
boot. We are not booting via EFI, in case that matters. This does not
appear to be a toolchain problem, as I can reproduce it with the
kernel.org GCC toolchains.

If there is any more information I can provide or patches I can test, I
am more than happy to do so.

Cheers,
Nathan

# bad: [2e84eedb182e43a9113c2c83cc3373c2ae99ce19] Merge branch 'for-next/core' into for-kernelci
# good: [2241ab53cbb5cdb08a6b2d4688feb13971058f65] Linux 6.2-rc5
git bisect start '2e84eedb182e43a9113c2c83cc3373c2ae99ce19' 'v6.2-rc5'
# good: [3eb1b41fba97a1586e3ecca8c10547071f541567] kselftest/arm64: Add coverage of SME 2 and 2.1 hwcaps
git bisect good 3eb1b41fba97a1586e3ecca8c10547071f541567
# good: [daac835347a52d9d141be281e4657cc08a360e97] kselftest/arm64: Correct buffer size for SME ZA storage
git bisect good daac835347a52d9d141be281e4657cc08a360e97
# good: [baaf553d3bc330697c68a00f96cf11f4edfeac7e] arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
git bisect good baaf553d3bc330697c68a00f96cf11f4edfeac7e
# good: [4f2c9bf16a4bc209a674e7b76d8e829b917c7f84] arm64: Add compat hwcap SSBS
git bisect good 4f2c9bf16a4bc209a674e7b76d8e829b917c7f84
# good: [1abf363d085cf6133ef44900334ddd0f61dc3276] KVM: arm64: Use symbolic definition for ISR_EL1.A
git bisect good 1abf363d085cf6133ef44900334ddd0f61dc3276
# bad: [61786170383093908e9f5f8fd8c5c3ff0c3bbe03] efi: arm64: enter with MMU and caches enabled
git bisect bad 61786170383093908e9f5f8fd8c5c3ff0c3bbe03
# good: [9d7c13e5dde31270eb48a34204a2e06b1a719546] arm64: head: record the MMU state at primary entry
git bisect good 9d7c13e5dde31270eb48a34204a2e06b1a719546
# bad: [3dcf60bbfd284e5ebfa40c56172222425d10abf0] arm64: head: Clean the ID map and the HYP text to the PoC if needed
git bisect bad 3dcf60bbfd284e5ebfa40c56172222425d10abf0
# bad: [32b135a7fafebe7843abe5425159fa081ae56b7c] arm64: head: avoid cache invalidation when entering with the MMU on
git bisect bad 32b135a7fafebe7843abe5425159fa081ae56b7c
# first bad commit: [32b135a7fafebe7843abe5425159fa081ae56b7c] arm64: head: avoid cache invalidation when entering with the MMU on

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 4/6] arm64: head: avoid cache invalidation when entering with the MMU on
@ 2023-01-25 16:32     ` Nathan Chancellor
  0 siblings, 0 replies; 26+ messages in thread
From: Nathan Chancellor @ 2023-01-25 16:32 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland, llvm

Hi Ard,

On Wed, Jan 11, 2023 at 11:22:34AM +0100, Ard Biesheuvel wrote:
> If we enter with the MMU on, there is no need for explicit cache
> invalidation for stores to memory, as they will be coherent with the
> caches.
> 
> Let's take advantage of this, and create the ID map with the MMU still
> enabled if that is how we entered, and avoid any cache invalidation
> calls in that case.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm64/kernel/head.S | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index c3b898efd3b5288d..d75f419206451d07 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -89,9 +89,9 @@
>  SYM_CODE_START(primary_entry)
>  	bl	record_mmu_state
>  	bl	preserve_boot_args
> +	bl	create_idmap
>  	bl	init_kernel_el			// w0=cpu_boot_mode
>  	mov	x20, x0
> -	bl	create_idmap
>  
>  	/*
>  	 * The following calls CPU setup code, see arch/arm64/mm/proc.S for
> @@ -377,12 +377,13 @@ SYM_FUNC_START_LOCAL(create_idmap)
>  	 * accesses (MMU disabled), invalidate those tables again to
>  	 * remove any speculatively loaded cache lines.
>  	 */
> +	cbnz	x19, 0f				// skip cache invalidation if MMU is on
>  	dmb	sy
>  
>  	adrp	x0, init_idmap_pg_dir
>  	adrp	x1, init_idmap_pg_end
>  	bl	dcache_inval_poc
> -	ret	x28
> +0:	ret	x28
>  SYM_FUNC_END(create_idmap)
>  
>  SYM_FUNC_START_LOCAL(create_kernel_mapping)
> -- 
> 2.39.0
> 

Our CI started reporting a boot failure in QEMU with defconfig +
CONFIG_CPU_BIG_ENDIAN=y after this patch as commit 32b135a7fafe ("arm64:
head: avoid cache invalidation when entering with the MMU on") in the
arm64 tree (and now next-20230125).

https://github.com/ClangBuiltLinux/continuous-integration2/actions/runs/4001750912/jobs/6868612292

$ timeout --foreground 3m qemu-system-aarch64 \
-cpu max,pauth-impdef=true \
-machine virt,gic-version=max,virtualization=true \
-kernel Image.gz \
-append "console=ttyAMA0 earlycon" \
-display none \
-initrd rootfs.cpio
-m 512m \
-nodefaults \
-no-reboot \
-serial mon:stdio
qemu-system-aarch64: terminating on signal 15 from pid 389 (timeout)

defconfig is fine at the same change.

There is no output, which makes sense since this is pretty early in
boot. We are not booting via EFI, in case that matters. This does not
appear to be a toolchain problem, as I can reproduce it with the
kernel.org GCC toolchains.

If there is any more information I can provide or patches I can test, I
am more than happy to do so.

Cheers,
Nathan

# bad: [2e84eedb182e43a9113c2c83cc3373c2ae99ce19] Merge branch 'for-next/core' into for-kernelci
# good: [2241ab53cbb5cdb08a6b2d4688feb13971058f65] Linux 6.2-rc5
git bisect start '2e84eedb182e43a9113c2c83cc3373c2ae99ce19' 'v6.2-rc5'
# good: [3eb1b41fba97a1586e3ecca8c10547071f541567] kselftest/arm64: Add coverage of SME 2 and 2.1 hwcaps
git bisect good 3eb1b41fba97a1586e3ecca8c10547071f541567
# good: [daac835347a52d9d141be281e4657cc08a360e97] kselftest/arm64: Correct buffer size for SME ZA storage
git bisect good daac835347a52d9d141be281e4657cc08a360e97
# good: [baaf553d3bc330697c68a00f96cf11f4edfeac7e] arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
git bisect good baaf553d3bc330697c68a00f96cf11f4edfeac7e
# good: [4f2c9bf16a4bc209a674e7b76d8e829b917c7f84] arm64: Add compat hwcap SSBS
git bisect good 4f2c9bf16a4bc209a674e7b76d8e829b917c7f84
# good: [1abf363d085cf6133ef44900334ddd0f61dc3276] KVM: arm64: Use symbolic definition for ISR_EL1.A
git bisect good 1abf363d085cf6133ef44900334ddd0f61dc3276
# bad: [61786170383093908e9f5f8fd8c5c3ff0c3bbe03] efi: arm64: enter with MMU and caches enabled
git bisect bad 61786170383093908e9f5f8fd8c5c3ff0c3bbe03
# good: [9d7c13e5dde31270eb48a34204a2e06b1a719546] arm64: head: record the MMU state at primary entry
git bisect good 9d7c13e5dde31270eb48a34204a2e06b1a719546
# bad: [3dcf60bbfd284e5ebfa40c56172222425d10abf0] arm64: head: Clean the ID map and the HYP text to the PoC if needed
git bisect bad 3dcf60bbfd284e5ebfa40c56172222425d10abf0
# bad: [32b135a7fafebe7843abe5425159fa081ae56b7c] arm64: head: avoid cache invalidation when entering with the MMU on
git bisect bad 32b135a7fafebe7843abe5425159fa081ae56b7c
# first bad commit: [32b135a7fafebe7843abe5425159fa081ae56b7c] arm64: head: avoid cache invalidation when entering with the MMU on

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 4/6] arm64: head: avoid cache invalidation when entering with the MMU on
  2023-01-25 16:32     ` Nathan Chancellor
@ 2023-01-25 16:42       ` Ard Biesheuvel
  -1 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-25 16:42 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: linux-efi, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland, llvm

On Wed, 25 Jan 2023 at 17:32, Nathan Chancellor <nathan@kernel.org> wrote:
>
> Hi Ard,
>
> On Wed, Jan 11, 2023 at 11:22:34AM +0100, Ard Biesheuvel wrote:
> > If we enter with the MMU on, there is no need for explicit cache
> > invalidation for stores to memory, as they will be coherent with the
> > caches.
> >
> > Let's take advantage of this, and create the ID map with the MMU still
> > enabled if that is how we entered, and avoid any cache invalidation
> > calls in that case.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> >  arch/arm64/kernel/head.S | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> > index c3b898efd3b5288d..d75f419206451d07 100644
> > --- a/arch/arm64/kernel/head.S
> > +++ b/arch/arm64/kernel/head.S
> > @@ -89,9 +89,9 @@
> >  SYM_CODE_START(primary_entry)
> >       bl      record_mmu_state
> >       bl      preserve_boot_args
> > +     bl      create_idmap
> >       bl      init_kernel_el                  // w0=cpu_boot_mode
> >       mov     x20, x0
> > -     bl      create_idmap
> >
> >       /*
> >        * The following calls CPU setup code, see arch/arm64/mm/proc.S for
> > @@ -377,12 +377,13 @@ SYM_FUNC_START_LOCAL(create_idmap)
> >        * accesses (MMU disabled), invalidate those tables again to
> >        * remove any speculatively loaded cache lines.
> >        */
> > +     cbnz    x19, 0f                         // skip cache invalidation if MMU is on
> >       dmb     sy
> >
> >       adrp    x0, init_idmap_pg_dir
> >       adrp    x1, init_idmap_pg_end
> >       bl      dcache_inval_poc
> > -     ret     x28
> > +0:   ret     x28
> >  SYM_FUNC_END(create_idmap)
> >
> >  SYM_FUNC_START_LOCAL(create_kernel_mapping)
> > --
> > 2.39.0
> >
>
> Our CI started reporting a boot failure in QEMU with defconfig +
> CONFIG_CPU_BIG_ENDIAN=y after this patch as commit 32b135a7fafe ("arm64:
> head: avoid cache invalidation when entering with the MMU on") in the
> arm64 tree (and now next-20230125).
>
> https://github.com/ClangBuiltLinux/continuous-integration2/actions/runs/4001750912/jobs/6868612292
>
> $ timeout --foreground 3m qemu-system-aarch64 \
> -cpu max,pauth-impdef=true \
> -machine virt,gic-version=max,virtualization=true \
> -kernel Image.gz \
> -append "console=ttyAMA0 earlycon" \
> -display none \
> -initrd rootfs.cpio
> -m 512m \
> -nodefaults \
> -no-reboot \
> -serial mon:stdio
> qemu-system-aarch64: terminating on signal 15 from pid 389 (timeout)
>
> defconfig is fine at the same change.
>
> There is no output, which makes sense since this is pretty early in
> boot. We are not booting via EFI, in case that matters. This does not
> appear to be a toolchain problem, as I can reproduce it with the
> kernel.org GCC toolchains.
>

Thanks for the report.

With this patch, the ID map is populated before the switch to BE mode,
and so the descriptors are written in the wrong byte order.

This should be easy to fix - I'll have a patch out shortly.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 4/6] arm64: head: avoid cache invalidation when entering with the MMU on
@ 2023-01-25 16:42       ` Ard Biesheuvel
  0 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-01-25 16:42 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: linux-efi, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland, llvm

On Wed, 25 Jan 2023 at 17:32, Nathan Chancellor <nathan@kernel.org> wrote:
>
> Hi Ard,
>
> On Wed, Jan 11, 2023 at 11:22:34AM +0100, Ard Biesheuvel wrote:
> > If we enter with the MMU on, there is no need for explicit cache
> > invalidation for stores to memory, as they will be coherent with the
> > caches.
> >
> > Let's take advantage of this, and create the ID map with the MMU still
> > enabled if that is how we entered, and avoid any cache invalidation
> > calls in that case.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> >  arch/arm64/kernel/head.S | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> > index c3b898efd3b5288d..d75f419206451d07 100644
> > --- a/arch/arm64/kernel/head.S
> > +++ b/arch/arm64/kernel/head.S
> > @@ -89,9 +89,9 @@
> >  SYM_CODE_START(primary_entry)
> >       bl      record_mmu_state
> >       bl      preserve_boot_args
> > +     bl      create_idmap
> >       bl      init_kernel_el                  // w0=cpu_boot_mode
> >       mov     x20, x0
> > -     bl      create_idmap
> >
> >       /*
> >        * The following calls CPU setup code, see arch/arm64/mm/proc.S for
> > @@ -377,12 +377,13 @@ SYM_FUNC_START_LOCAL(create_idmap)
> >        * accesses (MMU disabled), invalidate those tables again to
> >        * remove any speculatively loaded cache lines.
> >        */
> > +     cbnz    x19, 0f                         // skip cache invalidation if MMU is on
> >       dmb     sy
> >
> >       adrp    x0, init_idmap_pg_dir
> >       adrp    x1, init_idmap_pg_end
> >       bl      dcache_inval_poc
> > -     ret     x28
> > +0:   ret     x28
> >  SYM_FUNC_END(create_idmap)
> >
> >  SYM_FUNC_START_LOCAL(create_kernel_mapping)
> > --
> > 2.39.0
> >
>
> Our CI started reporting a boot failure in QEMU with defconfig +
> CONFIG_CPU_BIG_ENDIAN=y after this patch as commit 32b135a7fafe ("arm64:
> head: avoid cache invalidation when entering with the MMU on") in the
> arm64 tree (and now next-20230125).
>
> https://github.com/ClangBuiltLinux/continuous-integration2/actions/runs/4001750912/jobs/6868612292
>
> $ timeout --foreground 3m qemu-system-aarch64 \
> -cpu max,pauth-impdef=true \
> -machine virt,gic-version=max,virtualization=true \
> -kernel Image.gz \
> -append "console=ttyAMA0 earlycon" \
> -display none \
> -initrd rootfs.cpio
> -m 512m \
> -nodefaults \
> -no-reboot \
> -serial mon:stdio
> qemu-system-aarch64: terminating on signal 15 from pid 389 (timeout)
>
> defconfig is fine at the same change.
>
> There is no output, which makes sense since this is pretty early in
> boot. We are not booting via EFI, in case that matters. This does not
> appear to be a toolchain problem, as I can reproduce it with the
> kernel.org GCC toolchains.
>

Thanks for the report.

With this patch, the ID map is populated before the switch to BE mode,
and so the descriptors are written in the wrong byte order.

This should be easy to fix - I'll have a patch out shortly.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 2/6] arm64: kernel: move identity map out of .text mapping
  2023-01-11 10:22   ` Ard Biesheuvel
@ 2023-02-03 18:08     ` Nathan Chancellor
  -1 siblings, 0 replies; 26+ messages in thread
From: Nathan Chancellor @ 2023-02-03 18:08 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

Hi Ard,

On Wed, Jan 11, 2023 at 11:22:32AM +0100, Ard Biesheuvel wrote:
> Reorganize the ID map slightly so that only code that is executed with
> the MMU off or via the 1:1 mapping remains. This allows us to move the
> identity map out of the .text segment, as it will no longer need
> executable permissions via the kernel mapping.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm64/kernel/head.S        | 28 +++++++++++---------
>  arch/arm64/kernel/vmlinux.lds.S |  2 +-
>  arch/arm64/mm/proc.S            |  2 --
>  3 files changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index c4e12d466a5f35f0..bec97aad092c2b43 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -543,19 +543,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
>  	eret
>  SYM_FUNC_END(init_kernel_el)
>  
> -/*
> - * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
> - * in w0. See arch/arm64/include/asm/virt.h for more info.
> - */
> -SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
> -	adr_l	x1, __boot_cpu_mode
> -	cmp	w0, #BOOT_CPU_MODE_EL2
> -	b.ne	1f
> -	add	x1, x1, #4
> -1:	str	w0, [x1]			// Save CPU boot mode
> -	ret
> -SYM_FUNC_END(set_cpu_boot_mode_flag)
> -
>  	/*
>  	 * This provides a "holding pen" for platforms to hold all secondary
>  	 * cores are held until we're ready for them to initialise.
> @@ -599,6 +586,7 @@ SYM_FUNC_START_LOCAL(secondary_startup)
>  	br	x8
>  SYM_FUNC_END(secondary_startup)
>  
> +	.text
>  SYM_FUNC_START_LOCAL(__secondary_switched)
>  	mov	x0, x20
>  	bl	set_cpu_boot_mode_flag
> @@ -631,6 +619,19 @@ SYM_FUNC_START_LOCAL(__secondary_too_slow)
>  	b	__secondary_too_slow
>  SYM_FUNC_END(__secondary_too_slow)
>  
> +/*
> + * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
> + * in w0. See arch/arm64/include/asm/virt.h for more info.
> + */
> +SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
> +	adr_l	x1, __boot_cpu_mode
> +	cmp	w0, #BOOT_CPU_MODE_EL2
> +	b.ne	1f
> +	add	x1, x1, #4
> +1:	str	w0, [x1]			// Save CPU boot mode
> +	ret
> +SYM_FUNC_END(set_cpu_boot_mode_flag)
> +
>  /*
>   * The booting CPU updates the failed status @__early_cpu_boot_status,
>   * with MMU turned off.
> @@ -662,6 +663,7 @@ SYM_FUNC_END(__secondary_too_slow)
>   * Checks if the selected granule size is supported by the CPU.
>   * If it isn't, park the CPU
>   */
> +	.section ".idmap.text","awx"
>  SYM_FUNC_START(__enable_mmu)
>  	mrs	x3, ID_AA64MMFR0_EL1
>  	ubfx	x3, x3, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 4c13dafc98b8400f..407415a5163ab62f 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -179,7 +179,6 @@ SECTIONS
>  			LOCK_TEXT
>  			KPROBES_TEXT
>  			HYPERVISOR_TEXT
> -			IDMAP_TEXT
>  			*(.gnu.warning)
>  		. = ALIGN(16);
>  		*(.got)			/* Global offset table		*/
> @@ -206,6 +205,7 @@ SECTIONS
>  		TRAMP_TEXT
>  		HIBERNATE_TEXT
>  		KEXEC_TEXT
> +		IDMAP_TEXT
>  		. = ALIGN(PAGE_SIZE);
>  	}
>  
> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> index 066fa60b93d24827..91410f48809000a0 100644
> --- a/arch/arm64/mm/proc.S
> +++ b/arch/arm64/mm/proc.S
> @@ -110,7 +110,6 @@ SYM_FUNC_END(cpu_do_suspend)
>   *
>   * x0: Address of context pointer
>   */
> -	.pushsection ".idmap.text", "awx"
>  SYM_FUNC_START(cpu_do_resume)
>  	ldp	x2, x3, [x0]
>  	ldp	x4, x5, [x0, #16]
> @@ -166,7 +165,6 @@ alternative_else_nop_endif
>  	isb
>  	ret
>  SYM_FUNC_END(cpu_do_resume)
> -	.popsection
>  #endif
>  
>  	.pushsection ".idmap.text", "awx"
> -- 
> 2.39.0
> 

Sorry you have to keep hearing from me, I am starting to feel like a
nuisance :) apologies if this is already been reported, I did a search
of lore and did not find anything.

I have noticed the following message on my arm64 machines recently and I
had some time to bisect it down to this change in -next (log below):

  [    0.029481] kprobes: Failed to populate blacklist (error -22), kprobes not restricted, be careful using them!

I can trivially reproduce it with defconfig + CONFIG_KPROBES=y in QEMU.
If there is any other information I can provide or patches I can test, I
am more than happy to do so.

Cheers,
Nathan

# bad: [ea4dabbb4ad7eb52632a2ca0b8f89f0ea7c55dcf] Add linux-next specific files for 20230202
# good: [9f266ccaa2f5228bfe67ad58a94ca4e0109b954a] Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
git bisect start 'ea4dabbb4ad7eb52632a2ca0b8f89f0ea7c55dcf' '9f266ccaa2f5228bfe67ad58a94ca4e0109b954a'
# bad: [1212e8a2ede5d43ffa423f6bb15dc128bc442c17] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
git bisect bad 1212e8a2ede5d43ffa423f6bb15dc128bc442c17
# bad: [61a09f7913443728509eaba2b10566372e39f4fc] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping.git
git bisect bad 61a09f7913443728509eaba2b10566372e39f4fc
# good: [642eb331c5eddadf2a9b0ea171c0605788627dae] soc: document merges
git bisect good 642eb331c5eddadf2a9b0ea171c0605788627dae
# bad: [8c08bee280c147d47db13297c5f4d11fdc915fec] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap.git
git bisect bad 8c08bee280c147d47db13297c5f4d11fdc915fec
# good: [ab0a5c9e8c2649f43e6d49b65785189e208df506] Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
git bisect good ab0a5c9e8c2649f43e6d49b65785189e208df506
# bad: [706652c32b3d2a55b52f1e33ee1d2f98a0eafed0] Merge branch 'asahi-soc/for-next' of https://github.com/AsahiLinux/linux.git
git bisect bad 706652c32b3d2a55b52f1e33ee1d2f98a0eafed0
# bad: [13c0a41d2cb2265661c7178cbe986592c906ef0b] Merge branches 'for-next/sysreg', 'for-next/sme', 'for-next/kselftest', 'for-next/misc', 'for-next/sme2', 'for-next/tpidr2', 'for-next/scs', 'for-next/compat-hwcap', 'for-next/ftrace', 'for-next/efi-boot-mmu-on', 'for-next/ptrauth' and 'for-next/pseudo-nmi' into for-next/core
git bisect bad 13c0a41d2cb2265661c7178cbe986592c906ef0b
# good: [b2ab432bcf65e6fa3ec3fef6dd08796404b009d0] kselftest/arm64: Remove redundant _start labels from zt-test
git bisect good b2ab432bcf65e6fa3ec3fef6dd08796404b009d0
# good: [a7db82f18cd3d85ea8ef70fca5946b441187ed6d] kselftest/arm64: Fix enumeration of systems without 128 bit SME for SSVE+ZA
git bisect good a7db82f18cd3d85ea8ef70fca5946b441187ed6d
# good: [dc4824faa265db1bc93449e8ec386a0245404fa6] arm64: avoid executing padding bytes during kexec / hibernation
git bisect good dc4824faa265db1bc93449e8ec386a0245404fa6
# good: [004fc58f917cfea5d7190139e3ed1b7a13e39c25] arm64/mm: Intercept pfn changes in set_pte_at()
git bisect good 004fc58f917cfea5d7190139e3ed1b7a13e39c25
# good: [4f2c9bf16a4bc209a674e7b76d8e829b917c7f84] arm64: Add compat hwcap SSBS
git bisect good 4f2c9bf16a4bc209a674e7b76d8e829b917c7f84
# bad: [2ced0f30a426c7301350681f838344d5aea711e3] arm64: head: Switch endianness before populating the ID map
git bisect bad 2ced0f30a426c7301350681f838344d5aea711e3
# bad: [9d7c13e5dde31270eb48a34204a2e06b1a719546] arm64: head: record the MMU state at primary entry
git bisect bad 9d7c13e5dde31270eb48a34204a2e06b1a719546
# bad: [af7249b317e4d0b3d5a0ebbb7ee7a0f336ca7bca] arm64: kernel: move identity map out of .text mapping
git bisect bad af7249b317e4d0b3d5a0ebbb7ee7a0f336ca7bca
# good: [82e4958800c01daa7662362ee9543065bd14c852] arm64: head: Move all finalise_el2 calls to after __enable_mmu
git bisect good 82e4958800c01daa7662362ee9543065bd14c852
# first bad commit: [af7249b317e4d0b3d5a0ebbb7ee7a0f336ca7bca] arm64: kernel: move identity map out of .text mapping

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 2/6] arm64: kernel: move identity map out of .text mapping
@ 2023-02-03 18:08     ` Nathan Chancellor
  0 siblings, 0 replies; 26+ messages in thread
From: Nathan Chancellor @ 2023-02-03 18:08 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-efi, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

Hi Ard,

On Wed, Jan 11, 2023 at 11:22:32AM +0100, Ard Biesheuvel wrote:
> Reorganize the ID map slightly so that only code that is executed with
> the MMU off or via the 1:1 mapping remains. This allows us to move the
> identity map out of the .text segment, as it will no longer need
> executable permissions via the kernel mapping.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm64/kernel/head.S        | 28 +++++++++++---------
>  arch/arm64/kernel/vmlinux.lds.S |  2 +-
>  arch/arm64/mm/proc.S            |  2 --
>  3 files changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index c4e12d466a5f35f0..bec97aad092c2b43 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -543,19 +543,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
>  	eret
>  SYM_FUNC_END(init_kernel_el)
>  
> -/*
> - * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
> - * in w0. See arch/arm64/include/asm/virt.h for more info.
> - */
> -SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
> -	adr_l	x1, __boot_cpu_mode
> -	cmp	w0, #BOOT_CPU_MODE_EL2
> -	b.ne	1f
> -	add	x1, x1, #4
> -1:	str	w0, [x1]			// Save CPU boot mode
> -	ret
> -SYM_FUNC_END(set_cpu_boot_mode_flag)
> -
>  	/*
>  	 * This provides a "holding pen" for platforms to hold all secondary
>  	 * cores are held until we're ready for them to initialise.
> @@ -599,6 +586,7 @@ SYM_FUNC_START_LOCAL(secondary_startup)
>  	br	x8
>  SYM_FUNC_END(secondary_startup)
>  
> +	.text
>  SYM_FUNC_START_LOCAL(__secondary_switched)
>  	mov	x0, x20
>  	bl	set_cpu_boot_mode_flag
> @@ -631,6 +619,19 @@ SYM_FUNC_START_LOCAL(__secondary_too_slow)
>  	b	__secondary_too_slow
>  SYM_FUNC_END(__secondary_too_slow)
>  
> +/*
> + * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
> + * in w0. See arch/arm64/include/asm/virt.h for more info.
> + */
> +SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
> +	adr_l	x1, __boot_cpu_mode
> +	cmp	w0, #BOOT_CPU_MODE_EL2
> +	b.ne	1f
> +	add	x1, x1, #4
> +1:	str	w0, [x1]			// Save CPU boot mode
> +	ret
> +SYM_FUNC_END(set_cpu_boot_mode_flag)
> +
>  /*
>   * The booting CPU updates the failed status @__early_cpu_boot_status,
>   * with MMU turned off.
> @@ -662,6 +663,7 @@ SYM_FUNC_END(__secondary_too_slow)
>   * Checks if the selected granule size is supported by the CPU.
>   * If it isn't, park the CPU
>   */
> +	.section ".idmap.text","awx"
>  SYM_FUNC_START(__enable_mmu)
>  	mrs	x3, ID_AA64MMFR0_EL1
>  	ubfx	x3, x3, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 4c13dafc98b8400f..407415a5163ab62f 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -179,7 +179,6 @@ SECTIONS
>  			LOCK_TEXT
>  			KPROBES_TEXT
>  			HYPERVISOR_TEXT
> -			IDMAP_TEXT
>  			*(.gnu.warning)
>  		. = ALIGN(16);
>  		*(.got)			/* Global offset table		*/
> @@ -206,6 +205,7 @@ SECTIONS
>  		TRAMP_TEXT
>  		HIBERNATE_TEXT
>  		KEXEC_TEXT
> +		IDMAP_TEXT
>  		. = ALIGN(PAGE_SIZE);
>  	}
>  
> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> index 066fa60b93d24827..91410f48809000a0 100644
> --- a/arch/arm64/mm/proc.S
> +++ b/arch/arm64/mm/proc.S
> @@ -110,7 +110,6 @@ SYM_FUNC_END(cpu_do_suspend)
>   *
>   * x0: Address of context pointer
>   */
> -	.pushsection ".idmap.text", "awx"
>  SYM_FUNC_START(cpu_do_resume)
>  	ldp	x2, x3, [x0]
>  	ldp	x4, x5, [x0, #16]
> @@ -166,7 +165,6 @@ alternative_else_nop_endif
>  	isb
>  	ret
>  SYM_FUNC_END(cpu_do_resume)
> -	.popsection
>  #endif
>  
>  	.pushsection ".idmap.text", "awx"
> -- 
> 2.39.0
> 

Sorry you have to keep hearing from me, I am starting to feel like a
nuisance :) apologies if this is already been reported, I did a search
of lore and did not find anything.

I have noticed the following message on my arm64 machines recently and I
had some time to bisect it down to this change in -next (log below):

  [    0.029481] kprobes: Failed to populate blacklist (error -22), kprobes not restricted, be careful using them!

I can trivially reproduce it with defconfig + CONFIG_KPROBES=y in QEMU.
If there is any other information I can provide or patches I can test, I
am more than happy to do so.

Cheers,
Nathan

# bad: [ea4dabbb4ad7eb52632a2ca0b8f89f0ea7c55dcf] Add linux-next specific files for 20230202
# good: [9f266ccaa2f5228bfe67ad58a94ca4e0109b954a] Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
git bisect start 'ea4dabbb4ad7eb52632a2ca0b8f89f0ea7c55dcf' '9f266ccaa2f5228bfe67ad58a94ca4e0109b954a'
# bad: [1212e8a2ede5d43ffa423f6bb15dc128bc442c17] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
git bisect bad 1212e8a2ede5d43ffa423f6bb15dc128bc442c17
# bad: [61a09f7913443728509eaba2b10566372e39f4fc] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping.git
git bisect bad 61a09f7913443728509eaba2b10566372e39f4fc
# good: [642eb331c5eddadf2a9b0ea171c0605788627dae] soc: document merges
git bisect good 642eb331c5eddadf2a9b0ea171c0605788627dae
# bad: [8c08bee280c147d47db13297c5f4d11fdc915fec] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap.git
git bisect bad 8c08bee280c147d47db13297c5f4d11fdc915fec
# good: [ab0a5c9e8c2649f43e6d49b65785189e208df506] Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git
git bisect good ab0a5c9e8c2649f43e6d49b65785189e208df506
# bad: [706652c32b3d2a55b52f1e33ee1d2f98a0eafed0] Merge branch 'asahi-soc/for-next' of https://github.com/AsahiLinux/linux.git
git bisect bad 706652c32b3d2a55b52f1e33ee1d2f98a0eafed0
# bad: [13c0a41d2cb2265661c7178cbe986592c906ef0b] Merge branches 'for-next/sysreg', 'for-next/sme', 'for-next/kselftest', 'for-next/misc', 'for-next/sme2', 'for-next/tpidr2', 'for-next/scs', 'for-next/compat-hwcap', 'for-next/ftrace', 'for-next/efi-boot-mmu-on', 'for-next/ptrauth' and 'for-next/pseudo-nmi' into for-next/core
git bisect bad 13c0a41d2cb2265661c7178cbe986592c906ef0b
# good: [b2ab432bcf65e6fa3ec3fef6dd08796404b009d0] kselftest/arm64: Remove redundant _start labels from zt-test
git bisect good b2ab432bcf65e6fa3ec3fef6dd08796404b009d0
# good: [a7db82f18cd3d85ea8ef70fca5946b441187ed6d] kselftest/arm64: Fix enumeration of systems without 128 bit SME for SSVE+ZA
git bisect good a7db82f18cd3d85ea8ef70fca5946b441187ed6d
# good: [dc4824faa265db1bc93449e8ec386a0245404fa6] arm64: avoid executing padding bytes during kexec / hibernation
git bisect good dc4824faa265db1bc93449e8ec386a0245404fa6
# good: [004fc58f917cfea5d7190139e3ed1b7a13e39c25] arm64/mm: Intercept pfn changes in set_pte_at()
git bisect good 004fc58f917cfea5d7190139e3ed1b7a13e39c25
# good: [4f2c9bf16a4bc209a674e7b76d8e829b917c7f84] arm64: Add compat hwcap SSBS
git bisect good 4f2c9bf16a4bc209a674e7b76d8e829b917c7f84
# bad: [2ced0f30a426c7301350681f838344d5aea711e3] arm64: head: Switch endianness before populating the ID map
git bisect bad 2ced0f30a426c7301350681f838344d5aea711e3
# bad: [9d7c13e5dde31270eb48a34204a2e06b1a719546] arm64: head: record the MMU state at primary entry
git bisect bad 9d7c13e5dde31270eb48a34204a2e06b1a719546
# bad: [af7249b317e4d0b3d5a0ebbb7ee7a0f336ca7bca] arm64: kernel: move identity map out of .text mapping
git bisect bad af7249b317e4d0b3d5a0ebbb7ee7a0f336ca7bca
# good: [82e4958800c01daa7662362ee9543065bd14c852] arm64: head: Move all finalise_el2 calls to after __enable_mmu
git bisect good 82e4958800c01daa7662362ee9543065bd14c852
# first bad commit: [af7249b317e4d0b3d5a0ebbb7ee7a0f336ca7bca] arm64: kernel: move identity map out of .text mapping

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 2/6] arm64: kernel: move identity map out of .text mapping
  2023-02-03 18:08     ` Nathan Chancellor
@ 2023-02-03 22:41       ` Ard Biesheuvel
  -1 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-02-03 22:41 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: linux-efi, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

On Fri, 3 Feb 2023 at 19:08, Nathan Chancellor <nathan@kernel.org> wrote:
>
> Hi Ard,
>
> On Wed, Jan 11, 2023 at 11:22:32AM +0100, Ard Biesheuvel wrote:
> > Reorganize the ID map slightly so that only code that is executed with
> > the MMU off or via the 1:1 mapping remains. This allows us to move the
> > identity map out of the .text segment, as it will no longer need
> > executable permissions via the kernel mapping.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> >  arch/arm64/kernel/head.S        | 28 +++++++++++---------
> >  arch/arm64/kernel/vmlinux.lds.S |  2 +-
> >  arch/arm64/mm/proc.S            |  2 --
> >  3 files changed, 16 insertions(+), 16 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> > index c4e12d466a5f35f0..bec97aad092c2b43 100644
> > --- a/arch/arm64/kernel/head.S
> > +++ b/arch/arm64/kernel/head.S
> > @@ -543,19 +543,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
> >       eret
> >  SYM_FUNC_END(init_kernel_el)
> >
> > -/*
> > - * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
> > - * in w0. See arch/arm64/include/asm/virt.h for more info.
> > - */
> > -SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
> > -     adr_l   x1, __boot_cpu_mode
> > -     cmp     w0, #BOOT_CPU_MODE_EL2
> > -     b.ne    1f
> > -     add     x1, x1, #4
> > -1:   str     w0, [x1]                        // Save CPU boot mode
> > -     ret
> > -SYM_FUNC_END(set_cpu_boot_mode_flag)
> > -
> >       /*
> >        * This provides a "holding pen" for platforms to hold all secondary
> >        * cores are held until we're ready for them to initialise.
> > @@ -599,6 +586,7 @@ SYM_FUNC_START_LOCAL(secondary_startup)
> >       br      x8
> >  SYM_FUNC_END(secondary_startup)
> >
> > +     .text
> >  SYM_FUNC_START_LOCAL(__secondary_switched)
> >       mov     x0, x20
> >       bl      set_cpu_boot_mode_flag
> > @@ -631,6 +619,19 @@ SYM_FUNC_START_LOCAL(__secondary_too_slow)
> >       b       __secondary_too_slow
> >  SYM_FUNC_END(__secondary_too_slow)
> >
> > +/*
> > + * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
> > + * in w0. See arch/arm64/include/asm/virt.h for more info.
> > + */
> > +SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
> > +     adr_l   x1, __boot_cpu_mode
> > +     cmp     w0, #BOOT_CPU_MODE_EL2
> > +     b.ne    1f
> > +     add     x1, x1, #4
> > +1:   str     w0, [x1]                        // Save CPU boot mode
> > +     ret
> > +SYM_FUNC_END(set_cpu_boot_mode_flag)
> > +
> >  /*
> >   * The booting CPU updates the failed status @__early_cpu_boot_status,
> >   * with MMU turned off.
> > @@ -662,6 +663,7 @@ SYM_FUNC_END(__secondary_too_slow)
> >   * Checks if the selected granule size is supported by the CPU.
> >   * If it isn't, park the CPU
> >   */
> > +     .section ".idmap.text","awx"
> >  SYM_FUNC_START(__enable_mmu)
> >       mrs     x3, ID_AA64MMFR0_EL1
> >       ubfx    x3, x3, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4
> > diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> > index 4c13dafc98b8400f..407415a5163ab62f 100644
> > --- a/arch/arm64/kernel/vmlinux.lds.S
> > +++ b/arch/arm64/kernel/vmlinux.lds.S
> > @@ -179,7 +179,6 @@ SECTIONS
> >                       LOCK_TEXT
> >                       KPROBES_TEXT
> >                       HYPERVISOR_TEXT
> > -                     IDMAP_TEXT
> >                       *(.gnu.warning)
> >               . = ALIGN(16);
> >               *(.got)                 /* Global offset table          */
> > @@ -206,6 +205,7 @@ SECTIONS
> >               TRAMP_TEXT
> >               HIBERNATE_TEXT
> >               KEXEC_TEXT
> > +             IDMAP_TEXT
> >               . = ALIGN(PAGE_SIZE);
> >       }
> >
> > diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> > index 066fa60b93d24827..91410f48809000a0 100644
> > --- a/arch/arm64/mm/proc.S
> > +++ b/arch/arm64/mm/proc.S
> > @@ -110,7 +110,6 @@ SYM_FUNC_END(cpu_do_suspend)
> >   *
> >   * x0: Address of context pointer
> >   */
> > -     .pushsection ".idmap.text", "awx"
> >  SYM_FUNC_START(cpu_do_resume)
> >       ldp     x2, x3, [x0]
> >       ldp     x4, x5, [x0, #16]
> > @@ -166,7 +165,6 @@ alternative_else_nop_endif
> >       isb
> >       ret
> >  SYM_FUNC_END(cpu_do_resume)
> > -     .popsection
> >  #endif
> >
> >       .pushsection ".idmap.text", "awx"
> > --
> > 2.39.0
> >
>
> Sorry you have to keep hearing from me, I am starting to feel like a
> nuisance :) apologies if this is already been reported, I did a search
> of lore and did not find anything.
>

Don't be silly. If my patch broke something, it is my responsibility
to fix it, and I'd rather hear about it from you (with a high quality
report, as usual) than from someone who dumps a log on me but cannot
be bothered to follow up, or doesn't have the chops to help narrow it
down.

> I have noticed the following message on my arm64 machines recently and I
> had some time to bisect it down to this change in -next (log below):
>
>   [    0.029481] kprobes: Failed to populate blacklist (error -22), kprobes not restricted, be careful using them!
>
> I can trivially reproduce it with defconfig + CONFIG_KPROBES=y in QEMU.
> If there is any other information I can provide or patches I can test, I
> am more than happy to do so.
>

I had noticed this diagnostic a couple of times as well, but tbh, I
did not realize that it was my own patch that caused it.

I think the below should fix it: we no longer have to blacklist the ID
map for kprobes now that it is no longer part of the .text section to
begin with, and kprobes will disregard it by default

diff --git a/arch/arm64/kernel/probes/kprobes.c
b/arch/arm64/kernel/probes/kprobes.c
index f35d059a9a366fa6..70b91a8c6bb3f358 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -387,10 +387,6 @@ int __init arch_populate_kprobe_blacklist(void)
                                        (unsigned long)__irqentry_text_end);
        if (ret)
                return ret;
-       ret = kprobe_add_area_blacklist((unsigned long)__idmap_text_start,
-                                       (unsigned long)__idmap_text_end);
-       if (ret)
-               return ret;
        ret = kprobe_add_area_blacklist((unsigned long)__hyp_text_start,
                                        (unsigned long)__hyp_text_end);
        if (ret || is_kernel_in_hyp_mode())

Feel free to turn this into a patch and send it out. (The day has
already ended here :-))

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v7 2/6] arm64: kernel: move identity map out of .text mapping
@ 2023-02-03 22:41       ` Ard Biesheuvel
  0 siblings, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2023-02-03 22:41 UTC (permalink / raw)
  To: Nathan Chancellor
  Cc: linux-efi, linux-arm-kernel, Will Deacon, Catalin Marinas,
	Marc Zyngier, Mark Rutland

On Fri, 3 Feb 2023 at 19:08, Nathan Chancellor <nathan@kernel.org> wrote:
>
> Hi Ard,
>
> On Wed, Jan 11, 2023 at 11:22:32AM +0100, Ard Biesheuvel wrote:
> > Reorganize the ID map slightly so that only code that is executed with
> > the MMU off or via the 1:1 mapping remains. This allows us to move the
> > identity map out of the .text segment, as it will no longer need
> > executable permissions via the kernel mapping.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> >  arch/arm64/kernel/head.S        | 28 +++++++++++---------
> >  arch/arm64/kernel/vmlinux.lds.S |  2 +-
> >  arch/arm64/mm/proc.S            |  2 --
> >  3 files changed, 16 insertions(+), 16 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> > index c4e12d466a5f35f0..bec97aad092c2b43 100644
> > --- a/arch/arm64/kernel/head.S
> > +++ b/arch/arm64/kernel/head.S
> > @@ -543,19 +543,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
> >       eret
> >  SYM_FUNC_END(init_kernel_el)
> >
> > -/*
> > - * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
> > - * in w0. See arch/arm64/include/asm/virt.h for more info.
> > - */
> > -SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
> > -     adr_l   x1, __boot_cpu_mode
> > -     cmp     w0, #BOOT_CPU_MODE_EL2
> > -     b.ne    1f
> > -     add     x1, x1, #4
> > -1:   str     w0, [x1]                        // Save CPU boot mode
> > -     ret
> > -SYM_FUNC_END(set_cpu_boot_mode_flag)
> > -
> >       /*
> >        * This provides a "holding pen" for platforms to hold all secondary
> >        * cores are held until we're ready for them to initialise.
> > @@ -599,6 +586,7 @@ SYM_FUNC_START_LOCAL(secondary_startup)
> >       br      x8
> >  SYM_FUNC_END(secondary_startup)
> >
> > +     .text
> >  SYM_FUNC_START_LOCAL(__secondary_switched)
> >       mov     x0, x20
> >       bl      set_cpu_boot_mode_flag
> > @@ -631,6 +619,19 @@ SYM_FUNC_START_LOCAL(__secondary_too_slow)
> >       b       __secondary_too_slow
> >  SYM_FUNC_END(__secondary_too_slow)
> >
> > +/*
> > + * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
> > + * in w0. See arch/arm64/include/asm/virt.h for more info.
> > + */
> > +SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
> > +     adr_l   x1, __boot_cpu_mode
> > +     cmp     w0, #BOOT_CPU_MODE_EL2
> > +     b.ne    1f
> > +     add     x1, x1, #4
> > +1:   str     w0, [x1]                        // Save CPU boot mode
> > +     ret
> > +SYM_FUNC_END(set_cpu_boot_mode_flag)
> > +
> >  /*
> >   * The booting CPU updates the failed status @__early_cpu_boot_status,
> >   * with MMU turned off.
> > @@ -662,6 +663,7 @@ SYM_FUNC_END(__secondary_too_slow)
> >   * Checks if the selected granule size is supported by the CPU.
> >   * If it isn't, park the CPU
> >   */
> > +     .section ".idmap.text","awx"
> >  SYM_FUNC_START(__enable_mmu)
> >       mrs     x3, ID_AA64MMFR0_EL1
> >       ubfx    x3, x3, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4
> > diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> > index 4c13dafc98b8400f..407415a5163ab62f 100644
> > --- a/arch/arm64/kernel/vmlinux.lds.S
> > +++ b/arch/arm64/kernel/vmlinux.lds.S
> > @@ -179,7 +179,6 @@ SECTIONS
> >                       LOCK_TEXT
> >                       KPROBES_TEXT
> >                       HYPERVISOR_TEXT
> > -                     IDMAP_TEXT
> >                       *(.gnu.warning)
> >               . = ALIGN(16);
> >               *(.got)                 /* Global offset table          */
> > @@ -206,6 +205,7 @@ SECTIONS
> >               TRAMP_TEXT
> >               HIBERNATE_TEXT
> >               KEXEC_TEXT
> > +             IDMAP_TEXT
> >               . = ALIGN(PAGE_SIZE);
> >       }
> >
> > diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> > index 066fa60b93d24827..91410f48809000a0 100644
> > --- a/arch/arm64/mm/proc.S
> > +++ b/arch/arm64/mm/proc.S
> > @@ -110,7 +110,6 @@ SYM_FUNC_END(cpu_do_suspend)
> >   *
> >   * x0: Address of context pointer
> >   */
> > -     .pushsection ".idmap.text", "awx"
> >  SYM_FUNC_START(cpu_do_resume)
> >       ldp     x2, x3, [x0]
> >       ldp     x4, x5, [x0, #16]
> > @@ -166,7 +165,6 @@ alternative_else_nop_endif
> >       isb
> >       ret
> >  SYM_FUNC_END(cpu_do_resume)
> > -     .popsection
> >  #endif
> >
> >       .pushsection ".idmap.text", "awx"
> > --
> > 2.39.0
> >
>
> Sorry you have to keep hearing from me, I am starting to feel like a
> nuisance :) apologies if this is already been reported, I did a search
> of lore and did not find anything.
>

Don't be silly. If my patch broke something, it is my responsibility
to fix it, and I'd rather hear about it from you (with a high quality
report, as usual) than from someone who dumps a log on me but cannot
be bothered to follow up, or doesn't have the chops to help narrow it
down.

> I have noticed the following message on my arm64 machines recently and I
> had some time to bisect it down to this change in -next (log below):
>
>   [    0.029481] kprobes: Failed to populate blacklist (error -22), kprobes not restricted, be careful using them!
>
> I can trivially reproduce it with defconfig + CONFIG_KPROBES=y in QEMU.
> If there is any other information I can provide or patches I can test, I
> am more than happy to do so.
>

I had noticed this diagnostic a couple of times as well, but tbh, I
did not realize that it was my own patch that caused it.

I think the below should fix it: we no longer have to blacklist the ID
map for kprobes now that it is no longer part of the .text section to
begin with, and kprobes will disregard it by default

diff --git a/arch/arm64/kernel/probes/kprobes.c
b/arch/arm64/kernel/probes/kprobes.c
index f35d059a9a366fa6..70b91a8c6bb3f358 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -387,10 +387,6 @@ int __init arch_populate_kprobe_blacklist(void)
                                        (unsigned long)__irqentry_text_end);
        if (ret)
                return ret;
-       ret = kprobe_add_area_blacklist((unsigned long)__idmap_text_start,
-                                       (unsigned long)__idmap_text_end);
-       if (ret)
-               return ret;
        ret = kprobe_add_area_blacklist((unsigned long)__hyp_text_start,
                                        (unsigned long)__hyp_text_end);
        if (ret || is_kernel_in_hyp_mode())

Feel free to turn this into a patch and send it out. (The day has
already ended here :-))

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-02-03 22:43 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-11 10:22 [PATCH v7 0/6] arm64: Permit EFI boot with MMU and caches on Ard Biesheuvel
2023-01-11 10:22 ` Ard Biesheuvel
2023-01-11 10:22 ` [PATCH v7 1/6] arm64: head: Move all finalise_el2 calls to after __enable_mmu Ard Biesheuvel
2023-01-11 10:22   ` Ard Biesheuvel
2023-01-11 10:22 ` [PATCH v7 2/6] arm64: kernel: move identity map out of .text mapping Ard Biesheuvel
2023-01-11 10:22   ` Ard Biesheuvel
2023-02-03 18:08   ` Nathan Chancellor
2023-02-03 18:08     ` Nathan Chancellor
2023-02-03 22:41     ` Ard Biesheuvel
2023-02-03 22:41       ` Ard Biesheuvel
2023-01-11 10:22 ` [PATCH v7 3/6] arm64: head: record the MMU state at primary entry Ard Biesheuvel
2023-01-11 10:22   ` Ard Biesheuvel
2023-01-11 10:22 ` [PATCH v7 4/6] arm64: head: avoid cache invalidation when entering with the MMU on Ard Biesheuvel
2023-01-11 10:22   ` Ard Biesheuvel
2023-01-25 16:32   ` Nathan Chancellor
2023-01-25 16:32     ` Nathan Chancellor
2023-01-25 16:42     ` Ard Biesheuvel
2023-01-25 16:42       ` Ard Biesheuvel
2023-01-11 10:22 ` [PATCH v7 5/6] arm64: head: Clean the ID map and the HYP text to the PoC if needed Ard Biesheuvel
2023-01-11 10:22   ` Ard Biesheuvel
2023-01-11 10:22 ` [PATCH v7 6/6] efi: arm64: enter with MMU and caches enabled Ard Biesheuvel
2023-01-11 10:22   ` Ard Biesheuvel
2023-01-11 10:26 ` [PATCH v7 0/6] arm64: Permit EFI boot with MMU and caches on Ard Biesheuvel
2023-01-11 10:26   ` Ard Biesheuvel
2023-01-24 12:10 ` Catalin Marinas
2023-01-24 12:10   ` Catalin Marinas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.