* [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-08 18:21 ` Ard Biesheuvel
0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:21 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
The purpose of this series is to remove any explicit cache maintenance
for coherency during early boot that becomes unnecessary if we simply
retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
and use it to populate the ID map page tables. After setting up this
preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
TCR and SCTLR registers as before, and proceed as usual, avoiding the
need for any manipulations of memory while the MMU and caches are off.
The only properties of the firmware provided 1:1 map we rely on is that
it does not require any explicit cache maintenance for coherency, and
that it covers the entire memory footprint of the image, including the
BSS and padding at the end - all else is under control of the kernel
itself, as before.
Changes since v4:
- add patch to align the callers of finalise_el2()
- also clean HYP text to the PoC when booting at EL2 with the MMU on
- add a warning and a taint when doing non-EFI boot with the MMU and
caches enabled
- rebase onto zboot changes in efi/next - this means that patches #6 and
#7 will not apply onto arm64/for-next so a shared stable branch will
be needed if we want to queue this up for v6.2
Changes since v3:
- drop EFI_LOADER_CODE memory type patch that has been queued in the
mean time
- rebased onto [partial] series that moves efi-entry.S into the libstub/
source directory
- fixed a correctness issue in patch #2
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Ard Biesheuvel (7):
arm64: head: Move all finalise_el2 calls to after __enable_mmu
arm64: kernel: move identity map out of .text mapping
arm64: head: record the MMU state at primary entry
arm64: head: avoid cache invalidation when entering with the MMU on
arm64: head: Clean the ID map and the HYP text to the PoC if needed
arm64: lds: reduce effective minimum image alignment to 64k
efi: arm64: enter with MMU and caches enabled
arch/arm64/include/asm/efi.h | 9 +-
arch/arm64/kernel/head.S | 93 +++++++++++++++-----
arch/arm64/kernel/image-vars.h | 5 +-
arch/arm64/kernel/setup.c | 9 +-
arch/arm64/kernel/sleep.S | 6 +-
arch/arm64/kernel/vmlinux.lds.S | 13 ++-
arch/arm64/mm/cache.S | 5 +-
arch/arm64/mm/proc.S | 2 -
drivers/firmware/efi/libstub/Makefile | 4 +-
drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------
drivers/firmware/efi/libstub/arm64-stub.c | 26 ++++--
drivers/firmware/efi/libstub/arm64.c | 41 +++++++--
include/linux/efi.h | 6 +-
13 files changed, 159 insertions(+), 127 deletions(-)
delete mode 100644 drivers/firmware/efi/libstub/arm64-entry.S
--
2.35.1
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-08 18:21 ` Ard Biesheuvel
0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:21 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
The purpose of this series is to remove any explicit cache maintenance
for coherency during early boot that becomes unnecessary if we simply
retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
and use it to populate the ID map page tables. After setting up this
preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
TCR and SCTLR registers as before, and proceed as usual, avoiding the
need for any manipulations of memory while the MMU and caches are off.
The only properties of the firmware provided 1:1 map we rely on is that
it does not require any explicit cache maintenance for coherency, and
that it covers the entire memory footprint of the image, including the
BSS and padding at the end - all else is under control of the kernel
itself, as before.
Changes since v4:
- add patch to align the callers of finalise_el2()
- also clean HYP text to the PoC when booting at EL2 with the MMU on
- add a warning and a taint when doing non-EFI boot with the MMU and
caches enabled
- rebase onto zboot changes in efi/next - this means that patches #6 and
#7 will not apply onto arm64/for-next so a shared stable branch will
be needed if we want to queue this up for v6.2
Changes since v3:
- drop EFI_LOADER_CODE memory type patch that has been queued in the
mean time
- rebased onto [partial] series that moves efi-entry.S into the libstub/
source directory
- fixed a correctness issue in patch #2
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Ard Biesheuvel (7):
arm64: head: Move all finalise_el2 calls to after __enable_mmu
arm64: kernel: move identity map out of .text mapping
arm64: head: record the MMU state at primary entry
arm64: head: avoid cache invalidation when entering with the MMU on
arm64: head: Clean the ID map and the HYP text to the PoC if needed
arm64: lds: reduce effective minimum image alignment to 64k
efi: arm64: enter with MMU and caches enabled
arch/arm64/include/asm/efi.h | 9 +-
arch/arm64/kernel/head.S | 93 +++++++++++++++-----
arch/arm64/kernel/image-vars.h | 5 +-
arch/arm64/kernel/setup.c | 9 +-
arch/arm64/kernel/sleep.S | 6 +-
arch/arm64/kernel/vmlinux.lds.S | 13 ++-
arch/arm64/mm/cache.S | 5 +-
arch/arm64/mm/proc.S | 2 -
drivers/firmware/efi/libstub/Makefile | 4 +-
drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------
drivers/firmware/efi/libstub/arm64-stub.c | 26 ++++--
drivers/firmware/efi/libstub/arm64.c | 41 +++++++--
include/linux/efi.h | 6 +-
13 files changed, 159 insertions(+), 127 deletions(-)
delete mode 100644 drivers/firmware/efi/libstub/arm64-entry.S
--
2.35.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH v5 1/7] arm64: head: Move all finalise_el2 calls to after __enable_mmu
2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:21 ` Ard Biesheuvel
-1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:21 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
In the primary boot path, finalise_el2() is called much later than on
the secondary boot or resume-from-suspend paths, and this does not
appear to be intentional.
Since we aim to do as little as possible before enabling the MMU and
caches, align secondary and resume with primary boot, and defer the call
to after the MMU is turned on. This also removes the need to clean
finalise_el2() to the PoC once we enable support for booting with the
MMU on.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/head.S | 5 ++++-
arch/arm64/kernel/sleep.S | 5 ++++-
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2196aad7b55bcef0..c59e0d95b44d0901 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -584,7 +584,6 @@ SYM_FUNC_START_LOCAL(secondary_startup)
* Common entry point for secondary CPUs.
*/
mov x20, x0 // preserve boot mode
- bl finalise_el2
bl __cpu_secondary_check52bitva
#if VA_BITS > 48
ldr_l x0, vabits_actual
@@ -600,6 +599,10 @@ SYM_FUNC_END(secondary_startup)
SYM_FUNC_START_LOCAL(__secondary_switched)
mov x0, x20
bl set_cpu_boot_mode_flag
+
+ mov x0, x20
+ bl finalise_el2
+
str_l xzr, __early_cpu_boot_status, x3
adr_l x5, vectors
msr vbar_el1, x5
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 97c9de57725dfddb..7b7c56e048346e97 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -100,7 +100,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
.pushsection ".idmap.text", "awx"
SYM_CODE_START(cpu_resume)
bl init_kernel_el
- bl finalise_el2
+ mov x19, x0 // preserve boot mode
#if VA_BITS > 48
ldr_l x0, vabits_actual
#endif
@@ -116,6 +116,9 @@ SYM_CODE_END(cpu_resume)
.popsection
SYM_FUNC_START(_cpu_resume)
+ mov x0, x19
+ bl finalise_el2
+
mrs x1, mpidr_el1
adr_l x8, mpidr_hash // x8 = struct mpidr_hash virt address
--
2.35.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 1/7] arm64: head: Move all finalise_el2 calls to after __enable_mmu
@ 2022-11-08 18:21 ` Ard Biesheuvel
0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:21 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
In the primary boot path, finalise_el2() is called much later than on
the secondary boot or resume-from-suspend paths, and this does not
appear to be intentional.
Since we aim to do as little as possible before enabling the MMU and
caches, align secondary and resume with primary boot, and defer the call
to after the MMU is turned on. This also removes the need to clean
finalise_el2() to the PoC once we enable support for booting with the
MMU on.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/head.S | 5 ++++-
arch/arm64/kernel/sleep.S | 5 ++++-
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2196aad7b55bcef0..c59e0d95b44d0901 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -584,7 +584,6 @@ SYM_FUNC_START_LOCAL(secondary_startup)
* Common entry point for secondary CPUs.
*/
mov x20, x0 // preserve boot mode
- bl finalise_el2
bl __cpu_secondary_check52bitva
#if VA_BITS > 48
ldr_l x0, vabits_actual
@@ -600,6 +599,10 @@ SYM_FUNC_END(secondary_startup)
SYM_FUNC_START_LOCAL(__secondary_switched)
mov x0, x20
bl set_cpu_boot_mode_flag
+
+ mov x0, x20
+ bl finalise_el2
+
str_l xzr, __early_cpu_boot_status, x3
adr_l x5, vectors
msr vbar_el1, x5
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 97c9de57725dfddb..7b7c56e048346e97 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -100,7 +100,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
.pushsection ".idmap.text", "awx"
SYM_CODE_START(cpu_resume)
bl init_kernel_el
- bl finalise_el2
+ mov x19, x0 // preserve boot mode
#if VA_BITS > 48
ldr_l x0, vabits_actual
#endif
@@ -116,6 +116,9 @@ SYM_CODE_END(cpu_resume)
.popsection
SYM_FUNC_START(_cpu_resume)
+ mov x0, x19
+ bl finalise_el2
+
mrs x1, mpidr_el1
adr_l x8, mpidr_hash // x8 = struct mpidr_hash virt address
--
2.35.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 2/7] arm64: kernel: move identity map out of .text mapping
2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:21 ` Ard Biesheuvel
-1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:21 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
Reorganize the ID map slightly so that only code that is executed with
the MMU off or via the 1:1 mapping remains. This allows us to move the
identity map out of the .text segment, as it will no longer need
executable permissions via the kernel mapping.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/head.S | 28 +++++++++++---------
arch/arm64/kernel/vmlinux.lds.S | 2 +-
arch/arm64/mm/proc.S | 2 --
3 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index c59e0d95b44d0901..272877c5b4fa1203 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -540,19 +540,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
eret
SYM_FUNC_END(init_kernel_el)
-/*
- * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
- * in w0. See arch/arm64/include/asm/virt.h for more info.
- */
-SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
- adr_l x1, __boot_cpu_mode
- cmp w0, #BOOT_CPU_MODE_EL2
- b.ne 1f
- add x1, x1, #4
-1: str w0, [x1] // Save CPU boot mode
- ret
-SYM_FUNC_END(set_cpu_boot_mode_flag)
-
/*
* This provides a "holding pen" for platforms to hold all secondary
* cores are held until we're ready for them to initialise.
@@ -596,6 +583,7 @@ SYM_FUNC_START_LOCAL(secondary_startup)
br x8
SYM_FUNC_END(secondary_startup)
+ .text
SYM_FUNC_START_LOCAL(__secondary_switched)
mov x0, x20
bl set_cpu_boot_mode_flag
@@ -628,6 +616,19 @@ SYM_FUNC_START_LOCAL(__secondary_too_slow)
b __secondary_too_slow
SYM_FUNC_END(__secondary_too_slow)
+/*
+ * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
+ * in w0. See arch/arm64/include/asm/virt.h for more info.
+ */
+SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
+ adr_l x1, __boot_cpu_mode
+ cmp w0, #BOOT_CPU_MODE_EL2
+ b.ne 1f
+ add x1, x1, #4
+1: str w0, [x1] // Save CPU boot mode
+ ret
+SYM_FUNC_END(set_cpu_boot_mode_flag)
+
/*
* The booting CPU updates the failed status @__early_cpu_boot_status,
* with MMU turned off.
@@ -659,6 +660,7 @@ SYM_FUNC_END(__secondary_too_slow)
* Checks if the selected granule size is supported by the CPU.
* If it isn't, park the CPU
*/
+ .section ".idmap.text","awx"
SYM_FUNC_START(__enable_mmu)
mrs x3, ID_AA64MMFR0_EL1
ubfx x3, x3, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 45131e354e27f1f8..c7727a1740ce11f5 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -168,7 +168,6 @@ SECTIONS
LOCK_TEXT
KPROBES_TEXT
HYPERVISOR_TEXT
- IDMAP_TEXT
*(.gnu.warning)
. = ALIGN(16);
*(.got) /* Global offset table */
@@ -195,6 +194,7 @@ SECTIONS
TRAMP_TEXT
HIBERNATE_TEXT
KEXEC_TEXT
+ IDMAP_TEXT
. = ALIGN(PAGE_SIZE);
}
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index b9ecbbae1e1abca1..d7ca6f23fb0d1334 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -110,7 +110,6 @@ SYM_FUNC_END(cpu_do_suspend)
*
* x0: Address of context pointer
*/
- .pushsection ".idmap.text", "awx"
SYM_FUNC_START(cpu_do_resume)
ldp x2, x3, [x0]
ldp x4, x5, [x0, #16]
@@ -166,7 +165,6 @@ alternative_else_nop_endif
isb
ret
SYM_FUNC_END(cpu_do_resume)
- .popsection
#endif
.pushsection ".idmap.text", "awx"
--
2.35.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 2/7] arm64: kernel: move identity map out of .text mapping
@ 2022-11-08 18:21 ` Ard Biesheuvel
0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:21 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
Reorganize the ID map slightly so that only code that is executed with
the MMU off or via the 1:1 mapping remains. This allows us to move the
identity map out of the .text segment, as it will no longer need
executable permissions via the kernel mapping.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/head.S | 28 +++++++++++---------
arch/arm64/kernel/vmlinux.lds.S | 2 +-
arch/arm64/mm/proc.S | 2 --
3 files changed, 16 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index c59e0d95b44d0901..272877c5b4fa1203 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -540,19 +540,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
eret
SYM_FUNC_END(init_kernel_el)
-/*
- * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
- * in w0. See arch/arm64/include/asm/virt.h for more info.
- */
-SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
- adr_l x1, __boot_cpu_mode
- cmp w0, #BOOT_CPU_MODE_EL2
- b.ne 1f
- add x1, x1, #4
-1: str w0, [x1] // Save CPU boot mode
- ret
-SYM_FUNC_END(set_cpu_boot_mode_flag)
-
/*
* This provides a "holding pen" for platforms to hold all secondary
* cores are held until we're ready for them to initialise.
@@ -596,6 +583,7 @@ SYM_FUNC_START_LOCAL(secondary_startup)
br x8
SYM_FUNC_END(secondary_startup)
+ .text
SYM_FUNC_START_LOCAL(__secondary_switched)
mov x0, x20
bl set_cpu_boot_mode_flag
@@ -628,6 +616,19 @@ SYM_FUNC_START_LOCAL(__secondary_too_slow)
b __secondary_too_slow
SYM_FUNC_END(__secondary_too_slow)
+/*
+ * Sets the __boot_cpu_mode flag depending on the CPU boot mode passed
+ * in w0. See arch/arm64/include/asm/virt.h for more info.
+ */
+SYM_FUNC_START_LOCAL(set_cpu_boot_mode_flag)
+ adr_l x1, __boot_cpu_mode
+ cmp w0, #BOOT_CPU_MODE_EL2
+ b.ne 1f
+ add x1, x1, #4
+1: str w0, [x1] // Save CPU boot mode
+ ret
+SYM_FUNC_END(set_cpu_boot_mode_flag)
+
/*
* The booting CPU updates the failed status @__early_cpu_boot_status,
* with MMU turned off.
@@ -659,6 +660,7 @@ SYM_FUNC_END(__secondary_too_slow)
* Checks if the selected granule size is supported by the CPU.
* If it isn't, park the CPU
*/
+ .section ".idmap.text","awx"
SYM_FUNC_START(__enable_mmu)
mrs x3, ID_AA64MMFR0_EL1
ubfx x3, x3, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 45131e354e27f1f8..c7727a1740ce11f5 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -168,7 +168,6 @@ SECTIONS
LOCK_TEXT
KPROBES_TEXT
HYPERVISOR_TEXT
- IDMAP_TEXT
*(.gnu.warning)
. = ALIGN(16);
*(.got) /* Global offset table */
@@ -195,6 +194,7 @@ SECTIONS
TRAMP_TEXT
HIBERNATE_TEXT
KEXEC_TEXT
+ IDMAP_TEXT
. = ALIGN(PAGE_SIZE);
}
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index b9ecbbae1e1abca1..d7ca6f23fb0d1334 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -110,7 +110,6 @@ SYM_FUNC_END(cpu_do_suspend)
*
* x0: Address of context pointer
*/
- .pushsection ".idmap.text", "awx"
SYM_FUNC_START(cpu_do_resume)
ldp x2, x3, [x0]
ldp x4, x5, [x0, #16]
@@ -166,7 +165,6 @@ alternative_else_nop_endif
isb
ret
SYM_FUNC_END(cpu_do_resume)
- .popsection
#endif
.pushsection ".idmap.text", "awx"
--
2.35.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 3/7] arm64: head: record the MMU state at primary entry
2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:22 ` Ard Biesheuvel
-1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
Prepare for being able to deal with primary entry with the MMU and
caches enabled, by recording whether or not we entered with the MMU on
in register x19 and in a global variable. (Note that setting this
variable to '1' does not require cache invalidation, nor is it required
for storing the bootargs in that case, so omit the cache maintenance).
Since boot with the MMU enabled is not permitted by the bare metal boot
protocol, ensure that a diagnostic is emitted and a taint bit set if
the MMU was found to be enabled on a non-EFI boot. We will make an
exception for EFI boot later, which has strict requirements for the
mapping of system memory, permitting us to relax the boot protocol and
hand over from the EFI stub to the core kernel with MMU and caches left
enabled.
While at it, add 'pre_disable_mmu_workaround' macro invocations to
init_kernel_el, as its manipulation of SCTLR_ELx may amount to disabling
of the MMU after subsequent patches.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/head.S | 21 ++++++++++++++++++++
arch/arm64/kernel/setup.c | 9 +++++++--
2 files changed, 28 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 272877c5b4fa1203..3e654e43fa115947 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -77,6 +77,7 @@
* primary lowlevel boot path:
*
* Register Scope Purpose
+ * x19 primary_entry() .. start_kernel() whether we entered with the MMU on
* x20 primary_entry() .. __primary_switch() CPU boot mode
* x21 primary_entry() .. start_kernel() FDT pointer passed at boot in x0
* x22 create_idmap() .. start_kernel() ID map VA of the DT blob
@@ -86,6 +87,7 @@
* x28 create_idmap() callee preserved temp register
*/
SYM_CODE_START(primary_entry)
+ bl record_mmu_state
bl preserve_boot_args
bl init_kernel_el // w0=cpu_boot_mode
mov x20, x0
@@ -109,6 +111,19 @@ SYM_CODE_START(primary_entry)
b __primary_switch
SYM_CODE_END(primary_entry)
+SYM_CODE_START_LOCAL(record_mmu_state)
+ mrs x19, CurrentEL
+ cmp x19, #CurrentEL_EL2
+ mrs x19, sctlr_el1
+ b.ne 0f
+ mrs x19, sctlr_el2
+0: tst x19, #SCTLR_ELx_C // Z := (C == 0)
+ and x19, x19, #SCTLR_ELx_M // isolate M bit
+ ccmp x19, xzr, #4, ne // Z |= (M == 0)
+ cset x19, ne // set x19 if !Z
+ ret
+SYM_CODE_END(record_mmu_state)
+
/*
* Preserve the arguments passed by the bootloader in x0 .. x3
*/
@@ -119,11 +134,14 @@ SYM_CODE_START_LOCAL(preserve_boot_args)
stp x21, x1, [x0] // x0 .. x3 at kernel entry
stp x2, x3, [x0, #16]
+ cbnz x19, 0f // skip cache invalidation if MMU is on
dmb sy // needed before dc ivac with
// MMU off
add x1, x0, #0x20 // 4 x 8 bytes
b dcache_inval_poc // tail call
+0: str_l x19, mmu_enabled_at_boot, x0
+ ret
SYM_CODE_END(preserve_boot_args)
SYM_FUNC_START_LOCAL(clear_page_tables)
@@ -494,6 +512,7 @@ SYM_FUNC_START(init_kernel_el)
SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
mov_q x0, INIT_SCTLR_EL1_MMU_OFF
+ pre_disable_mmu_workaround
msr sctlr_el1, x0
isb
mov_q x0, INIT_PSTATE_EL1
@@ -526,11 +545,13 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
cbz x0, 1f
/* Set a sane SCTLR_EL1, the VHE way */
+ pre_disable_mmu_workaround
msr_s SYS_SCTLR_EL12, x1
mov x2, #BOOT_CPU_FLAG_E2H
b 2f
1:
+ pre_disable_mmu_workaround
msr sctlr_el1, x1
mov x2, xzr
2:
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index fea3223704b6339a..11cf21afafa9f852 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -56,6 +56,7 @@ static int num_standard_resources;
static struct resource *standard_resources;
phys_addr_t __fdt_pointer __initdata;
+u64 mmu_enabled_at_boot __initdata;
/*
* Standard memory resources
@@ -328,8 +329,12 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
xen_early_init();
efi_init();
- if (!efi_enabled(EFI_BOOT) && ((u64)_text % MIN_KIMG_ALIGN) != 0)
- pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
+ if (!efi_enabled(EFI_BOOT)) {
+ if ((u64)_text % MIN_KIMG_ALIGN)
+ pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
+ WARN_TAINT(mmu_enabled_at_boot, TAINT_FIRMWARE_WORKAROUND,
+ FW_BUG "Booted with MMU enabled!");
+ }
arm64_memblock_init();
--
2.35.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 3/7] arm64: head: record the MMU state at primary entry
@ 2022-11-08 18:22 ` Ard Biesheuvel
0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
Prepare for being able to deal with primary entry with the MMU and
caches enabled, by recording whether or not we entered with the MMU on
in register x19 and in a global variable. (Note that setting this
variable to '1' does not require cache invalidation, nor is it required
for storing the bootargs in that case, so omit the cache maintenance).
Since boot with the MMU enabled is not permitted by the bare metal boot
protocol, ensure that a diagnostic is emitted and a taint bit set if
the MMU was found to be enabled on a non-EFI boot. We will make an
exception for EFI boot later, which has strict requirements for the
mapping of system memory, permitting us to relax the boot protocol and
hand over from the EFI stub to the core kernel with MMU and caches left
enabled.
While at it, add 'pre_disable_mmu_workaround' macro invocations to
init_kernel_el, as its manipulation of SCTLR_ELx may amount to disabling
of the MMU after subsequent patches.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/head.S | 21 ++++++++++++++++++++
arch/arm64/kernel/setup.c | 9 +++++++--
2 files changed, 28 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 272877c5b4fa1203..3e654e43fa115947 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -77,6 +77,7 @@
* primary lowlevel boot path:
*
* Register Scope Purpose
+ * x19 primary_entry() .. start_kernel() whether we entered with the MMU on
* x20 primary_entry() .. __primary_switch() CPU boot mode
* x21 primary_entry() .. start_kernel() FDT pointer passed at boot in x0
* x22 create_idmap() .. start_kernel() ID map VA of the DT blob
@@ -86,6 +87,7 @@
* x28 create_idmap() callee preserved temp register
*/
SYM_CODE_START(primary_entry)
+ bl record_mmu_state
bl preserve_boot_args
bl init_kernel_el // w0=cpu_boot_mode
mov x20, x0
@@ -109,6 +111,19 @@ SYM_CODE_START(primary_entry)
b __primary_switch
SYM_CODE_END(primary_entry)
+SYM_CODE_START_LOCAL(record_mmu_state)
+ mrs x19, CurrentEL
+ cmp x19, #CurrentEL_EL2
+ mrs x19, sctlr_el1
+ b.ne 0f
+ mrs x19, sctlr_el2
+0: tst x19, #SCTLR_ELx_C // Z := (C == 0)
+ and x19, x19, #SCTLR_ELx_M // isolate M bit
+ ccmp x19, xzr, #4, ne // Z |= (M == 0)
+ cset x19, ne // set x19 if !Z
+ ret
+SYM_CODE_END(record_mmu_state)
+
/*
* Preserve the arguments passed by the bootloader in x0 .. x3
*/
@@ -119,11 +134,14 @@ SYM_CODE_START_LOCAL(preserve_boot_args)
stp x21, x1, [x0] // x0 .. x3 at kernel entry
stp x2, x3, [x0, #16]
+ cbnz x19, 0f // skip cache invalidation if MMU is on
dmb sy // needed before dc ivac with
// MMU off
add x1, x0, #0x20 // 4 x 8 bytes
b dcache_inval_poc // tail call
+0: str_l x19, mmu_enabled_at_boot, x0
+ ret
SYM_CODE_END(preserve_boot_args)
SYM_FUNC_START_LOCAL(clear_page_tables)
@@ -494,6 +512,7 @@ SYM_FUNC_START(init_kernel_el)
SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
mov_q x0, INIT_SCTLR_EL1_MMU_OFF
+ pre_disable_mmu_workaround
msr sctlr_el1, x0
isb
mov_q x0, INIT_PSTATE_EL1
@@ -526,11 +545,13 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
cbz x0, 1f
/* Set a sane SCTLR_EL1, the VHE way */
+ pre_disable_mmu_workaround
msr_s SYS_SCTLR_EL12, x1
mov x2, #BOOT_CPU_FLAG_E2H
b 2f
1:
+ pre_disable_mmu_workaround
msr sctlr_el1, x1
mov x2, xzr
2:
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index fea3223704b6339a..11cf21afafa9f852 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -56,6 +56,7 @@ static int num_standard_resources;
static struct resource *standard_resources;
phys_addr_t __fdt_pointer __initdata;
+u64 mmu_enabled_at_boot __initdata;
/*
* Standard memory resources
@@ -328,8 +329,12 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
xen_early_init();
efi_init();
- if (!efi_enabled(EFI_BOOT) && ((u64)_text % MIN_KIMG_ALIGN) != 0)
- pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
+ if (!efi_enabled(EFI_BOOT)) {
+ if ((u64)_text % MIN_KIMG_ALIGN)
+ pr_warn(FW_BUG "Kernel image misaligned at boot, please fix your bootloader!");
+ WARN_TAINT(mmu_enabled_at_boot, TAINT_FIRMWARE_WORKAROUND,
+ FW_BUG "Booted with MMU enabled!");
+ }
arm64_memblock_init();
--
2.35.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 4/7] arm64: head: avoid cache invalidation when entering with the MMU on
2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:22 ` Ard Biesheuvel
-1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
If we enter with the MMU on, there is no need for explicit cache
invalidation for stores to memory, as they will be coherent with the
caches.
Let's take advantage of this, and create the ID map with the MMU still
enabled if that is how we entered, and avoid any cache invalidation
calls in that case.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/head.S | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 3e654e43fa115947..a7c84cde67c5c652 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -89,9 +89,9 @@
SYM_CODE_START(primary_entry)
bl record_mmu_state
bl preserve_boot_args
+ bl create_idmap
bl init_kernel_el // w0=cpu_boot_mode
mov x20, x0
- bl create_idmap
/*
* The following calls CPU setup code, see arch/arm64/mm/proc.S for
@@ -378,12 +378,13 @@ SYM_FUNC_START_LOCAL(create_idmap)
* accesses (MMU disabled), invalidate those tables again to
* remove any speculatively loaded cache lines.
*/
+ cbnz x19, 0f // skip cache invalidation if MMU is on
dmb sy
adrp x0, init_idmap_pg_dir
adrp x1, init_idmap_pg_end
bl dcache_inval_poc
- ret x28
+0: ret x28
SYM_FUNC_END(create_idmap)
SYM_FUNC_START_LOCAL(create_kernel_mapping)
--
2.35.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 4/7] arm64: head: avoid cache invalidation when entering with the MMU on
@ 2022-11-08 18:22 ` Ard Biesheuvel
0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
If we enter with the MMU on, there is no need for explicit cache
invalidation for stores to memory, as they will be coherent with the
caches.
Let's take advantage of this, and create the ID map with the MMU still
enabled if that is how we entered, and avoid any cache invalidation
calls in that case.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/head.S | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 3e654e43fa115947..a7c84cde67c5c652 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -89,9 +89,9 @@
SYM_CODE_START(primary_entry)
bl record_mmu_state
bl preserve_boot_args
+ bl create_idmap
bl init_kernel_el // w0=cpu_boot_mode
mov x20, x0
- bl create_idmap
/*
* The following calls CPU setup code, see arch/arm64/mm/proc.S for
@@ -378,12 +378,13 @@ SYM_FUNC_START_LOCAL(create_idmap)
* accesses (MMU disabled), invalidate those tables again to
* remove any speculatively loaded cache lines.
*/
+ cbnz x19, 0f // skip cache invalidation if MMU is on
dmb sy
adrp x0, init_idmap_pg_dir
adrp x1, init_idmap_pg_end
bl dcache_inval_poc
- ret x28
+0: ret x28
SYM_FUNC_END(create_idmap)
SYM_FUNC_START_LOCAL(create_kernel_mapping)
--
2.35.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 5/7] arm64: head: Clean the ID map and the HYP text to the PoC if needed
2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:22 ` Ard Biesheuvel
-1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
If we enter with the MMU and caches enabled, the bootloader may not have
performed any cache maintenance to the PoC. So clean the ID mapped page
to the PoC, to ensure that instruction and data accesses with the MMU
off see the correct data. For similar reasons, clean all the HYP text to
the PoC as well when entering at EL2 with the MMU and caches enabled.
Note that this means primary_entry() itself needs to be moved into the
ID map as well, as we will return from init_kernel_el() with the MMU and
caches off.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/head.S | 31 +++++++++++++++++---
arch/arm64/kernel/sleep.S | 1 +
2 files changed, 28 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index a7c84cde67c5c652..825f1d0549661030 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -70,7 +70,7 @@
__EFI_PE_HEADER
- __INIT
+ .section ".idmap.text","awx"
/*
* The following callee saved general purpose registers are used on the
@@ -90,6 +90,17 @@ SYM_CODE_START(primary_entry)
bl record_mmu_state
bl preserve_boot_args
bl create_idmap
+
+ /*
+ * If we entered with the MMU and caches on, clean the ID mapped part
+ * of the primary boot code to the PoC so we can safely execute it with
+ * the MMU off.
+ */
+ cbz x19, 0f
+ adrp x0, __idmap_text_start
+ adr_l x1, __idmap_text_end
+ bl dcache_clean_poc
+0: mov x19, x0
bl init_kernel_el // w0=cpu_boot_mode
mov x20, x0
@@ -111,6 +122,7 @@ SYM_CODE_START(primary_entry)
b __primary_switch
SYM_CODE_END(primary_entry)
+ __INIT
SYM_CODE_START_LOCAL(record_mmu_state)
mrs x19, CurrentEL
cmp x19, #CurrentEL_EL2
@@ -505,10 +517,12 @@ SYM_FUNC_END(__primary_switched)
* Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in x0 if
* booted in EL1 or EL2 respectively, with the top 32 bits containing
* potential context flags. These flags are *not* stored in __boot_cpu_mode.
+ *
+ * x0: whether we are being called from the primary boot path with the MMU on
*/
SYM_FUNC_START(init_kernel_el)
- mrs x0, CurrentEL
- cmp x0, #CurrentEL_EL2
+ mrs x1, CurrentEL
+ cmp x1, #CurrentEL_EL2
b.eq init_el2
SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
@@ -523,6 +537,14 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
eret
SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
+ msr elr_el2, lr
+
+ // clean all HYP code to the PoC if we booted at EL2 with the MMU on
+ cbz x0, 0f
+ adrp x0, __hyp_idmap_text_start
+ adr_l x1, __hyp_text_end
+ bl dcache_clean_poc
+0:
mov_q x0, HCR_HOST_NVHE_FLAGS
msr hcr_el2, x0
isb
@@ -556,7 +578,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
msr sctlr_el1, x1
mov x2, xzr
2:
- msr elr_el2, lr
mov w0, #BOOT_CPU_MODE_EL2
orr x0, x0, x2
eret
@@ -567,6 +588,7 @@ SYM_FUNC_END(init_kernel_el)
* cores are held until we're ready for them to initialise.
*/
SYM_FUNC_START(secondary_holding_pen)
+ mov x0, xzr
bl init_kernel_el // w0=cpu_boot_mode
mrs x2, mpidr_el1
mov_q x1, MPIDR_HWID_BITMASK
@@ -584,6 +606,7 @@ SYM_FUNC_END(secondary_holding_pen)
* be used where CPUs are brought online dynamically by the kernel.
*/
SYM_FUNC_START(secondary_entry)
+ mov x0, xzr
bl init_kernel_el // w0=cpu_boot_mode
b secondary_startup
SYM_FUNC_END(secondary_entry)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 7b7c56e048346e97..2ae7cff1953aaf87 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -99,6 +99,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
.pushsection ".idmap.text", "awx"
SYM_CODE_START(cpu_resume)
+ mov x0, xzr
bl init_kernel_el
mov x19, x0 // preserve boot mode
#if VA_BITS > 48
--
2.35.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 5/7] arm64: head: Clean the ID map and the HYP text to the PoC if needed
@ 2022-11-08 18:22 ` Ard Biesheuvel
0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
If we enter with the MMU and caches enabled, the bootloader may not have
performed any cache maintenance to the PoC. So clean the ID mapped page
to the PoC, to ensure that instruction and data accesses with the MMU
off see the correct data. For similar reasons, clean all the HYP text to
the PoC as well when entering at EL2 with the MMU and caches enabled.
Note that this means primary_entry() itself needs to be moved into the
ID map as well, as we will return from init_kernel_el() with the MMU and
caches off.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/head.S | 31 +++++++++++++++++---
arch/arm64/kernel/sleep.S | 1 +
2 files changed, 28 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index a7c84cde67c5c652..825f1d0549661030 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -70,7 +70,7 @@
__EFI_PE_HEADER
- __INIT
+ .section ".idmap.text","awx"
/*
* The following callee saved general purpose registers are used on the
@@ -90,6 +90,17 @@ SYM_CODE_START(primary_entry)
bl record_mmu_state
bl preserve_boot_args
bl create_idmap
+
+ /*
+ * If we entered with the MMU and caches on, clean the ID mapped part
+ * of the primary boot code to the PoC so we can safely execute it with
+ * the MMU off.
+ */
+ cbz x19, 0f
+ adrp x0, __idmap_text_start
+ adr_l x1, __idmap_text_end
+ bl dcache_clean_poc
+0: mov x19, x0
bl init_kernel_el // w0=cpu_boot_mode
mov x20, x0
@@ -111,6 +122,7 @@ SYM_CODE_START(primary_entry)
b __primary_switch
SYM_CODE_END(primary_entry)
+ __INIT
SYM_CODE_START_LOCAL(record_mmu_state)
mrs x19, CurrentEL
cmp x19, #CurrentEL_EL2
@@ -505,10 +517,12 @@ SYM_FUNC_END(__primary_switched)
* Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in x0 if
* booted in EL1 or EL2 respectively, with the top 32 bits containing
* potential context flags. These flags are *not* stored in __boot_cpu_mode.
+ *
+ * x0: whether we are being called from the primary boot path with the MMU on
*/
SYM_FUNC_START(init_kernel_el)
- mrs x0, CurrentEL
- cmp x0, #CurrentEL_EL2
+ mrs x1, CurrentEL
+ cmp x1, #CurrentEL_EL2
b.eq init_el2
SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
@@ -523,6 +537,14 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
eret
SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
+ msr elr_el2, lr
+
+ // clean all HYP code to the PoC if we booted at EL2 with the MMU on
+ cbz x0, 0f
+ adrp x0, __hyp_idmap_text_start
+ adr_l x1, __hyp_text_end
+ bl dcache_clean_poc
+0:
mov_q x0, HCR_HOST_NVHE_FLAGS
msr hcr_el2, x0
isb
@@ -556,7 +578,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
msr sctlr_el1, x1
mov x2, xzr
2:
- msr elr_el2, lr
mov w0, #BOOT_CPU_MODE_EL2
orr x0, x0, x2
eret
@@ -567,6 +588,7 @@ SYM_FUNC_END(init_kernel_el)
* cores are held until we're ready for them to initialise.
*/
SYM_FUNC_START(secondary_holding_pen)
+ mov x0, xzr
bl init_kernel_el // w0=cpu_boot_mode
mrs x2, mpidr_el1
mov_q x1, MPIDR_HWID_BITMASK
@@ -584,6 +606,7 @@ SYM_FUNC_END(secondary_holding_pen)
* be used where CPUs are brought online dynamically by the kernel.
*/
SYM_FUNC_START(secondary_entry)
+ mov x0, xzr
bl init_kernel_el // w0=cpu_boot_mode
b secondary_startup
SYM_FUNC_END(secondary_entry)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 7b7c56e048346e97..2ae7cff1953aaf87 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -99,6 +99,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
.pushsection ".idmap.text", "awx"
SYM_CODE_START(cpu_resume)
+ mov x0, xzr
bl init_kernel_el
mov x19, x0 // preserve boot mode
#if VA_BITS > 48
--
2.35.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 6/7] arm64: lds: reduce effective minimum image alignment to 64k
2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:22 ` Ard Biesheuvel
-1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
Our segment alignment is 64k for all configurations, and coincidentally,
this is the largest alignment supported by the PE/COFF executable
format used by EFI. This means that generally, there is no need to move
the image around in memory after it has been loaded by the firmware,
which can be advantageous as it also permits us to rely on the memory
attributes set by the firmware (R-X for [_text, __inittext_end] and RW-
for [__initdata_begin, _end].
However, the minimum alignment of the image is actually 128k on 64k
pages configurations with CONFIG_VMAP_STACK=y, due to the existence of a
single 128k aligned object in the image, which is the stack of the init
task.
Let's work around this by adding some padding before the init stack
allocation, so we can round down the stack pointer to a suitably aligned
value if the image is not aligned to 128k in memory.
Note that this does not affect the boot protocol, which still requires 2
MiB alignment for bare metal boot, but is only part of the internal
contract between the EFI stub and the kernel proper.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/efi.h | 9 +--------
arch/arm64/kernel/head.S | 3 +++
arch/arm64/kernel/vmlinux.lds.S | 11 ++++++++++-
include/linux/efi.h | 6 +-----
4 files changed, 15 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 108b115dbf5b7436..7ed7a0e621a5b0b6 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -54,13 +54,6 @@ efi_status_t __efi_rt_asm_wrapper(void *, const char *, ...);
/* arch specific definitions used by the stub code */
-/*
- * In some configurations (e.g. VMAP_STACK && 64K pages), stacks built into the
- * kernel need greater alignment than we require the segments to be padded to.
- */
-#define EFI_KIMG_ALIGN \
- (SEGMENT_ALIGN > THREAD_ALIGN ? SEGMENT_ALIGN : THREAD_ALIGN)
-
/*
* On arm64, we have to ensure that the initrd ends up in the linear region,
* which is a 1 GB aligned region of size '1UL << (VA_BITS_MIN - 1)' that is
@@ -88,7 +81,7 @@ static inline unsigned long efi_get_kimg_min_align(void)
* 2M alignment if KASLR was explicitly disabled, even if it was not
* going to be activated to begin with.
*/
- return efi_nokaslr ? MIN_KIMG_ALIGN : EFI_KIMG_ALIGN;
+ return efi_nokaslr ? MIN_KIMG_ALIGN : SEGMENT_ALIGN;
}
#define EFI_ALLOC_ALIGN SZ_64K
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 825f1d0549661030..8d7c6155da59e215 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -429,6 +429,9 @@ SYM_FUNC_END(create_kernel_mapping)
msr sp_el0, \tsk
ldr \tmp1, [\tsk, #TSK_STACK]
+#if THREAD_ALIGN > SEGMENT_ALIGN
+ bic \tmp1, \tmp1, #THREAD_ALIGN - 1
+#endif
add sp, \tmp1, #THREAD_SIZE
sub sp, sp, #PT_REGS_SIZE
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index c7727a1740ce11f5..5002d869fa7f1767 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -274,7 +274,16 @@ SECTIONS
_data = .;
_sdata = .;
- RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_ALIGN)
+#if THREAD_ALIGN > SEGMENT_ALIGN
+ /*
+ * Add some padding for the init stack so we can fix up any potential
+ * misalignment at runtime. In practice, this can only occur on 64k
+ * pages configurations with CONFIG_VMAP_STACK=y.
+ */
+ . += THREAD_ALIGN - SEGMENT_ALIGN;
+ ASSERT(. == init_stack, "init_stack not at start of RW_DATA as expected")
+#endif
+ RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, SEGMENT_ALIGN)
/*
* Data written with the MMU off but read with the MMU on requires
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 16b7318957b0709f..19eda0bb4617a4cf 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -421,11 +421,7 @@ void efi_native_runtime_setup(void);
/*
* This GUID may be installed onto the kernel image's handle as a NULL protocol
* to signal to the stub that the placement of the image should be respected,
- * and moving the image in physical memory is undesirable. To ensure
- * compatibility with 64k pages kernels with virtually mapped stacks, and to
- * avoid defeating physical randomization, this protocol should only be
- * installed if the image was placed at a randomized 128k aligned address in
- * memory.
+ * and moving the image in physical memory is undesirable.
*/
#define LINUX_EFI_LOADED_IMAGE_FIXED_GUID EFI_GUID(0xf5a37b6d, 0x3344, 0x42a5, 0xb6, 0xbb, 0x97, 0x86, 0x48, 0xc1, 0x89, 0x0a)
--
2.35.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 6/7] arm64: lds: reduce effective minimum image alignment to 64k
@ 2022-11-08 18:22 ` Ard Biesheuvel
0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
Our segment alignment is 64k for all configurations, and coincidentally,
this is the largest alignment supported by the PE/COFF executable
format used by EFI. This means that generally, there is no need to move
the image around in memory after it has been loaded by the firmware,
which can be advantageous as it also permits us to rely on the memory
attributes set by the firmware (R-X for [_text, __inittext_end] and RW-
for [__initdata_begin, _end].
However, the minimum alignment of the image is actually 128k on 64k
pages configurations with CONFIG_VMAP_STACK=y, due to the existence of a
single 128k aligned object in the image, which is the stack of the init
task.
Let's work around this by adding some padding before the init stack
allocation, so we can round down the stack pointer to a suitably aligned
value if the image is not aligned to 128k in memory.
Note that this does not affect the boot protocol, which still requires 2
MiB alignment for bare metal boot, but is only part of the internal
contract between the EFI stub and the kernel proper.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/include/asm/efi.h | 9 +--------
arch/arm64/kernel/head.S | 3 +++
arch/arm64/kernel/vmlinux.lds.S | 11 ++++++++++-
include/linux/efi.h | 6 +-----
4 files changed, 15 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 108b115dbf5b7436..7ed7a0e621a5b0b6 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -54,13 +54,6 @@ efi_status_t __efi_rt_asm_wrapper(void *, const char *, ...);
/* arch specific definitions used by the stub code */
-/*
- * In some configurations (e.g. VMAP_STACK && 64K pages), stacks built into the
- * kernel need greater alignment than we require the segments to be padded to.
- */
-#define EFI_KIMG_ALIGN \
- (SEGMENT_ALIGN > THREAD_ALIGN ? SEGMENT_ALIGN : THREAD_ALIGN)
-
/*
* On arm64, we have to ensure that the initrd ends up in the linear region,
* which is a 1 GB aligned region of size '1UL << (VA_BITS_MIN - 1)' that is
@@ -88,7 +81,7 @@ static inline unsigned long efi_get_kimg_min_align(void)
* 2M alignment if KASLR was explicitly disabled, even if it was not
* going to be activated to begin with.
*/
- return efi_nokaslr ? MIN_KIMG_ALIGN : EFI_KIMG_ALIGN;
+ return efi_nokaslr ? MIN_KIMG_ALIGN : SEGMENT_ALIGN;
}
#define EFI_ALLOC_ALIGN SZ_64K
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 825f1d0549661030..8d7c6155da59e215 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -429,6 +429,9 @@ SYM_FUNC_END(create_kernel_mapping)
msr sp_el0, \tsk
ldr \tmp1, [\tsk, #TSK_STACK]
+#if THREAD_ALIGN > SEGMENT_ALIGN
+ bic \tmp1, \tmp1, #THREAD_ALIGN - 1
+#endif
add sp, \tmp1, #THREAD_SIZE
sub sp, sp, #PT_REGS_SIZE
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index c7727a1740ce11f5..5002d869fa7f1767 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -274,7 +274,16 @@ SECTIONS
_data = .;
_sdata = .;
- RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_ALIGN)
+#if THREAD_ALIGN > SEGMENT_ALIGN
+ /*
+ * Add some padding for the init stack so we can fix up any potential
+ * misalignment at runtime. In practice, this can only occur on 64k
+ * pages configurations with CONFIG_VMAP_STACK=y.
+ */
+ . += THREAD_ALIGN - SEGMENT_ALIGN;
+ ASSERT(. == init_stack, "init_stack not at start of RW_DATA as expected")
+#endif
+ RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, SEGMENT_ALIGN)
/*
* Data written with the MMU off but read with the MMU on requires
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 16b7318957b0709f..19eda0bb4617a4cf 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -421,11 +421,7 @@ void efi_native_runtime_setup(void);
/*
* This GUID may be installed onto the kernel image's handle as a NULL protocol
* to signal to the stub that the placement of the image should be respected,
- * and moving the image in physical memory is undesirable. To ensure
- * compatibility with 64k pages kernels with virtually mapped stacks, and to
- * avoid defeating physical randomization, this protocol should only be
- * installed if the image was placed at a randomized 128k aligned address in
- * memory.
+ * and moving the image in physical memory is undesirable.
*/
#define LINUX_EFI_LOADED_IMAGE_FIXED_GUID EFI_GUID(0xf5a37b6d, 0x3344, 0x42a5, 0xb6, 0xbb, 0x97, 0x86, 0x48, 0xc1, 0x89, 0x0a)
--
2.35.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 7/7] efi: arm64: enter with MMU and caches enabled
2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-08 18:22 ` Ard Biesheuvel
-1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
Instead of cleaning the entire loaded kernel image to the PoC and
disabling the MMU and caches before branching to the kernel's bare metal
entry point, we can leave the MMU and caches enabled, and rely on EFI's
cacheable 1:1 mapping of all of system RAM (which is mandated by the
spec) to populate the initial page tables.
This removes the need for managing coherency in software, which is
tedious and error prone.
Note that we still need to clean the executable region of the image to
the PoU if this is required for I/D coherency, but only if we actually
decided to move the image in memory, as otherwise, this will have been
taken care of by the loader.
This change affects both the builtin EFI stub as well as the zboot
decompressor, which now carries the entire EFI stub along with the
decompression code and the compressed image.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/image-vars.h | 5 +-
arch/arm64/mm/cache.S | 5 +-
drivers/firmware/efi/libstub/Makefile | 4 +-
drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------------
drivers/firmware/efi/libstub/arm64-stub.c | 26 +++++---
drivers/firmware/efi/libstub/arm64.c | 41 ++++++++++--
6 files changed, 61 insertions(+), 87 deletions(-)
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index f31130ba02331060..40ebb882d2d8c97b 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -10,7 +10,7 @@
#error This file should only be included in vmlinux.lds.S
#endif
-PROVIDE(__efistub_primary_entry_offset = primary_entry - _text);
+PROVIDE(__efistub_primary_entry = primary_entry);
/*
* The EFI stub has its own symbol namespace prefixed by __efistub_, to
@@ -21,10 +21,11 @@ PROVIDE(__efistub_primary_entry_offset = primary_entry - _text);
* linked at. The routines below are all implemented in assembler in a
* position independent manner
*/
-PROVIDE(__efistub_dcache_clean_poc = __pi_dcache_clean_poc);
+PROVIDE(__efistub_caches_clean_inval_pou = __pi_caches_clean_inval_pou);
PROVIDE(__efistub__text = _text);
PROVIDE(__efistub__end = _end);
+PROVIDE(__efistub___inittext_end = __inittext_end);
PROVIDE(__efistub__edata = _edata);
PROVIDE(__efistub_screen_info = screen_info);
PROVIDE(__efistub__ctype = _ctype);
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 081058d4e4366edb..8c3b3ee9b1d725c8 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -52,10 +52,11 @@ alternative_else_nop_endif
* - start - virtual start address of region
* - end - virtual end address of region
*/
-SYM_FUNC_START(caches_clean_inval_pou)
+SYM_FUNC_START(__pi_caches_clean_inval_pou)
caches_clean_inval_pou_macro
ret
-SYM_FUNC_END(caches_clean_inval_pou)
+SYM_FUNC_END(__pi_caches_clean_inval_pou)
+SYM_FUNC_ALIAS(caches_clean_inval_pou, __pi_caches_clean_inval_pou)
/*
* caches_clean_inval_user_pou(start,end)
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 402dfb30ddc7a01e..f838ab98978f1038 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -86,7 +86,7 @@ lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o string.o intrinsics.o systable.o \
screen_info.o efi-stub-entry.o
lib-$(CONFIG_ARM) += arm32-stub.o
-lib-$(CONFIG_ARM64) += arm64.o arm64-stub.o arm64-entry.o
+lib-$(CONFIG_ARM64) += arm64.o arm64-stub.o
lib-$(CONFIG_X86) += x86-stub.o
lib-$(CONFIG_RISCV) += riscv.o riscv-stub.o
lib-$(CONFIG_LOONGARCH) += loongarch.o loongarch-stub.o
@@ -140,7 +140,7 @@ STUBCOPY_RELOC-$(CONFIG_ARM) := R_ARM_ABS
#
STUBCOPY_FLAGS-$(CONFIG_ARM64) += --prefix-alloc-sections=.init \
--prefix-symbols=__efistub_
-STUBCOPY_RELOC-$(CONFIG_ARM64) := R_AARCH64_ABS64
+STUBCOPY_RELOC-$(CONFIG_ARM64) := R_AARCH64_ABS
# For RISC-V, we don't need anything special other than arm64. Keep all the
# symbols in .init section and make sure that no absolute symbols references
diff --git a/drivers/firmware/efi/libstub/arm64-entry.S b/drivers/firmware/efi/libstub/arm64-entry.S
deleted file mode 100644
index b5c17e89a4fc0c21..0000000000000000
--- a/drivers/firmware/efi/libstub/arm64-entry.S
+++ /dev/null
@@ -1,67 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * EFI entry point.
- *
- * Copyright (C) 2013, 2014 Red Hat, Inc.
- * Author: Mark Salter <msalter@redhat.com>
- */
-#include <linux/linkage.h>
-#include <asm/assembler.h>
-
- /*
- * The entrypoint of a arm64 bare metal image is at offset #0 of the
- * image, so this is a reasonable default for primary_entry_offset.
- * Only when the EFI stub is integrated into the core kernel, it is not
- * guaranteed that the PE/COFF header has been copied to memory too, so
- * in this case, primary_entry_offset should be overridden by the
- * linker and point to primary_entry() directly.
- */
- .weak primary_entry_offset
-
-SYM_CODE_START(efi_enter_kernel)
- /*
- * efi_pe_entry() will have copied the kernel image if necessary and we
- * end up here with device tree address in x1 and the kernel entry
- * point stored in x0. Save those values in registers which are
- * callee preserved.
- */
- ldr w2, =primary_entry_offset
- add x19, x0, x2 // relocated Image entrypoint
-
- mov x0, x1 // DTB address
- mov x1, xzr
- mov x2, xzr
- mov x3, xzr
-
- /*
- * Clean the remainder of this routine to the PoC
- * so that we can safely disable the MMU and caches.
- */
- adr x4, 1f
- dc civac, x4
- dsb sy
-
- /* Turn off Dcache and MMU */
- mrs x4, CurrentEL
- cmp x4, #CurrentEL_EL2
- mrs x4, sctlr_el1
- b.ne 0f
- mrs x4, sctlr_el2
-0: bic x4, x4, #SCTLR_ELx_M
- bic x4, x4, #SCTLR_ELx_C
- b.eq 1f
- b 2f
-
- .balign 32
-1: pre_disable_mmu_workaround
- msr sctlr_el2, x4
- isb
- br x19 // jump to kernel entrypoint
-
-2: pre_disable_mmu_workaround
- msr sctlr_el1, x4
- isb
- br x19 // jump to kernel entrypoint
-
- .org 1b + 32
-SYM_CODE_END(efi_enter_kernel)
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 7f0aab3a8ab302d6..00fb2eab6d0c74ef 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -58,7 +58,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
efi_handle_t image_handle)
{
efi_status_t status;
- unsigned long kernel_size, kernel_memsize = 0;
+ unsigned long kernel_size, kernel_codesize, kernel_memsize;
u32 phys_seed = 0;
u64 min_kimg_align = efi_get_kimg_min_align();
@@ -93,6 +93,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
SEGMENT_ALIGN >> 10);
kernel_size = _edata - _text;
+ kernel_codesize = __inittext_end - _text;
kernel_memsize = kernel_size + (_end - _edata);
*reserve_size = kernel_memsize;
@@ -120,7 +121,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
*/
*image_addr = (u64)_text;
*reserve_size = 0;
- goto clean_image_to_poc;
+ return EFI_SUCCESS;
}
status = efi_allocate_pages_aligned(*reserve_size, reserve_addr,
@@ -136,14 +137,21 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
*image_addr = *reserve_addr;
memcpy((void *)*image_addr, _text, kernel_size);
+ caches_clean_inval_pou(*image_addr, *image_addr + kernel_codesize);
-clean_image_to_poc:
+ return EFI_SUCCESS;
+}
+
+asmlinkage void primary_entry(void);
+
+unsigned long primary_entry_offset(void)
+{
/*
- * Clean the copied Image to the PoC, and ensure it is not shadowed by
- * stale icache entries from before relocation.
+ * When built as part of the kernel, the EFI stub cannot branch to the
+ * kernel proper via the image header, as the PE/COFF header is
+ * strictly not part of the in-memory presentation of the image, only
+ * of the file representation. So instead, we need to jump to the
+ * actual entrypoint in the .text region of the image.
*/
- dcache_clean_poc(*image_addr, *image_addr + kernel_size);
- asm("ic ialluis");
-
- return EFI_SUCCESS;
+ return (char *)primary_entry - _text;
}
diff --git a/drivers/firmware/efi/libstub/arm64.c b/drivers/firmware/efi/libstub/arm64.c
index d2e94972c5fad523..99f86ddc91cf10cf 100644
--- a/drivers/firmware/efi/libstub/arm64.c
+++ b/drivers/firmware/efi/libstub/arm64.c
@@ -41,6 +41,12 @@ efi_status_t check_platform_features(void)
return EFI_SUCCESS;
}
+#ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
+#define DCTYPE "civac"
+#else
+#define DCTYPE "cvau"
+#endif
+
void efi_cache_sync_image(unsigned long image_base,
unsigned long alloc_size,
unsigned long code_size)
@@ -49,13 +55,38 @@ void efi_cache_sync_image(unsigned long image_base,
u64 lsize = 4 << cpuid_feature_extract_unsigned_field(ctr,
CTR_EL0_DminLine_SHIFT);
- do {
- asm("dc civac, %0" :: "r"(image_base));
- image_base += lsize;
- alloc_size -= lsize;
- } while (alloc_size >= lsize);
+ /* only perform the cache maintenance if needed for I/D coherency */
+ if (!(ctr & BIT(CTR_EL0_IDC_SHIFT))) {
+ do {
+ asm("dc " DCTYPE ", %0" :: "r"(image_base));
+ image_base += lsize;
+ code_size -= lsize;
+ } while (code_size >= lsize);
+ }
asm("ic ialluis");
dsb(ish);
isb();
}
+
+unsigned long __weak primary_entry_offset(void)
+{
+ /*
+ * By default, we can invoke the kernel via the branch instruction in
+ * the image header, so offset #0. This will be overridden by the EFI
+ * stub build that is linked into the core kernel, as in that case, the
+ * image header may not have been loaded into memory, or may be mapped
+ * with non-executable permissions.
+ */
+ return 0;
+}
+
+void __noreturn efi_enter_kernel(unsigned long entrypoint,
+ unsigned long fdt_addr,
+ unsigned long fdt_size)
+{
+ void (* __noreturn enter_kernel)(u64, u64, u64, u64);
+
+ enter_kernel = (void *)entrypoint + primary_entry_offset();
+ enter_kernel(fdt_addr, 0, 0, 0);
+}
--
2.35.1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH v5 7/7] efi: arm64: enter with MMU and caches enabled
@ 2022-11-08 18:22 ` Ard Biesheuvel
0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 18:22 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Ard Biesheuvel, Will Deacon,
Catalin Marinas, Marc Zyngier, Mark Rutland
Instead of cleaning the entire loaded kernel image to the PoC and
disabling the MMU and caches before branching to the kernel's bare metal
entry point, we can leave the MMU and caches enabled, and rely on EFI's
cacheable 1:1 mapping of all of system RAM (which is mandated by the
spec) to populate the initial page tables.
This removes the need for managing coherency in software, which is
tedious and error prone.
Note that we still need to clean the executable region of the image to
the PoU if this is required for I/D coherency, but only if we actually
decided to move the image in memory, as otherwise, this will have been
taken care of by the loader.
This change affects both the builtin EFI stub as well as the zboot
decompressor, which now carries the entire EFI stub along with the
decompression code and the compressed image.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm64/kernel/image-vars.h | 5 +-
arch/arm64/mm/cache.S | 5 +-
drivers/firmware/efi/libstub/Makefile | 4 +-
drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------------
drivers/firmware/efi/libstub/arm64-stub.c | 26 +++++---
drivers/firmware/efi/libstub/arm64.c | 41 ++++++++++--
6 files changed, 61 insertions(+), 87 deletions(-)
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index f31130ba02331060..40ebb882d2d8c97b 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -10,7 +10,7 @@
#error This file should only be included in vmlinux.lds.S
#endif
-PROVIDE(__efistub_primary_entry_offset = primary_entry - _text);
+PROVIDE(__efistub_primary_entry = primary_entry);
/*
* The EFI stub has its own symbol namespace prefixed by __efistub_, to
@@ -21,10 +21,11 @@ PROVIDE(__efistub_primary_entry_offset = primary_entry - _text);
* linked at. The routines below are all implemented in assembler in a
* position independent manner
*/
-PROVIDE(__efistub_dcache_clean_poc = __pi_dcache_clean_poc);
+PROVIDE(__efistub_caches_clean_inval_pou = __pi_caches_clean_inval_pou);
PROVIDE(__efistub__text = _text);
PROVIDE(__efistub__end = _end);
+PROVIDE(__efistub___inittext_end = __inittext_end);
PROVIDE(__efistub__edata = _edata);
PROVIDE(__efistub_screen_info = screen_info);
PROVIDE(__efistub__ctype = _ctype);
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 081058d4e4366edb..8c3b3ee9b1d725c8 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -52,10 +52,11 @@ alternative_else_nop_endif
* - start - virtual start address of region
* - end - virtual end address of region
*/
-SYM_FUNC_START(caches_clean_inval_pou)
+SYM_FUNC_START(__pi_caches_clean_inval_pou)
caches_clean_inval_pou_macro
ret
-SYM_FUNC_END(caches_clean_inval_pou)
+SYM_FUNC_END(__pi_caches_clean_inval_pou)
+SYM_FUNC_ALIAS(caches_clean_inval_pou, __pi_caches_clean_inval_pou)
/*
* caches_clean_inval_user_pou(start,end)
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index 402dfb30ddc7a01e..f838ab98978f1038 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -86,7 +86,7 @@ lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o string.o intrinsics.o systable.o \
screen_info.o efi-stub-entry.o
lib-$(CONFIG_ARM) += arm32-stub.o
-lib-$(CONFIG_ARM64) += arm64.o arm64-stub.o arm64-entry.o
+lib-$(CONFIG_ARM64) += arm64.o arm64-stub.o
lib-$(CONFIG_X86) += x86-stub.o
lib-$(CONFIG_RISCV) += riscv.o riscv-stub.o
lib-$(CONFIG_LOONGARCH) += loongarch.o loongarch-stub.o
@@ -140,7 +140,7 @@ STUBCOPY_RELOC-$(CONFIG_ARM) := R_ARM_ABS
#
STUBCOPY_FLAGS-$(CONFIG_ARM64) += --prefix-alloc-sections=.init \
--prefix-symbols=__efistub_
-STUBCOPY_RELOC-$(CONFIG_ARM64) := R_AARCH64_ABS64
+STUBCOPY_RELOC-$(CONFIG_ARM64) := R_AARCH64_ABS
# For RISC-V, we don't need anything special other than arm64. Keep all the
# symbols in .init section and make sure that no absolute symbols references
diff --git a/drivers/firmware/efi/libstub/arm64-entry.S b/drivers/firmware/efi/libstub/arm64-entry.S
deleted file mode 100644
index b5c17e89a4fc0c21..0000000000000000
--- a/drivers/firmware/efi/libstub/arm64-entry.S
+++ /dev/null
@@ -1,67 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * EFI entry point.
- *
- * Copyright (C) 2013, 2014 Red Hat, Inc.
- * Author: Mark Salter <msalter@redhat.com>
- */
-#include <linux/linkage.h>
-#include <asm/assembler.h>
-
- /*
- * The entrypoint of a arm64 bare metal image is at offset #0 of the
- * image, so this is a reasonable default for primary_entry_offset.
- * Only when the EFI stub is integrated into the core kernel, it is not
- * guaranteed that the PE/COFF header has been copied to memory too, so
- * in this case, primary_entry_offset should be overridden by the
- * linker and point to primary_entry() directly.
- */
- .weak primary_entry_offset
-
-SYM_CODE_START(efi_enter_kernel)
- /*
- * efi_pe_entry() will have copied the kernel image if necessary and we
- * end up here with device tree address in x1 and the kernel entry
- * point stored in x0. Save those values in registers which are
- * callee preserved.
- */
- ldr w2, =primary_entry_offset
- add x19, x0, x2 // relocated Image entrypoint
-
- mov x0, x1 // DTB address
- mov x1, xzr
- mov x2, xzr
- mov x3, xzr
-
- /*
- * Clean the remainder of this routine to the PoC
- * so that we can safely disable the MMU and caches.
- */
- adr x4, 1f
- dc civac, x4
- dsb sy
-
- /* Turn off Dcache and MMU */
- mrs x4, CurrentEL
- cmp x4, #CurrentEL_EL2
- mrs x4, sctlr_el1
- b.ne 0f
- mrs x4, sctlr_el2
-0: bic x4, x4, #SCTLR_ELx_M
- bic x4, x4, #SCTLR_ELx_C
- b.eq 1f
- b 2f
-
- .balign 32
-1: pre_disable_mmu_workaround
- msr sctlr_el2, x4
- isb
- br x19 // jump to kernel entrypoint
-
-2: pre_disable_mmu_workaround
- msr sctlr_el1, x4
- isb
- br x19 // jump to kernel entrypoint
-
- .org 1b + 32
-SYM_CODE_END(efi_enter_kernel)
diff --git a/drivers/firmware/efi/libstub/arm64-stub.c b/drivers/firmware/efi/libstub/arm64-stub.c
index 7f0aab3a8ab302d6..00fb2eab6d0c74ef 100644
--- a/drivers/firmware/efi/libstub/arm64-stub.c
+++ b/drivers/firmware/efi/libstub/arm64-stub.c
@@ -58,7 +58,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
efi_handle_t image_handle)
{
efi_status_t status;
- unsigned long kernel_size, kernel_memsize = 0;
+ unsigned long kernel_size, kernel_codesize, kernel_memsize;
u32 phys_seed = 0;
u64 min_kimg_align = efi_get_kimg_min_align();
@@ -93,6 +93,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
SEGMENT_ALIGN >> 10);
kernel_size = _edata - _text;
+ kernel_codesize = __inittext_end - _text;
kernel_memsize = kernel_size + (_end - _edata);
*reserve_size = kernel_memsize;
@@ -120,7 +121,7 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
*/
*image_addr = (u64)_text;
*reserve_size = 0;
- goto clean_image_to_poc;
+ return EFI_SUCCESS;
}
status = efi_allocate_pages_aligned(*reserve_size, reserve_addr,
@@ -136,14 +137,21 @@ efi_status_t handle_kernel_image(unsigned long *image_addr,
*image_addr = *reserve_addr;
memcpy((void *)*image_addr, _text, kernel_size);
+ caches_clean_inval_pou(*image_addr, *image_addr + kernel_codesize);
-clean_image_to_poc:
+ return EFI_SUCCESS;
+}
+
+asmlinkage void primary_entry(void);
+
+unsigned long primary_entry_offset(void)
+{
/*
- * Clean the copied Image to the PoC, and ensure it is not shadowed by
- * stale icache entries from before relocation.
+ * When built as part of the kernel, the EFI stub cannot branch to the
+ * kernel proper via the image header, as the PE/COFF header is
+ * strictly not part of the in-memory presentation of the image, only
+ * of the file representation. So instead, we need to jump to the
+ * actual entrypoint in the .text region of the image.
*/
- dcache_clean_poc(*image_addr, *image_addr + kernel_size);
- asm("ic ialluis");
-
- return EFI_SUCCESS;
+ return (char *)primary_entry - _text;
}
diff --git a/drivers/firmware/efi/libstub/arm64.c b/drivers/firmware/efi/libstub/arm64.c
index d2e94972c5fad523..99f86ddc91cf10cf 100644
--- a/drivers/firmware/efi/libstub/arm64.c
+++ b/drivers/firmware/efi/libstub/arm64.c
@@ -41,6 +41,12 @@ efi_status_t check_platform_features(void)
return EFI_SUCCESS;
}
+#ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
+#define DCTYPE "civac"
+#else
+#define DCTYPE "cvau"
+#endif
+
void efi_cache_sync_image(unsigned long image_base,
unsigned long alloc_size,
unsigned long code_size)
@@ -49,13 +55,38 @@ void efi_cache_sync_image(unsigned long image_base,
u64 lsize = 4 << cpuid_feature_extract_unsigned_field(ctr,
CTR_EL0_DminLine_SHIFT);
- do {
- asm("dc civac, %0" :: "r"(image_base));
- image_base += lsize;
- alloc_size -= lsize;
- } while (alloc_size >= lsize);
+ /* only perform the cache maintenance if needed for I/D coherency */
+ if (!(ctr & BIT(CTR_EL0_IDC_SHIFT))) {
+ do {
+ asm("dc " DCTYPE ", %0" :: "r"(image_base));
+ image_base += lsize;
+ code_size -= lsize;
+ } while (code_size >= lsize);
+ }
asm("ic ialluis");
dsb(ish);
isb();
}
+
+unsigned long __weak primary_entry_offset(void)
+{
+ /*
+ * By default, we can invoke the kernel via the branch instruction in
+ * the image header, so offset #0. This will be overridden by the EFI
+ * stub build that is linked into the core kernel, as in that case, the
+ * image header may not have been loaded into memory, or may be mapped
+ * with non-executable permissions.
+ */
+ return 0;
+}
+
+void __noreturn efi_enter_kernel(unsigned long entrypoint,
+ unsigned long fdt_addr,
+ unsigned long fdt_size)
+{
+ void (* __noreturn enter_kernel)(u64, u64, u64, u64);
+
+ enter_kernel = (void *)entrypoint + primary_entry_offset();
+ enter_kernel(fdt_addr, 0, 0, 0);
+}
--
2.35.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH v5 5/7] arm64: head: Clean the ID map and the HYP text to the PoC if needed
2022-11-08 18:22 ` Ard Biesheuvel
@ 2022-11-08 22:11 ` Ard Biesheuvel
-1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 22:11 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Will Deacon, Catalin Marinas, Marc Zyngier,
Mark Rutland
On Tue, 8 Nov 2022 at 19:22, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> If we enter with the MMU and caches enabled, the bootloader may not have
> performed any cache maintenance to the PoC. So clean the ID mapped page
> to the PoC, to ensure that instruction and data accesses with the MMU
> off see the correct data. For similar reasons, clean all the HYP text to
> the PoC as well when entering at EL2 with the MMU and caches enabled.
>
> Note that this means primary_entry() itself needs to be moved into the
> ID map as well, as we will return from init_kernel_el() with the MMU and
> caches off.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
> arch/arm64/kernel/head.S | 31 +++++++++++++++++---
> arch/arm64/kernel/sleep.S | 1 +
> 2 files changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index a7c84cde67c5c652..825f1d0549661030 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -70,7 +70,7 @@
>
> __EFI_PE_HEADER
>
> - __INIT
> + .section ".idmap.text","awx"
>
> /*
> * The following callee saved general purpose registers are used on the
> @@ -90,6 +90,17 @@ SYM_CODE_START(primary_entry)
> bl record_mmu_state
> bl preserve_boot_args
> bl create_idmap
> +
> + /*
> + * If we entered with the MMU and caches on, clean the ID mapped part
> + * of the primary boot code to the PoC so we can safely execute it with
> + * the MMU off.
> + */
> + cbz x19, 0f
> + adrp x0, __idmap_text_start
> + adr_l x1, __idmap_text_end
> + bl dcache_clean_poc
> +0: mov x19, x0
This is wrong, it should be
mov x0, x19
> bl init_kernel_el // w0=cpu_boot_mode
> mov x20, x0
>
> @@ -111,6 +122,7 @@ SYM_CODE_START(primary_entry)
> b __primary_switch
> SYM_CODE_END(primary_entry)
>
> + __INIT
> SYM_CODE_START_LOCAL(record_mmu_state)
> mrs x19, CurrentEL
> cmp x19, #CurrentEL_EL2
> @@ -505,10 +517,12 @@ SYM_FUNC_END(__primary_switched)
> * Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in x0 if
> * booted in EL1 or EL2 respectively, with the top 32 bits containing
> * potential context flags. These flags are *not* stored in __boot_cpu_mode.
> + *
> + * x0: whether we are being called from the primary boot path with the MMU on
> */
> SYM_FUNC_START(init_kernel_el)
> - mrs x0, CurrentEL
> - cmp x0, #CurrentEL_EL2
> + mrs x1, CurrentEL
> + cmp x1, #CurrentEL_EL2
> b.eq init_el2
>
> SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
> @@ -523,6 +537,14 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
> eret
>
> SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
> + msr elr_el2, lr
> +
> + // clean all HYP code to the PoC if we booted at EL2 with the MMU on
> + cbz x0, 0f
> + adrp x0, __hyp_idmap_text_start
> + adr_l x1, __hyp_text_end
> + bl dcache_clean_poc
> +0:
> mov_q x0, HCR_HOST_NVHE_FLAGS
> msr hcr_el2, x0
> isb
> @@ -556,7 +578,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
> msr sctlr_el1, x1
> mov x2, xzr
> 2:
> - msr elr_el2, lr
> mov w0, #BOOT_CPU_MODE_EL2
> orr x0, x0, x2
> eret
> @@ -567,6 +588,7 @@ SYM_FUNC_END(init_kernel_el)
> * cores are held until we're ready for them to initialise.
> */
> SYM_FUNC_START(secondary_holding_pen)
> + mov x0, xzr
> bl init_kernel_el // w0=cpu_boot_mode
> mrs x2, mpidr_el1
> mov_q x1, MPIDR_HWID_BITMASK
> @@ -584,6 +606,7 @@ SYM_FUNC_END(secondary_holding_pen)
> * be used where CPUs are brought online dynamically by the kernel.
> */
> SYM_FUNC_START(secondary_entry)
> + mov x0, xzr
> bl init_kernel_el // w0=cpu_boot_mode
> b secondary_startup
> SYM_FUNC_END(secondary_entry)
> diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
> index 7b7c56e048346e97..2ae7cff1953aaf87 100644
> --- a/arch/arm64/kernel/sleep.S
> +++ b/arch/arm64/kernel/sleep.S
> @@ -99,6 +99,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
>
> .pushsection ".idmap.text", "awx"
> SYM_CODE_START(cpu_resume)
> + mov x0, xzr
> bl init_kernel_el
> mov x19, x0 // preserve boot mode
> #if VA_BITS > 48
> --
> 2.35.1
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 5/7] arm64: head: Clean the ID map and the HYP text to the PoC if needed
@ 2022-11-08 22:11 ` Ard Biesheuvel
0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-08 22:11 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux-efi, keescook, Will Deacon, Catalin Marinas, Marc Zyngier,
Mark Rutland
On Tue, 8 Nov 2022 at 19:22, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> If we enter with the MMU and caches enabled, the bootloader may not have
> performed any cache maintenance to the PoC. So clean the ID mapped page
> to the PoC, to ensure that instruction and data accesses with the MMU
> off see the correct data. For similar reasons, clean all the HYP text to
> the PoC as well when entering at EL2 with the MMU and caches enabled.
>
> Note that this means primary_entry() itself needs to be moved into the
> ID map as well, as we will return from init_kernel_el() with the MMU and
> caches off.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
> arch/arm64/kernel/head.S | 31 +++++++++++++++++---
> arch/arm64/kernel/sleep.S | 1 +
> 2 files changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index a7c84cde67c5c652..825f1d0549661030 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -70,7 +70,7 @@
>
> __EFI_PE_HEADER
>
> - __INIT
> + .section ".idmap.text","awx"
>
> /*
> * The following callee saved general purpose registers are used on the
> @@ -90,6 +90,17 @@ SYM_CODE_START(primary_entry)
> bl record_mmu_state
> bl preserve_boot_args
> bl create_idmap
> +
> + /*
> + * If we entered with the MMU and caches on, clean the ID mapped part
> + * of the primary boot code to the PoC so we can safely execute it with
> + * the MMU off.
> + */
> + cbz x19, 0f
> + adrp x0, __idmap_text_start
> + adr_l x1, __idmap_text_end
> + bl dcache_clean_poc
> +0: mov x19, x0
This is wrong, it should be
mov x0, x19
> bl init_kernel_el // w0=cpu_boot_mode
> mov x20, x0
>
> @@ -111,6 +122,7 @@ SYM_CODE_START(primary_entry)
> b __primary_switch
> SYM_CODE_END(primary_entry)
>
> + __INIT
> SYM_CODE_START_LOCAL(record_mmu_state)
> mrs x19, CurrentEL
> cmp x19, #CurrentEL_EL2
> @@ -505,10 +517,12 @@ SYM_FUNC_END(__primary_switched)
> * Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in x0 if
> * booted in EL1 or EL2 respectively, with the top 32 bits containing
> * potential context flags. These flags are *not* stored in __boot_cpu_mode.
> + *
> + * x0: whether we are being called from the primary boot path with the MMU on
> */
> SYM_FUNC_START(init_kernel_el)
> - mrs x0, CurrentEL
> - cmp x0, #CurrentEL_EL2
> + mrs x1, CurrentEL
> + cmp x1, #CurrentEL_EL2
> b.eq init_el2
>
> SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
> @@ -523,6 +537,14 @@ SYM_INNER_LABEL(init_el1, SYM_L_LOCAL)
> eret
>
> SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
> + msr elr_el2, lr
> +
> + // clean all HYP code to the PoC if we booted at EL2 with the MMU on
> + cbz x0, 0f
> + adrp x0, __hyp_idmap_text_start
> + adr_l x1, __hyp_text_end
> + bl dcache_clean_poc
> +0:
> mov_q x0, HCR_HOST_NVHE_FLAGS
> msr hcr_el2, x0
> isb
> @@ -556,7 +578,6 @@ SYM_INNER_LABEL(init_el2, SYM_L_LOCAL)
> msr sctlr_el1, x1
> mov x2, xzr
> 2:
> - msr elr_el2, lr
> mov w0, #BOOT_CPU_MODE_EL2
> orr x0, x0, x2
> eret
> @@ -567,6 +588,7 @@ SYM_FUNC_END(init_kernel_el)
> * cores are held until we're ready for them to initialise.
> */
> SYM_FUNC_START(secondary_holding_pen)
> + mov x0, xzr
> bl init_kernel_el // w0=cpu_boot_mode
> mrs x2, mpidr_el1
> mov_q x1, MPIDR_HWID_BITMASK
> @@ -584,6 +606,7 @@ SYM_FUNC_END(secondary_holding_pen)
> * be used where CPUs are brought online dynamically by the kernel.
> */
> SYM_FUNC_START(secondary_entry)
> + mov x0, xzr
> bl init_kernel_el // w0=cpu_boot_mode
> b secondary_startup
> SYM_FUNC_END(secondary_entry)
> diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
> index 7b7c56e048346e97..2ae7cff1953aaf87 100644
> --- a/arch/arm64/kernel/sleep.S
> +++ b/arch/arm64/kernel/sleep.S
> @@ -99,6 +99,7 @@ SYM_FUNC_END(__cpu_suspend_enter)
>
> .pushsection ".idmap.text", "awx"
> SYM_CODE_START(cpu_resume)
> + mov x0, xzr
> bl init_kernel_el
> mov x19, x0 // preserve boot mode
> #if VA_BITS > 48
> --
> 2.35.1
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
2022-11-08 18:21 ` Ard Biesheuvel
@ 2022-11-11 17:36 ` Mark Rutland
-1 siblings, 0 replies; 30+ messages in thread
From: Mark Rutland @ 2022-11-11 17:36 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, linux-efi, keescook, Will Deacon,
Catalin Marinas, Marc Zyngier
Hi Ard,
Sorry for the late-in-the-day reply here...
On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> The purpose of this series is to remove any explicit cache maintenance
> for coherency during early boot that becomes unnecessary if we simply
> retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> and use it to populate the ID map page tables. After setting up this
> preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> TCR and SCTLR registers as before, and proceed as usual, avoiding the
> need for any manipulations of memory while the MMU and caches are off.
>
> The only properties of the firmware provided 1:1 map we rely on is that
> it does not require any explicit cache maintenance for coherency, and
> that it covers the entire memory footprint of the image, including the
> BSS and padding at the end - all else is under control of the kernel
> itself, as before.
As a high-level thing, I'm still very much not keen on entering the kernel with
the MMU on. Given that we have to support booting with the MMU off for !EFI
boot (including kexec when EFI is in use), I think this makes it harder to
reason about the boot code overall (e.g. due to the conditional maintenance
added to head.S), and adds more scope for error, even if it simplifies the EFI
stub itself.
I reckon that (sticking with entering with the MMU off), there's more that we
can do to split the table creation into more stages, and to minimize the early
portion of that which has to run with the MMU off. That would benefit non-EFI
boot and kexec, and retain the single-boot-flow that we currently have.
My rough thinking was:
1) Reduce the idmap down to a single page, such that we only need to clear
NR_PAGETABLE_LEVELS pages to initialize this.
2) Create a small stub at a fixed TTBR1 VA which we use to create a new initial
mapping of the kernel image (either in TTBR0 as with the currently idmap, or
in TTBR1 directly). The stub logic could be small enough that it could be
mapped at page granularity, and we'd only need to initialize
NR_PAGETABLE_LEVELS pages before enabling the MMU.
This would then bounce onto the next stage, either in TTBR0 directly, or
bouncing through there as with the TTBR1 replacement logic.
We could plausibly write that in C, and the early page table asm logic could
be simplified.
Thanks,
Mark.
> Changes since v4:
> - add patch to align the callers of finalise_el2()
> - also clean HYP text to the PoC when booting at EL2 with the MMU on
> - add a warning and a taint when doing non-EFI boot with the MMU and
> caches enabled
> - rebase onto zboot changes in efi/next - this means that patches #6 and
> #7 will not apply onto arm64/for-next so a shared stable branch will
> be needed if we want to queue this up for v6.2
>
> Changes since v3:
> - drop EFI_LOADER_CODE memory type patch that has been queued in the
> mean time
> - rebased onto [partial] series that moves efi-entry.S into the libstub/
> source directory
> - fixed a correctness issue in patch #2
>
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
>
> Ard Biesheuvel (7):
> arm64: head: Move all finalise_el2 calls to after __enable_mmu
> arm64: kernel: move identity map out of .text mapping
> arm64: head: record the MMU state at primary entry
> arm64: head: avoid cache invalidation when entering with the MMU on
> arm64: head: Clean the ID map and the HYP text to the PoC if needed
> arm64: lds: reduce effective minimum image alignment to 64k
> efi: arm64: enter with MMU and caches enabled
>
> arch/arm64/include/asm/efi.h | 9 +-
> arch/arm64/kernel/head.S | 93 +++++++++++++++-----
> arch/arm64/kernel/image-vars.h | 5 +-
> arch/arm64/kernel/setup.c | 9 +-
> arch/arm64/kernel/sleep.S | 6 +-
> arch/arm64/kernel/vmlinux.lds.S | 13 ++-
> arch/arm64/mm/cache.S | 5 +-
> arch/arm64/mm/proc.S | 2 -
> drivers/firmware/efi/libstub/Makefile | 4 +-
> drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------
> drivers/firmware/efi/libstub/arm64-stub.c | 26 ++++--
> drivers/firmware/efi/libstub/arm64.c | 41 +++++++--
> include/linux/efi.h | 6 +-
> 13 files changed, 159 insertions(+), 127 deletions(-)
> delete mode 100644 drivers/firmware/efi/libstub/arm64-entry.S
>
> --
> 2.35.1
>
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-11 17:36 ` Mark Rutland
0 siblings, 0 replies; 30+ messages in thread
From: Mark Rutland @ 2022-11-11 17:36 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, linux-efi, keescook, Will Deacon,
Catalin Marinas, Marc Zyngier
Hi Ard,
Sorry for the late-in-the-day reply here...
On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> The purpose of this series is to remove any explicit cache maintenance
> for coherency during early boot that becomes unnecessary if we simply
> retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> and use it to populate the ID map page tables. After setting up this
> preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> TCR and SCTLR registers as before, and proceed as usual, avoiding the
> need for any manipulations of memory while the MMU and caches are off.
>
> The only properties of the firmware provided 1:1 map we rely on is that
> it does not require any explicit cache maintenance for coherency, and
> that it covers the entire memory footprint of the image, including the
> BSS and padding at the end - all else is under control of the kernel
> itself, as before.
As a high-level thing, I'm still very much not keen on entering the kernel with
the MMU on. Given that we have to support booting with the MMU off for !EFI
boot (including kexec when EFI is in use), I think this makes it harder to
reason about the boot code overall (e.g. due to the conditional maintenance
added to head.S), and adds more scope for error, even if it simplifies the EFI
stub itself.
I reckon that (sticking with entering with the MMU off), there's more that we
can do to split the table creation into more stages, and to minimize the early
portion of that which has to run with the MMU off. That would benefit non-EFI
boot and kexec, and retain the single-boot-flow that we currently have.
My rough thinking was:
1) Reduce the idmap down to a single page, such that we only need to clear
NR_PAGETABLE_LEVELS pages to initialize this.
2) Create a small stub at a fixed TTBR1 VA which we use to create a new initial
mapping of the kernel image (either in TTBR0 as with the currently idmap, or
in TTBR1 directly). The stub logic could be small enough that it could be
mapped at page granularity, and we'd only need to initialize
NR_PAGETABLE_LEVELS pages before enabling the MMU.
This would then bounce onto the next stage, either in TTBR0 directly, or
bouncing through there as with the TTBR1 replacement logic.
We could plausibly write that in C, and the early page table asm logic could
be simplified.
Thanks,
Mark.
> Changes since v4:
> - add patch to align the callers of finalise_el2()
> - also clean HYP text to the PoC when booting at EL2 with the MMU on
> - add a warning and a taint when doing non-EFI boot with the MMU and
> caches enabled
> - rebase onto zboot changes in efi/next - this means that patches #6 and
> #7 will not apply onto arm64/for-next so a shared stable branch will
> be needed if we want to queue this up for v6.2
>
> Changes since v3:
> - drop EFI_LOADER_CODE memory type patch that has been queued in the
> mean time
> - rebased onto [partial] series that moves efi-entry.S into the libstub/
> source directory
> - fixed a correctness issue in patch #2
>
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
>
> Ard Biesheuvel (7):
> arm64: head: Move all finalise_el2 calls to after __enable_mmu
> arm64: kernel: move identity map out of .text mapping
> arm64: head: record the MMU state at primary entry
> arm64: head: avoid cache invalidation when entering with the MMU on
> arm64: head: Clean the ID map and the HYP text to the PoC if needed
> arm64: lds: reduce effective minimum image alignment to 64k
> efi: arm64: enter with MMU and caches enabled
>
> arch/arm64/include/asm/efi.h | 9 +-
> arch/arm64/kernel/head.S | 93 +++++++++++++++-----
> arch/arm64/kernel/image-vars.h | 5 +-
> arch/arm64/kernel/setup.c | 9 +-
> arch/arm64/kernel/sleep.S | 6 +-
> arch/arm64/kernel/vmlinux.lds.S | 13 ++-
> arch/arm64/mm/cache.S | 5 +-
> arch/arm64/mm/proc.S | 2 -
> drivers/firmware/efi/libstub/Makefile | 4 +-
> drivers/firmware/efi/libstub/arm64-entry.S | 67 --------------
> drivers/firmware/efi/libstub/arm64-stub.c | 26 ++++--
> drivers/firmware/efi/libstub/arm64.c | 41 +++++++--
> include/linux/efi.h | 6 +-
> 13 files changed, 159 insertions(+), 127 deletions(-)
> delete mode 100644 drivers/firmware/efi/libstub/arm64-entry.S
>
> --
> 2.35.1
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
2022-11-11 17:36 ` Mark Rutland
@ 2022-11-15 11:17 ` Will Deacon
-1 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2022-11-15 11:17 UTC (permalink / raw)
To: Mark Rutland
Cc: Ard Biesheuvel, linux-arm-kernel, linux-efi, keescook,
Catalin Marinas, Marc Zyngier
On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > The purpose of this series is to remove any explicit cache maintenance
> > for coherency during early boot that becomes unnecessary if we simply
> > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > and use it to populate the ID map page tables. After setting up this
> > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > need for any manipulations of memory while the MMU and caches are off.
> >
> > The only properties of the firmware provided 1:1 map we rely on is that
> > it does not require any explicit cache maintenance for coherency, and
> > that it covers the entire memory footprint of the image, including the
> > BSS and padding at the end - all else is under control of the kernel
> > itself, as before.
>
> As a high-level thing, I'm still very much not keen on entering the kernel with
> the MMU on. Given that we have to support booting with the MMU off for !EFI
> boot (including kexec when EFI is in use), I think this makes it harder to
> reason about the boot code overall (e.g. due to the conditional maintenance
> added to head.S), and adds more scope for error, even if it simplifies the EFI
> stub itself.
As discussed offline, two things that would help the current series are:
(1) Some performance numbers comparing MMU off vs MMU on boot
(2) Use of a separate entry point for the MMU on case, potentially failing
the boot if the MMU is on and we're not using EFI
Will
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-15 11:17 ` Will Deacon
0 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2022-11-15 11:17 UTC (permalink / raw)
To: Mark Rutland
Cc: Ard Biesheuvel, linux-arm-kernel, linux-efi, keescook,
Catalin Marinas, Marc Zyngier
On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > The purpose of this series is to remove any explicit cache maintenance
> > for coherency during early boot that becomes unnecessary if we simply
> > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > and use it to populate the ID map page tables. After setting up this
> > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > need for any manipulations of memory while the MMU and caches are off.
> >
> > The only properties of the firmware provided 1:1 map we rely on is that
> > it does not require any explicit cache maintenance for coherency, and
> > that it covers the entire memory footprint of the image, including the
> > BSS and padding at the end - all else is under control of the kernel
> > itself, as before.
>
> As a high-level thing, I'm still very much not keen on entering the kernel with
> the MMU on. Given that we have to support booting with the MMU off for !EFI
> boot (including kexec when EFI is in use), I think this makes it harder to
> reason about the boot code overall (e.g. due to the conditional maintenance
> added to head.S), and adds more scope for error, even if it simplifies the EFI
> stub itself.
As discussed offline, two things that would help the current series are:
(1) Some performance numbers comparing MMU off vs MMU on boot
(2) Use of a separate entry point for the MMU on case, potentially failing
the boot if the MMU is on and we're not using EFI
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
2022-11-15 11:17 ` Will Deacon
@ 2022-11-15 11:21 ` Ard Biesheuvel
-1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-15 11:21 UTC (permalink / raw)
To: Will Deacon
Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
Catalin Marinas, Marc Zyngier
On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
>
> On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > The purpose of this series is to remove any explicit cache maintenance
> > > for coherency during early boot that becomes unnecessary if we simply
> > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > and use it to populate the ID map page tables. After setting up this
> > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > need for any manipulations of memory while the MMU and caches are off.
> > >
> > > The only properties of the firmware provided 1:1 map we rely on is that
> > > it does not require any explicit cache maintenance for coherency, and
> > > that it covers the entire memory footprint of the image, including the
> > > BSS and padding at the end - all else is under control of the kernel
> > > itself, as before.
> >
> > As a high-level thing, I'm still very much not keen on entering the kernel with
> > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > boot (including kexec when EFI is in use), I think this makes it harder to
> > reason about the boot code overall (e.g. due to the conditional maintenance
> > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > stub itself.
>
> As discussed offline, two things that would help the current series are:
>
> (1) Some performance numbers comparing MMU off vs MMU on boot
>
> (2) Use of a separate entry point for the MMU on case, potentially failing
> the boot if the MMU is on and we're not using EFI
>
Ack.
But thinking about (2) again, failing the boot is better done at a
time when you can inform the user about it, no?
IOW, just going into a deadloop really early if you enter the bare
metal entry point with the MMU on is going to be hard to distinguish
from other issues, whereas panicking after the console up is more
likely to help getting the actual issue diagnosed.
So perhaps we should panic() instead of warn+taint when this condition
occurs, and do it from an early initcall instead of from setup_arch().
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-15 11:21 ` Ard Biesheuvel
0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-15 11:21 UTC (permalink / raw)
To: Will Deacon
Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
Catalin Marinas, Marc Zyngier
On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
>
> On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > The purpose of this series is to remove any explicit cache maintenance
> > > for coherency during early boot that becomes unnecessary if we simply
> > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > and use it to populate the ID map page tables. After setting up this
> > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > need for any manipulations of memory while the MMU and caches are off.
> > >
> > > The only properties of the firmware provided 1:1 map we rely on is that
> > > it does not require any explicit cache maintenance for coherency, and
> > > that it covers the entire memory footprint of the image, including the
> > > BSS and padding at the end - all else is under control of the kernel
> > > itself, as before.
> >
> > As a high-level thing, I'm still very much not keen on entering the kernel with
> > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > boot (including kexec when EFI is in use), I think this makes it harder to
> > reason about the boot code overall (e.g. due to the conditional maintenance
> > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > stub itself.
>
> As discussed offline, two things that would help the current series are:
>
> (1) Some performance numbers comparing MMU off vs MMU on boot
>
> (2) Use of a separate entry point for the MMU on case, potentially failing
> the boot if the MMU is on and we're not using EFI
>
Ack.
But thinking about (2) again, failing the boot is better done at a
time when you can inform the user about it, no?
IOW, just going into a deadloop really early if you enter the bare
metal entry point with the MMU on is going to be hard to distinguish
from other issues, whereas panicking after the console up is more
likely to help getting the actual issue diagnosed.
So perhaps we should panic() instead of warn+taint when this condition
occurs, and do it from an early initcall instead of from setup_arch().
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
2022-11-15 11:21 ` Ard Biesheuvel
@ 2022-11-15 11:31 ` Will Deacon
-1 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2022-11-15 11:31 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
Catalin Marinas, Marc Zyngier
On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote:
> On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
> >
> > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > > The purpose of this series is to remove any explicit cache maintenance
> > > > for coherency during early boot that becomes unnecessary if we simply
> > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > > and use it to populate the ID map page tables. After setting up this
> > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > > need for any manipulations of memory while the MMU and caches are off.
> > > >
> > > > The only properties of the firmware provided 1:1 map we rely on is that
> > > > it does not require any explicit cache maintenance for coherency, and
> > > > that it covers the entire memory footprint of the image, including the
> > > > BSS and padding at the end - all else is under control of the kernel
> > > > itself, as before.
> > >
> > > As a high-level thing, I'm still very much not keen on entering the kernel with
> > > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > > boot (including kexec when EFI is in use), I think this makes it harder to
> > > reason about the boot code overall (e.g. due to the conditional maintenance
> > > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > > stub itself.
> >
> > As discussed offline, two things that would help the current series are:
> >
> > (1) Some performance numbers comparing MMU off vs MMU on boot
> >
> > (2) Use of a separate entry point for the MMU on case, potentially failing
> > the boot if the MMU is on and we're not using EFI
> >
>
> Ack.
>
> But thinking about (2) again, failing the boot is better done at a
> time when you can inform the user about it, no?
>
> IOW, just going into a deadloop really early if you enter the bare
> metal entry point with the MMU on is going to be hard to distinguish
> from other issues, whereas panicking after the console up is more
> likely to help getting the actual issue diagnosed.
Agreed.
> So perhaps we should panic() instead of warn+taint when this condition
> occurs, and do it from an early initcall instead of from setup_arch().
To be honest, and I appreciate that this is unhelpful, but I'm fine with
the warn+taint and prefer that to a fatal stop.
Will
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-15 11:31 ` Will Deacon
0 siblings, 0 replies; 30+ messages in thread
From: Will Deacon @ 2022-11-15 11:31 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
Catalin Marinas, Marc Zyngier
On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote:
> On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
> >
> > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > > The purpose of this series is to remove any explicit cache maintenance
> > > > for coherency during early boot that becomes unnecessary if we simply
> > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > > and use it to populate the ID map page tables. After setting up this
> > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > > need for any manipulations of memory while the MMU and caches are off.
> > > >
> > > > The only properties of the firmware provided 1:1 map we rely on is that
> > > > it does not require any explicit cache maintenance for coherency, and
> > > > that it covers the entire memory footprint of the image, including the
> > > > BSS and padding at the end - all else is under control of the kernel
> > > > itself, as before.
> > >
> > > As a high-level thing, I'm still very much not keen on entering the kernel with
> > > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > > boot (including kexec when EFI is in use), I think this makes it harder to
> > > reason about the boot code overall (e.g. due to the conditional maintenance
> > > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > > stub itself.
> >
> > As discussed offline, two things that would help the current series are:
> >
> > (1) Some performance numbers comparing MMU off vs MMU on boot
> >
> > (2) Use of a separate entry point for the MMU on case, potentially failing
> > the boot if the MMU is on and we're not using EFI
> >
>
> Ack.
>
> But thinking about (2) again, failing the boot is better done at a
> time when you can inform the user about it, no?
>
> IOW, just going into a deadloop really early if you enter the bare
> metal entry point with the MMU on is going to be hard to distinguish
> from other issues, whereas panicking after the console up is more
> likely to help getting the actual issue diagnosed.
Agreed.
> So perhaps we should panic() instead of warn+taint when this condition
> occurs, and do it from an early initcall instead of from setup_arch().
To be honest, and I appreciate that this is unhelpful, but I'm fine with
the warn+taint and prefer that to a fatal stop.
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
2022-11-15 11:31 ` Will Deacon
@ 2022-11-26 14:16 ` Ard Biesheuvel
-1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-26 14:16 UTC (permalink / raw)
To: Will Deacon
Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
Catalin Marinas, Marc Zyngier
On Tue, 15 Nov 2022 at 12:31, Will Deacon <will@kernel.org> wrote:
>
> On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote:
> > On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
> > >
> > > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > > > The purpose of this series is to remove any explicit cache maintenance
> > > > > for coherency during early boot that becomes unnecessary if we simply
> > > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > > > and use it to populate the ID map page tables. After setting up this
> > > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > > > need for any manipulations of memory while the MMU and caches are off.
> > > > >
> > > > > The only properties of the firmware provided 1:1 map we rely on is that
> > > > > it does not require any explicit cache maintenance for coherency, and
> > > > > that it covers the entire memory footprint of the image, including the
> > > > > BSS and padding at the end - all else is under control of the kernel
> > > > > itself, as before.
> > > >
> > > > As a high-level thing, I'm still very much not keen on entering the kernel with
> > > > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > > > boot (including kexec when EFI is in use), I think this makes it harder to
> > > > reason about the boot code overall (e.g. due to the conditional maintenance
> > > > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > > > stub itself.
> > >
> > > As discussed offline, two things that would help the current series are:
> > >
> > > (1) Some performance numbers comparing MMU off vs MMU on boot
> > >
Finally got around to measuring this - I lost access to my TX2 machine
for a couple of days during the past week,
With the patch below applied to mainline, I measure ~6 ms spent
cleaning the entire image to the PoC (which is the bulk of it) and
subsequently populating the initial ID map and activating it.
This drops to about 0.6 ms with my changes applied. This is unlikely
to ever matter in practice, perhaps, but I will note that booting a VM
in EFI mode using Tianocore/EDK2 from the point where KVM clears the
counter to the point where we start user space can be done (on the
same machine) in 500-700 ms so it is not entirely insignificant
either.
I could try and measure it on bare metal as well, but I suppose that
launch times are even less relevant there so I didn't bother.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-26 14:16 ` Ard Biesheuvel
0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-26 14:16 UTC (permalink / raw)
To: Will Deacon
Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
Catalin Marinas, Marc Zyngier
On Tue, 15 Nov 2022 at 12:31, Will Deacon <will@kernel.org> wrote:
>
> On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote:
> > On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
> > >
> > > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > > > The purpose of this series is to remove any explicit cache maintenance
> > > > > for coherency during early boot that becomes unnecessary if we simply
> > > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > > > and use it to populate the ID map page tables. After setting up this
> > > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > > > need for any manipulations of memory while the MMU and caches are off.
> > > > >
> > > > > The only properties of the firmware provided 1:1 map we rely on is that
> > > > > it does not require any explicit cache maintenance for coherency, and
> > > > > that it covers the entire memory footprint of the image, including the
> > > > > BSS and padding at the end - all else is under control of the kernel
> > > > > itself, as before.
> > > >
> > > > As a high-level thing, I'm still very much not keen on entering the kernel with
> > > > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > > > boot (including kexec when EFI is in use), I think this makes it harder to
> > > > reason about the boot code overall (e.g. due to the conditional maintenance
> > > > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > > > stub itself.
> > >
> > > As discussed offline, two things that would help the current series are:
> > >
> > > (1) Some performance numbers comparing MMU off vs MMU on boot
> > >
Finally got around to measuring this - I lost access to my TX2 machine
for a couple of days during the past week,
With the patch below applied to mainline, I measure ~6 ms spent
cleaning the entire image to the PoC (which is the bulk of it) and
subsequently populating the initial ID map and activating it.
This drops to about 0.6 ms with my changes applied. This is unlikely
to ever matter in practice, perhaps, but I will note that booting a VM
in EFI mode using Tianocore/EDK2 from the point where KVM clears the
counter to the point where we start user space can be done (on the
same machine) in 500-700 ms so it is not entirely insignificant
either.
I could try and measure it on bare metal as well, but I suppose that
launch times are even less relevant there so I didn't bother.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
2022-11-26 14:16 ` Ard Biesheuvel
@ 2022-11-26 14:17 ` Ard Biesheuvel
-1 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-26 14:17 UTC (permalink / raw)
To: Will Deacon
Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
Catalin Marinas, Marc Zyngier
On Sat, 26 Nov 2022 at 15:16, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Tue, 15 Nov 2022 at 12:31, Will Deacon <will@kernel.org> wrote:
> >
> > On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote:
> > > On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
> > > >
> > > > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > > > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > > > > The purpose of this series is to remove any explicit cache maintenance
> > > > > > for coherency during early boot that becomes unnecessary if we simply
> > > > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > > > > and use it to populate the ID map page tables. After setting up this
> > > > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > > > > need for any manipulations of memory while the MMU and caches are off.
> > > > > >
> > > > > > The only properties of the firmware provided 1:1 map we rely on is that
> > > > > > it does not require any explicit cache maintenance for coherency, and
> > > > > > that it covers the entire memory footprint of the image, including the
> > > > > > BSS and padding at the end - all else is under control of the kernel
> > > > > > itself, as before.
> > > > >
> > > > > As a high-level thing, I'm still very much not keen on entering the kernel with
> > > > > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > > > > boot (including kexec when EFI is in use), I think this makes it harder to
> > > > > reason about the boot code overall (e.g. due to the conditional maintenance
> > > > > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > > > > stub itself.
> > > >
> > > > As discussed offline, two things that would help the current series are:
> > > >
> > > > (1) Some performance numbers comparing MMU off vs MMU on boot
> > > >
>
> Finally got around to measuring this - I lost access to my TX2 machine
> for a couple of days during the past week,
>
> With the patch below applied to mainline, I measure ~6 ms spent
> cleaning the entire image to the PoC (which is the bulk of it) and
> subsequently populating the initial ID map and activating it.
>
> This drops to about 0.6 ms with my changes applied. This is unlikely
> to ever matter in practice, perhaps, but I will note that booting a VM
> in EFI mode using Tianocore/EDK2 from the point where KVM clears the
> counter to the point where we start user space can be done (on the
> same machine) in 500-700 ms so it is not entirely insignificant
> either.
>
> I could try and measure it on bare metal as well, but I suppose that
> launch times are even less relevant there so I didn't bother.
diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index 61a87fa1c3055e26..27f59784a1c0be2c 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -22,6 +22,7 @@ SYM_CODE_START(efi_enter_kernel)
ldr w2, =primary_entry_offset
add x19, x0, x2 // relocated Image entrypoint
mov x20, x1 // DTB address
+ mrs x27, cntvct_el0
/*
* Clean the copied Image to the PoC, and ensure it is not shadowed by
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2196aad7b55bcef0..068a7d111836382b 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -430,6 +430,8 @@ SYM_FUNC_START_LOCAL(__primary_switched)
str_l x21, __fdt_pointer, x5 // Save FDT pointer
+ str_l x27, boot_args + 8, x5
+
ldr_l x4, kimage_vaddr // Save the offset between
sub x4, x4, x0 // the kernel virtual and
str_l x4, kimage_voffset, x5 // physical mappings
@@ -797,6 +799,10 @@ SYM_FUNC_START_LOCAL(__primary_switch)
adrp x1, reserved_pg_dir
adrp x2, init_idmap_pg_dir
bl __enable_mmu
+
+ mrs x0, cntvct_el0
+ sub x27, x0, x27
+
#ifdef CONFIG_RELOCATABLE
adrp x23, KERNEL_START
and x23, x23, MIN_KIMG_ALIGN - 1
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot
@ 2022-11-26 14:17 ` Ard Biesheuvel
0 siblings, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2022-11-26 14:17 UTC (permalink / raw)
To: Will Deacon
Cc: Mark Rutland, linux-arm-kernel, linux-efi, keescook,
Catalin Marinas, Marc Zyngier
On Sat, 26 Nov 2022 at 15:16, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Tue, 15 Nov 2022 at 12:31, Will Deacon <will@kernel.org> wrote:
> >
> > On Tue, Nov 15, 2022 at 12:21:55PM +0100, Ard Biesheuvel wrote:
> > > On Tue, 15 Nov 2022 at 12:17, Will Deacon <will@kernel.org> wrote:
> > > >
> > > > On Fri, Nov 11, 2022 at 05:36:19PM +0000, Mark Rutland wrote:
> > > > > On Tue, Nov 08, 2022 at 07:21:57PM +0100, Ard Biesheuvel wrote:
> > > > > > The purpose of this series is to remove any explicit cache maintenance
> > > > > > for coherency during early boot that becomes unnecessary if we simply
> > > > > > retain the cacheable 1:1 mapping of all of system RAM provided by EFI,
> > > > > > and use it to populate the ID map page tables. After setting up this
> > > > > > preliminary ID map, we disable the MMU, drop to EL1, reprogram the MAIR,
> > > > > > TCR and SCTLR registers as before, and proceed as usual, avoiding the
> > > > > > need for any manipulations of memory while the MMU and caches are off.
> > > > > >
> > > > > > The only properties of the firmware provided 1:1 map we rely on is that
> > > > > > it does not require any explicit cache maintenance for coherency, and
> > > > > > that it covers the entire memory footprint of the image, including the
> > > > > > BSS and padding at the end - all else is under control of the kernel
> > > > > > itself, as before.
> > > > >
> > > > > As a high-level thing, I'm still very much not keen on entering the kernel with
> > > > > the MMU on. Given that we have to support booting with the MMU off for !EFI
> > > > > boot (including kexec when EFI is in use), I think this makes it harder to
> > > > > reason about the boot code overall (e.g. due to the conditional maintenance
> > > > > added to head.S), and adds more scope for error, even if it simplifies the EFI
> > > > > stub itself.
> > > >
> > > > As discussed offline, two things that would help the current series are:
> > > >
> > > > (1) Some performance numbers comparing MMU off vs MMU on boot
> > > >
>
> Finally got around to measuring this - I lost access to my TX2 machine
> for a couple of days during the past week,
>
> With the patch below applied to mainline, I measure ~6 ms spent
> cleaning the entire image to the PoC (which is the bulk of it) and
> subsequently populating the initial ID map and activating it.
>
> This drops to about 0.6 ms with my changes applied. This is unlikely
> to ever matter in practice, perhaps, but I will note that booting a VM
> in EFI mode using Tianocore/EDK2 from the point where KVM clears the
> counter to the point where we start user space can be done (on the
> same machine) in 500-700 ms so it is not entirely insignificant
> either.
>
> I could try and measure it on bare metal as well, but I suppose that
> launch times are even less relevant there so I didn't bother.
diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S
index 61a87fa1c3055e26..27f59784a1c0be2c 100644
--- a/arch/arm64/kernel/efi-entry.S
+++ b/arch/arm64/kernel/efi-entry.S
@@ -22,6 +22,7 @@ SYM_CODE_START(efi_enter_kernel)
ldr w2, =primary_entry_offset
add x19, x0, x2 // relocated Image entrypoint
mov x20, x1 // DTB address
+ mrs x27, cntvct_el0
/*
* Clean the copied Image to the PoC, and ensure it is not shadowed by
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 2196aad7b55bcef0..068a7d111836382b 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -430,6 +430,8 @@ SYM_FUNC_START_LOCAL(__primary_switched)
str_l x21, __fdt_pointer, x5 // Save FDT pointer
+ str_l x27, boot_args + 8, x5
+
ldr_l x4, kimage_vaddr // Save the offset between
sub x4, x4, x0 // the kernel virtual and
str_l x4, kimage_voffset, x5 // physical mappings
@@ -797,6 +799,10 @@ SYM_FUNC_START_LOCAL(__primary_switch)
adrp x1, reserved_pg_dir
adrp x2, init_idmap_pg_dir
bl __enable_mmu
+
+ mrs x0, cntvct_el0
+ sub x27, x0, x27
+
#ifdef CONFIG_RELOCATABLE
adrp x23, KERNEL_START
and x23, x23, MIN_KIMG_ALIGN - 1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 30+ messages in thread
end of thread, other threads:[~2022-11-26 14:18 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-08 18:21 [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot Ard Biesheuvel
2022-11-08 18:21 ` Ard Biesheuvel
2022-11-08 18:21 ` [PATCH v5 1/7] arm64: head: Move all finalise_el2 calls to after __enable_mmu Ard Biesheuvel
2022-11-08 18:21 ` Ard Biesheuvel
2022-11-08 18:21 ` [PATCH v5 2/7] arm64: kernel: move identity map out of .text mapping Ard Biesheuvel
2022-11-08 18:21 ` Ard Biesheuvel
2022-11-08 18:22 ` [PATCH v5 3/7] arm64: head: record the MMU state at primary entry Ard Biesheuvel
2022-11-08 18:22 ` Ard Biesheuvel
2022-11-08 18:22 ` [PATCH v5 4/7] arm64: head: avoid cache invalidation when entering with the MMU on Ard Biesheuvel
2022-11-08 18:22 ` Ard Biesheuvel
2022-11-08 18:22 ` [PATCH v5 5/7] arm64: head: Clean the ID map and the HYP text to the PoC if needed Ard Biesheuvel
2022-11-08 18:22 ` Ard Biesheuvel
2022-11-08 22:11 ` Ard Biesheuvel
2022-11-08 22:11 ` Ard Biesheuvel
2022-11-08 18:22 ` [PATCH v5 6/7] arm64: lds: reduce effective minimum image alignment to 64k Ard Biesheuvel
2022-11-08 18:22 ` Ard Biesheuvel
2022-11-08 18:22 ` [PATCH v5 7/7] efi: arm64: enter with MMU and caches enabled Ard Biesheuvel
2022-11-08 18:22 ` Ard Biesheuvel
2022-11-11 17:36 ` [PATCH v5 0/7] arm64: efi: leave MMU and caches on at boot Mark Rutland
2022-11-11 17:36 ` Mark Rutland
2022-11-15 11:17 ` Will Deacon
2022-11-15 11:17 ` Will Deacon
2022-11-15 11:21 ` Ard Biesheuvel
2022-11-15 11:21 ` Ard Biesheuvel
2022-11-15 11:31 ` Will Deacon
2022-11-15 11:31 ` Will Deacon
2022-11-26 14:16 ` Ard Biesheuvel
2022-11-26 14:16 ` Ard Biesheuvel
2022-11-26 14:17 ` Ard Biesheuvel
2022-11-26 14:17 ` Ard Biesheuvel
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.