All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] Ensure stack is aligned for kernel entries
@ 2018-09-26 13:56 Julien Thierry
  2018-09-26 13:56 ` [PATCH 1/7] arm64: Add static check for pt_regs alignment Julien Thierry
                   ` (10 more replies)
  0 siblings, 11 replies; 50+ messages in thread
From: Julien Thierry @ 2018-09-26 13:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
using it to access memory. When taking an exception, it is possible that
the context during which the exception occured had SP mis-aligned. The
entry code needs to make sure that the stack is aligned before using it to
save the context.

This is only a concern when taking exception from an EL using the same
SP_ELx as the handler. In other cases it can be assumed that the SP being
picked up on exception entry is aligned under the condition that SP is
always aligned when doing eret to an EL using a different SP.

On Juno I see a runtime difference <1% for hackbench. If I do not include
the fast path@EL1 (patch 4) I see a diff of 1-2%.

For EL2 entries, a bit of clean up of stuff getting patched in the vector
has been needed.

Cheers,

Julien

-->

Julien Thierry (7):
  arm64: Add static check for pt_regs alignment
  arm64: sdei: Always use sdei stack for sdei events
  arm64: Align stack when taking exception from EL1
  arm64: Add fast-path for stack alignment
  arm64: Do not apply BP hardening for hyp entries from EL2
  arm64: Do not apply vector harderning for hyp entries from EL2
  arm64: kvm: Align stack for exception coming from EL2

 arch/arm64/include/asm/assembler.h |  9 +++++
 arch/arm64/include/asm/ptrace.h    |  2 +
 arch/arm64/include/asm/sdei.h      |  2 -
 arch/arm64/kernel/cpu_errata.c     | 10 ++++-
 arch/arm64/kernel/entry.S          | 43 +++++++++++++++++++--
 arch/arm64/kernel/sdei.c           | 23 ++++-------
 arch/arm64/kvm/hyp/hyp-entry.S     | 78 +++++++++++++++++++++++++++++++-------
 drivers/firmware/Kconfig           |  1 +
 8 files changed, 132 insertions(+), 36 deletions(-)

--
1.9.1

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/7] arm64: Add static check for pt_regs alignment
  2018-09-26 13:56 [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
@ 2018-09-26 13:56 ` Julien Thierry
  2018-11-07 21:58   ` Will Deacon
  2018-11-08 18:35   ` Ard Biesheuvel
  2018-09-26 13:56 ` [PATCH 2/7] arm64: sdei: Always use sdei stack for sdei events Julien Thierry
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 50+ messages in thread
From: Julien Thierry @ 2018-09-26 13:56 UTC (permalink / raw)
  To: linux-arm-kernel

The pt_regs structure's size is required to be a multiple of 16 to maintain
the stack alignment upon kernel entries.

Add a static check to ensure this is the case.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
---
 arch/arm64/include/asm/ptrace.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 177b851..cc17fad 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -229,6 +229,8 @@ static inline u64 regs_get_register(struct pt_regs *regs, unsigned int offset)
 
 	WARN_ON(offset & 7);
 
+	BUILD_BUG_ON((sizeof (struct pt_regs) % 16 != 0));
+
 	offset >>= 3;
 	switch (offset) {
 	case 0 ... 30:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/7] arm64: sdei: Always use sdei stack for sdei events
  2018-09-26 13:56 [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
  2018-09-26 13:56 ` [PATCH 1/7] arm64: Add static check for pt_regs alignment Julien Thierry
@ 2018-09-26 13:56 ` Julien Thierry
  2018-11-07 21:58   ` Will Deacon
  2018-09-26 13:56 ` [PATCH 3/7] arm64: Align stack when taking exception from EL1 Julien Thierry
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Julien Thierry @ 2018-09-26 13:56 UTC (permalink / raw)
  To: linux-arm-kernel

SDEI events can occur at any point, including when the stack pointer is
not aligned. SP could be modified to respect alignment 16-byte alignement,
but there is a need to deal with code using SP as scratch register.

Always reserve the SDEI stacks to handle its events.

Since stack allocation relies on VMAPed stacks, lets make SDEI depend on
VMAP_STACK.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
---
 arch/arm64/include/asm/sdei.h |  2 --
 arch/arm64/kernel/entry.S     |  2 --
 arch/arm64/kernel/sdei.c      | 23 ++++++++---------------
 drivers/firmware/Kconfig      |  1 +
 4 files changed, 9 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/include/asm/sdei.h b/arch/arm64/include/asm/sdei.h
index ffe47d7..149dffe 100644
--- a/arch/arm64/include/asm/sdei.h
+++ b/arch/arm64/include/asm/sdei.h
@@ -46,8 +46,6 @@ asmlinkage unsigned long __sdei_handler(struct pt_regs *regs,
 static inline bool on_sdei_stack(unsigned long sp,
 				struct stack_info *info)
 {
-	if (!IS_ENABLED(CONFIG_VMAP_STACK))
-		return false;
 	if (!IS_ENABLED(CONFIG_ARM_SDE_INTERFACE))
 		return false;
 	if (in_nmi())
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 09dbea22..fc5842b 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -1187,7 +1187,6 @@ ENTRY(__sdei_asm_handler)
 
 	mov	x19, x1
 
-#ifdef CONFIG_VMAP_STACK
 	/*
 	 * entry.S may have been using sp as a scratch register, find whether
 	 * this is a normal or critical event and switch to the appropriate
@@ -1201,7 +1200,6 @@ ENTRY(__sdei_asm_handler)
 2:	mov	x6, #SDEI_STACK_SIZE
 	add	x5, x5, x6
 	mov	sp, x5
-#endif
 
 	/*
 	 * We may have interrupted userspace, or a guest, or exit-from or
diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c
index 5ba4465..2e2fb0b 100644
--- a/arch/arm64/kernel/sdei.c
+++ b/arch/arm64/kernel/sdei.c
@@ -20,21 +20,16 @@
 unsigned long sdei_exit_mode;
 
 /*
- * VMAP'd stacks checking for stack overflow on exception using sp as a scratch
- * register, meaning SDEI has to switch to its own stack. We need two stacks as
- * a critical event may interrupt a normal event that has just taken a
- * synchronous exception, and is using sp as scratch register. For a critical
- * event interrupting a normal event, we can't reliably tell if we were on the
- * sdei stack.
+ * SDEI events could occur at a time where SP_EL1 is misaligned or invalid, a
+ * solution is to give SDEI its own stack. We need two stacks as a critical
+ * event may interrupt a normal event that has just taken a synchronous
+ * exception, and is using sp as scratch register. For a critical event
+ * interrupting a normal event, we can't reliably tell if we were on the sdei
+ * stack.
  * For now, we allocate stacks when the driver is probed.
  */
-DECLARE_PER_CPU(unsigned long *, sdei_stack_normal_ptr);
-DECLARE_PER_CPU(unsigned long *, sdei_stack_critical_ptr);
-
-#ifdef CONFIG_VMAP_STACK
 DEFINE_PER_CPU(unsigned long *, sdei_stack_normal_ptr);
 DEFINE_PER_CPU(unsigned long *, sdei_stack_critical_ptr);
-#endif
 
 static void _free_sdei_stack(unsigned long * __percpu *ptr, int cpu)
 {
@@ -150,10 +145,8 @@ unsigned long sdei_arch_get_entry_point(int conduit)
 		return 0;
 	}
 
-	if (IS_ENABLED(CONFIG_VMAP_STACK)) {
-		if (init_sdei_stacks())
-			return 0;
-	}
+	if (init_sdei_stacks())
+		return 0;
 
 	sdei_exit_mode = (conduit == CONDUIT_HVC) ? SDEI_EXIT_HVC : SDEI_EXIT_SMC;
 
diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index 6e83880..c63df31 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -85,6 +85,7 @@ config ARM_SCPI_POWER_DOMAIN
 config ARM_SDE_INTERFACE
 	bool "ARM Software Delegated Exception Interface (SDEI)"
 	depends on ARM64
+	depends on VMAP_STACK
 	help
 	  The Software Delegated Exception Interface (SDEI) is an ARM
 	  standard for registering callbacks from the platform firmware
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/7] arm64: Align stack when taking exception from EL1
  2018-09-26 13:56 [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
  2018-09-26 13:56 ` [PATCH 1/7] arm64: Add static check for pt_regs alignment Julien Thierry
  2018-09-26 13:56 ` [PATCH 2/7] arm64: sdei: Always use sdei stack for sdei events Julien Thierry
@ 2018-09-26 13:56 ` Julien Thierry
  2018-11-07 21:59   ` Will Deacon
  2018-09-26 13:56 ` [PATCH 4/7] arm64: Add fast-path for stack alignment Julien Thierry
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Julien Thierry @ 2018-09-26 13:56 UTC (permalink / raw)
  To: linux-arm-kernel

Arm64 SP needs to be aligned to 16 bytes before being used as base address
for loads and stores. When taking some valid exceptions from EL1 (e.g. irq,
dbg, data abort), there is no guarantee that SP_EL1 was aligned when
taking the exception.

Pad the stack on EL1 entries when misaligned.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
---
 arch/arm64/include/asm/assembler.h |  9 +++++++++
 arch/arm64/kernel/entry.S          | 32 ++++++++++++++++++++++++++++++--
 2 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 0bcc98d..a0a5415 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -701,4 +701,13 @@
 .Lyield_out_\@ :
 	.endm
 
+/*
+ * Echange content of register xt with sp.
+ */
+	.macro xchg_sp	xt
+	add	sp, sp, \xt	// sp' = sp + xt
+	sub	\xt, sp, \xt	// xt' = sp' - xt = sp + xt - xt = sp
+	sub	sp, sp, \xt	// sp'' = sp' - xt' = sp + xt - sp = xt
+	.endm
+
 #endif	/* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index fc5842b..8fb66e4 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -59,6 +59,19 @@
 	.endr
 	.endm
 
+	.macro force_stack_align
+	xchg_sp	x0
+	str	x1, [x0] // store x1 far away from S_SP
+
+	// aligned_sp[S_SP] = old_sp
+	bic	x1, x0, #0xf	// align down to 16-byte
+	str	x0, [x1, #S_SP]
+
+	ldr	x1, [x0]
+	bic	x0, x0, #0xf	// x0 = aligned_sp
+	xchg_sp	x0
+	.endm
+
 /*
  * Bad Abort numbers
  *-----------------
@@ -158,6 +171,10 @@ alternative_cb_end
 	.if	\regsize == 32
 	mov	w0, w0				// zero upper 32 bits of x0
 	.endif
+	.if	\el != 0
+	force_stack_align
+	.endif
+
 	stp	x0, x1, [sp, #16 * 0]
 	stp	x2, x3, [sp, #16 * 1]
 	stp	x4, x5, [sp, #16 * 2]
@@ -184,7 +201,8 @@ alternative_cb_end
 	apply_ssbd 1, x22, x23
 
 	.else
-	add	x21, sp, #S_FRAME_SIZE
+	ldr	x21, [sp, #S_SP]
+	add	x21, x21, #S_FRAME_SIZE		// adjust stored sp
 	get_thread_info tsk
 	/* Save the task's original addr_limit and set USER_DS */
 	ldr	x20, [tsk, #TSK_TI_ADDR_LIMIT]
@@ -327,7 +345,6 @@ alternative_else_nop_endif
 
 	msr	elr_el1, x21			// set up the return data
 	msr	spsr_el1, x22
-	ldp	x0, x1, [sp, #16 * 0]
 	ldp	x2, x3, [sp, #16 * 1]
 	ldp	x4, x5, [sp, #16 * 2]
 	ldp	x6, x7, [sp, #16 * 3]
@@ -343,7 +360,18 @@ alternative_else_nop_endif
 	ldp	x26, x27, [sp, #16 * 13]
 	ldp	x28, x29, [sp, #16 * 14]
 	ldr	lr, [sp, #S_LR]
+
+	/* Restore x0, x1 and sp */
+	.if	\el != 0
+	mov	x1, sp
+	ldr	x0, [sp, #S_SP]
+	mov	sp, x0
+	ldp	x0, x1, [x1, #16 * 0]
+	.else
+	ldp	x0, x1, [sp, #16 * 0]
 	add	sp, sp, #S_FRAME_SIZE		// restore sp
+	.endif
+
 	/*
 	 * ARCH_HAS_MEMBARRIER_SYNC_CORE rely on eret context synchronization
 	 * when returning from IPI handler, and when returning to user-space.
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 4/7] arm64: Add fast-path for stack alignment
  2018-09-26 13:56 [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
                   ` (2 preceding siblings ...)
  2018-09-26 13:56 ` [PATCH 3/7] arm64: Align stack when taking exception from EL1 Julien Thierry
@ 2018-09-26 13:56 ` Julien Thierry
  2018-11-07 21:59   ` Will Deacon
  2018-09-26 13:56   ` Julien Thierry
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 50+ messages in thread
From: Julien Thierry @ 2018-09-26 13:56 UTC (permalink / raw)
  To: linux-arm-kernel

Avoiding stack alignment for already aligned stack can give a small
performance boost.

Branch out of the aligning code when the stack is known to be aligned.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
---
 arch/arm64/kernel/entry.S | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 8fb66e4..bd8d52c 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -61,6 +61,10 @@
 
 	.macro force_stack_align
 	xchg_sp	x0
+
+	tst	x0, #0xf
+	b.eq	0f
+
 	str	x1, [x0] // store x1 far away from S_SP
 
 	// aligned_sp[S_SP] = old_sp
@@ -69,6 +73,11 @@
 
 	ldr	x1, [x0]
 	bic	x0, x0, #0xf	// x0 = aligned_sp
+	b	1f
+
+0:
+	str	x0, [x0, #S_SP]
+1:
 	xchg_sp	x0
 	.endm
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 5/7] arm64: Do not apply BP hardening for hyp entries from EL2
  2018-09-26 13:56 [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
@ 2018-09-26 13:56   ` Julien Thierry
  2018-09-26 13:56 ` [PATCH 2/7] arm64: sdei: Always use sdei stack for sdei events Julien Thierry
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-09-26 13:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: marc.zyngier, catalin.marinas, will.deacon, kvmarm, Dave.Martin

When an EL2 entry of __kvm_hyp_vector is taken, it means an entry from a
lower EL was previously taken to exit the guest. Taking that lower EL entry
already applied BP hardening if it was needed, so there is no need to do
it again.

Only apply BP hardening for exception coming from lower EL.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
---
 arch/arm64/kernel/cpu_errata.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index dec1089..9db5ecc 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -103,10 +103,16 @@ static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
 	void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K);
 	int i;

-	for (i = 0; i < SZ_2K; i += 0x80)
+	/*
+	 * Only overwrite hyp entries for exceptions from lower EL.
+	 * Exception vection vector is 2K bytes, first 1K bytes concern
+	 * exception from EL2 (EL2t, EL2h), last 1K concert lower exception
+	 * levels (ELx-64bits, ELx-32bits).
+	 */
+	for (i = SZ_1K; i < SZ_2K; i += 0x80)
 		memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start);

-	__flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
+	__flush_icache_range((uintptr_t)dst + SZ_1K, (uintptr_t)dst + SZ_2K);
 }

 static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
--
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 5/7] arm64: Do not apply BP hardening for hyp entries from EL2
@ 2018-09-26 13:56   ` Julien Thierry
  0 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-09-26 13:56 UTC (permalink / raw)
  To: linux-arm-kernel

When an EL2 entry of __kvm_hyp_vector is taken, it means an entry from a
lower EL was previously taken to exit the guest. Taking that lower EL entry
already applied BP hardening if it was needed, so there is no need to do
it again.

Only apply BP hardening for exception coming from lower EL.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: kvmarm at lists.cs.columbia.edu
---
 arch/arm64/kernel/cpu_errata.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index dec1089..9db5ecc 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -103,10 +103,16 @@ static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
 	void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K);
 	int i;

-	for (i = 0; i < SZ_2K; i += 0x80)
+	/*
+	 * Only overwrite hyp entries for exceptions from lower EL.
+	 * Exception vection vector is 2K bytes, first 1K bytes concern
+	 * exception from EL2 (EL2t, EL2h), last 1K concert lower exception
+	 * levels (ELx-64bits, ELx-32bits).
+	 */
+	for (i = SZ_1K; i < SZ_2K; i += 0x80)
 		memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start);

-	__flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
+	__flush_icache_range((uintptr_t)dst + SZ_1K, (uintptr_t)dst + SZ_2K);
 }

 static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
--
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 6/7] arm64: Do not apply vector harderning for hyp entries from EL2
  2018-09-26 13:56 [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
@ 2018-09-26 13:56   ` Julien Thierry
  2018-09-26 13:56 ` [PATCH 2/7] arm64: sdei: Always use sdei stack for sdei events Julien Thierry
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-09-26 13:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: marc.zyngier, catalin.marinas, will.deacon, kvmarm, Dave.Martin

When an EL2 entry of __kvm_hyp_vector is taken, it means an entry from a
lower EL was previously taken to exit the guest. Taking that lower EL entry
already applied vector hardening if needed, so there is no need to do it
again.

Only apply vector hardening for exception coming from lower EL.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
---
 arch/arm64/kvm/hyp/hyp-entry.S | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
index 24b4fba..da31386 100644
--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -257,7 +257,15 @@ ENTRY(__kvm_hyp_vector)
 ENDPROC(__kvm_hyp_vector)

 #ifdef CONFIG_KVM_INDIRECT_VECTORS
-.macro hyp_ventry
+.macro hyp_el2_ventry
+	.align 7
+1:	b	__kvm_hyp_vector + (1b - 0b)
+	.rept 31
+	nop
+	.endr
+.endm
+
+.macro hyp_el1_ventry
 	.align 7
 1:	.rept 27
 	nop
@@ -290,8 +298,11 @@ alternative_cb_end

 .macro generate_vectors
 0:
-	.rept 16
-	hyp_ventry
+	.rept 8
+	hyp_el2_ventry
+	.endr
+	.rept 8
+	hyp_el1_ventry
 	.endr
 	.org 0b + SZ_2K		// Safety measure
 .endm
--
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 6/7] arm64: Do not apply vector harderning for hyp entries from EL2
@ 2018-09-26 13:56   ` Julien Thierry
  0 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-09-26 13:56 UTC (permalink / raw)
  To: linux-arm-kernel

When an EL2 entry of __kvm_hyp_vector is taken, it means an entry from a
lower EL was previously taken to exit the guest. Taking that lower EL entry
already applied vector hardening if needed, so there is no need to do it
again.

Only apply vector hardening for exception coming from lower EL.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: kvmarm at lists.cs.columbia.edu
---
 arch/arm64/kvm/hyp/hyp-entry.S | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
index 24b4fba..da31386 100644
--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -257,7 +257,15 @@ ENTRY(__kvm_hyp_vector)
 ENDPROC(__kvm_hyp_vector)

 #ifdef CONFIG_KVM_INDIRECT_VECTORS
-.macro hyp_ventry
+.macro hyp_el2_ventry
+	.align 7
+1:	b	__kvm_hyp_vector + (1b - 0b)
+	.rept 31
+	nop
+	.endr
+.endm
+
+.macro hyp_el1_ventry
 	.align 7
 1:	.rept 27
 	nop
@@ -290,8 +298,11 @@ alternative_cb_end

 .macro generate_vectors
 0:
-	.rept 16
-	hyp_ventry
+	.rept 8
+	hyp_el2_ventry
+	.endr
+	.rept 8
+	hyp_el1_ventry
 	.endr
 	.org 0b + SZ_2K		// Safety measure
 .endm
--
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 7/7] arm64: kvm: Align stack for exception coming from EL2
  2018-09-26 13:56 [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
@ 2018-09-26 13:56   ` Julien Thierry
  2018-09-26 13:56 ` [PATCH 2/7] arm64: sdei: Always use sdei stack for sdei events Julien Thierry
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-09-26 13:56 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: marc.zyngier, catalin.marinas, will.deacon, kvmarm, Dave.Martin

SP needs to be 16-bytes aligned before accessing memory through it. When
handling exceptions from EL2, there is no guarantee that SP is already
aligned.

Ensure SP is aligned upon entries from EL2.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
---
 arch/arm64/kvm/hyp/hyp-entry.S | 63 ++++++++++++++++++++++++++++++++++--------
 1 file changed, 52 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
index da31386..f611072 100644
--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -184,6 +184,10 @@ el2_error:
 	ccmp	x0, x1, #4, ne
 	b.ne	__hyp_panic
 	mov	x0, #(1 << ARM_EXIT_WITH_SERROR_BIT)
+
+	/* Restore the stack */
+	ldr	x1, [sp]
+	mov	sp, x1
 	eret

 ENTRY(__hyp_do_panic)
@@ -221,29 +225,66 @@ ENDPROC(\label)

 	.align 11

-.macro valid_vect target
+/*
+ * Aligns the stack and stores the old sp value. This is what the stack
+ * looks like after this code:
+ *
+ *  +--------+
+ *  |        |
+ *  |--------|--> old_sp (upon entry)
+ *  |   x1   |--> (only used for local save/restore)
+ *  |--------|--> old_sp - 8
+ *  |        |--> padding
+ *  |--------|--> aligned_sp + 8
+ *  | old_sp |
+ *  |--------|--> sp = aligned_sp
+ *  |        |
+ *  +--------+
+ */
+.macro align_sp_el2
+	xchg_sp	x0
+	str	x1, [x0, #-8]	// save x1
+	mov	x1, x0		// x1 = old_sp
+	sub	x0, x0, #0x10	// save space for x1 + old_sp
+	bic	x0, x0, #0xf	// align down to 16-bytes
+	xchg_sp	x0
+	str	x1, [sp]	// save old_sp
+	ldr	x1, [x1, #-8]
+.endm
+
+.macro valid_vect target, el=1
 	.align 7
+	.if \el == 2
+	align_sp_el2
+	.endif
+
 	stp	x0, x1, [sp, #-16]!
 	b	\target
 .endm

-.macro invalid_vect target
+.macro invalid_vect target, el=1
 	.align 7
+	.if \el == 2
+	align_sp_el2
+	.endif
+
 	b	\target
+	.if \el == 1
 	ldp	x0, x1, [sp], #16
 	b	\target
+	.endif
 .endm

 ENTRY(__kvm_hyp_vector)
-	invalid_vect	el2t_sync_invalid	// Synchronous EL2t
-	invalid_vect	el2t_irq_invalid	// IRQ EL2t
-	invalid_vect	el2t_fiq_invalid	// FIQ EL2t
-	invalid_vect	el2t_error_invalid	// Error EL2t
-
-	invalid_vect	el2h_sync_invalid	// Synchronous EL2h
-	invalid_vect	el2h_irq_invalid	// IRQ EL2h
-	invalid_vect	el2h_fiq_invalid	// FIQ EL2h
-	valid_vect	el2_error		// Error EL2h
+	invalid_vect	el2t_sync_invalid, 2	// Synchronous EL2t
+	invalid_vect	el2t_irq_invalid, 2	// IRQ EL2t
+	invalid_vect	el2t_fiq_invalid, 2	// FIQ EL2t
+	invalid_vect	el2t_error_invalid, 2	// Error EL2t
+
+	invalid_vect	el2h_sync_invalid, 2	// Synchronous EL2h
+	invalid_vect	el2h_irq_invalid, 2	// IRQ EL2h
+	invalid_vect	el2h_fiq_invalid, 2	// FIQ EL2h
+	valid_vect	el2_error, 2		// Error EL2h

 	valid_vect	el1_sync		// Synchronous 64-bit EL1
 	valid_vect	el1_irq			// IRQ 64-bit EL1
--
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 7/7] arm64: kvm: Align stack for exception coming from EL2
@ 2018-09-26 13:56   ` Julien Thierry
  0 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-09-26 13:56 UTC (permalink / raw)
  To: linux-arm-kernel

SP needs to be 16-bytes aligned before accessing memory through it. When
handling exceptions from EL2, there is no guarantee that SP is already
aligned.

Ensure SP is aligned upon entries from EL2.

Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: kvmarm at lists.cs.columbia.edu
---
 arch/arm64/kvm/hyp/hyp-entry.S | 63 ++++++++++++++++++++++++++++++++++--------
 1 file changed, 52 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
index da31386..f611072 100644
--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -184,6 +184,10 @@ el2_error:
 	ccmp	x0, x1, #4, ne
 	b.ne	__hyp_panic
 	mov	x0, #(1 << ARM_EXIT_WITH_SERROR_BIT)
+
+	/* Restore the stack */
+	ldr	x1, [sp]
+	mov	sp, x1
 	eret

 ENTRY(__hyp_do_panic)
@@ -221,29 +225,66 @@ ENDPROC(\label)

 	.align 11

-.macro valid_vect target
+/*
+ * Aligns the stack and stores the old sp value. This is what the stack
+ * looks like after this code:
+ *
+ *  +--------+
+ *  |        |
+ *  |--------|--> old_sp (upon entry)
+ *  |   x1   |--> (only used for local save/restore)
+ *  |--------|--> old_sp - 8
+ *  |        |--> padding
+ *  |--------|--> aligned_sp + 8
+ *  | old_sp |
+ *  |--------|--> sp = aligned_sp
+ *  |        |
+ *  +--------+
+ */
+.macro align_sp_el2
+	xchg_sp	x0
+	str	x1, [x0, #-8]	// save x1
+	mov	x1, x0		// x1 = old_sp
+	sub	x0, x0, #0x10	// save space for x1 + old_sp
+	bic	x0, x0, #0xf	// align down to 16-bytes
+	xchg_sp	x0
+	str	x1, [sp]	// save old_sp
+	ldr	x1, [x1, #-8]
+.endm
+
+.macro valid_vect target, el=1
 	.align 7
+	.if \el == 2
+	align_sp_el2
+	.endif
+
 	stp	x0, x1, [sp, #-16]!
 	b	\target
 .endm

-.macro invalid_vect target
+.macro invalid_vect target, el=1
 	.align 7
+	.if \el == 2
+	align_sp_el2
+	.endif
+
 	b	\target
+	.if \el == 1
 	ldp	x0, x1, [sp], #16
 	b	\target
+	.endif
 .endm

 ENTRY(__kvm_hyp_vector)
-	invalid_vect	el2t_sync_invalid	// Synchronous EL2t
-	invalid_vect	el2t_irq_invalid	// IRQ EL2t
-	invalid_vect	el2t_fiq_invalid	// FIQ EL2t
-	invalid_vect	el2t_error_invalid	// Error EL2t
-
-	invalid_vect	el2h_sync_invalid	// Synchronous EL2h
-	invalid_vect	el2h_irq_invalid	// IRQ EL2h
-	invalid_vect	el2h_fiq_invalid	// FIQ EL2h
-	valid_vect	el2_error		// Error EL2h
+	invalid_vect	el2t_sync_invalid, 2	// Synchronous EL2t
+	invalid_vect	el2t_irq_invalid, 2	// IRQ EL2t
+	invalid_vect	el2t_fiq_invalid, 2	// FIQ EL2t
+	invalid_vect	el2t_error_invalid, 2	// Error EL2t
+
+	invalid_vect	el2h_sync_invalid, 2	// Synchronous EL2h
+	invalid_vect	el2h_irq_invalid, 2	// IRQ EL2h
+	invalid_vect	el2h_fiq_invalid, 2	// FIQ EL2h
+	valid_vect	el2_error, 2		// Error EL2h

 	valid_vect	el1_sync		// Synchronous 64-bit EL1
 	valid_vect	el1_irq			// IRQ 64-bit EL1
--
1.9.1

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-09-26 13:56 [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
                   ` (6 preceding siblings ...)
  2018-09-26 13:56   ` Julien Thierry
@ 2018-10-19  8:07 ` Julien Thierry
  2018-11-07 21:58 ` Will Deacon
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-10-19  8:07 UTC (permalink / raw)
  To: linux-arm-kernel

Gentle ping on this series.

On 26/09/18 14:56, Julien Thierry wrote:
> Hi,
> 
> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
> using it to access memory. When taking an exception, it is possible that
> the context during which the exception occured had SP mis-aligned. The
> entry code needs to make sure that the stack is aligned before using it to
> save the context.
> 
> This is only a concern when taking exception from an EL using the same
> SP_ELx as the handler. In other cases it can be assumed that the SP being
> picked up on exception entry is aligned under the condition that SP is
> always aligned when doing eret to an EL using a different SP.
> 
> On Juno I see a runtime difference <1% for hackbench. If I do not include
> the fast path at EL1 (patch 4) I see a diff of 1-2%.
> 
> For EL2 entries, a bit of clean up of stuff getting patched in the vector
> has been needed.
> 
> Cheers,
> 
> Julien
> 
> -->
> 
> Julien Thierry (7):
>    arm64: Add static check for pt_regs alignment
>    arm64: sdei: Always use sdei stack for sdei events
>    arm64: Align stack when taking exception from EL1
>    arm64: Add fast-path for stack alignment
>    arm64: Do not apply BP hardening for hyp entries from EL2
>    arm64: Do not apply vector harderning for hyp entries from EL2
>    arm64: kvm: Align stack for exception coming from EL2
> 
>   arch/arm64/include/asm/assembler.h |  9 +++++
>   arch/arm64/include/asm/ptrace.h    |  2 +
>   arch/arm64/include/asm/sdei.h      |  2 -
>   arch/arm64/kernel/cpu_errata.c     | 10 ++++-
>   arch/arm64/kernel/entry.S          | 43 +++++++++++++++++++--
>   arch/arm64/kernel/sdei.c           | 23 ++++-------
>   arch/arm64/kvm/hyp/hyp-entry.S     | 78 +++++++++++++++++++++++++++++++-------
>   drivers/firmware/Kconfig           |  1 +
>   8 files changed, 132 insertions(+), 36 deletions(-)
> 
> --
> 1.9.1
> 

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/7] arm64: Add static check for pt_regs alignment
  2018-09-26 13:56 ` [PATCH 1/7] arm64: Add static check for pt_regs alignment Julien Thierry
@ 2018-11-07 21:58   ` Will Deacon
  2018-11-08 10:16     ` Mark Rutland
  2018-11-08 18:35   ` Ard Biesheuvel
  1 sibling, 1 reply; 50+ messages in thread
From: Will Deacon @ 2018-11-07 21:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 26, 2018 at 02:56:18PM +0100, Julien Thierry wrote:
> The pt_regs structure's size is required to be a multiple of 16 to maintain
> the stack alignment upon kernel entries.
> 
> Add a static check to ensure this is the case.
> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> ---
>  arch/arm64/include/asm/ptrace.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
> index 177b851..cc17fad 100644
> --- a/arch/arm64/include/asm/ptrace.h
> +++ b/arch/arm64/include/asm/ptrace.h
> @@ -229,6 +229,8 @@ static inline u64 regs_get_register(struct pt_regs *regs, unsigned int offset)
>  
>  	WARN_ON(offset & 7);
>  
> +	BUILD_BUG_ON((sizeof (struct pt_regs) % 16 != 0));

This seems like a fairly random place to put this, and given that it's not
dependent on context I reckon it would be better to put it in a C file
rather than in a header that gets included multiple times. I still couldn't
really find a good place though :/

Will

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 2/7] arm64: sdei: Always use sdei stack for sdei events
  2018-09-26 13:56 ` [PATCH 2/7] arm64: sdei: Always use sdei stack for sdei events Julien Thierry
@ 2018-11-07 21:58   ` Will Deacon
  2018-11-21 11:15     ` James Morse
  0 siblings, 1 reply; 50+ messages in thread
From: Will Deacon @ 2018-11-07 21:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 26, 2018 at 02:56:19PM +0100, Julien Thierry wrote:
> SDEI events can occur at any point, including when the stack pointer is
> not aligned. SP could be modified to respect alignment 16-byte alignement,
> but there is a need to deal with code using SP as scratch register.
> 
> Always reserve the SDEI stacks to handle its events.
> 
> Since stack allocation relies on VMAPed stacks, lets make SDEI depend on
> VMAP_STACK.
> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> ---
>  arch/arm64/include/asm/sdei.h |  2 --
>  arch/arm64/kernel/entry.S     |  2 --
>  arch/arm64/kernel/sdei.c      | 23 ++++++++---------------
>  drivers/firmware/Kconfig      |  1 +
>  4 files changed, 9 insertions(+), 19 deletions(-)

One side-effect of this change is that SDEI cannot be enabled in conjunction
with KASAN. James -- are you ok with that?

Will

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-09-26 13:56 [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
                   ` (7 preceding siblings ...)
  2018-10-19  8:07 ` [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
@ 2018-11-07 21:58 ` Will Deacon
  2018-11-08 12:43   ` Julien Thierry
  2018-11-08 13:04 ` Ard Biesheuvel
  2018-11-22 11:49 ` Julien Thierry
  10 siblings, 1 reply; 50+ messages in thread
From: Will Deacon @ 2018-11-07 21:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Julien,

On Wed, Sep 26, 2018 at 02:56:17PM +0100, Julien Thierry wrote:
> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
> using it to access memory. When taking an exception, it is possible that
> the context during which the exception occured had SP mis-aligned. The
> entry code needs to make sure that the stack is aligned before using it to
> save the context.

Do you know what we haven't had reports of this crashing? Is it because GCC
tends to keep the SP aligned anyway, so we're getting away with it for the
moment? Trying to work out whether this is a candidate for -stable.

Cheers,

Will

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 3/7] arm64: Align stack when taking exception from EL1
  2018-09-26 13:56 ` [PATCH 3/7] arm64: Align stack when taking exception from EL1 Julien Thierry
@ 2018-11-07 21:59   ` Will Deacon
  2018-11-08 12:20     ` Julien Thierry
  0 siblings, 1 reply; 50+ messages in thread
From: Will Deacon @ 2018-11-07 21:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 26, 2018 at 02:56:20PM +0100, Julien Thierry wrote:
> Arm64 SP needs to be aligned to 16 bytes before being used as base address
> for loads and stores. When taking some valid exceptions from EL1 (e.g. irq,
> dbg, data abort), there is no guarantee that SP_EL1 was aligned when
> taking the exception.
> 
> Pad the stack on EL1 entries when misaligned.
> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> ---
>  arch/arm64/include/asm/assembler.h |  9 +++++++++
>  arch/arm64/kernel/entry.S          | 32 ++++++++++++++++++++++++++++++--
>  2 files changed, 39 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index 0bcc98d..a0a5415 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -701,4 +701,13 @@
>  .Lyield_out_\@ :
>  	.endm
>  
> +/*
> + * Echange content of register xt with sp.

Exchange contents of register Xt with SP

> + */
> +	.macro xchg_sp	xt
> +	add	sp, sp, \xt	// sp' = sp + xt
> +	sub	\xt, sp, \xt	// xt' = sp' - xt = sp + xt - xt = sp
> +	sub	sp, sp, \xt	// sp'' = sp' - xt' = sp + xt - sp = xt
> +	.endm

Having said that, this is clearly inspired by the code in kernel_ventry for
dealing with stack overflow, yet the macro you've got here cannot be reused
there.

Maybe we're better off only macro-ising:

	.macro sp_fetch_add xt
	add	sp, sp, \xt
	sub	\xt, sp, \xt
	.endm

and then letting the callers handle the rest, or build an swizzle_sp macro
on top of it?

> +
>  #endif	/* __ASM_ASSEMBLER_H */
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index fc5842b..8fb66e4 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -59,6 +59,19 @@
>  	.endr
>  	.endm
>  
> +	.macro force_stack_align
> +	xchg_sp	x0
> +	str	x1, [x0] // store x1 far away from S_SP

So this temporarily trashes pt_regs.regs[0] if the stack is already aligned,
right? I think that's fine, but we should comment /that/ rather than vaguely
saying it's far away from S_SP. In fact, having a block comment above this
macro saying exactly what is clobbered would generally be a good idea (esp
as you nobble the flags in your next patch).

> +
> +	// aligned_sp[S_SP] = old_sp
> +	bic	x1, x0, #0xf	// align down to 16-byte
> +	str	x0, [x1, #S_SP]
> +
> +	ldr	x1, [x0]
> +	bic	x0, x0, #0xf	// x0 = aligned_sp
> +	xchg_sp	x0
> +	.endm
> +
>  /*
>   * Bad Abort numbers
>   *-----------------
> @@ -158,6 +171,10 @@ alternative_cb_end
>  	.if	\regsize == 32
>  	mov	w0, w0				// zero upper 32 bits of x0
>  	.endif
> +	.if	\el != 0
> +	force_stack_align
> +	.endif
> +
>  	stp	x0, x1, [sp, #16 * 0]
>  	stp	x2, x3, [sp, #16 * 1]
>  	stp	x4, x5, [sp, #16 * 2]
> @@ -184,7 +201,8 @@ alternative_cb_end
>  	apply_ssbd 1, x22, x23
>  
>  	.else
> -	add	x21, sp, #S_FRAME_SIZE
> +	ldr	x21, [sp, #S_SP]
> +	add	x21, x21, #S_FRAME_SIZE		// adjust stored sp

Can we not just store the corrected SP in force_stack_align, rather than
store something bogus and then have to add the frame size back here?

Will

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 4/7] arm64: Add fast-path for stack alignment
  2018-09-26 13:56 ` [PATCH 4/7] arm64: Add fast-path for stack alignment Julien Thierry
@ 2018-11-07 21:59   ` Will Deacon
  2018-11-08 12:21     ` Julien Thierry
  0 siblings, 1 reply; 50+ messages in thread
From: Will Deacon @ 2018-11-07 21:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 26, 2018 at 02:56:21PM +0100, Julien Thierry wrote:
> Avoiding stack alignment for already aligned stack can give a small
> performance boost.
> 
> Branch out of the aligning code when the stack is known to be aligned.
> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> ---
>  arch/arm64/kernel/entry.S | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 8fb66e4..bd8d52c 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -61,6 +61,10 @@
>  
>  	.macro force_stack_align
>  	xchg_sp	x0
> +
> +	tst	x0, #0xf
> +	b.eq	0f
> +
>  	str	x1, [x0] // store x1 far away from S_SP
>  
>  	// aligned_sp[S_SP] = old_sp
> @@ -69,6 +73,11 @@
>  
>  	ldr	x1, [x0]
>  	bic	x0, x0, #0xf	// x0 = aligned_sp
> +	b	1f
> +
> +0:
> +	str	x0, [x0, #S_SP]
> +1:

I couldn't figure out a better way to do this, but please remove the empty
lines that you're adding here. With that:

Acked-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/7] arm64: Do not apply BP hardening for hyp entries from EL2
  2018-09-26 13:56   ` Julien Thierry
@ 2018-11-07 21:59     ` Will Deacon
  -1 siblings, 0 replies; 50+ messages in thread
From: Will Deacon @ 2018-11-07 21:59 UTC (permalink / raw)
  To: Julien Thierry
  Cc: marc.zyngier, catalin.marinas, kvmarm, Dave.Martin, linux-arm-kernel

On Wed, Sep 26, 2018 at 02:56:22PM +0100, Julien Thierry wrote:
> When an EL2 entry of __kvm_hyp_vector is taken, it means an entry from a
> lower EL was previously taken to exit the guest. Taking that lower EL entry
> already applied BP hardening if it was needed, so there is no need to do
> it again.
> 
> Only apply BP hardening for exception coming from lower EL.
> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Cc: kvmarm@lists.cs.columbia.edu
> ---
>  arch/arm64/kernel/cpu_errata.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index dec1089..9db5ecc 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -103,10 +103,16 @@ static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
>  	void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K);
>  	int i;
> 
> -	for (i = 0; i < SZ_2K; i += 0x80)
> +	/*
> +	 * Only overwrite hyp entries for exceptions from lower EL.
> +	 * Exception vection vector is 2K bytes, first 1K bytes concern
> +	 * exception from EL2 (EL2t, EL2h), last 1K concert lower exception
> +	 * levels (ELx-64bits, ELx-32bits).
> +	 */
> +	for (i = SZ_1K; i < SZ_2K; i += 0x80)
>  		memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start);
> 
> -	__flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
> +	__flush_icache_range((uintptr_t)dst + SZ_1K, (uintptr_t)dst + SZ_2K);
>  }

I'd personally find this clearer if you did:

	dst += SZ_1K;

before the for loop and with your comment above it. Then the for loop
becomes:

	for (i = 0; i < SZ_1K; i += 0x80)

and the range of the cache maintenance is [dst, dst + SZ_1K)

But I'll leave it up to Marc.

Will

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 5/7] arm64: Do not apply BP hardening for hyp entries from EL2
@ 2018-11-07 21:59     ` Will Deacon
  0 siblings, 0 replies; 50+ messages in thread
From: Will Deacon @ 2018-11-07 21:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 26, 2018 at 02:56:22PM +0100, Julien Thierry wrote:
> When an EL2 entry of __kvm_hyp_vector is taken, it means an entry from a
> lower EL was previously taken to exit the guest. Taking that lower EL entry
> already applied BP hardening if it was needed, so there is no need to do
> it again.
> 
> Only apply BP hardening for exception coming from lower EL.
> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Cc: kvmarm at lists.cs.columbia.edu
> ---
>  arch/arm64/kernel/cpu_errata.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index dec1089..9db5ecc 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -103,10 +103,16 @@ static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
>  	void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K);
>  	int i;
> 
> -	for (i = 0; i < SZ_2K; i += 0x80)
> +	/*
> +	 * Only overwrite hyp entries for exceptions from lower EL.
> +	 * Exception vection vector is 2K bytes, first 1K bytes concern
> +	 * exception from EL2 (EL2t, EL2h), last 1K concert lower exception
> +	 * levels (ELx-64bits, ELx-32bits).
> +	 */
> +	for (i = SZ_1K; i < SZ_2K; i += 0x80)
>  		memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start);
> 
> -	__flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
> +	__flush_icache_range((uintptr_t)dst + SZ_1K, (uintptr_t)dst + SZ_2K);
>  }

I'd personally find this clearer if you did:

	dst += SZ_1K;

before the for loop and with your comment above it. Then the for loop
becomes:

	for (i = 0; i < SZ_1K; i += 0x80)

and the range of the cache maintenance is [dst, dst + SZ_1K)

But I'll leave it up to Marc.

Will

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 6/7] arm64: Do not apply vector harderning for hyp entries from EL2
  2018-09-26 13:56   ` Julien Thierry
@ 2018-11-07 21:59     ` Will Deacon
  -1 siblings, 0 replies; 50+ messages in thread
From: Will Deacon @ 2018-11-07 21:59 UTC (permalink / raw)
  To: Julien Thierry
  Cc: marc.zyngier, catalin.marinas, kvmarm, Dave.Martin, linux-arm-kernel

On Wed, Sep 26, 2018 at 02:56:23PM +0100, Julien Thierry wrote:
> When an EL2 entry of __kvm_hyp_vector is taken, it means an entry from a
> lower EL was previously taken to exit the guest. Taking that lower EL entry
> already applied vector hardening if needed, so there is no need to do it
> again.
> 
> Only apply vector hardening for exception coming from lower EL.
> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Cc: kvmarm@lists.cs.columbia.edu
> ---
>  arch/arm64/kvm/hyp/hyp-entry.S | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
> index 24b4fba..da31386 100644
> --- a/arch/arm64/kvm/hyp/hyp-entry.S
> +++ b/arch/arm64/kvm/hyp/hyp-entry.S
> @@ -257,7 +257,15 @@ ENTRY(__kvm_hyp_vector)
>  ENDPROC(__kvm_hyp_vector)
> 
>  #ifdef CONFIG_KVM_INDIRECT_VECTORS
> -.macro hyp_ventry
> +.macro hyp_el2_ventry
> +	.align 7
> +1:	b	__kvm_hyp_vector + (1b - 0b)
> +	.rept 31
> +	nop
> +	.endr
> +.endm
> +
> +.macro hyp_el1_ventry
>  	.align 7
>  1:	.rept 27
>  	nop
> @@ -290,8 +298,11 @@ alternative_cb_end

I think it would be cleaner to take the EL as a macro parameter, but again,
I defer to Marc and Christoffer.

Will

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 6/7] arm64: Do not apply vector harderning for hyp entries from EL2
@ 2018-11-07 21:59     ` Will Deacon
  0 siblings, 0 replies; 50+ messages in thread
From: Will Deacon @ 2018-11-07 21:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 26, 2018 at 02:56:23PM +0100, Julien Thierry wrote:
> When an EL2 entry of __kvm_hyp_vector is taken, it means an entry from a
> lower EL was previously taken to exit the guest. Taking that lower EL entry
> already applied vector hardening if needed, so there is no need to do it
> again.
> 
> Only apply vector hardening for exception coming from lower EL.
> 
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> Cc: kvmarm at lists.cs.columbia.edu
> ---
>  arch/arm64/kvm/hyp/hyp-entry.S | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
> index 24b4fba..da31386 100644
> --- a/arch/arm64/kvm/hyp/hyp-entry.S
> +++ b/arch/arm64/kvm/hyp/hyp-entry.S
> @@ -257,7 +257,15 @@ ENTRY(__kvm_hyp_vector)
>  ENDPROC(__kvm_hyp_vector)
> 
>  #ifdef CONFIG_KVM_INDIRECT_VECTORS
> -.macro hyp_ventry
> +.macro hyp_el2_ventry
> +	.align 7
> +1:	b	__kvm_hyp_vector + (1b - 0b)
> +	.rept 31
> +	nop
> +	.endr
> +.endm
> +
> +.macro hyp_el1_ventry
>  	.align 7
>  1:	.rept 27
>  	nop
> @@ -290,8 +298,11 @@ alternative_cb_end

I think it would be cleaner to take the EL as a macro parameter, but again,
I defer to Marc and Christoffer.

Will

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/7] arm64: Add static check for pt_regs alignment
  2018-11-07 21:58   ` Will Deacon
@ 2018-11-08 10:16     ` Mark Rutland
  2018-11-08 11:57       ` Julien Thierry
  0 siblings, 1 reply; 50+ messages in thread
From: Mark Rutland @ 2018-11-08 10:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 07, 2018 at 09:58:35PM +0000, Will Deacon wrote:
> On Wed, Sep 26, 2018 at 02:56:18PM +0100, Julien Thierry wrote:
> > The pt_regs structure's size is required to be a multiple of 16 to maintain
> > the stack alignment upon kernel entries.
> > 
> > Add a static check to ensure this is the case.
> > 
> > Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> > ---
> >  arch/arm64/include/asm/ptrace.h | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
> > index 177b851..cc17fad 100644
> > --- a/arch/arm64/include/asm/ptrace.h
> > +++ b/arch/arm64/include/asm/ptrace.h
> > @@ -229,6 +229,8 @@ static inline u64 regs_get_register(struct pt_regs *regs, unsigned int offset)
> >  
> >  	WARN_ON(offset & 7);
> >  
> > +	BUILD_BUG_ON((sizeof (struct pt_regs) % 16 != 0));
> 
> This seems like a fairly random place to put this, and given that it's not
> dependent on context I reckon it would be better to put it in a C file
> rather than in a header that gets included multiple times. I still couldn't
> really find a good place though :/

Perhaps asm-offsets.c, when we define S_FRAME_SIZE?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/7] arm64: Add static check for pt_regs alignment
  2018-11-08 10:16     ` Mark Rutland
@ 2018-11-08 11:57       ` Julien Thierry
  2018-11-08 16:53         ` Will Deacon
  0 siblings, 1 reply; 50+ messages in thread
From: Julien Thierry @ 2018-11-08 11:57 UTC (permalink / raw)
  To: linux-arm-kernel



On 08/11/18 10:16, Mark Rutland wrote:
> On Wed, Nov 07, 2018 at 09:58:35PM +0000, Will Deacon wrote:
>> On Wed, Sep 26, 2018 at 02:56:18PM +0100, Julien Thierry wrote:
>>> The pt_regs structure's size is required to be a multiple of 16 to maintain
>>> the stack alignment upon kernel entries.
>>>
>>> Add a static check to ensure this is the case.
>>>
>>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>>> ---
>>>   arch/arm64/include/asm/ptrace.h | 2 ++
>>>   1 file changed, 2 insertions(+)
>>>
>>> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
>>> index 177b851..cc17fad 100644
>>> --- a/arch/arm64/include/asm/ptrace.h
>>> +++ b/arch/arm64/include/asm/ptrace.h
>>> @@ -229,6 +229,8 @@ static inline u64 regs_get_register(struct pt_regs *regs, unsigned int offset)
>>>   
>>>   	WARN_ON(offset & 7);
>>>   
>>> +	BUILD_BUG_ON((sizeof (struct pt_regs) % 16 != 0));
>>
>> This seems like a fairly random place to put this, and given that it's not
>> dependent on context I reckon it would be better to put it in a C file
>> rather than in a header that gets included multiple times. I still couldn't
>> really find a good place though :/
> 
> Perhaps asm-offsets.c, when we define S_FRAME_SIZE?
> 

That sounds like a good option to me. If Will thinks so as well I'll 
move that in asm-offsets.c.

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 3/7] arm64: Align stack when taking exception from EL1
  2018-11-07 21:59   ` Will Deacon
@ 2018-11-08 12:20     ` Julien Thierry
  0 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-11-08 12:20 UTC (permalink / raw)
  To: linux-arm-kernel



On 07/11/18 21:59, Will Deacon wrote:
> On Wed, Sep 26, 2018 at 02:56:20PM +0100, Julien Thierry wrote:
>> Arm64 SP needs to be aligned to 16 bytes before being used as base address
>> for loads and stores. When taking some valid exceptions from EL1 (e.g. irq,
>> dbg, data abort), there is no guarantee that SP_EL1 was aligned when
>> taking the exception.
>>
>> Pad the stack on EL1 entries when misaligned.
>>
>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>> ---
>>   arch/arm64/include/asm/assembler.h |  9 +++++++++
>>   arch/arm64/kernel/entry.S          | 32 ++++++++++++++++++++++++++++++--
>>   2 files changed, 39 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
>> index 0bcc98d..a0a5415 100644
>> --- a/arch/arm64/include/asm/assembler.h
>> +++ b/arch/arm64/include/asm/assembler.h
>> @@ -701,4 +701,13 @@
>>   .Lyield_out_\@ :
>>   	.endm
>>   
>> +/*
>> + * Echange content of register xt with sp.
> 
> Exchange contents of register Xt with SP
> 
>> + */
>> +	.macro xchg_sp	xt
>> +	add	sp, sp, \xt	// sp' = sp + xt
>> +	sub	\xt, sp, \xt	// xt' = sp' - xt = sp + xt - xt = sp
>> +	sub	sp, sp, \xt	// sp'' = sp' - xt' = sp + xt - sp = xt
>> +	.endm
> 
> Having said that, this is clearly inspired by the code in kernel_ventry for
> dealing with stack overflow, yet the macro you've got here cannot be reused
> there.
> 
> Maybe we're better off only macro-ising:
> 
> 	.macro sp_fetch_add xt
> 	add	sp, sp, \xt
> 	sub	\xt, sp, \xt
> 	.endm
> 
> and then letting the callers handle the rest, or build an swizzle_sp macro
> on top of it?
> 

Hmm, I'll try to do that and see how it looks.

>> +
>>   #endif	/* __ASM_ASSEMBLER_H */
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index fc5842b..8fb66e4 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -59,6 +59,19 @@
>>   	.endr
>>   	.endm
>>   
>> +	.macro force_stack_align
>> +	xchg_sp	x0
>> +	str	x1, [x0] // store x1 far away from S_SP
> 
> So this temporarily trashes pt_regs.regs[0] if the stack is already aligned,
> right? I think that's fine, but we should comment /that/ rather than vaguely
> saying it's far away from S_SP. In fact, having a block comment above this
> macro saying exactly what is clobbered would generally be a good idea (esp
> as you nobble the flags in your next patch).
> 

Is it a big deal to trash pt_regs.regs[0] if the stack is already 
aligned? This is a freshly allocated stack frame, and we expect 
pt_regs.regs[0] to be overwritten afterwards.

I don't mind adding a comment about that, I'm just wondering about the 
relevance of it.

However, I do agree that I should add a comment about what this does 
with pt_regs.sp since the code using that macro needs to be aware of it.

>> +
>> +	// aligned_sp[S_SP] = old_sp
>> +	bic	x1, x0, #0xf	// align down to 16-byte
>> +	str	x0, [x1, #S_SP]
>> +
>> +	ldr	x1, [x0]
>> +	bic	x0, x0, #0xf	// x0 = aligned_sp
>> +	xchg_sp	x0
>> +	.endm
>> +
>>   /*
>>    * Bad Abort numbers
>>    *-----------------
>> @@ -158,6 +171,10 @@ alternative_cb_end
>>   	.if	\regsize == 32
>>   	mov	w0, w0				// zero upper 32 bits of x0
>>   	.endif
>> +	.if	\el != 0
>> +	force_stack_align
>> +	.endif
>> +
>>   	stp	x0, x1, [sp, #16 * 0]
>>   	stp	x2, x3, [sp, #16 * 1]
>>   	stp	x4, x5, [sp, #16 * 2]
>> @@ -184,7 +201,8 @@ alternative_cb_end
>>   	apply_ssbd 1, x22, x23
>>   
>>   	.else
>> -	add	x21, sp, #S_FRAME_SIZE
>> +	ldr	x21, [sp, #S_SP]
>> +	add	x21, x21, #S_FRAME_SIZE		// adjust stored sp
> 
> Can we not just store the corrected SP in force_stack_align, rather than
> store something bogus and then have to add the frame size back here?

We can, it would add 2 instructions to the alignment macro. I think I 
mainly tried to keep the alignment stuff short in case we want to add 
the fast path from the next patch, but if we're keeping it anyway we 
might as well add the code to save the correct SP.

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 4/7] arm64: Add fast-path for stack alignment
  2018-11-07 21:59   ` Will Deacon
@ 2018-11-08 12:21     ` Julien Thierry
  0 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-11-08 12:21 UTC (permalink / raw)
  To: linux-arm-kernel



On 07/11/18 21:59, Will Deacon wrote:
> On Wed, Sep 26, 2018 at 02:56:21PM +0100, Julien Thierry wrote:
>> Avoiding stack alignment for already aligned stack can give a small
>> performance boost.
>>
>> Branch out of the aligning code when the stack is known to be aligned.
>>
>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>> ---
>>   arch/arm64/kernel/entry.S | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index 8fb66e4..bd8d52c 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -61,6 +61,10 @@
>>   
>>   	.macro force_stack_align
>>   	xchg_sp	x0
>> +
>> +	tst	x0, #0xf
>> +	b.eq	0f
>> +
>>   	str	x1, [x0] // store x1 far away from S_SP
>>   
>>   	// aligned_sp[S_SP] = old_sp
>> @@ -69,6 +73,11 @@
>>   
>>   	ldr	x1, [x0]
>>   	bic	x0, x0, #0xf	// x0 = aligned_sp
>> +	b	1f
>> +
>> +0:
>> +	str	x0, [x0, #S_SP]
>> +1:
> 
> I couldn't figure out a better way to do this, but please remove the empty
> lines that you're adding here. With that:
> 

Sure, I'll do that.

> Acked-by: Will Deacon <will.deacon@arm.com>
> 

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/7] arm64: Do not apply BP hardening for hyp entries from EL2
  2018-11-07 21:59     ` Will Deacon
@ 2018-11-08 12:23       ` Julien Thierry
  -1 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-11-08 12:23 UTC (permalink / raw)
  To: Will Deacon
  Cc: marc.zyngier, catalin.marinas, kvmarm, Dave.Martin, linux-arm-kernel



On 07/11/18 21:59, Will Deacon wrote:
> On Wed, Sep 26, 2018 at 02:56:22PM +0100, Julien Thierry wrote:
>> When an EL2 entry of __kvm_hyp_vector is taken, it means an entry from a
>> lower EL was previously taken to exit the guest. Taking that lower EL entry
>> already applied BP hardening if it was needed, so there is no need to do
>> it again.
>>
>> Only apply BP hardening for exception coming from lower EL.
>>
>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>> Cc: kvmarm@lists.cs.columbia.edu
>> ---
>>   arch/arm64/kernel/cpu_errata.c | 10 ++++++++--
>>   1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
>> index dec1089..9db5ecc 100644
>> --- a/arch/arm64/kernel/cpu_errata.c
>> +++ b/arch/arm64/kernel/cpu_errata.c
>> @@ -103,10 +103,16 @@ static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
>>   	void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K);
>>   	int i;
>>
>> -	for (i = 0; i < SZ_2K; i += 0x80)
>> +	/*
>> +	 * Only overwrite hyp entries for exceptions from lower EL.
>> +	 * Exception vection vector is 2K bytes, first 1K bytes concern
>> +	 * exception from EL2 (EL2t, EL2h), last 1K concert lower exception
>> +	 * levels (ELx-64bits, ELx-32bits).
>> +	 */
>> +	for (i = SZ_1K; i < SZ_2K; i += 0x80)
>>   		memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start);
>>
>> -	__flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
>> +	__flush_icache_range((uintptr_t)dst + SZ_1K, (uintptr_t)dst + SZ_2K);
>>   }
> 
> I'd personally find this clearer if you did:
> 
> 	dst += SZ_1K;
> 
> before the for loop and with your comment above it. Then the for loop
> becomes:
> 
> 	for (i = 0; i < SZ_1K; i += 0x80)
> 
> and the range of the cache maintenance is [dst, dst + SZ_1K)
> > But I'll leave it up to Marc.
> 

I'd say I agree with your suggestion, so unless Marc finds it better in 
the current state, I'll make the change.

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 5/7] arm64: Do not apply BP hardening for hyp entries from EL2
@ 2018-11-08 12:23       ` Julien Thierry
  0 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-11-08 12:23 UTC (permalink / raw)
  To: linux-arm-kernel



On 07/11/18 21:59, Will Deacon wrote:
> On Wed, Sep 26, 2018 at 02:56:22PM +0100, Julien Thierry wrote:
>> When an EL2 entry of __kvm_hyp_vector is taken, it means an entry from a
>> lower EL was previously taken to exit the guest. Taking that lower EL entry
>> already applied BP hardening if it was needed, so there is no need to do
>> it again.
>>
>> Only apply BP hardening for exception coming from lower EL.
>>
>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>> Cc: kvmarm at lists.cs.columbia.edu
>> ---
>>   arch/arm64/kernel/cpu_errata.c | 10 ++++++++--
>>   1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
>> index dec1089..9db5ecc 100644
>> --- a/arch/arm64/kernel/cpu_errata.c
>> +++ b/arch/arm64/kernel/cpu_errata.c
>> @@ -103,10 +103,16 @@ static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
>>   	void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K);
>>   	int i;
>>
>> -	for (i = 0; i < SZ_2K; i += 0x80)
>> +	/*
>> +	 * Only overwrite hyp entries for exceptions from lower EL.
>> +	 * Exception vection vector is 2K bytes, first 1K bytes concern
>> +	 * exception from EL2 (EL2t, EL2h), last 1K concert lower exception
>> +	 * levels (ELx-64bits, ELx-32bits).
>> +	 */
>> +	for (i = SZ_1K; i < SZ_2K; i += 0x80)
>>   		memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start);
>>
>> -	__flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
>> +	__flush_icache_range((uintptr_t)dst + SZ_1K, (uintptr_t)dst + SZ_2K);
>>   }
> 
> I'd personally find this clearer if you did:
> 
> 	dst += SZ_1K;
> 
> before the for loop and with your comment above it. Then the for loop
> becomes:
> 
> 	for (i = 0; i < SZ_1K; i += 0x80)
> 
> and the range of the cache maintenance is [dst, dst + SZ_1K)
> > But I'll leave it up to Marc.
> 

I'd say I agree with your suggestion, so unless Marc finds it better in 
the current state, I'll make the change.

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 6/7] arm64: Do not apply vector harderning for hyp entries from EL2
  2018-11-07 21:59     ` Will Deacon
@ 2018-11-08 12:31       ` Julien Thierry
  -1 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-11-08 12:31 UTC (permalink / raw)
  To: Will Deacon
  Cc: marc.zyngier, catalin.marinas, kvmarm, Dave.Martin, linux-arm-kernel



On 07/11/18 21:59, Will Deacon wrote:
> On Wed, Sep 26, 2018 at 02:56:23PM +0100, Julien Thierry wrote:
>> When an EL2 entry of __kvm_hyp_vector is taken, it means an entry from a
>> lower EL was previously taken to exit the guest. Taking that lower EL entry
>> already applied vector hardening if needed, so there is no need to do it
>> again.
>>
>> Only apply vector hardening for exception coming from lower EL.
>>
>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>> Cc: kvmarm@lists.cs.columbia.edu
>> ---
>>   arch/arm64/kvm/hyp/hyp-entry.S | 17 ++++++++++++++---
>>   1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
>> index 24b4fba..da31386 100644
>> --- a/arch/arm64/kvm/hyp/hyp-entry.S
>> +++ b/arch/arm64/kvm/hyp/hyp-entry.S
>> @@ -257,7 +257,15 @@ ENTRY(__kvm_hyp_vector)
>>   ENDPROC(__kvm_hyp_vector)
>>
>>   #ifdef CONFIG_KVM_INDIRECT_VECTORS
>> -.macro hyp_ventry
>> +.macro hyp_el2_ventry
>> +	.align 7
>> +1:	b	__kvm_hyp_vector + (1b - 0b)
>> +	.rept 31
>> +	nop
>> +	.endr
>> +.endm
>> +
>> +.macro hyp_el1_ventry
>>   	.align 7
>>   1:	.rept 27
>>   	nop
>> @@ -290,8 +298,11 @@ alternative_cb_end
> 
> I think it would be cleaner to take the EL as a macro parameter, but again,
> I defer to Marc and Christoffer.

I'd argue they don't have much in common here. Unless I put the branch 
for the el2 ventry after 27 nops like for the el1, we wouldn't gain much 
from putting the two in the same macro IMO.

I'll see what Marc and Christoffer think about it.

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 6/7] arm64: Do not apply vector harderning for hyp entries from EL2
@ 2018-11-08 12:31       ` Julien Thierry
  0 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-11-08 12:31 UTC (permalink / raw)
  To: linux-arm-kernel



On 07/11/18 21:59, Will Deacon wrote:
> On Wed, Sep 26, 2018 at 02:56:23PM +0100, Julien Thierry wrote:
>> When an EL2 entry of __kvm_hyp_vector is taken, it means an entry from a
>> lower EL was previously taken to exit the guest. Taking that lower EL entry
>> already applied vector hardening if needed, so there is no need to do it
>> again.
>>
>> Only apply vector hardening for exception coming from lower EL.
>>
>> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
>> Cc: kvmarm at lists.cs.columbia.edu
>> ---
>>   arch/arm64/kvm/hyp/hyp-entry.S | 17 ++++++++++++++---
>>   1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
>> index 24b4fba..da31386 100644
>> --- a/arch/arm64/kvm/hyp/hyp-entry.S
>> +++ b/arch/arm64/kvm/hyp/hyp-entry.S
>> @@ -257,7 +257,15 @@ ENTRY(__kvm_hyp_vector)
>>   ENDPROC(__kvm_hyp_vector)
>>
>>   #ifdef CONFIG_KVM_INDIRECT_VECTORS
>> -.macro hyp_ventry
>> +.macro hyp_el2_ventry
>> +	.align 7
>> +1:	b	__kvm_hyp_vector + (1b - 0b)
>> +	.rept 31
>> +	nop
>> +	.endr
>> +.endm
>> +
>> +.macro hyp_el1_ventry
>>   	.align 7
>>   1:	.rept 27
>>   	nop
>> @@ -290,8 +298,11 @@ alternative_cb_end
> 
> I think it would be cleaner to take the EL as a macro parameter, but again,
> I defer to Marc and Christoffer.

I'd argue they don't have much in common here. Unless I put the branch 
for the el2 ventry after 27 nops like for the el1, we wouldn't gain much 
from putting the two in the same macro IMO.

I'll see what Marc and Christoffer think about it.

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-07 21:58 ` Will Deacon
@ 2018-11-08 12:43   ` Julien Thierry
  0 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-11-08 12:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On 07/11/18 21:58, Will Deacon wrote:
> Hi Julien,
> 
> On Wed, Sep 26, 2018 at 02:56:17PM +0100, Julien Thierry wrote:
>> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
>> using it to access memory. When taking an exception, it is possible that
>> the context during which the exception occured had SP mis-aligned. The
>> entry code needs to make sure that the stack is aligned before using it to
>> save the context.
> 
> Do you know what we haven't had reports of this crashing? Is it because GCC
> tends to keep the SP aligned anyway, so we're getting away with it for the
> moment? Trying to work out whether this is a candidate for -stable.
> 

I think that GCC tends to keep the SP aligned anyway is the most likely 
explanation, yes.

I tried looking for specific options that could make this more likely, 
but all I could find was the option -mpreferred-stack-boundary only 
available for x86 and -mstack-alignment only provided by clang.

Couldn't find anything yet on the gcc arm64 side that would either 
guarantee we'd have an aligned stack nor that GCC would make it very 
very likely.

I can try to investigate a bit more.

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-09-26 13:56 [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
                   ` (8 preceding siblings ...)
  2018-11-07 21:58 ` Will Deacon
@ 2018-11-08 13:04 ` Ard Biesheuvel
  2018-11-08 13:27   ` Julien Thierry
  2018-11-22 11:49 ` Julien Thierry
  10 siblings, 1 reply; 50+ messages in thread
From: Ard Biesheuvel @ 2018-11-08 13:04 UTC (permalink / raw)
  To: linux-arm-kernel

On 26 September 2018 at 15:56, Julien Thierry <julien.thierry@arm.com> wrote:
> Hi,
>
> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
> using it to access memory. When taking an exception, it is possible that
> the context during which the exception occured had SP mis-aligned.

How is this possible? GCC clearly only manipulates the stack pointer
in 16 byte multiples, and so if we do the same in our asm code (which
I think we already do, given the lack of reports about this issue), is
this handling really necessary?


> The
> entry code needs to make sure that the stack is aligned before using it to
> save the context.
>
> This is only a concern when taking exception from an EL using the same
> SP_ELx as the handler. In other cases it can be assumed that the SP being
> picked up on exception entry is aligned under the condition that SP is
> always aligned when doing eret to an EL using a different SP.
>
> On Juno I see a runtime difference <1% for hackbench. If I do not include
> the fast path at EL1 (patch 4) I see a diff of 1-2%.
>
> For EL2 entries, a bit of clean up of stuff getting patched in the vector
> has been needed.
>
> Cheers,
>
> Julien
>
> -->
>
> Julien Thierry (7):
>   arm64: Add static check for pt_regs alignment
>   arm64: sdei: Always use sdei stack for sdei events
>   arm64: Align stack when taking exception from EL1
>   arm64: Add fast-path for stack alignment
>   arm64: Do not apply BP hardening for hyp entries from EL2
>   arm64: Do not apply vector harderning for hyp entries from EL2
>   arm64: kvm: Align stack for exception coming from EL2
>
>  arch/arm64/include/asm/assembler.h |  9 +++++
>  arch/arm64/include/asm/ptrace.h    |  2 +
>  arch/arm64/include/asm/sdei.h      |  2 -
>  arch/arm64/kernel/cpu_errata.c     | 10 ++++-
>  arch/arm64/kernel/entry.S          | 43 +++++++++++++++++++--
>  arch/arm64/kernel/sdei.c           | 23 ++++-------
>  arch/arm64/kvm/hyp/hyp-entry.S     | 78 +++++++++++++++++++++++++++++++-------
>  drivers/firmware/Kconfig           |  1 +
>  8 files changed, 132 insertions(+), 36 deletions(-)
>
> --
> 1.9.1
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-08 13:04 ` Ard Biesheuvel
@ 2018-11-08 13:27   ` Julien Thierry
  2018-11-08 14:10     ` Ard Biesheuvel
  0 siblings, 1 reply; 50+ messages in thread
From: Julien Thierry @ 2018-11-08 13:27 UTC (permalink / raw)
  To: linux-arm-kernel



On 08/11/18 13:04, Ard Biesheuvel wrote:
> On 26 September 2018 at 15:56, Julien Thierry <julien.thierry@arm.com> wrote:
>> Hi,
>>
>> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
>> using it to access memory. When taking an exception, it is possible that
>> the context during which the exception occured had SP mis-aligned.
> 
> How is this possible? GCC clearly only manipulates the stack pointer
> in 16 byte multiples, and so if we do the same in our asm code (which
> I think we already do, given the lack of reports about this issue), is
> this handling really necessary?
> 

Is there anything that actually gives us that guarantee from GCC? I 
agree that currently it looks like aarch64-<...>-gcc only manipulates SP 
aligned to 16 bytes, but I don't know whether that is certain.

The series can be dropped if there is enough confidence that this won't 
happen.

Thanks,

> 
>> The
>> entry code needs to make sure that the stack is aligned before using it to
>> save the context.
>>
>> This is only a concern when taking exception from an EL using the same
>> SP_ELx as the handler. In other cases it can be assumed that the SP being
>> picked up on exception entry is aligned under the condition that SP is
>> always aligned when doing eret to an EL using a different SP.
>>
>> On Juno I see a runtime difference <1% for hackbench. If I do not include
>> the fast path at EL1 (patch 4) I see a diff of 1-2%.
>>
>> For EL2 entries, a bit of clean up of stuff getting patched in the vector
>> has been needed.
>>
>> Cheers,
>>
>> Julien
>>
>> -->
>>
>> Julien Thierry (7):
>>    arm64: Add static check for pt_regs alignment
>>    arm64: sdei: Always use sdei stack for sdei events
>>    arm64: Align stack when taking exception from EL1
>>    arm64: Add fast-path for stack alignment
>>    arm64: Do not apply BP hardening for hyp entries from EL2
>>    arm64: Do not apply vector harderning for hyp entries from EL2
>>    arm64: kvm: Align stack for exception coming from EL2
>>
>>   arch/arm64/include/asm/assembler.h |  9 +++++
>>   arch/arm64/include/asm/ptrace.h    |  2 +
>>   arch/arm64/include/asm/sdei.h      |  2 -
>>   arch/arm64/kernel/cpu_errata.c     | 10 ++++-
>>   arch/arm64/kernel/entry.S          | 43 +++++++++++++++++++--
>>   arch/arm64/kernel/sdei.c           | 23 ++++-------
>>   arch/arm64/kvm/hyp/hyp-entry.S     | 78 +++++++++++++++++++++++++++++++-------
>>   drivers/firmware/Kconfig           |  1 +
>>   8 files changed, 132 insertions(+), 36 deletions(-)
>>
>> --
>> 1.9.1
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-08 13:27   ` Julien Thierry
@ 2018-11-08 14:10     ` Ard Biesheuvel
  2018-11-08 14:19       ` Ramana Radhakrishnan
  0 siblings, 1 reply; 50+ messages in thread
From: Ard Biesheuvel @ 2018-11-08 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

(+ Ramana)

On 8 November 2018 at 14:27, Julien Thierry <julien.thierry@arm.com> wrote:
>
>
> On 08/11/18 13:04, Ard Biesheuvel wrote:
>>
>> On 26 September 2018 at 15:56, Julien Thierry <julien.thierry@arm.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
>>> using it to access memory. When taking an exception, it is possible that
>>> the context during which the exception occured had SP mis-aligned.
>>
>>
>> How is this possible? GCC clearly only manipulates the stack pointer
>> in 16 byte multiples, and so if we do the same in our asm code (which
>> I think we already do, given the lack of reports about this issue), is
>> this handling really necessary?
>>
>
> Is there anything that actually gives us that guarantee from GCC? I agree
> that currently it looks like aarch64-<...>-gcc only manipulates SP aligned
> to 16 bytes, but I don't know whether that is certain.
>

I think we should get that clarified then. I don't think it makes
sense for GCC to have to reason about whether SP currently has a value
that permits dereferencing.

> The series can be dropped if there is enough confidence that this won't
> happen.
>
> Thanks,
>
>
>>
>>> The
>>> entry code needs to make sure that the stack is aligned before using it
>>> to
>>> save the context.
>>>
>>> This is only a concern when taking exception from an EL using the same
>>> SP_ELx as the handler. In other cases it can be assumed that the SP being
>>> picked up on exception entry is aligned under the condition that SP is
>>> always aligned when doing eret to an EL using a different SP.
>>>
>>> On Juno I see a runtime difference <1% for hackbench. If I do not include
>>> the fast path at EL1 (patch 4) I see a diff of 1-2%.
>>>
>>> For EL2 entries, a bit of clean up of stuff getting patched in the vector
>>> has been needed.
>>>
>>> Cheers,
>>>
>>> Julien
>>>
>>> -->
>>>
>>> Julien Thierry (7):
>>>    arm64: Add static check for pt_regs alignment
>>>    arm64: sdei: Always use sdei stack for sdei events
>>>    arm64: Align stack when taking exception from EL1
>>>    arm64: Add fast-path for stack alignment
>>>    arm64: Do not apply BP hardening for hyp entries from EL2
>>>    arm64: Do not apply vector harderning for hyp entries from EL2
>>>    arm64: kvm: Align stack for exception coming from EL2
>>>
>>>   arch/arm64/include/asm/assembler.h |  9 +++++
>>>   arch/arm64/include/asm/ptrace.h    |  2 +
>>>   arch/arm64/include/asm/sdei.h      |  2 -
>>>   arch/arm64/kernel/cpu_errata.c     | 10 ++++-
>>>   arch/arm64/kernel/entry.S          | 43 +++++++++++++++++++--
>>>   arch/arm64/kernel/sdei.c           | 23 ++++-------
>>>   arch/arm64/kvm/hyp/hyp-entry.S     | 78
>>> +++++++++++++++++++++++++++++++-------
>>>   drivers/firmware/Kconfig           |  1 +
>>>   8 files changed, 132 insertions(+), 36 deletions(-)
>>>
>>> --
>>> 1.9.1
>>>
>>> _______________________________________________
>>> linux-arm-kernel mailing list
>>> linux-arm-kernel at lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
>
> --
> Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-08 14:10     ` Ard Biesheuvel
@ 2018-11-08 14:19       ` Ramana Radhakrishnan
  2018-11-08 15:01         ` Julien Thierry
  2018-11-08 15:30         ` Dave Martin
  0 siblings, 2 replies; 50+ messages in thread
From: Ramana Radhakrishnan @ 2018-11-08 14:19 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/11/2018 14:10, Ard Biesheuvel wrote:
> (+ Ramana)
> 
> On 8 November 2018 at 14:27, Julien Thierry <julien.thierry@arm.com> wrote:
>>
>>
>> On 08/11/18 13:04, Ard Biesheuvel wrote:
>>>
>>> On 26 September 2018 at 15:56, Julien Thierry <julien.thierry@arm.com>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
>>>> using it to access memory. When taking an exception, it is possible that
>>>> the context during which the exception occured had SP mis-aligned.
>>>
>>>
>>> How is this possible? GCC clearly only manipulates the stack pointer
>>> in 16 byte multiples, and so if we do the same in our asm code (which
>>> I think we already do, given the lack of reports about this issue), is
>>> this handling really necessary?
>>>
>>
>> Is there anything that actually gives us that guarantee from GCC? I agree
>> that currently it looks like aarch64-<...>-gcc only manipulates SP aligned
>> to 16 bytes, but I don't know whether that is certain.
>>
> 
> I think we should get that clarified then. I don't think it makes
> sense for GCC to have to reason about whether SP currently has a value
> that permits dereferencing.

The ABI gives that guarantee.

http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf

<quote>

5.2.2.1 Universal stack constraints

<...>

Additionally, at any point at which memory is accessed
via SP, the hardware requires that SP mod 16 = 0.  The stack must be 
quad-word aligned

</end quote>

regards
Ramana

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-08 14:19       ` Ramana Radhakrishnan
@ 2018-11-08 15:01         ` Julien Thierry
  2018-11-08 15:33           ` Ramana Radhakrishnan
  2018-11-08 15:30         ` Dave Martin
  1 sibling, 1 reply; 50+ messages in thread
From: Julien Thierry @ 2018-11-08 15:01 UTC (permalink / raw)
  To: linux-arm-kernel



On 08/11/18 14:19, Ramana Radhakrishnan wrote:
> On 08/11/2018 14:10, Ard Biesheuvel wrote:
>> (+ Ramana)
>>
>> On 8 November 2018 at 14:27, Julien Thierry <julien.thierry@arm.com> wrote:
>>>
>>>
>>> On 08/11/18 13:04, Ard Biesheuvel wrote:
>>>>
>>>> On 26 September 2018 at 15:56, Julien Thierry <julien.thierry@arm.com>
>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
>>>>> using it to access memory. When taking an exception, it is possible that
>>>>> the context during which the exception occured had SP mis-aligned.
>>>>
>>>>
>>>> How is this possible? GCC clearly only manipulates the stack pointer
>>>> in 16 byte multiples, and so if we do the same in our asm code (which
>>>> I think we already do, given the lack of reports about this issue), is
>>>> this handling really necessary?
>>>>
>>>
>>> Is there anything that actually gives us that guarantee from GCC? I agree
>>> that currently it looks like aarch64-<...>-gcc only manipulates SP aligned
>>> to 16 bytes, but I don't know whether that is certain.
>>>
>>
>> I think we should get that clarified then. I don't think it makes
>> sense for GCC to have to reason about whether SP currently has a value
>> that permits dereferencing.
> 
> The ABI gives that guarantee.
> 
> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
> 
> <quote>
> 
> 5.2.2.1 Universal stack constraints
> 
> <...>
> 
> Additionally, at any point at which memory is accessed
> via SP, the hardware requires that SP mod 16 = 0.  The stack must be
> quad-word aligned
> 
> </end quote>
> 

Thanks Ramana. Sadly I don't think this clarifies things. The issue is 
that the guarantee is only that SP is aligned when we will use it to 
access memory, but still allows for a scenario like:

-> Updating SP with non 16bytes-aligned value (it's fine as long as the 
following code takes care to align it before accessing memory)
-> IRQ is raised before SP gets aligned
-> Vector entry starts saving context on misaligned stack
-> Misaligned SP exception

The only thing that would avoid us the trouble is a guarantee that GCC 
never modifies SP in such a way that SP is not 16-bytes aligned.

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-08 14:19       ` Ramana Radhakrishnan
  2018-11-08 15:01         ` Julien Thierry
@ 2018-11-08 15:30         ` Dave Martin
  2018-11-08 15:33           ` Ramana Radhakrishnan
  1 sibling, 1 reply; 50+ messages in thread
From: Dave Martin @ 2018-11-08 15:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 08, 2018 at 02:19:14PM +0000, Ramana Radhakrishnan wrote:
> On 08/11/2018 14:10, Ard Biesheuvel wrote:
> > (+ Ramana)
> > 
> > On 8 November 2018 at 14:27, Julien Thierry <julien.thierry@arm.com> wrote:
> >>
> >>
> >> On 08/11/18 13:04, Ard Biesheuvel wrote:
> >>>
> >>> On 26 September 2018 at 15:56, Julien Thierry <julien.thierry@arm.com>
> >>> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
> >>>> using it to access memory. When taking an exception, it is possible that
> >>>> the context during which the exception occured had SP mis-aligned.
> >>>
> >>>
> >>> How is this possible? GCC clearly only manipulates the stack pointer
> >>> in 16 byte multiples, and so if we do the same in our asm code (which
> >>> I think we already do, given the lack of reports about this issue), is
> >>> this handling really necessary?
> >>>
> >>
> >> Is there anything that actually gives us that guarantee from GCC? I agree
> >> that currently it looks like aarch64-<...>-gcc only manipulates SP aligned
> >> to 16 bytes, but I don't know whether that is certain.
> >>
> > 
> > I think we should get that clarified then. I don't think it makes
> > sense for GCC to have to reason about whether SP currently has a value
> > that permits dereferencing.
> 
> The ABI gives that guarantee.
> 
> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
> 
> <quote>

Surely This only applies at public interfaces?

We make be re-entering the kernel due to an exception taken in the
middle of a function.

In theory, the compiler is allowed to temporarily misalign SP (i.e., the
procedure call standard doesn't forbid that).


So, we'd need to be confident that GCC and LLVM and any other compiler
we care about _guarantee_ never to do that.  (And that there is not asm
in the kernel that will do so.)

Cheers
---Dave

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-08 15:30         ` Dave Martin
@ 2018-11-08 15:33           ` Ramana Radhakrishnan
  2018-11-08 15:39             ` Dave Martin
  0 siblings, 1 reply; 50+ messages in thread
From: Ramana Radhakrishnan @ 2018-11-08 15:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/11/2018 15:30, Dave Martin wrote:
> On Thu, Nov 08, 2018 at 02:19:14PM +0000, Ramana Radhakrishnan wrote:
>> On 08/11/2018 14:10, Ard Biesheuvel wrote:
>>> (+ Ramana)
>>>
>>> On 8 November 2018 at 14:27, Julien Thierry <julien.thierry@arm.com> wrote:
>>>>
>>>>
>>>> On 08/11/18 13:04, Ard Biesheuvel wrote:
>>>>>
>>>>> On 26 September 2018 at 15:56, Julien Thierry <julien.thierry@arm.com>
>>>>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
>>>>>> using it to access memory. When taking an exception, it is possible that
>>>>>> the context during which the exception occured had SP mis-aligned.
>>>>>
>>>>>
>>>>> How is this possible? GCC clearly only manipulates the stack pointer
>>>>> in 16 byte multiples, and so if we do the same in our asm code (which
>>>>> I think we already do, given the lack of reports about this issue), is
>>>>> this handling really necessary?
>>>>>
>>>>
>>>> Is there anything that actually gives us that guarantee from GCC? I agree
>>>> that currently it looks like aarch64-<...>-gcc only manipulates SP aligned
>>>> to 16 bytes, but I don't know whether that is certain.
>>>>
>>>
>>> I think we should get that clarified then. I don't think it makes
>>> sense for GCC to have to reason about whether SP currently has a value
>>> that permits dereferencing.
>>
>> The ABI gives that guarantee.
>>
>> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
>>
>> <quote>
> 
> Surely This only applies at public interfaces?
> 

I don't think this has anything to do with public interfaces. If there 
is a trap with a 16byte misaligned access of the SP then it doesn't 
matter whether it's a public interface or not.

regards
Ramana

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-08 15:01         ` Julien Thierry
@ 2018-11-08 15:33           ` Ramana Radhakrishnan
  2018-11-08 15:41             ` Julien Thierry
  0 siblings, 1 reply; 50+ messages in thread
From: Ramana Radhakrishnan @ 2018-11-08 15:33 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/11/2018 15:01, Julien Thierry wrote:
> 
> 
> On 08/11/18 14:19, Ramana Radhakrishnan wrote:
>> On 08/11/2018 14:10, Ard Biesheuvel wrote:
>>> (+ Ramana)
>>>
>>> On 8 November 2018 at 14:27, Julien Thierry <julien.thierry@arm.com> wrote:
>>>>
>>>>
>>>> On 08/11/18 13:04, Ard Biesheuvel wrote:
>>>>>
>>>>> On 26 September 2018 at 15:56, Julien Thierry <julien.thierry@arm.com>
>>>>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
>>>>>> using it to access memory. When taking an exception, it is possible that
>>>>>> the context during which the exception occured had SP mis-aligned.
>>>>>
>>>>>
>>>>> How is this possible? GCC clearly only manipulates the stack pointer
>>>>> in 16 byte multiples, and so if we do the same in our asm code (which
>>>>> I think we already do, given the lack of reports about this issue), is
>>>>> this handling really necessary?
>>>>>
>>>>
>>>> Is there anything that actually gives us that guarantee from GCC? I agree
>>>> that currently it looks like aarch64-<...>-gcc only manipulates SP aligned
>>>> to 16 bytes, but I don't know whether that is certain.
>>>>
>>>
>>> I think we should get that clarified then. I don't think it makes
>>> sense for GCC to have to reason about whether SP currently has a value
>>> that permits dereferencing.
>>
>> The ABI gives that guarantee.
>>
>> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
>>
>> <quote>
>>
>> 5.2.2.1 Universal stack constraints
>>
>> <...>
>>
>> Additionally, at any point at which memory is accessed
>> via SP, the hardware requires that SP mod 16 = 0.  The stack must be
>> quad-word aligned
>>
>> </end quote>
>>
> 
> Thanks Ramana. Sadly I don't think this clarifies things. The issue is
> that the guarantee is only that SP is aligned when we will use it to
> access memory, but still allows for a scenario like:


That is certainly a correct interpretation of the ABI language.

> 
> -> Updating SP with non 16bytes-aligned value (it's fine as long as the
> following code takes care to align it before accessing memory)
> -> IRQ is raised before SP gets aligned
> -> Vector entry starts saving context on misaligned stack
> -> Misaligned SP exception
> 
> The only thing that would avoid us the trouble is a guarantee that GCC
> never modifies SP in such a way that SP is not 16-bytes aligned.


I think it sort of falls out in the implementation from what I remember 
and see (and empirically checked with a couple of colleagues).

I don't think there is an ABI requirement that SP should left 16 byte 
aligned even for intermediate calculations.

Whether GCC does this or not today is immaterial and for userland it's 
certainly not the only code generator in town for this sort of question. 
The question also arises with other jits and other code generators which 
may leave the stack temporarily not aligned at 16 bytes. So it does 
sound like the right thing to do in the kernel defensively.

regards
Ramana
> 
> Thanks,
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-08 15:33           ` Ramana Radhakrishnan
@ 2018-11-08 15:39             ` Dave Martin
  2018-11-08 15:44               ` Ard Biesheuvel
  0 siblings, 1 reply; 50+ messages in thread
From: Dave Martin @ 2018-11-08 15:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 08, 2018 at 03:33:01PM +0000, Ramana Radhakrishnan wrote:
> On 08/11/2018 15:30, Dave Martin wrote:
> > On Thu, Nov 08, 2018 at 02:19:14PM +0000, Ramana Radhakrishnan wrote:
> >> On 08/11/2018 14:10, Ard Biesheuvel wrote:
> >>> (+ Ramana)
> >>>
> >>> On 8 November 2018 at 14:27, Julien Thierry <julien.thierry@arm.com> wrote:
> >>>>
> >>>>
> >>>> On 08/11/18 13:04, Ard Biesheuvel wrote:
> >>>>>
> >>>>> On 26 September 2018 at 15:56, Julien Thierry <julien.thierry@arm.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
> >>>>>> using it to access memory. When taking an exception, it is possible that
> >>>>>> the context during which the exception occured had SP mis-aligned.
> >>>>>
> >>>>>
> >>>>> How is this possible? GCC clearly only manipulates the stack pointer
> >>>>> in 16 byte multiples, and so if we do the same in our asm code (which
> >>>>> I think we already do, given the lack of reports about this issue), is
> >>>>> this handling really necessary?
> >>>>>
> >>>>
> >>>> Is there anything that actually gives us that guarantee from GCC? I agree
> >>>> that currently it looks like aarch64-<...>-gcc only manipulates SP aligned
> >>>> to 16 bytes, but I don't know whether that is certain.
> >>>>
> >>>
> >>> I think we should get that clarified then. I don't think it makes
> >>> sense for GCC to have to reason about whether SP currently has a value
> >>> that permits dereferencing.
> >>
> >> The ABI gives that guarantee.
> >>
> >> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
> >>
> >> <quote>
> > 
> > Surely This only applies at public interfaces?
> > 
> 
> I don't think this has anything to do with public interfaces. If there 
> is a trap with a 16byte misaligned access of the SP then it doesn't 
> matter whether it's a public interface or not.

We're not talking about SP alignment faults here particluarly.

We're talking about any exception that may be taken from EL1h to EL1h,
which may happen on random instructions inside a function, for random
reasons.

There was talk about running the kernel mostly in EL1t but I don't think
we currently do this (somebody please put me right if I'm wrong here!)

Cheers
---Dave

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-08 15:33           ` Ramana Radhakrishnan
@ 2018-11-08 15:41             ` Julien Thierry
  0 siblings, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-11-08 15:41 UTC (permalink / raw)
  To: linux-arm-kernel



On 08/11/18 15:33, Ramana Radhakrishnan wrote:
> On 08/11/2018 15:01, Julien Thierry wrote:
>>
>>
>> On 08/11/18 14:19, Ramana Radhakrishnan wrote:
>>> On 08/11/2018 14:10, Ard Biesheuvel wrote:
>>>> (+ Ramana)
>>>>
>>>> On 8 November 2018 at 14:27, Julien Thierry <julien.thierry@arm.com> wrote:
>>>>>
>>>>>
>>>>> On 08/11/18 13:04, Ard Biesheuvel wrote:
>>>>>>
>>>>>> On 26 September 2018 at 15:56, Julien Thierry <julien.thierry@arm.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
>>>>>>> using it to access memory. When taking an exception, it is possible that
>>>>>>> the context during which the exception occured had SP mis-aligned.
>>>>>>
>>>>>>
>>>>>> How is this possible? GCC clearly only manipulates the stack pointer
>>>>>> in 16 byte multiples, and so if we do the same in our asm code (which
>>>>>> I think we already do, given the lack of reports about this issue), is
>>>>>> this handling really necessary?
>>>>>>
>>>>>
>>>>> Is there anything that actually gives us that guarantee from GCC? I agree
>>>>> that currently it looks like aarch64-<...>-gcc only manipulates SP aligned
>>>>> to 16 bytes, but I don't know whether that is certain.
>>>>>
>>>>
>>>> I think we should get that clarified then. I don't think it makes
>>>> sense for GCC to have to reason about whether SP currently has a value
>>>> that permits dereferencing.
>>>
>>> The ABI gives that guarantee.
>>>
>>> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
>>>
>>> <quote>
>>>
>>> 5.2.2.1 Universal stack constraints
>>>
>>> <...>
>>>
>>> Additionally, at any point at which memory is accessed
>>> via SP, the hardware requires that SP mod 16 = 0.  The stack must be
>>> quad-word aligned
>>>
>>> </end quote>
>>>
>>
>> Thanks Ramana. Sadly I don't think this clarifies things. The issue is
>> that the guarantee is only that SP is aligned when we will use it to
>> access memory, but still allows for a scenario like:
> 
> 
> That is certainly a correct interpretation of the ABI language.
> 
>>
>> -> Updating SP with non 16bytes-aligned value (it's fine as long as the
>> following code takes care to align it before accessing memory)
>> -> IRQ is raised before SP gets aligned
>> -> Vector entry starts saving context on misaligned stack
>> -> Misaligned SP exception
>>
>> The only thing that would avoid us the trouble is a guarantee that GCC
>> never modifies SP in such a way that SP is not 16-bytes aligned.
> 
> 
> I think it sort of falls out in the implementation from what I remember
> and see (and empirically checked with a couple of colleagues).
> 
> I don't think there is an ABI requirement that SP should left 16 byte
> aligned even for intermediate calculations.
> 
> Whether GCC does this or not today is immaterial and for userland it's
> certainly not the only code generator in town for this sort of question.
> The question also arises with other jits and other code generators which
> may leave the stack temporarily not aligned at 16 bytes. So it does
> sound like the right thing to do in the kernel defensively.
> 

I don't think we really mind about userland. EL1 can (and, in Linux, 
does) use it's own SP. So having an interrupt while EL0 has a misaligned 
SP is not an issue for the kernel.

The cases covered by this series in only for exceptions received at the 
same EL that will handle that exception.

Cheers,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-08 15:39             ` Dave Martin
@ 2018-11-08 15:44               ` Ard Biesheuvel
  2018-11-08 16:57                 ` Ramana Radhakrishnan
  0 siblings, 1 reply; 50+ messages in thread
From: Ard Biesheuvel @ 2018-11-08 15:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 November 2018 at 16:39, Dave Martin <Dave.Martin@arm.com> wrote:
> On Thu, Nov 08, 2018 at 03:33:01PM +0000, Ramana Radhakrishnan wrote:
>> On 08/11/2018 15:30, Dave Martin wrote:
>> > On Thu, Nov 08, 2018 at 02:19:14PM +0000, Ramana Radhakrishnan wrote:
>> >> On 08/11/2018 14:10, Ard Biesheuvel wrote:
>> >>> (+ Ramana)
>> >>>
>> >>> On 8 November 2018 at 14:27, Julien Thierry <julien.thierry@arm.com> wrote:
>> >>>>
>> >>>>
>> >>>> On 08/11/18 13:04, Ard Biesheuvel wrote:
>> >>>>>
>> >>>>> On 26 September 2018 at 15:56, Julien Thierry <julien.thierry@arm.com>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> Hi,
>> >>>>>>
>> >>>>>> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
>> >>>>>> using it to access memory. When taking an exception, it is possible that
>> >>>>>> the context during which the exception occured had SP mis-aligned.
>> >>>>>
>> >>>>>
>> >>>>> How is this possible? GCC clearly only manipulates the stack pointer
>> >>>>> in 16 byte multiples, and so if we do the same in our asm code (which
>> >>>>> I think we already do, given the lack of reports about this issue), is
>> >>>>> this handling really necessary?
>> >>>>>
>> >>>>
>> >>>> Is there anything that actually gives us that guarantee from GCC? I agree
>> >>>> that currently it looks like aarch64-<...>-gcc only manipulates SP aligned
>> >>>> to 16 bytes, but I don't know whether that is certain.
>> >>>>
>> >>>
>> >>> I think we should get that clarified then. I don't think it makes
>> >>> sense for GCC to have to reason about whether SP currently has a value
>> >>> that permits dereferencing.
>> >>
>> >> The ABI gives that guarantee.
>> >>
>> >> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf
>> >>
>> >> <quote>
>> >
>> > Surely This only applies at public interfaces?
>> >
>>
>> I don't think this has anything to do with public interfaces. If there
>> is a trap with a 16byte misaligned access of the SP then it doesn't
>> matter whether it's a public interface or not.
>
> We're not talking about SP alignment faults here particluarly.
>
> We're talking about any exception that may be taken from EL1h to EL1h,
> which may happen on random instructions inside a function, for random
> reasons.
>

Indeed.

But my question remains how likely it is that the compiler we use for
generating the kernel code (so we are not talking about userland JITs
or other crazy stuff here) would play with SP like that, especially
since it is no longer a general purpose register. To me, it would make
sense to attempt to reach an agreement with the GCC folks that
compiler generated code does not muck about with SP like that.

In asm, there is one notable exception, of course, where we swap x0
and sp temporarily in the entry path, but we can't get interrupted
there anyway.

In any case, if people insist, I'm ok with it. To me, it just seems
like unnecessary clutter for a theoretical issue.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/7] arm64: Add static check for pt_regs alignment
  2018-11-08 11:57       ` Julien Thierry
@ 2018-11-08 16:53         ` Will Deacon
  0 siblings, 0 replies; 50+ messages in thread
From: Will Deacon @ 2018-11-08 16:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 08, 2018 at 11:57:27AM +0000, Julien Thierry wrote:
> 
> 
> On 08/11/18 10:16, Mark Rutland wrote:
> > On Wed, Nov 07, 2018 at 09:58:35PM +0000, Will Deacon wrote:
> > > On Wed, Sep 26, 2018 at 02:56:18PM +0100, Julien Thierry wrote:
> > > > The pt_regs structure's size is required to be a multiple of 16 to maintain
> > > > the stack alignment upon kernel entries.
> > > > 
> > > > Add a static check to ensure this is the case.
> > > > 
> > > > Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> > > > ---
> > > >   arch/arm64/include/asm/ptrace.h | 2 ++
> > > >   1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
> > > > index 177b851..cc17fad 100644
> > > > --- a/arch/arm64/include/asm/ptrace.h
> > > > +++ b/arch/arm64/include/asm/ptrace.h
> > > > @@ -229,6 +229,8 @@ static inline u64 regs_get_register(struct pt_regs *regs, unsigned int offset)
> > > >   	WARN_ON(offset & 7);
> > > > +	BUILD_BUG_ON((sizeof (struct pt_regs) % 16 != 0));
> > > 
> > > This seems like a fairly random place to put this, and given that it's not
> > > dependent on context I reckon it would be better to put it in a C file
> > > rather than in a header that gets included multiple times. I still couldn't
> > > really find a good place though :/
> > 
> > Perhaps asm-offsets.c, when we define S_FRAME_SIZE?
> > 
> 
> That sounds like a good option to me. If Will thinks so as well I'll move
> that in asm-offsets.c.

Fine by me.

Will

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-08 15:44               ` Ard Biesheuvel
@ 2018-11-08 16:57                 ` Ramana Radhakrishnan
  2018-11-08 17:14                   ` Ard Biesheuvel
  0 siblings, 1 reply; 50+ messages in thread
From: Ramana Radhakrishnan @ 2018-11-08 16:57 UTC (permalink / raw)
  To: linux-arm-kernel


> 
> Indeed.
> 
> But my question remains how likely it is that the compiler we use for
> generating the kernel code (so we are not talking about userland JITs
> or other crazy stuff here) would play with SP like that, especially
> since it is no longer a general purpose register. To me, it would make
> sense to attempt to reach an agreement with the GCC folks that
> compiler generated code does not muck about with SP like that.


It falls out from the choice of instructions we use for this sort of 
thing and because frame sizes really get rounded up to 16 bytes.

In the explanation below assume the stack is aligned to 16 bytes on 
entry. Frame sizes are always a multiple of 16.

For frame sizes up to 240 bytes that's a stp fp, lr, [sp, #-<off>]!

For frame sizes up to 4k bytes, that's a single sub instruction. Again 
all frame sizes are 16 byte aligned and therefore it's all ok.

For frame sizes greater than this and up to 64k that's a mov to a 
temporary register with a 16 bit immediate followed by a single 
subtract, again not going to leave your stack frame misaligned.

 From frame sizes above that we split this into a subtract with a 
multiple of 4k (again 16 byte aligned) and the rest.

For frame size above 16MB we use a sequence of mov / movk's with a 
temporary register and then do a single subtract of SP and therefore 
that's also fine.

I think for the sake of this conversation with respect to compiling the 
kernel, I think we can safely say that GCC will end up leaving the stack 
16 byte aligned even with the intermediate computations and that 
shouldn't be an issue from my reading of the AArch64 backend.

So, probably move on but if you really want to be defensive you may want 
to carry this patch. However there's nothing that I will say about 
hand-written assembly (and I say that in the interest of completeness).

regards
Ramana

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-08 16:57                 ` Ramana Radhakrishnan
@ 2018-11-08 17:14                   ` Ard Biesheuvel
  0 siblings, 0 replies; 50+ messages in thread
From: Ard Biesheuvel @ 2018-11-08 17:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 8 November 2018 at 17:57, Ramana Radhakrishnan
<Ramana.Radhakrishnan@arm.com> wrote:
>
>>
>> Indeed.
>>
>> But my question remains how likely it is that the compiler we use for
>> generating the kernel code (so we are not talking about userland JITs
>> or other crazy stuff here) would play with SP like that, especially
>> since it is no longer a general purpose register. To me, it would make
>> sense to attempt to reach an agreement with the GCC folks that
>> compiler generated code does not muck about with SP like that.
>
>
> It falls out from the choice of instructions we use for this sort of
> thing and because frame sizes really get rounded up to 16 bytes.
>
> In the explanation below assume the stack is aligned to 16 bytes on
> entry. Frame sizes are always a multiple of 16.
>
> For frame sizes up to 240 bytes that's a stp fp, lr, [sp, #-<off>]!
>
> For frame sizes up to 4k bytes, that's a single sub instruction. Again
> all frame sizes are 16 byte aligned and therefore it's all ok.
>
> For frame sizes greater than this and up to 64k that's a mov to a
> temporary register with a 16 bit immediate followed by a single
> subtract, again not going to leave your stack frame misaligned.
>
>  From frame sizes above that we split this into a subtract with a
> multiple of 4k (again 16 byte aligned) and the rest.
>
> For frame size above 16MB we use a sequence of mov / movk's with a
> temporary register and then do a single subtract of SP and therefore
> that's also fine.
>

Thanks Ramana. In the kernel, stack frames are typically limited to ~1
KB in size, so most of these will never even be encountered in our
case.

> I think for the sake of this conversation with respect to compiling the
> kernel, I think we can safely say that GCC will end up leaving the stack
> 16 byte aligned even with the intermediate computations and that
> shouldn't be an issue from my reading of the AArch64 backend.
>
> So, probably move on but if you really want to be defensive you may want
> to carry this patch. However there's nothing that I will say about
> hand-written assembly (and I say that in the interest of completeness).
>

indeed, hand written assembly is an entirely different matter, but
that is a manageable quantity of code which is currenty clean in this
regard, as far as we know.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/7] arm64: Add static check for pt_regs alignment
  2018-09-26 13:56 ` [PATCH 1/7] arm64: Add static check for pt_regs alignment Julien Thierry
  2018-11-07 21:58   ` Will Deacon
@ 2018-11-08 18:35   ` Ard Biesheuvel
  1 sibling, 0 replies; 50+ messages in thread
From: Ard Biesheuvel @ 2018-11-08 18:35 UTC (permalink / raw)
  To: linux-arm-kernel

On 26 September 2018 at 15:56, Julien Thierry <julien.thierry@arm.com> wrote:
> The pt_regs structure's size is required to be a multiple of 16 to maintain
> the stack alignment upon kernel entries.
>
> Add a static check to ensure this is the case.
>
> Signed-off-by: Julien Thierry <julien.thierry@arm.com>
> ---
>  arch/arm64/include/asm/ptrace.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
> index 177b851..cc17fad 100644
> --- a/arch/arm64/include/asm/ptrace.h
> +++ b/arch/arm64/include/asm/ptrace.h
> @@ -229,6 +229,8 @@ static inline u64 regs_get_register(struct pt_regs *regs, unsigned int offset)
>
>         WARN_ON(offset & 7);
>
> +       BUILD_BUG_ON((sizeof (struct pt_regs) % 16 != 0));

The parens inside BUILD_BUG_ON() are redundant (unless you move the
closing one right after '16', which I would prefer)

> +
>         offset >>= 3;
>         switch (offset) {
>         case 0 ... 30:
> --
> 1.9.1
>

Wherever you manage to put this:

Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 2/7] arm64: sdei: Always use sdei stack for sdei events
  2018-11-07 21:58   ` Will Deacon
@ 2018-11-21 11:15     ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2018-11-21 11:15 UTC (permalink / raw)
  To: linux-arm-kernel

Hi guys,

On 07/11/2018 21:58, Will Deacon wrote:
> On Wed, Sep 26, 2018 at 02:56:19PM +0100, Julien Thierry wrote:
>> SDEI events can occur at any point, including when the stack pointer is
>> not aligned. SP could be modified to respect alignment 16-byte alignement,
>> but there is a need to deal with code using SP as scratch register.
>>
>> Always reserve the SDEI stacks to handle its events.
>>
>> Since stack allocation relies on VMAPed stacks, lets make SDEI depend on
>> VMAP_STACK.

> One side-effect of this change is that SDEI cannot be enabled in conjunction
> with KASAN. James -- are you ok with that?

So it does. I don't think this matters, the only SDEI users in the short-term
can be exercised by other notifications.
I mean to make the stacks for this thing dynamically allocated (so its less of a
memory hog), I can fix this KASAN hole at the same time.


Thanks,

James

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-09-26 13:56 [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
                   ` (9 preceding siblings ...)
  2018-11-08 13:04 ` Ard Biesheuvel
@ 2018-11-22 11:49 ` Julien Thierry
  2018-11-22 12:11   ` Ard Biesheuvel
  10 siblings, 1 reply; 50+ messages in thread
From: Julien Thierry @ 2018-11-22 11:49 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 26/09/18 14:56, Julien Thierry wrote:
> Hi,
> 
> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
> using it to access memory. When taking an exception, it is possible that
> the context during which the exception occured had SP mis-aligned. The
> entry code needs to make sure that the stack is aligned before using it to
> save the context.
> 

So, as discussed in this thread:
https://www.spinics.net/lists/arm-kernel/msg688342.html

We might have the compiler provide the guarantee that SP is kept at 
multiples of 16-bytes throughout C functions. If this is accepted, it 
avoids the complexity of this series.

There is just one case remaining, which is less important as it is 
mostly a debug concern. If we take an SP alignment fault from EL1 (due 
to bad implementation) we take recursive exceptions:

- Without CONFIG_VMAP_STACK, kernel just freezes always taking SP 
alignment faults
- With CONFIG_VMAP_STACK after taking a number of exceptions, we 
overflow the stack and the kernel switches to the overflow stack where 
it finally manages to display a kernel panic message

So the question is, do we care about doing something for the above case?

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-22 11:49 ` Julien Thierry
@ 2018-11-22 12:11   ` Ard Biesheuvel
  2018-11-22 15:10     ` Julien Thierry
  2018-11-22 15:12     ` Will Deacon
  0 siblings, 2 replies; 50+ messages in thread
From: Ard Biesheuvel @ 2018-11-22 12:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 22 Nov 2018 at 12:49, Julien Thierry <julien.thierry@arm.com> wrote:
>
> Hi,
>
> On 26/09/18 14:56, Julien Thierry wrote:
> > Hi,
> >
> > Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
> > using it to access memory. When taking an exception, it is possible that
> > the context during which the exception occured had SP mis-aligned. The
> > entry code needs to make sure that the stack is aligned before using it to
> > save the context.
> >
>
> So, as discussed in this thread:
> https://www.spinics.net/lists/arm-kernel/msg688342.html
>
> We might have the compiler provide the guarantee that SP is kept at
> multiples of 16-bytes throughout C functions. If this is accepted, it
> avoids the complexity of this series.
>

Yes, but we still want the build time sizeof(pt_regs) check and
perhaps other pieces of this series.

> There is just one case remaining, which is less important as it is
> mostly a debug concern. If we take an SP alignment fault from EL1 (due
> to bad implementation) we take recursive exceptions:
>
> - Without CONFIG_VMAP_STACK, kernel just freezes always taking SP
> alignment faults
> - With CONFIG_VMAP_STACK after taking a number of exceptions, we
> overflow the stack and the kernel switches to the overflow stack where
> it finally manages to display a kernel panic message
>
> So the question is, do we care about doing something for the above case?
>

So in the latter case, it is obvious from the debug output that the
stack pointer was misaligned? I'd expect so, given that each
recursively taken exception has the same cause, and so it would be
clear what happened.

So I'm leaning towards just relying on that, given that
CONFIG_VMAP_STACK is the default, although it is unfortunate that
KASAN still disables it.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-22 12:11   ` Ard Biesheuvel
@ 2018-11-22 15:10     ` Julien Thierry
  2018-11-22 15:12     ` Will Deacon
  1 sibling, 0 replies; 50+ messages in thread
From: Julien Thierry @ 2018-11-22 15:10 UTC (permalink / raw)
  To: linux-arm-kernel



On 22/11/18 12:11, Ard Biesheuvel wrote:
> On Thu, 22 Nov 2018 at 12:49, Julien Thierry <julien.thierry@arm.com> wrote:
>>
>> Hi,
>>
>> On 26/09/18 14:56, Julien Thierry wrote:
>>> Hi,
>>>
>>> Having SCTLR_ELx.SA enabled requires the SP to be 16-bytes aligned before
>>> using it to access memory. When taking an exception, it is possible that
>>> the context during which the exception occured had SP mis-aligned. The
>>> entry code needs to make sure that the stack is aligned before using it to
>>> save the context.
>>>
>>
>> So, as discussed in this thread:
>> https://www.spinics.net/lists/arm-kernel/msg688342.html
>>
>> We might have the compiler provide the guarantee that SP is kept at
>> multiples of 16-bytes throughout C functions. If this is accepted, it
>> avoids the complexity of this series.
>>
> 
> Yes, but we still want the build time sizeof(pt_regs) check and
> perhaps other pieces of this series.
> 

Good point.

>> There is just one case remaining, which is less important as it is
>> mostly a debug concern. If we take an SP alignment fault from EL1 (due
>> to bad implementation) we take recursive exceptions:
>>
>> - Without CONFIG_VMAP_STACK, kernel just freezes always taking SP
>> alignment faults
>> - With CONFIG_VMAP_STACK after taking a number of exceptions, we
>> overflow the stack and the kernel switches to the overflow stack where
>> it finally manages to display a kernel panic message
>>
>> So the question is, do we care about doing something for the above case?
>>
> 
> So in the latter case, it is obvious from the debug output that the
> stack pointer was misaligned? I'd expect so, given that each
> recursively taken exception has the same cause, and so it would be
> clear what happened.
> 

So, it is obvious that the error is a misaligned stack pointer as it is 
described in the ESR and as you said, we are always taking the same kind 
of exception.

However the ELR will get overwritten by the recursive exception and its 
last value will be the first instruction accessing memory through SP in 
the exception handler. Meaning we lost the address where the first 
misaligned stack access happened.

But then I don't know whether this case warrants adding complexity to 
the entry code just in case one day fiddles around with the SP and does 
not align it to 16-bytes before accessing memory (or does it while 
interrupts are enabled).

> So I'm leaning towards just relying on that, given that
> CONFIG_VMAP_STACK is the default, although it is unfortunate that
> KASAN still disables it.

Thanks,

-- 
Julien Thierry

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/7] Ensure stack is aligned for kernel entries
  2018-11-22 12:11   ` Ard Biesheuvel
  2018-11-22 15:10     ` Julien Thierry
@ 2018-11-22 15:12     ` Will Deacon
  1 sibling, 0 replies; 50+ messages in thread
From: Will Deacon @ 2018-11-22 15:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 22, 2018 at 01:11:43PM +0100, Ard Biesheuvel wrote:
> On Thu, 22 Nov 2018 at 12:49, Julien Thierry <julien.thierry@arm.com> wrote:
> > On 26/09/18 14:56, Julien Thierry wrote:
> > There is just one case remaining, which is less important as it is
> > mostly a debug concern. If we take an SP alignment fault from EL1 (due
> > to bad implementation) we take recursive exceptions:
> >
> > - Without CONFIG_VMAP_STACK, kernel just freezes always taking SP
> > alignment faults
> > - With CONFIG_VMAP_STACK after taking a number of exceptions, we
> > overflow the stack and the kernel switches to the overflow stack where
> > it finally manages to display a kernel panic message
> >
> > So the question is, do we care about doing something for the above case?
> >
> 
> So in the latter case, it is obvious from the debug output that the
> stack pointer was misaligned? I'd expect so, given that each
> recursively taken exception has the same cause, and so it would be
> clear what happened.
> 
> So I'm leaning towards just relying on that, given that
> CONFIG_VMAP_STACK is the default, although it is unfortunate that
> KASAN still disables it.

Whilst it's unfortunate, I don't think it's the end of the world as long
as KASAN remains a debug-only feature. In that case, you'd only enable it
if you were debugging a suspected use-after-free, and I think the splat
you'd see from the overflow is clear enough that it's not one of those.

However, I really would like a written confirmation (i.e. in some
documentation) from GCC that they won't misalign the SP in generated
code before we drop the meat of this series.

Will

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2018-11-22 15:12 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-26 13:56 [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
2018-09-26 13:56 ` [PATCH 1/7] arm64: Add static check for pt_regs alignment Julien Thierry
2018-11-07 21:58   ` Will Deacon
2018-11-08 10:16     ` Mark Rutland
2018-11-08 11:57       ` Julien Thierry
2018-11-08 16:53         ` Will Deacon
2018-11-08 18:35   ` Ard Biesheuvel
2018-09-26 13:56 ` [PATCH 2/7] arm64: sdei: Always use sdei stack for sdei events Julien Thierry
2018-11-07 21:58   ` Will Deacon
2018-11-21 11:15     ` James Morse
2018-09-26 13:56 ` [PATCH 3/7] arm64: Align stack when taking exception from EL1 Julien Thierry
2018-11-07 21:59   ` Will Deacon
2018-11-08 12:20     ` Julien Thierry
2018-09-26 13:56 ` [PATCH 4/7] arm64: Add fast-path for stack alignment Julien Thierry
2018-11-07 21:59   ` Will Deacon
2018-11-08 12:21     ` Julien Thierry
2018-09-26 13:56 ` [PATCH 5/7] arm64: Do not apply BP hardening for hyp entries from EL2 Julien Thierry
2018-09-26 13:56   ` Julien Thierry
2018-11-07 21:59   ` Will Deacon
2018-11-07 21:59     ` Will Deacon
2018-11-08 12:23     ` Julien Thierry
2018-11-08 12:23       ` Julien Thierry
2018-09-26 13:56 ` [PATCH 6/7] arm64: Do not apply vector harderning " Julien Thierry
2018-09-26 13:56   ` Julien Thierry
2018-11-07 21:59   ` Will Deacon
2018-11-07 21:59     ` Will Deacon
2018-11-08 12:31     ` Julien Thierry
2018-11-08 12:31       ` Julien Thierry
2018-09-26 13:56 ` [PATCH 7/7] arm64: kvm: Align stack for exception coming " Julien Thierry
2018-09-26 13:56   ` Julien Thierry
2018-10-19  8:07 ` [PATCH 0/7] Ensure stack is aligned for kernel entries Julien Thierry
2018-11-07 21:58 ` Will Deacon
2018-11-08 12:43   ` Julien Thierry
2018-11-08 13:04 ` Ard Biesheuvel
2018-11-08 13:27   ` Julien Thierry
2018-11-08 14:10     ` Ard Biesheuvel
2018-11-08 14:19       ` Ramana Radhakrishnan
2018-11-08 15:01         ` Julien Thierry
2018-11-08 15:33           ` Ramana Radhakrishnan
2018-11-08 15:41             ` Julien Thierry
2018-11-08 15:30         ` Dave Martin
2018-11-08 15:33           ` Ramana Radhakrishnan
2018-11-08 15:39             ` Dave Martin
2018-11-08 15:44               ` Ard Biesheuvel
2018-11-08 16:57                 ` Ramana Radhakrishnan
2018-11-08 17:14                   ` Ard Biesheuvel
2018-11-22 11:49 ` Julien Thierry
2018-11-22 12:11   ` Ard Biesheuvel
2018-11-22 15:10     ` Julien Thierry
2018-11-22 15:12     ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.