All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/16] SError rework + v8.2 RAS and IESB cpufeature support
@ 2017-07-28 14:10 ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

Hello,

This series reworks the exception masking so that SError is unmasked ~all the
time, and adds the RAS and IESB cpufeatures.

The major change from v1 is the priority-order for DAIF exceptions after the
SError rework is different due to IESB.

The SError rework is needed for the esb in __switch_to() to be delivered as
an exception, instead of being 'deferred' requiring a sysreg read to check.
The esb cost should be small compared to the dsb(ish) immediately before.

Systems with the RAS Extensions[0] are likely to be servers with APEI's
firmware-first support. For these any SError will be taken to EL3, and Linux
will be notified by some other mechanism... we will never get a physical SError
and don't need to check DISR_EL1 on these systems.
But, if we don't handle RAS SErrors and check DISR_EL1, systems with the RAS
Extensions but no APEI firmware-first will lose SErrors. This series adds the
minimum handling for non-APEI systems.

The ESR AET 'severity' has 'corrected' and not-yet-consumed values, we ignore
RAS SErrors that have these severity values, and panic() for everything else.

v8.2's IESB adds an implicit ESB 'after' TakeException() and 'before'
ExceptionReturn() when entering/returning-from EL1. The TakeException()
esb will always be deferred, so we have to check DISR_EL1. For
ExceptionReturn() we need to unmask SError over the kernel's ERET, so that
any deferred SError isn't left in DISR_EL1 while we are running at EL0.

This means being able to restore SPSR and ELR if we take a 'survivable' SError
during kernel_exit, which is done by stashing the values in a per-cpu variable.

There is no SCTLR_EL2.IESB bit, so KVM only has IESB's behaviour if its using
VHE and we have RAS & IESB. To avoid losing SErrors KVM needs to check DISR_EL1
on __guest_exit(). For ExceptionReturn() a pending host-error will be
blamed on a guest, unmask SError over ERET... if there is an error and the
system doesn't have APEI firmware-first this will cause a hyp-panic.
Future work: add the same minimum-handling to KVMs EL2 panic code.

The DISR_EL1 read on every TakeException() may have an impact on performance.
I'd be interested in seeing any numbers that can be shared, I only have a
software model to test this. If we know a system has APEI firmware-first,
(indicated by GHES entries in the HEST), we can assume firmware has set
SCR_EL3.EA, making DISR_EL1 RAZ/WI for EL{1,2}. More future work is to use
a static key to disable the DISR_EL1 checking on kernel-entry if we know
all reads will be zero.

Known-issues:
 * A synchronous exception taken from the SError handler will overwrite
   the per-cpu SPSR/ELR values on return. Getting the asm code to save these
   and do the restore makes it more complicated and shouldn't be necessary for
   the SError handler as it is today.
 * Handing v8.2s RAS without v8.1s VHE is weird, (see kvm_explicit_esb). Would
   anyone object to making 'all v8.1 features' a runtime requirement of any
   v8.2 feature?


This series can be retrieved from:
git://linux-arm.org/linux-jm.git -b serror_rework/v2


Thanks,

James

[0] https://static.docs.arm.com/ddi0587/a/RAS%20Extension-release%20candidate_march_29.pdf

James Morse (14):
  arm64: explicitly mask all exceptions
  arm64: introduce an order for exceptions
  arm64: unmask all exceptions from C code on CPU startup
  arm64: entry.S: mask all exceptions during kernel_exit
  arm64: entry.S: move enable_step_tsk into kernel_exit
  arm64: entry.S: convert elX_sync
  arm64: entry.S: convert elX_irq
  arm64: kernel: Survive corrected RAS errors notified by SError
  arm64: kernel: Handle deferred SError on kernel entry
  arm64: entry.S: Make eret restartable
  arm64: cpufeature: Enable Implicit ESB on entry/return-from EL1
  KVM: arm64: Take pending SErrors on entry to the guest
  KVM: arm64: Save ESR_EL2 on guest SError
  KVM: arm64: Handle deferred SErrors consumed on guest exit

Xie XiuQi (2):
  arm64: entry.S: move SError handling into a C function for future
    expansion
  arm64: cpufeature: Detect CPU RAS Extentions

 arch/arm64/Kconfig                   |  33 ++++-
 arch/arm64/include/asm/assembler.h   |  75 ++++++++---
 arch/arm64/include/asm/barrier.h     |   1 +
 arch/arm64/include/asm/cpucaps.h     |   4 +-
 arch/arm64/include/asm/esr.h         |  17 +++
 arch/arm64/include/asm/exception.h   |  34 +++++
 arch/arm64/include/asm/irqflags.h    |  58 +++++++--
 arch/arm64/include/asm/kvm_emulate.h |   5 +
 arch/arm64/include/asm/kvm_host.h    |   1 +
 arch/arm64/include/asm/processor.h   |   2 +
 arch/arm64/include/asm/sysreg.h      |   4 +
 arch/arm64/kernel/asm-offsets.c      |   1 +
 arch/arm64/kernel/cpufeature.c       |  43 +++++++
 arch/arm64/kernel/entry.S            | 241 +++++++++++++++++++++++++----------
 arch/arm64/kernel/hibernate.c        |   4 +-
 arch/arm64/kernel/machine_kexec.c    |   3 +-
 arch/arm64/kernel/process.c          |   3 +
 arch/arm64/kernel/setup.c            |   7 +-
 arch/arm64/kernel/smp.c              |  12 +-
 arch/arm64/kernel/suspend.c          |   6 +-
 arch/arm64/kernel/traps.c            |  68 +++++++++-
 arch/arm64/kvm/handle_exit.c         |  84 +++++++++---
 arch/arm64/kvm/hyp.S                 |   1 +
 arch/arm64/kvm/hyp/entry.S           |  27 ++++
 arch/arm64/kvm/hyp/switch.c          |  15 ++-
 arch/arm64/mm/proc.S                 |  17 +--
 26 files changed, 618 insertions(+), 148 deletions(-)

-- 
2.13.2

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 00/16] SError rework + v8.2 RAS and IESB cpufeature support
@ 2017-07-28 14:10 ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

This series reworks the exception masking so that SError is unmasked ~all the
time, and adds the RAS and IESB cpufeatures.

The major change from v1 is the priority-order for DAIF exceptions after the
SError rework is different due to IESB.

The SError rework is needed for the esb in __switch_to() to be delivered as
an exception, instead of being 'deferred' requiring a sysreg read to check.
The esb cost should be small compared to the dsb(ish) immediately before.

Systems with the RAS Extensions[0] are likely to be servers with APEI's
firmware-first support. For these any SError will be taken to EL3, and Linux
will be notified by some other mechanism... we will never get a physical SError
and don't need to check DISR_EL1 on these systems.
But, if we don't handle RAS SErrors and check DISR_EL1, systems with the RAS
Extensions but no APEI firmware-first will lose SErrors. This series adds the
minimum handling for non-APEI systems.

The ESR AET 'severity' has 'corrected' and not-yet-consumed values, we ignore
RAS SErrors that have these severity values, and panic() for everything else.

v8.2's IESB adds an implicit ESB 'after' TakeException() and 'before'
ExceptionReturn() when entering/returning-from EL1. The TakeException()
esb will always be deferred, so we have to check DISR_EL1. For
ExceptionReturn() we need to unmask SError over the kernel's ERET, so that
any deferred SError isn't left in DISR_EL1 while we are running at EL0.

This means being able to restore SPSR and ELR if we take a 'survivable' SError
during kernel_exit, which is done by stashing the values in a per-cpu variable.

There is no SCTLR_EL2.IESB bit, so KVM only has IESB's behaviour if its using
VHE and we have RAS & IESB. To avoid losing SErrors KVM needs to check DISR_EL1
on __guest_exit(). For ExceptionReturn() a pending host-error will be
blamed on a guest, unmask SError over ERET... if there is an error and the
system doesn't have APEI firmware-first this will cause a hyp-panic.
Future work: add the same minimum-handling to KVMs EL2 panic code.

The DISR_EL1 read on every TakeException() may have an impact on performance.
I'd be interested in seeing any numbers that can be shared, I only have a
software model to test this. If we know a system has APEI firmware-first,
(indicated by GHES entries in the HEST), we can assume firmware has set
SCR_EL3.EA, making DISR_EL1 RAZ/WI for EL{1,2}. More future work is to use
a static key to disable the DISR_EL1 checking on kernel-entry if we know
all reads will be zero.

Known-issues:
 * A synchronous exception taken from the SError handler will overwrite
   the per-cpu SPSR/ELR values on return. Getting the asm code to save these
   and do the restore makes it more complicated and shouldn't be necessary for
   the SError handler as it is today.
 * Handing v8.2s RAS without v8.1s VHE is weird, (see kvm_explicit_esb). Would
   anyone object to making 'all v8.1 features' a runtime requirement of any
   v8.2 feature?


This series can be retrieved from:
git://linux-arm.org/linux-jm.git -b serror_rework/v2


Thanks,

James

[0] https://static.docs.arm.com/ddi0587/a/RAS%20Extension-release%20candidate_march_29.pdf

James Morse (14):
  arm64: explicitly mask all exceptions
  arm64: introduce an order for exceptions
  arm64: unmask all exceptions from C code on CPU startup
  arm64: entry.S: mask all exceptions during kernel_exit
  arm64: entry.S: move enable_step_tsk into kernel_exit
  arm64: entry.S: convert elX_sync
  arm64: entry.S: convert elX_irq
  arm64: kernel: Survive corrected RAS errors notified by SError
  arm64: kernel: Handle deferred SError on kernel entry
  arm64: entry.S: Make eret restartable
  arm64: cpufeature: Enable Implicit ESB on entry/return-from EL1
  KVM: arm64: Take pending SErrors on entry to the guest
  KVM: arm64: Save ESR_EL2 on guest SError
  KVM: arm64: Handle deferred SErrors consumed on guest exit

Xie XiuQi (2):
  arm64: entry.S: move SError handling into a C function for future
    expansion
  arm64: cpufeature: Detect CPU RAS Extentions

 arch/arm64/Kconfig                   |  33 ++++-
 arch/arm64/include/asm/assembler.h   |  75 ++++++++---
 arch/arm64/include/asm/barrier.h     |   1 +
 arch/arm64/include/asm/cpucaps.h     |   4 +-
 arch/arm64/include/asm/esr.h         |  17 +++
 arch/arm64/include/asm/exception.h   |  34 +++++
 arch/arm64/include/asm/irqflags.h    |  58 +++++++--
 arch/arm64/include/asm/kvm_emulate.h |   5 +
 arch/arm64/include/asm/kvm_host.h    |   1 +
 arch/arm64/include/asm/processor.h   |   2 +
 arch/arm64/include/asm/sysreg.h      |   4 +
 arch/arm64/kernel/asm-offsets.c      |   1 +
 arch/arm64/kernel/cpufeature.c       |  43 +++++++
 arch/arm64/kernel/entry.S            | 241 +++++++++++++++++++++++++----------
 arch/arm64/kernel/hibernate.c        |   4 +-
 arch/arm64/kernel/machine_kexec.c    |   3 +-
 arch/arm64/kernel/process.c          |   3 +
 arch/arm64/kernel/setup.c            |   7 +-
 arch/arm64/kernel/smp.c              |  12 +-
 arch/arm64/kernel/suspend.c          |   6 +-
 arch/arm64/kernel/traps.c            |  68 +++++++++-
 arch/arm64/kvm/handle_exit.c         |  84 +++++++++---
 arch/arm64/kvm/hyp.S                 |   1 +
 arch/arm64/kvm/hyp/entry.S           |  27 ++++
 arch/arm64/kvm/hyp/switch.c          |  15 ++-
 arch/arm64/mm/proc.S                 |  17 +--
 26 files changed, 618 insertions(+), 148 deletions(-)

-- 
2.13.2

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 01/16] arm64: explicitly mask all exceptions
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

There are a few places where we want to mask all exceptions. Today we
do this in a piecemeal fashion, typically we expect the caller to
have masked irqs and the arch code masks debug exceptions, ignoring
serror which is probably masked.

Make it clear that 'mask all exceptions' is the intention by adding
helpers to do exactly that.

The caller should update trace_hardirqs where appropriate, adding this
logic to the mask/unmask helpers causes asm/irqflags.h and
linux/irqflag.h to include each other.

Signed-off-by: James Morse <james.morse@arm.com>
---
Remove the 'disable IRQs' comment above cpu_die(), nothing returns via
this path, cpu's are resurrected via kernel/smp.c's
secondary_start_kernel().

 arch/arm64/include/asm/assembler.h | 19 +++++++++++++++++++
 arch/arm64/include/asm/irqflags.h  | 34 ++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/hibernate.c      |  4 ++--
 arch/arm64/kernel/machine_kexec.c  |  3 +--
 arch/arm64/kernel/smp.c            |  8 ++------
 arch/arm64/kernel/suspend.c        |  6 +++---
 arch/arm64/kernel/traps.c          |  2 +-
 arch/arm64/mm/proc.S               |  9 ++++-----
 8 files changed, 66 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 1b67c3782d00..896ddd9b21a6 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -31,6 +31,25 @@
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
 
+	.macro save_and_disable_daif, flags
+	.ifnb	\flags
+	mrs	\flags, daif
+	.endif
+	msr	daifset, #0xf
+	.endm
+
+	.macro disable_daif
+	msr	daifset, #0xf
+	.endm
+
+	.macro enable_daif
+	msr	daifclr, #0xf
+	.endm
+
+	.macro	restore_daif, flags:req
+	msr	daif, \flags
+	.endm
+
 /*
  * Enable and disable interrupts.
  */
diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 8c581281fa12..578d14f376ce 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -110,5 +110,39 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
 		: : "r" (flags) : "memory");				\
 	} while (0)
 
+/*
+ * Mask/unmask/restore all exceptions, including interrupts. If the I bit
+ * is modified the caller should call trace_hardirqs_{on,off}().
+ */
+static inline unsigned long local_mask_daif(void)
+{
+	unsigned long flags;
+
+	asm volatile(
+		"mrs	%0, daif		// local_mask_daif\n"
+		"msr	daifset, #0xf"
+		: "=r" (flags)
+		:
+		: "memory");
+	return flags;
+}
+
+static inline void local_unmask_daif(void)
+{
+	asm volatile(
+		"msr	daifclr, #0xf		// local_unmask_daif"
+		:
+		:
+		: "memory");
+}
+
+static inline void local_restore_daif(unsigned long flags)
+{
+	asm volatile(
+		"msr	daif, %0		// local_restore_daif"
+		:
+		: "r" (flags)
+		: "memory");
+}
 #endif
 #endif
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index a44e13942d30..e29f85938ef5 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -285,7 +285,7 @@ int swsusp_arch_suspend(void)
 		return -EBUSY;
 	}
 
-	local_dbg_save(flags);
+	flags = local_mask_daif();
 
 	if (__cpu_suspend_enter(&state)) {
 		/* make the crash dump kernel image visible/saveable */
@@ -315,7 +315,7 @@ int swsusp_arch_suspend(void)
 		__cpu_suspend_exit();
 	}
 
-	local_dbg_restore(flags);
+	local_restore_daif(flags);
 
 	return ret;
 }
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index 481f54a866c5..24e0df967400 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -195,8 +195,7 @@ void machine_kexec(struct kimage *kimage)
 
 	pr_info("Bye!\n");
 
-	/* Disable all DAIF exceptions. */
-	asm volatile ("msr daifset, #0xf" : : : "memory");
+	local_mask_daif();
 
 	/*
 	 * cpu_soft_restart will shutdown the MMU, disable data caches, then
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 321119881abf..cb2e5dd0f429 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -368,10 +368,6 @@ void __cpu_die(unsigned int cpu)
 /*
  * Called from the idle thread for the CPU which has been shutdown.
  *
- * Note that we disable IRQs here, but do not re-enable them
- * before returning to the caller. This is also the behaviour
- * of the other hotplug-cpu capable cores, so presumably coming
- * out of idle fixes this.
  */
 void cpu_die(void)
 {
@@ -379,7 +375,7 @@ void cpu_die(void)
 
 	idle_task_exit();
 
-	local_irq_disable();
+	local_mask_daif();
 
 	/* Tell __cpu_die() that this CPU is now safe to dispose of */
 	(void)cpu_report_death();
@@ -837,7 +833,7 @@ static void ipi_cpu_stop(unsigned int cpu)
 {
 	set_cpu_online(cpu, false);
 
-	local_irq_disable();
+	local_mask_daif();
 
 	while (1)
 		cpu_relax();
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 1e3be9064cfa..d135c70dec97 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -57,7 +57,7 @@ void notrace __cpu_suspend_exit(void)
 	/*
 	 * Restore HW breakpoint registers to sane values
 	 * before debug exceptions are possibly reenabled
-	 * through local_dbg_restore.
+	 * by cpu_suspend()s local_daif_restore() call.
 	 */
 	if (hw_breakpoint_restore)
 		hw_breakpoint_restore(cpu);
@@ -81,7 +81,7 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	 * updates to mdscr register (saved and restored along with
 	 * general purpose registers) from kernel debuggers.
 	 */
-	local_dbg_save(flags);
+	flags = local_mask_daif();
 
 	/*
 	 * Function graph tracer state gets incosistent when the kernel
@@ -114,7 +114,7 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	 * restored, so from this point onwards, debugging is fully
 	 * renabled if it was enabled when core started shutdown.
 	 */
-	local_dbg_restore(flags);
+	local_restore_daif(flags);
 
 	return ret;
 }
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index c7c7088097be..59efec10be15 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -656,7 +656,7 @@ asmlinkage void bad_mode(struct pt_regs *regs, int reason, unsigned int esr)
 		esr_get_class_string(esr));
 
 	die("Oops - bad mode", regs, 0);
-	local_irq_disable();
+	local_mask_daif();
 	panic("bad mode");
 }
 
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 877d42fb0df6..95233dfc4c39 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -109,10 +109,10 @@ ENTRY(cpu_do_resume)
 	/*
 	 * __cpu_setup() cleared MDSCR_EL1.MDE and friends, before unmasking
 	 * debug exceptions. By restoring MDSCR_EL1 here, we may take a debug
-	 * exception. Mask them until local_dbg_restore() in cpu_suspend()
+	 * exception. Mask them until local_daif_restore() in cpu_suspend()
 	 * resets them.
 	 */
-	disable_dbg
+	disable_daif
 	msr	mdscr_el1, x10
 
 	msr	sctlr_el1, x12
@@ -155,8 +155,7 @@ ENDPROC(cpu_do_switch_mm)
  * called by anything else. It can only be executed from a TTBR0 mapping.
  */
 ENTRY(idmap_cpu_replace_ttbr1)
-	mrs	x2, daif
-	msr	daifset, #0xf
+	save_and_disable_daif flags=x2
 
 	adrp	x1, empty_zero_page
 	msr	ttbr1_el1, x1
@@ -169,7 +168,7 @@ ENTRY(idmap_cpu_replace_ttbr1)
 	msr	ttbr1_el1, x0
 	isb
 
-	msr	daif, x2
+	restore_daif x2
 
 	ret
 ENDPROC(idmap_cpu_replace_ttbr1)
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 01/16] arm64: explicitly mask all exceptions
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

There are a few places where we want to mask all exceptions. Today we
do this in a piecemeal fashion, typically we expect the caller to
have masked irqs and the arch code masks debug exceptions, ignoring
serror which is probably masked.

Make it clear that 'mask all exceptions' is the intention by adding
helpers to do exactly that.

The caller should update trace_hardirqs where appropriate, adding this
logic to the mask/unmask helpers causes asm/irqflags.h and
linux/irqflag.h to include each other.

Signed-off-by: James Morse <james.morse@arm.com>
---
Remove the 'disable IRQs' comment above cpu_die(), nothing returns via
this path, cpu's are resurrected via kernel/smp.c's
secondary_start_kernel().

 arch/arm64/include/asm/assembler.h | 19 +++++++++++++++++++
 arch/arm64/include/asm/irqflags.h  | 34 ++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/hibernate.c      |  4 ++--
 arch/arm64/kernel/machine_kexec.c  |  3 +--
 arch/arm64/kernel/smp.c            |  8 ++------
 arch/arm64/kernel/suspend.c        |  6 +++---
 arch/arm64/kernel/traps.c          |  2 +-
 arch/arm64/mm/proc.S               |  9 ++++-----
 8 files changed, 66 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 1b67c3782d00..896ddd9b21a6 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -31,6 +31,25 @@
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
 
+	.macro save_and_disable_daif, flags
+	.ifnb	\flags
+	mrs	\flags, daif
+	.endif
+	msr	daifset, #0xf
+	.endm
+
+	.macro disable_daif
+	msr	daifset, #0xf
+	.endm
+
+	.macro enable_daif
+	msr	daifclr, #0xf
+	.endm
+
+	.macro	restore_daif, flags:req
+	msr	daif, \flags
+	.endm
+
 /*
  * Enable and disable interrupts.
  */
diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 8c581281fa12..578d14f376ce 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -110,5 +110,39 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
 		: : "r" (flags) : "memory");				\
 	} while (0)
 
+/*
+ * Mask/unmask/restore all exceptions, including interrupts. If the I bit
+ * is modified the caller should call trace_hardirqs_{on,off}().
+ */
+static inline unsigned long local_mask_daif(void)
+{
+	unsigned long flags;
+
+	asm volatile(
+		"mrs	%0, daif		// local_mask_daif\n"
+		"msr	daifset, #0xf"
+		: "=r" (flags)
+		:
+		: "memory");
+	return flags;
+}
+
+static inline void local_unmask_daif(void)
+{
+	asm volatile(
+		"msr	daifclr, #0xf		// local_unmask_daif"
+		:
+		:
+		: "memory");
+}
+
+static inline void local_restore_daif(unsigned long flags)
+{
+	asm volatile(
+		"msr	daif, %0		// local_restore_daif"
+		:
+		: "r" (flags)
+		: "memory");
+}
 #endif
 #endif
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index a44e13942d30..e29f85938ef5 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -285,7 +285,7 @@ int swsusp_arch_suspend(void)
 		return -EBUSY;
 	}
 
-	local_dbg_save(flags);
+	flags = local_mask_daif();
 
 	if (__cpu_suspend_enter(&state)) {
 		/* make the crash dump kernel image visible/saveable */
@@ -315,7 +315,7 @@ int swsusp_arch_suspend(void)
 		__cpu_suspend_exit();
 	}
 
-	local_dbg_restore(flags);
+	local_restore_daif(flags);
 
 	return ret;
 }
diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_kexec.c
index 481f54a866c5..24e0df967400 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -195,8 +195,7 @@ void machine_kexec(struct kimage *kimage)
 
 	pr_info("Bye!\n");
 
-	/* Disable all DAIF exceptions. */
-	asm volatile ("msr daifset, #0xf" : : : "memory");
+	local_mask_daif();
 
 	/*
 	 * cpu_soft_restart will shutdown the MMU, disable data caches, then
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 321119881abf..cb2e5dd0f429 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -368,10 +368,6 @@ void __cpu_die(unsigned int cpu)
 /*
  * Called from the idle thread for the CPU which has been shutdown.
  *
- * Note that we disable IRQs here, but do not re-enable them
- * before returning to the caller. This is also the behaviour
- * of the other hotplug-cpu capable cores, so presumably coming
- * out of idle fixes this.
  */
 void cpu_die(void)
 {
@@ -379,7 +375,7 @@ void cpu_die(void)
 
 	idle_task_exit();
 
-	local_irq_disable();
+	local_mask_daif();
 
 	/* Tell __cpu_die() that this CPU is now safe to dispose of */
 	(void)cpu_report_death();
@@ -837,7 +833,7 @@ static void ipi_cpu_stop(unsigned int cpu)
 {
 	set_cpu_online(cpu, false);
 
-	local_irq_disable();
+	local_mask_daif();
 
 	while (1)
 		cpu_relax();
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 1e3be9064cfa..d135c70dec97 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -57,7 +57,7 @@ void notrace __cpu_suspend_exit(void)
 	/*
 	 * Restore HW breakpoint registers to sane values
 	 * before debug exceptions are possibly reenabled
-	 * through local_dbg_restore.
+	 * by cpu_suspend()s local_daif_restore() call.
 	 */
 	if (hw_breakpoint_restore)
 		hw_breakpoint_restore(cpu);
@@ -81,7 +81,7 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	 * updates to mdscr register (saved and restored along with
 	 * general purpose registers) from kernel debuggers.
 	 */
-	local_dbg_save(flags);
+	flags = local_mask_daif();
 
 	/*
 	 * Function graph tracer state gets incosistent when the kernel
@@ -114,7 +114,7 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	 * restored, so from this point onwards, debugging is fully
 	 * renabled if it was enabled when core started shutdown.
 	 */
-	local_dbg_restore(flags);
+	local_restore_daif(flags);
 
 	return ret;
 }
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index c7c7088097be..59efec10be15 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -656,7 +656,7 @@ asmlinkage void bad_mode(struct pt_regs *regs, int reason, unsigned int esr)
 		esr_get_class_string(esr));
 
 	die("Oops - bad mode", regs, 0);
-	local_irq_disable();
+	local_mask_daif();
 	panic("bad mode");
 }
 
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 877d42fb0df6..95233dfc4c39 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -109,10 +109,10 @@ ENTRY(cpu_do_resume)
 	/*
 	 * __cpu_setup() cleared MDSCR_EL1.MDE and friends, before unmasking
 	 * debug exceptions. By restoring MDSCR_EL1 here, we may take a debug
-	 * exception. Mask them until local_dbg_restore() in cpu_suspend()
+	 * exception. Mask them until local_daif_restore() in cpu_suspend()
 	 * resets them.
 	 */
-	disable_dbg
+	disable_daif
 	msr	mdscr_el1, x10
 
 	msr	sctlr_el1, x12
@@ -155,8 +155,7 @@ ENDPROC(cpu_do_switch_mm)
  * called by anything else. It can only be executed from a TTBR0 mapping.
  */
 ENTRY(idmap_cpu_replace_ttbr1)
-	mrs	x2, daif
-	msr	daifset, #0xf
+	save_and_disable_daif flags=x2
 
 	adrp	x1, empty_zero_page
 	msr	ttbr1_el1, x1
@@ -169,7 +168,7 @@ ENTRY(idmap_cpu_replace_ttbr1)
 	msr	ttbr1_el1, x0
 	isb
 
-	msr	daif, x2
+	restore_daif x2
 
 	ret
 ENDPROC(idmap_cpu_replace_ttbr1)
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 02/16] arm64: introduce an order for exceptions
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

Lets define an order for masking and unmasking exceptions. To support
v8.2's RAS extentions, which are notified by SError, 'A' needs to be
the highest priority, (so we can leave PSTATE.A unmasked over an eret).
Debug should come next so our order is 'ADI'.

Masking debug exceptions should cause interrupts to be masked, but not
SError. Masking SError should mask all exceptions. Masking Interrupts has
no side effects for other flags. Keeping to this order makes it easier for
entry.S to know which exceptions should be unmasked.

FIQ is never expected, but we mask it when we mask SError exceptions, and
unmask it at all other times.

Change our local_dbg_{save,restore}() helpers to mask Interrupts too.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/irqflags.h | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 578d14f376ce..6904f2247394 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -21,6 +21,20 @@
 #include <asm/ptrace.h>
 
 /*
+ * AArch64 has flags for masking: Debug, Asynchronous (serror), Interrupts and
+ * FIQ exceptions, in the 'daif' register. We mask and unmask them in 'adi'
+ * order:
+ * Masking Asynchronous (serror) exceptions should causes all other exceptions
+ * to be masked too. Masking Debug should mask Interrupts, but not Asynchronous
+ * (serror) exceptions. Masking Interrupts has no side effects for other flags.
+ * Keeping to this order makes it easier for entry.S to know which exceptions
+ * should be unmasked.
+ *
+ * FIQ is never expected, but we mask it when we mask Asynchronous (serror)
+ * exceptions, and unmask it at all other times.
+ */
+
+/*
  * CPU interrupt mask handling.
  */
 static inline unsigned long arch_local_irq_save(void)
@@ -91,14 +105,14 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
 }
 
 /*
- * save and restore debug state
+ * save and restore debug and interrupt flags
  */
 #define local_dbg_save(flags)						\
 	do {								\
 		typecheck(unsigned long, flags);			\
 		asm volatile(						\
 		"mrs    %0, daif		// local_dbg_save\n"	\
-		"msr    daifset, #8"					\
+		"msr    daifset, #(8+2)"				\
 		: "=r" (flags) : : "memory");				\
 	} while (0)
 
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 02/16] arm64: introduce an order for exceptions
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

Lets define an order for masking and unmasking exceptions. To support
v8.2's RAS extentions, which are notified by SError, 'A' needs to be
the highest priority, (so we can leave PSTATE.A unmasked over an eret).
Debug should come next so our order is 'ADI'.

Masking debug exceptions should cause interrupts to be masked, but not
SError. Masking SError should mask all exceptions. Masking Interrupts has
no side effects for other flags. Keeping to this order makes it easier for
entry.S to know which exceptions should be unmasked.

FIQ is never expected, but we mask it when we mask SError exceptions, and
unmask it at all other times.

Change our local_dbg_{save,restore}() helpers to mask Interrupts too.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/irqflags.h | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 578d14f376ce..6904f2247394 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -21,6 +21,20 @@
 #include <asm/ptrace.h>
 
 /*
+ * AArch64 has flags for masking: Debug, Asynchronous (serror), Interrupts and
+ * FIQ exceptions, in the 'daif' register. We mask and unmask them in 'adi'
+ * order:
+ * Masking Asynchronous (serror) exceptions should causes all other exceptions
+ * to be masked too. Masking Debug should mask Interrupts, but not Asynchronous
+ * (serror) exceptions. Masking Interrupts has no side effects for other flags.
+ * Keeping to this order makes it easier for entry.S to know which exceptions
+ * should be unmasked.
+ *
+ * FIQ is never expected, but we mask it when we mask Asynchronous (serror)
+ * exceptions, and unmask it at all other times.
+ */
+
+/*
  * CPU interrupt mask handling.
  */
 static inline unsigned long arch_local_irq_save(void)
@@ -91,14 +105,14 @@ static inline int arch_irqs_disabled_flags(unsigned long flags)
 }
 
 /*
- * save and restore debug state
+ * save and restore debug and interrupt flags
  */
 #define local_dbg_save(flags)						\
 	do {								\
 		typecheck(unsigned long, flags);			\
 		asm volatile(						\
 		"mrs    %0, daif		// local_dbg_save\n"	\
-		"msr    daifset, #8"					\
+		"msr    daifset, #(8+2)"				\
 		: "=r" (flags) : : "memory");				\
 	} while (0)
 
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 03/16] arm64: unmask all exceptions from C code on CPU startup
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

On startup (and before any C code) __cpu_setup() resets the debug
configuration register MDSCR_EL1 to disable MDE and KDE, it then umasks
Debug exceptions.

On first boot, once we get into the setup.c on CPU0, we unmask SError.
IRQs are unmasked some time later by core code. FIQ is only unmasked
when we enter user-spce.
On secondary CPUs the __cpu_setup() code unmasks Debug exceptions early,
(cpuidle then has to mask them when it restores MDSCR_EL1), smp.c then
unmasks SError and Interrupts.

Move the Debug unmask into {setup,smp}.c to preserve our expected
priority order: SError should be unmasked before Debug and we can't
unmask SError until we have the earlycon available.

The same goes for secondary CPUs only here we are ready to receive
interrupts. Just unmask everything.

Remove the local_fiq_{en,dis}able macros as they don't respect our newly
defined order, and don't have any users.

This patch removes the isb that synchronized the MDSCR_el1 write in
__cpu_setup() with the PSTATE.D write. Now the PSTATE.D write happens
after __enable_mmu().

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/irqflags.h |  6 ------
 arch/arm64/kernel/setup.c         |  7 ++++---
 arch/arm64/kernel/smp.c           |  4 ++--
 arch/arm64/mm/proc.S              | 12 +-----------
 4 files changed, 7 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 6904f2247394..df61dcbce22e 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -67,12 +67,6 @@ static inline void arch_local_irq_disable(void)
 		: "memory");
 }
 
-#define local_fiq_enable()	asm("msr	daifclr, #1" : : : "memory")
-#define local_fiq_disable()	asm("msr	daifset, #1" : : : "memory")
-
-#define local_async_enable()	asm("msr	daifclr, #4" : : : "memory")
-#define local_async_disable()	asm("msr	daifset, #4" : : : "memory")
-
 /*
  * Save the current interrupt enable state.
  */
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index d4b740538ad5..133735cede73 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -262,10 +262,11 @@ void __init setup_arch(char **cmdline_p)
 	parse_early_param();
 
 	/*
-	 *  Unmask asynchronous aborts after bringing up possible earlycon.
-	 * (Report possible System Errors once we can report this occurred)
+	 * Unmask asynchronous aborts, debug and fiq exceptions after bringing
+	 * up possible earlycon. (Report possible System Errors once we can
+	 * report this occurred).
 	 */
-	local_async_enable();
+	local_restore_daif(PSR_I_BIT);
 
 	/*
 	 * TTBR0 is only used for the identity mapping at this stage. Make it
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index cb2e5dd0f429..ccd63681c327 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -271,8 +271,8 @@ asmlinkage void secondary_start_kernel(void)
 	set_cpu_online(cpu, true);
 	complete(&cpu_running);
 
-	local_irq_enable();
-	local_async_enable();
+	trace_hardirqs_on();
+	local_unmask_daif();
 
 	/*
 	 * OK, it's off to the idle thread for us
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 95233dfc4c39..9b18ef0c8ae0 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -83,6 +83,7 @@ ENDPROC(cpu_do_suspend)
 
 /**
  * cpu_do_resume - restore CPU register context
+ * Call with debug exceptions masked
  *
  * x0: Address of context pointer
  */
@@ -105,16 +106,7 @@ ENTRY(cpu_do_resume)
 
 	msr	tcr_el1, x8
 	msr	vbar_el1, x9
-
-	/*
-	 * __cpu_setup() cleared MDSCR_EL1.MDE and friends, before unmasking
-	 * debug exceptions. By restoring MDSCR_EL1 here, we may take a debug
-	 * exception. Mask them until local_daif_restore() in cpu_suspend()
-	 * resets them.
-	 */
-	disable_daif
 	msr	mdscr_el1, x10
-
 	msr	sctlr_el1, x12
 	msr	tpidr_el1, x13
 	msr	sp_el0, x14
@@ -189,8 +181,6 @@ ENTRY(__cpu_setup)
 	msr	cpacr_el1, x0			// Enable FP/ASIMD
 	mov	x0, #1 << 12			// Reset mdscr_el1 and disable
 	msr	mdscr_el1, x0			// access to the DCC from EL0
-	isb					// Unmask debug exceptions now,
-	enable_dbg				// since this is per-cpu
 	reset_pmuserenr_el0 x0			// Disable PMU access from EL0
 	/*
 	 * Memory region attributes for LPAE:
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 03/16] arm64: unmask all exceptions from C code on CPU startup
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

On startup (and before any C code) __cpu_setup() resets the debug
configuration register MDSCR_EL1 to disable MDE and KDE, it then umasks
Debug exceptions.

On first boot, once we get into the setup.c on CPU0, we unmask SError.
IRQs are unmasked some time later by core code. FIQ is only unmasked
when we enter user-spce.
On secondary CPUs the __cpu_setup() code unmasks Debug exceptions early,
(cpuidle then has to mask them when it restores MDSCR_EL1), smp.c then
unmasks SError and Interrupts.

Move the Debug unmask into {setup,smp}.c to preserve our expected
priority order: SError should be unmasked before Debug and we can't
unmask SError until we have the earlycon available.

The same goes for secondary CPUs only here we are ready to receive
interrupts. Just unmask everything.

Remove the local_fiq_{en,dis}able macros as they don't respect our newly
defined order, and don't have any users.

This patch removes the isb that synchronized the MDSCR_el1 write in
__cpu_setup() with the PSTATE.D write. Now the PSTATE.D write happens
after __enable_mmu().

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/irqflags.h |  6 ------
 arch/arm64/kernel/setup.c         |  7 ++++---
 arch/arm64/kernel/smp.c           |  4 ++--
 arch/arm64/mm/proc.S              | 12 +-----------
 4 files changed, 7 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
index 6904f2247394..df61dcbce22e 100644
--- a/arch/arm64/include/asm/irqflags.h
+++ b/arch/arm64/include/asm/irqflags.h
@@ -67,12 +67,6 @@ static inline void arch_local_irq_disable(void)
 		: "memory");
 }
 
-#define local_fiq_enable()	asm("msr	daifclr, #1" : : : "memory")
-#define local_fiq_disable()	asm("msr	daifset, #1" : : : "memory")
-
-#define local_async_enable()	asm("msr	daifclr, #4" : : : "memory")
-#define local_async_disable()	asm("msr	daifset, #4" : : : "memory")
-
 /*
  * Save the current interrupt enable state.
  */
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index d4b740538ad5..133735cede73 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -262,10 +262,11 @@ void __init setup_arch(char **cmdline_p)
 	parse_early_param();
 
 	/*
-	 *  Unmask asynchronous aborts after bringing up possible earlycon.
-	 * (Report possible System Errors once we can report this occurred)
+	 * Unmask asynchronous aborts, debug and fiq exceptions after bringing
+	 * up possible earlycon. (Report possible System Errors once we can
+	 * report this occurred).
 	 */
-	local_async_enable();
+	local_restore_daif(PSR_I_BIT);
 
 	/*
 	 * TTBR0 is only used for the identity mapping at this stage. Make it
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index cb2e5dd0f429..ccd63681c327 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -271,8 +271,8 @@ asmlinkage void secondary_start_kernel(void)
 	set_cpu_online(cpu, true);
 	complete(&cpu_running);
 
-	local_irq_enable();
-	local_async_enable();
+	trace_hardirqs_on();
+	local_unmask_daif();
 
 	/*
 	 * OK, it's off to the idle thread for us
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 95233dfc4c39..9b18ef0c8ae0 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -83,6 +83,7 @@ ENDPROC(cpu_do_suspend)
 
 /**
  * cpu_do_resume - restore CPU register context
+ * Call with debug exceptions masked
  *
  * x0: Address of context pointer
  */
@@ -105,16 +106,7 @@ ENTRY(cpu_do_resume)
 
 	msr	tcr_el1, x8
 	msr	vbar_el1, x9
-
-	/*
-	 * __cpu_setup() cleared MDSCR_EL1.MDE and friends, before unmasking
-	 * debug exceptions. By restoring MDSCR_EL1 here, we may take a debug
-	 * exception. Mask them until local_daif_restore() in cpu_suspend()
-	 * resets them.
-	 */
-	disable_daif
 	msr	mdscr_el1, x10
-
 	msr	sctlr_el1, x12
 	msr	tpidr_el1, x13
 	msr	sp_el0, x14
@@ -189,8 +181,6 @@ ENTRY(__cpu_setup)
 	msr	cpacr_el1, x0			// Enable FP/ASIMD
 	mov	x0, #1 << 12			// Reset mdscr_el1 and disable
 	msr	mdscr_el1, x0			// access to the DCC from EL0
-	isb					// Unmask debug exceptions now,
-	enable_dbg				// since this is per-cpu
 	reset_pmuserenr_el0 x0			// Disable PMU access from EL0
 	/*
 	 * Memory region attributes for LPAE:
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 04/16] arm64: entry.S: mask all exceptions during kernel_exit
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

Add a disable_daif call to kernel_exit to mask all exceptions
before restoring registers that are overwritten by an exception.

This should be done before we restore sp_el0, as any exception taken
from EL1 will assume this register is set correctly.

After this patch it is no longer necessary to mask interrupts before
kernel_exit.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kernel/entry.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index b738880350f9..491182f0abb5 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -207,6 +207,8 @@ alternative_else_nop_endif
 2:
 #endif
 
+	disable_daif
+
 	.if	\el == 0
 	ldr	x23, [sp, #S_SP]		// load return stack pointer
 	msr	sp_el0, x23
@@ -438,8 +440,6 @@ el1_da:
 	mov	x2, sp				// struct pt_regs
 	bl	do_mem_abort
 
-	// disable interrupts before pulling preserved data off the stack
-	disable_irq
 	kernel_exit 1
 el1_sp_pc:
 	/*
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 04/16] arm64: entry.S: mask all exceptions during kernel_exit
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

Add a disable_daif call to kernel_exit to mask all exceptions
before restoring registers that are overwritten by an exception.

This should be done before we restore sp_el0, as any exception taken
from EL1 will assume this register is set correctly.

After this patch it is no longer necessary to mask interrupts before
kernel_exit.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kernel/entry.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index b738880350f9..491182f0abb5 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -207,6 +207,8 @@ alternative_else_nop_endif
 2:
 #endif
 
+	disable_daif
+
 	.if	\el == 0
 	ldr	x23, [sp, #S_SP]		// load return stack pointer
 	msr	sp_el0, x23
@@ -438,8 +440,6 @@ el1_da:
 	mov	x2, sp				// struct pt_regs
 	bl	do_mem_abort
 
-	// disable interrupts before pulling preserved data off the stack
-	disable_irq
 	kernel_exit 1
 el1_sp_pc:
 	/*
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 05/16] arm64: entry.S: move enable_step_tsk into kernel_exit
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

enable_step_tsk may enable single-step, so needs to mask debug
exceptions to prevent us from single-stepping kernel_exit. This
should be the callers problem.

Earlier cleanup (2a2830703a23) moved disable_step_tsk into kernel_entry.
enable_step_tsk has two callers, both immediately before kernel_exit 0.
Move the macro call into kernel_exit after local_mask_daif.

enable_step_tsk is now only called with debug exceptions masked.
This was the last user of disable_dbg, remove it.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 9 +--------
 arch/arm64/kernel/entry.S          | 7 ++++---
 2 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 896ddd9b21a6..f4dc435406ea 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -70,13 +70,6 @@
 	msr	daif, \flags
 	.endm
 
-/*
- * Enable and disable debug exceptions.
- */
-	.macro	disable_dbg
-	msr	daifset, #8
-	.endm
-
 	.macro	enable_dbg
 	msr	daifclr, #8
 	.endm
@@ -90,9 +83,9 @@
 9990:
 	.endm
 
+	/* call with debug exceptions masked */
 	.macro	enable_step_tsk, flgs, tmp
 	tbz	\flgs, #TIF_SINGLESTEP, 9990f
-	disable_dbg
 	mrs	\tmp, mdscr_el1
 	orr	\tmp, \tmp, #1
 	msr	mdscr_el1, \tmp
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 491182f0abb5..0836b65d4c84 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -212,6 +212,10 @@ alternative_else_nop_endif
 	.if	\el == 0
 	ldr	x23, [sp, #S_SP]		// load return stack pointer
 	msr	sp_el0, x23
+
+	ldr	x1, [tsk, #TSK_TI_FLAGS]
+	enable_step_tsk flgs=x1, tmp=x2
+
 #ifdef CONFIG_ARM64_ERRATUM_845719
 alternative_if ARM64_WORKAROUND_845719
 	tbz	x22, #4, 1f
@@ -750,7 +754,6 @@ ret_fast_syscall:
 	cbnz	x2, ret_fast_syscall_trace
 	and	x2, x1, #_TIF_WORK_MASK
 	cbnz	x2, work_pending
-	enable_step_tsk x1, x2
 	kernel_exit 0
 ret_fast_syscall_trace:
 	enable_irq				// enable interrupts
@@ -765,7 +768,6 @@ work_pending:
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_on		// enabled while in userspace
 #endif
-	ldr	x1, [tsk, #TSK_TI_FLAGS]	// re-check for single-step
 	b	finish_ret_to_user
 /*
  * "slow" syscall return path.
@@ -776,7 +778,6 @@ ret_to_user:
 	and	x2, x1, #_TIF_WORK_MASK
 	cbnz	x2, work_pending
 finish_ret_to_user:
-	enable_step_tsk x1, x2
 	kernel_exit 0
 ENDPROC(ret_to_user)
 
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 05/16] arm64: entry.S: move enable_step_tsk into kernel_exit
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

enable_step_tsk may enable single-step, so needs to mask debug
exceptions to prevent us from single-stepping kernel_exit. This
should be the callers problem.

Earlier cleanup (2a2830703a23) moved disable_step_tsk into kernel_entry.
enable_step_tsk has two callers, both immediately before kernel_exit 0.
Move the macro call into kernel_exit after local_mask_daif.

enable_step_tsk is now only called with debug exceptions masked.
This was the last user of disable_dbg, remove it.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 9 +--------
 arch/arm64/kernel/entry.S          | 7 ++++---
 2 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 896ddd9b21a6..f4dc435406ea 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -70,13 +70,6 @@
 	msr	daif, \flags
 	.endm
 
-/*
- * Enable and disable debug exceptions.
- */
-	.macro	disable_dbg
-	msr	daifset, #8
-	.endm
-
 	.macro	enable_dbg
 	msr	daifclr, #8
 	.endm
@@ -90,9 +83,9 @@
 9990:
 	.endm
 
+	/* call with debug exceptions masked */
 	.macro	enable_step_tsk, flgs, tmp
 	tbz	\flgs, #TIF_SINGLESTEP, 9990f
-	disable_dbg
 	mrs	\tmp, mdscr_el1
 	orr	\tmp, \tmp, #1
 	msr	mdscr_el1, \tmp
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 491182f0abb5..0836b65d4c84 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -212,6 +212,10 @@ alternative_else_nop_endif
 	.if	\el == 0
 	ldr	x23, [sp, #S_SP]		// load return stack pointer
 	msr	sp_el0, x23
+
+	ldr	x1, [tsk, #TSK_TI_FLAGS]
+	enable_step_tsk flgs=x1, tmp=x2
+
 #ifdef CONFIG_ARM64_ERRATUM_845719
 alternative_if ARM64_WORKAROUND_845719
 	tbz	x22, #4, 1f
@@ -750,7 +754,6 @@ ret_fast_syscall:
 	cbnz	x2, ret_fast_syscall_trace
 	and	x2, x1, #_TIF_WORK_MASK
 	cbnz	x2, work_pending
-	enable_step_tsk x1, x2
 	kernel_exit 0
 ret_fast_syscall_trace:
 	enable_irq				// enable interrupts
@@ -765,7 +768,6 @@ work_pending:
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_on		// enabled while in userspace
 #endif
-	ldr	x1, [tsk, #TSK_TI_FLAGS]	// re-check for single-step
 	b	finish_ret_to_user
 /*
  * "slow" syscall return path.
@@ -776,7 +778,6 @@ ret_to_user:
 	and	x2, x1, #_TIF_WORK_MASK
 	cbnz	x2, work_pending
 finish_ret_to_user:
-	enable_step_tsk x1, x2
 	kernel_exit 0
 ENDPROC(ret_to_user)
 
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 06/16] arm64: entry.S: convert elX_sync
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

el1_sync unmasks exceptions on a case-by-case basis, debug exceptions
are unmasked, unless this was a debug exception. IRQs are unmasked
for instruction and data aborts only if the interrupted context had
irqs unmasked.

Following our 'adi' order, el1_dbg should run with Debug and Interrupt
exceptions masked. For the other cases we can inherit whatever we
interrupted.

Add a macro inherit_daif to set daif based on the interrupted pstate.

el0_sync also unmasks exceptions on a case-by-case basis, debug exceptions
are unmasked, unless this was a debug exception. Irqs are unmasked for
some exception types but not for others.

el0_dbg should run with Debug and Interrupt exceptions masked, for the
other cases we can unmask everything. This changes the behaviour of
fpsimd_{acc,exc} and el0_inv, which previously ran with Interrupts masked.

All of these el0 exception types call ct_user_exit after unmasking IRQs.
Move this into the switch statement. el0_dbg needs to do this itself once
it has finished its work and el0_svc needs to pass a flag to restore the
syscall args.

This patch removed the last user of enable_dbg_and_irq, remove it.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 19 +++++------
 arch/arm64/kernel/entry.S          | 66 +++++++++++++-------------------------
 2 files changed, 33 insertions(+), 52 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index f4dc435406ea..6a2512da468a 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -50,6 +50,12 @@
 	msr	daif, \flags
 	.endm
 
+	/* Only on aarch64 pstate, PSR_D_BIT is different for aarch32 */
+	.macro	inherit_daif, pstate:req, tmp:req
+	and	\tmp, \pstate, #(PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)
+	msr	daif, \tmp
+	.endm
+
 /*
  * Enable and disable interrupts.
  */
@@ -70,6 +76,10 @@
 	msr	daif, \flags
 	.endm
 
+	.macro	enable_serror
+	msr	daifclr, #4
+	.endm
+
 	.macro	enable_dbg
 	msr	daifclr, #8
 	.endm
@@ -93,15 +103,6 @@
 	.endm
 
 /*
- * Enable both debug exceptions and interrupts. This is likely to be
- * faster than two daifclr operations, since writes to this register
- * are self-synchronising.
- */
-	.macro	enable_dbg_and_irq
-	msr	daifclr, #(8 | 2)
-	.endm
-
-/*
  * SMP data memory barrier
  */
 	.macro	smp_dmb, opt
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 0836b65d4c84..51e704e46c29 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -408,8 +408,13 @@ ENDPROC(el1_error_invalid)
 	.align	6
 el1_sync:
 	kernel_entry 1
+	mrs	x0, far_el1
 	mrs	x1, esr_el1			// read the syndrome register
 	lsr	x24, x1, #ESR_ELx_EC_SHIFT	// exception class
+	cmp	x24, #ESR_ELx_EC_BREAKPT_CUR	// debug exception in EL1
+	b.ge	el1_dbg
+
+	inherit_daif	pstate=x23, tmp=x2
 	cmp	x24, #ESR_ELx_EC_DABT_CUR	// data abort in EL1
 	b.eq	el1_da
 	cmp	x24, #ESR_ELx_EC_IABT_CUR	// instruction abort in EL1
@@ -422,8 +427,6 @@ el1_sync:
 	b.eq	el1_sp_pc
 	cmp	x24, #ESR_ELx_EC_UNKNOWN	// unknown exception in EL1
 	b.eq	el1_undef
-	cmp	x24, #ESR_ELx_EC_BREAKPT_CUR	// debug exception in EL1
-	b.ge	el1_dbg
 	b	el1_inv
 
 el1_ia:
@@ -434,12 +437,7 @@ el1_da:
 	/*
 	 * Data abort handling
 	 */
-	mrs	x3, far_el1
-	enable_dbg
-	// re-enable interrupts if they were enabled in the aborted context
-	tbnz	x23, #7, 1f			// PSR_I_BIT
-	enable_irq
-1:
+	mov	x3, x0
 	clear_address_tag x0, x3
 	mov	x2, sp				// struct pt_regs
 	bl	do_mem_abort
@@ -449,31 +447,27 @@ el1_sp_pc:
 	/*
 	 * Stack or PC alignment exception handling
 	 */
-	mrs	x0, far_el1
-	enable_dbg
 	mov	x2, sp
 	b	do_sp_pc_abort
 el1_undef:
 	/*
 	 * Undefined instruction
 	 */
-	enable_dbg
 	mov	x0, sp
 	b	do_undefinstr
 el1_dbg:
 	/*
 	 * Debug exception handling
 	 */
+	enable_serror
 	cmp	x24, #ESR_ELx_EC_BRK64		// if BRK64
 	cinc	x24, x24, eq			// set bit '0'
 	tbz	x24, #0, el1_inv		// EL1 only
-	mrs	x0, far_el1
 	mov	x2, sp				// struct pt_regs
 	bl	do_debug_exception
 	kernel_exit 1
 el1_inv:
 	// TODO: add support for undefined instructions in kernel mode
-	enable_dbg
 	mov	x0, sp
 	mov	x2, x1
 	mov	x1, #BAD_SYNC
@@ -520,9 +514,16 @@ el1_preempt:
 el0_sync:
 	kernel_entry 0
 	mrs	x25, esr_el1			// read the syndrome register
+	mrs	x26, far_el1
 	lsr	x24, x25, #ESR_ELx_EC_SHIFT	// exception class
+	cmp	x24, #ESR_ELx_EC_BREAKPT_LOW	// debug exception in EL0
+	b.ge	el0_dbg
+
+	enable_daif
 	cmp	x24, #ESR_ELx_EC_SVC64		// SVC in 64-bit state
 	b.eq	el0_svc
+
+	ct_user_exit
 	cmp	x24, #ESR_ELx_EC_DABT_LOW	// data abort in EL0
 	b.eq	el0_da
 	cmp	x24, #ESR_ELx_EC_IABT_LOW	// instruction abort in EL0
@@ -539,8 +540,6 @@ el0_sync:
 	b.eq	el0_sp_pc
 	cmp	x24, #ESR_ELx_EC_UNKNOWN	// unknown exception in EL0
 	b.eq	el0_undef
-	cmp	x24, #ESR_ELx_EC_BREAKPT_LOW	// debug exception in EL0
-	b.ge	el0_dbg
 	b	el0_inv
 
 #ifdef CONFIG_COMPAT
@@ -548,9 +547,16 @@ el0_sync:
 el0_sync_compat:
 	kernel_entry 0, 32
 	mrs	x25, esr_el1			// read the syndrome register
+	mrs	x26, far_el1
 	lsr	x24, x25, #ESR_ELx_EC_SHIFT	// exception class
+	cmp	x24, #ESR_ELx_EC_BREAKPT_LOW	// debug exception in EL0
+	b.ge	el0_dbg
+
+	enable_daif
 	cmp	x24, #ESR_ELx_EC_SVC32		// SVC in 32-bit state
 	b.eq	el0_svc_compat
+
+	ct_user_exit
 	cmp	x24, #ESR_ELx_EC_DABT_LOW	// data abort in EL0
 	b.eq	el0_da
 	cmp	x24, #ESR_ELx_EC_IABT_LOW	// instruction abort in EL0
@@ -573,8 +579,6 @@ el0_sync_compat:
 	b.eq	el0_undef
 	cmp	x24, #ESR_ELx_EC_CP14_64	// CP14 MRRC/MCRR trap
 	b.eq	el0_undef
-	cmp	x24, #ESR_ELx_EC_BREAKPT_LOW	// debug exception in EL0
-	b.ge	el0_dbg
 	b	el0_inv
 el0_svc_compat:
 	/*
@@ -595,10 +599,6 @@ el0_da:
 	/*
 	 * Data abort handling
 	 */
-	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
-	ct_user_exit
 	clear_address_tag x0, x26
 	mov	x1, x25
 	mov	x2, sp
@@ -608,10 +608,6 @@ el0_ia:
 	/*
 	 * Instruction abort handling
 	 */
-	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
-	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
 	mov	x2, sp
@@ -621,8 +617,6 @@ el0_fpsimd_acc:
 	/*
 	 * Floating Point or Advanced SIMD access
 	 */
-	enable_dbg
-	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
 	bl	do_fpsimd_acc
@@ -631,8 +625,6 @@ el0_fpsimd_exc:
 	/*
 	 * Floating Point or Advanced SIMD exception
 	 */
-	enable_dbg
-	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
 	bl	do_fpsimd_exc
@@ -641,10 +633,6 @@ el0_sp_pc:
 	/*
 	 * Stack or PC alignment exception handling
 	 */
-	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
-	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
 	mov	x2, sp
@@ -654,9 +642,6 @@ el0_undef:
 	/*
 	 * Undefined instruction
 	 */
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
-	ct_user_exit
 	mov	x0, sp
 	bl	do_undefinstr
 	b	ret_to_user
@@ -664,8 +649,6 @@ el0_sys:
 	/*
 	 * System instructions, for trapped cache maintenance instructions
 	 */
-	enable_dbg_and_irq
-	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
 	bl	do_sysinstr
@@ -675,16 +658,14 @@ el0_dbg:
 	 * Debug exception handling
 	 */
 	tbnz	x24, #0, el0_inv		// EL0 only
-	mrs	x0, far_el1
+	mov	x0, x26
 	mov	x1, x25
 	mov	x2, sp
 	bl	do_debug_exception
-	enable_dbg
+	enable_daif
 	ct_user_exit
 	b	ret_to_user
 el0_inv:
-	enable_dbg
-	ct_user_exit
 	mov	x0, sp
 	mov	x1, #BAD_SYNC
 	mov	x2, x25
@@ -803,7 +784,6 @@ el0_svc:
 	mov	sc_nr, #__NR_syscalls
 el0_svc_naked:					// compat entry point
 	stp	x0, scno, [sp, #S_ORIG_X0]	// save the original x0 and syscall number
-	enable_dbg_and_irq
 	ct_user_exit 1
 
 	ldr	x16, [tsk, #TSK_TI_FLAGS]	// check for syscall hooks
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 06/16] arm64: entry.S: convert elX_sync
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

el1_sync unmasks exceptions on a case-by-case basis, debug exceptions
are unmasked, unless this was a debug exception. IRQs are unmasked
for instruction and data aborts only if the interrupted context had
irqs unmasked.

Following our 'adi' order, el1_dbg should run with Debug and Interrupt
exceptions masked. For the other cases we can inherit whatever we
interrupted.

Add a macro inherit_daif to set daif based on the interrupted pstate.

el0_sync also unmasks exceptions on a case-by-case basis, debug exceptions
are unmasked, unless this was a debug exception. Irqs are unmasked for
some exception types but not for others.

el0_dbg should run with Debug and Interrupt exceptions masked, for the
other cases we can unmask everything. This changes the behaviour of
fpsimd_{acc,exc} and el0_inv, which previously ran with Interrupts masked.

All of these el0 exception types call ct_user_exit after unmasking IRQs.
Move this into the switch statement. el0_dbg needs to do this itself once
it has finished its work and el0_svc needs to pass a flag to restore the
syscall args.

This patch removed the last user of enable_dbg_and_irq, remove it.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 19 +++++------
 arch/arm64/kernel/entry.S          | 66 +++++++++++++-------------------------
 2 files changed, 33 insertions(+), 52 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index f4dc435406ea..6a2512da468a 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -50,6 +50,12 @@
 	msr	daif, \flags
 	.endm
 
+	/* Only on aarch64 pstate, PSR_D_BIT is different for aarch32 */
+	.macro	inherit_daif, pstate:req, tmp:req
+	and	\tmp, \pstate, #(PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)
+	msr	daif, \tmp
+	.endm
+
 /*
  * Enable and disable interrupts.
  */
@@ -70,6 +76,10 @@
 	msr	daif, \flags
 	.endm
 
+	.macro	enable_serror
+	msr	daifclr, #4
+	.endm
+
 	.macro	enable_dbg
 	msr	daifclr, #8
 	.endm
@@ -93,15 +103,6 @@
 	.endm
 
 /*
- * Enable both debug exceptions and interrupts. This is likely to be
- * faster than two daifclr operations, since writes to this register
- * are self-synchronising.
- */
-	.macro	enable_dbg_and_irq
-	msr	daifclr, #(8 | 2)
-	.endm
-
-/*
  * SMP data memory barrier
  */
 	.macro	smp_dmb, opt
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 0836b65d4c84..51e704e46c29 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -408,8 +408,13 @@ ENDPROC(el1_error_invalid)
 	.align	6
 el1_sync:
 	kernel_entry 1
+	mrs	x0, far_el1
 	mrs	x1, esr_el1			// read the syndrome register
 	lsr	x24, x1, #ESR_ELx_EC_SHIFT	// exception class
+	cmp	x24, #ESR_ELx_EC_BREAKPT_CUR	// debug exception in EL1
+	b.ge	el1_dbg
+
+	inherit_daif	pstate=x23, tmp=x2
 	cmp	x24, #ESR_ELx_EC_DABT_CUR	// data abort in EL1
 	b.eq	el1_da
 	cmp	x24, #ESR_ELx_EC_IABT_CUR	// instruction abort in EL1
@@ -422,8 +427,6 @@ el1_sync:
 	b.eq	el1_sp_pc
 	cmp	x24, #ESR_ELx_EC_UNKNOWN	// unknown exception in EL1
 	b.eq	el1_undef
-	cmp	x24, #ESR_ELx_EC_BREAKPT_CUR	// debug exception in EL1
-	b.ge	el1_dbg
 	b	el1_inv
 
 el1_ia:
@@ -434,12 +437,7 @@ el1_da:
 	/*
 	 * Data abort handling
 	 */
-	mrs	x3, far_el1
-	enable_dbg
-	// re-enable interrupts if they were enabled in the aborted context
-	tbnz	x23, #7, 1f			// PSR_I_BIT
-	enable_irq
-1:
+	mov	x3, x0
 	clear_address_tag x0, x3
 	mov	x2, sp				// struct pt_regs
 	bl	do_mem_abort
@@ -449,31 +447,27 @@ el1_sp_pc:
 	/*
 	 * Stack or PC alignment exception handling
 	 */
-	mrs	x0, far_el1
-	enable_dbg
 	mov	x2, sp
 	b	do_sp_pc_abort
 el1_undef:
 	/*
 	 * Undefined instruction
 	 */
-	enable_dbg
 	mov	x0, sp
 	b	do_undefinstr
 el1_dbg:
 	/*
 	 * Debug exception handling
 	 */
+	enable_serror
 	cmp	x24, #ESR_ELx_EC_BRK64		// if BRK64
 	cinc	x24, x24, eq			// set bit '0'
 	tbz	x24, #0, el1_inv		// EL1 only
-	mrs	x0, far_el1
 	mov	x2, sp				// struct pt_regs
 	bl	do_debug_exception
 	kernel_exit 1
 el1_inv:
 	// TODO: add support for undefined instructions in kernel mode
-	enable_dbg
 	mov	x0, sp
 	mov	x2, x1
 	mov	x1, #BAD_SYNC
@@ -520,9 +514,16 @@ el1_preempt:
 el0_sync:
 	kernel_entry 0
 	mrs	x25, esr_el1			// read the syndrome register
+	mrs	x26, far_el1
 	lsr	x24, x25, #ESR_ELx_EC_SHIFT	// exception class
+	cmp	x24, #ESR_ELx_EC_BREAKPT_LOW	// debug exception in EL0
+	b.ge	el0_dbg
+
+	enable_daif
 	cmp	x24, #ESR_ELx_EC_SVC64		// SVC in 64-bit state
 	b.eq	el0_svc
+
+	ct_user_exit
 	cmp	x24, #ESR_ELx_EC_DABT_LOW	// data abort in EL0
 	b.eq	el0_da
 	cmp	x24, #ESR_ELx_EC_IABT_LOW	// instruction abort in EL0
@@ -539,8 +540,6 @@ el0_sync:
 	b.eq	el0_sp_pc
 	cmp	x24, #ESR_ELx_EC_UNKNOWN	// unknown exception in EL0
 	b.eq	el0_undef
-	cmp	x24, #ESR_ELx_EC_BREAKPT_LOW	// debug exception in EL0
-	b.ge	el0_dbg
 	b	el0_inv
 
 #ifdef CONFIG_COMPAT
@@ -548,9 +547,16 @@ el0_sync:
 el0_sync_compat:
 	kernel_entry 0, 32
 	mrs	x25, esr_el1			// read the syndrome register
+	mrs	x26, far_el1
 	lsr	x24, x25, #ESR_ELx_EC_SHIFT	// exception class
+	cmp	x24, #ESR_ELx_EC_BREAKPT_LOW	// debug exception in EL0
+	b.ge	el0_dbg
+
+	enable_daif
 	cmp	x24, #ESR_ELx_EC_SVC32		// SVC in 32-bit state
 	b.eq	el0_svc_compat
+
+	ct_user_exit
 	cmp	x24, #ESR_ELx_EC_DABT_LOW	// data abort in EL0
 	b.eq	el0_da
 	cmp	x24, #ESR_ELx_EC_IABT_LOW	// instruction abort in EL0
@@ -573,8 +579,6 @@ el0_sync_compat:
 	b.eq	el0_undef
 	cmp	x24, #ESR_ELx_EC_CP14_64	// CP14 MRRC/MCRR trap
 	b.eq	el0_undef
-	cmp	x24, #ESR_ELx_EC_BREAKPT_LOW	// debug exception in EL0
-	b.ge	el0_dbg
 	b	el0_inv
 el0_svc_compat:
 	/*
@@ -595,10 +599,6 @@ el0_da:
 	/*
 	 * Data abort handling
 	 */
-	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
-	ct_user_exit
 	clear_address_tag x0, x26
 	mov	x1, x25
 	mov	x2, sp
@@ -608,10 +608,6 @@ el0_ia:
 	/*
 	 * Instruction abort handling
 	 */
-	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
-	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
 	mov	x2, sp
@@ -621,8 +617,6 @@ el0_fpsimd_acc:
 	/*
 	 * Floating Point or Advanced SIMD access
 	 */
-	enable_dbg
-	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
 	bl	do_fpsimd_acc
@@ -631,8 +625,6 @@ el0_fpsimd_exc:
 	/*
 	 * Floating Point or Advanced SIMD exception
 	 */
-	enable_dbg
-	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
 	bl	do_fpsimd_exc
@@ -641,10 +633,6 @@ el0_sp_pc:
 	/*
 	 * Stack or PC alignment exception handling
 	 */
-	mrs	x26, far_el1
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
-	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
 	mov	x2, sp
@@ -654,9 +642,6 @@ el0_undef:
 	/*
 	 * Undefined instruction
 	 */
-	// enable interrupts before calling the main handler
-	enable_dbg_and_irq
-	ct_user_exit
 	mov	x0, sp
 	bl	do_undefinstr
 	b	ret_to_user
@@ -664,8 +649,6 @@ el0_sys:
 	/*
 	 * System instructions, for trapped cache maintenance instructions
 	 */
-	enable_dbg_and_irq
-	ct_user_exit
 	mov	x0, x25
 	mov	x1, sp
 	bl	do_sysinstr
@@ -675,16 +658,14 @@ el0_dbg:
 	 * Debug exception handling
 	 */
 	tbnz	x24, #0, el0_inv		// EL0 only
-	mrs	x0, far_el1
+	mov	x0, x26
 	mov	x1, x25
 	mov	x2, sp
 	bl	do_debug_exception
-	enable_dbg
+	enable_daif
 	ct_user_exit
 	b	ret_to_user
 el0_inv:
-	enable_dbg
-	ct_user_exit
 	mov	x0, sp
 	mov	x1, #BAD_SYNC
 	mov	x2, x25
@@ -803,7 +784,6 @@ el0_svc:
 	mov	sc_nr, #__NR_syscalls
 el0_svc_naked:					// compat entry point
 	stp	x0, scno, [sp, #S_ORIG_X0]	// save the original x0 and syscall number
-	enable_dbg_and_irq
 	ct_user_exit 1
 
 	ldr	x16, [tsk, #TSK_TI_FLAGS]	// check for syscall hooks
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 07/16] arm64: entry.S: convert elX_irq
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

Following our 'adi' order, Interrupts should be processed with Debug and
SError exceptions unmasked.

Add a helper to unmask these two, (and fiq for good measure).

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 10 +++++-----
 arch/arm64/kernel/entry.S          |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 6a2512da468a..a013ab05210d 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -56,6 +56,10 @@
 	msr	daif, \tmp
 	.endm
 
+	.macro enable_da_f
+	msr	daifclr, #(8 | 4 | 1)
+	.endm
+
 /*
  * Enable and disable interrupts.
  */
@@ -80,16 +84,12 @@
 	msr	daifclr, #4
 	.endm
 
-	.macro	enable_dbg
-	msr	daifclr, #8
-	.endm
-
 	.macro	disable_step_tsk, flgs, tmp
 	tbz	\flgs, #TIF_SINGLESTEP, 9990f
 	mrs	\tmp, mdscr_el1
 	bic	\tmp, \tmp, #1
 	msr	mdscr_el1, \tmp
-	isb	// Synchronise with enable_dbg
+	isb	// Synchronise with any daif write that enables debug
 9990:
 	.endm
 
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 51e704e46c29..2fde60f96239 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -477,7 +477,7 @@ ENDPROC(el1_sync)
 	.align	6
 el1_irq:
 	kernel_entry 1
-	enable_dbg
+	enable_da_f
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
 #endif
@@ -677,7 +677,7 @@ ENDPROC(el0_sync)
 el0_irq:
 	kernel_entry 0
 el0_irq_naked:
-	enable_dbg
+	enable_da_f
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
 #endif
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 07/16] arm64: entry.S: convert elX_irq
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

Following our 'adi' order, Interrupts should be processed with Debug and
SError exceptions unmasked.

Add a helper to unmask these two, (and fiq for good measure).

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 10 +++++-----
 arch/arm64/kernel/entry.S          |  4 ++--
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 6a2512da468a..a013ab05210d 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -56,6 +56,10 @@
 	msr	daif, \tmp
 	.endm
 
+	.macro enable_da_f
+	msr	daifclr, #(8 | 4 | 1)
+	.endm
+
 /*
  * Enable and disable interrupts.
  */
@@ -80,16 +84,12 @@
 	msr	daifclr, #4
 	.endm
 
-	.macro	enable_dbg
-	msr	daifclr, #8
-	.endm
-
 	.macro	disable_step_tsk, flgs, tmp
 	tbz	\flgs, #TIF_SINGLESTEP, 9990f
 	mrs	\tmp, mdscr_el1
 	bic	\tmp, \tmp, #1
 	msr	mdscr_el1, \tmp
-	isb	// Synchronise with enable_dbg
+	isb	// Synchronise with any daif write that enables debug
 9990:
 	.endm
 
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 51e704e46c29..2fde60f96239 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -477,7 +477,7 @@ ENDPROC(el1_sync)
 	.align	6
 el1_irq:
 	kernel_entry 1
-	enable_dbg
+	enable_da_f
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
 #endif
@@ -677,7 +677,7 @@ ENDPROC(el0_sync)
 el0_irq:
 	kernel_entry 0
 el0_irq_naked:
-	enable_dbg
+	enable_da_f
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
 #endif
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 08/16] arm64: entry.S: move SError handling into a C function for future expansion
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

From: Xie XiuQi <xiexiuqi@huawei.com>

Today SError is taken using the inv_entry macro that ends up in
bad_mode.

SError can be used by the RAS Extensions to notify either the OS or
firmware of CPU problems, some of which may have been corrected.

To allow this handling to be added, add a do_serror() C function
that just panic()s. Add the entry.S boiler plate to save/restore the
CPU registers. Future patches may change do_serror() to return if the
SError Interrupt was notification of a corrected error.

Use nmi_panic() so that an SError taken during a regular panic()
continues to process the first panic().

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Wang Xiongfeng <wangxiongfengi2@huawei.com>
[Split out of a bigger patch, added compat path, renamed, enabled debug
 exceptions]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig        |  2 +-
 arch/arm64/kernel/entry.S | 34 +++++++++++++++++++++++++++-------
 arch/arm64/kernel/traps.c | 15 +++++++++++++++
 3 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index dfd908630631..d3913cffa3ac 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -97,7 +97,7 @@ config ARM64
 	select HAVE_IRQ_TIME_ACCOUNTING
 	select HAVE_MEMBLOCK
 	select HAVE_MEMBLOCK_NODE_MAP if NUMA
-	select HAVE_NMI if ACPI_APEI_SEA
+	select HAVE_NMI
 	select HAVE_PATA_PLATFORM
 	select HAVE_PERF_EVENTS
 	select HAVE_PERF_REGS
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 2fde60f96239..9e63f69e1366 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -329,18 +329,18 @@ ENTRY(vectors)
 	ventry	el1_sync			// Synchronous EL1h
 	ventry	el1_irq				// IRQ EL1h
 	ventry	el1_fiq_invalid			// FIQ EL1h
-	ventry	el1_error_invalid		// Error EL1h
+	ventry	el1_serror			// Error EL1h
 
 	ventry	el0_sync			// Synchronous 64-bit EL0
 	ventry	el0_irq				// IRQ 64-bit EL0
 	ventry	el0_fiq_invalid			// FIQ 64-bit EL0
-	ventry	el0_error_invalid		// Error 64-bit EL0
+	ventry	el0_serror			// Error 64-bit EL0
 
 #ifdef CONFIG_COMPAT
 	ventry	el0_sync_compat			// Synchronous 32-bit EL0
 	ventry	el0_irq_compat			// IRQ 32-bit EL0
 	ventry	el0_fiq_invalid_compat		// FIQ 32-bit EL0
-	ventry	el0_error_invalid_compat	// Error 32-bit EL0
+	ventry	el0_serror_compat		// Error 32-bit EL0
 #else
 	ventry	el0_sync_invalid		// Synchronous 32-bit EL0
 	ventry	el0_irq_invalid			// IRQ 32-bit EL0
@@ -380,10 +380,6 @@ ENDPROC(el0_error_invalid)
 el0_fiq_invalid_compat:
 	inv_entry 0, BAD_FIQ, 32
 ENDPROC(el0_fiq_invalid_compat)
-
-el0_error_invalid_compat:
-	inv_entry 0, BAD_ERROR, 32
-ENDPROC(el0_error_invalid_compat)
 #endif
 
 el1_sync_invalid:
@@ -593,6 +589,10 @@ el0_svc_compat:
 el0_irq_compat:
 	kernel_entry 0, 32
 	b	el0_irq_naked
+
+el0_serror_compat:
+	kernel_entry 0, 32
+	b	el0_serror_naked
 #endif
 
 el0_da:
@@ -691,6 +691,26 @@ el0_irq_naked:
 	b	ret_to_user
 ENDPROC(el0_irq)
 
+el1_serror:
+	kernel_entry 1
+	mrs	x1, esr_el1
+	mov	x0, sp
+	bl	do_serror
+	kernel_exit 1
+ENDPROC(el1_serror)
+
+el0_serror:
+	kernel_entry 0
+el0_serror_naked:
+	mrs	x1, esr_el1
+	mov	x0, sp
+	bl	do_serror
+	enable_daif
+	ct_user_exit
+	b	ret_to_user
+ENDPROC(el0_serror)
+
+
 /*
  * Register switch for AArch64. The callee-saved registers need to be saved
  * and restored. On entry:
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 59efec10be15..943a0e242dbc 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -685,6 +685,21 @@ asmlinkage void bad_el0_sync(struct pt_regs *regs, int reason, unsigned int esr)
 	force_sig_info(info.si_signo, &info, current);
 }
 
+asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+{
+	nmi_enter();
+
+	console_verbose();
+
+	pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n",
+		smp_processor_id(), esr, esr_get_class_string(esr));
+	__show_regs(regs);
+
+	nmi_panic(regs, "Asynchronous SError Interrupt");
+
+	nmi_exit();
+}
+
 void __pte_error(const char *file, int line, unsigned long val)
 {
 	pr_err("%s:%d: bad pte %016lx.\n", file, line, val);
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 08/16] arm64: entry.S: move SError handling into a C function for future expansion
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

From: Xie XiuQi <xiexiuqi@huawei.com>

Today SError is taken using the inv_entry macro that ends up in
bad_mode.

SError can be used by the RAS Extensions to notify either the OS or
firmware of CPU problems, some of which may have been corrected.

To allow this handling to be added, add a do_serror() C function
that just panic()s. Add the entry.S boiler plate to save/restore the
CPU registers. Future patches may change do_serror() to return if the
SError Interrupt was notification of a corrected error.

Use nmi_panic() so that an SError taken during a regular panic()
continues to process the first panic().

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Wang Xiongfeng <wangxiongfengi2@huawei.com>
[Split out of a bigger patch, added compat path, renamed, enabled debug
 exceptions]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig        |  2 +-
 arch/arm64/kernel/entry.S | 34 +++++++++++++++++++++++++++-------
 arch/arm64/kernel/traps.c | 15 +++++++++++++++
 3 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index dfd908630631..d3913cffa3ac 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -97,7 +97,7 @@ config ARM64
 	select HAVE_IRQ_TIME_ACCOUNTING
 	select HAVE_MEMBLOCK
 	select HAVE_MEMBLOCK_NODE_MAP if NUMA
-	select HAVE_NMI if ACPI_APEI_SEA
+	select HAVE_NMI
 	select HAVE_PATA_PLATFORM
 	select HAVE_PERF_EVENTS
 	select HAVE_PERF_REGS
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 2fde60f96239..9e63f69e1366 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -329,18 +329,18 @@ ENTRY(vectors)
 	ventry	el1_sync			// Synchronous EL1h
 	ventry	el1_irq				// IRQ EL1h
 	ventry	el1_fiq_invalid			// FIQ EL1h
-	ventry	el1_error_invalid		// Error EL1h
+	ventry	el1_serror			// Error EL1h
 
 	ventry	el0_sync			// Synchronous 64-bit EL0
 	ventry	el0_irq				// IRQ 64-bit EL0
 	ventry	el0_fiq_invalid			// FIQ 64-bit EL0
-	ventry	el0_error_invalid		// Error 64-bit EL0
+	ventry	el0_serror			// Error 64-bit EL0
 
 #ifdef CONFIG_COMPAT
 	ventry	el0_sync_compat			// Synchronous 32-bit EL0
 	ventry	el0_irq_compat			// IRQ 32-bit EL0
 	ventry	el0_fiq_invalid_compat		// FIQ 32-bit EL0
-	ventry	el0_error_invalid_compat	// Error 32-bit EL0
+	ventry	el0_serror_compat		// Error 32-bit EL0
 #else
 	ventry	el0_sync_invalid		// Synchronous 32-bit EL0
 	ventry	el0_irq_invalid			// IRQ 32-bit EL0
@@ -380,10 +380,6 @@ ENDPROC(el0_error_invalid)
 el0_fiq_invalid_compat:
 	inv_entry 0, BAD_FIQ, 32
 ENDPROC(el0_fiq_invalid_compat)
-
-el0_error_invalid_compat:
-	inv_entry 0, BAD_ERROR, 32
-ENDPROC(el0_error_invalid_compat)
 #endif
 
 el1_sync_invalid:
@@ -593,6 +589,10 @@ el0_svc_compat:
 el0_irq_compat:
 	kernel_entry 0, 32
 	b	el0_irq_naked
+
+el0_serror_compat:
+	kernel_entry 0, 32
+	b	el0_serror_naked
 #endif
 
 el0_da:
@@ -691,6 +691,26 @@ el0_irq_naked:
 	b	ret_to_user
 ENDPROC(el0_irq)
 
+el1_serror:
+	kernel_entry 1
+	mrs	x1, esr_el1
+	mov	x0, sp
+	bl	do_serror
+	kernel_exit 1
+ENDPROC(el1_serror)
+
+el0_serror:
+	kernel_entry 0
+el0_serror_naked:
+	mrs	x1, esr_el1
+	mov	x0, sp
+	bl	do_serror
+	enable_daif
+	ct_user_exit
+	b	ret_to_user
+ENDPROC(el0_serror)
+
+
 /*
  * Register switch for AArch64. The callee-saved registers need to be saved
  * and restored. On entry:
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 59efec10be15..943a0e242dbc 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -685,6 +685,21 @@ asmlinkage void bad_el0_sync(struct pt_regs *regs, int reason, unsigned int esr)
 	force_sig_info(info.si_signo, &info, current);
 }
 
+asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+{
+	nmi_enter();
+
+	console_verbose();
+
+	pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n",
+		smp_processor_id(), esr, esr_get_class_string(esr));
+	__show_regs(regs);
+
+	nmi_panic(regs, "Asynchronous SError Interrupt");
+
+	nmi_exit();
+}
+
 void __pte_error(const char *file, int line, unsigned long val)
 {
 	pr_err("%s:%d: bad pte %016lx.\n", file, line, val);
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 09/16] arm64: cpufeature: Detect CPU RAS Extentions
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

From: Xie XiuQi <xiexiuqi@huawei.com>

ARM's v8.2 Extentions add support for Reliability, Availability and
Serviceability (RAS). On CPUs with these extensions system software
can use additional barriers to isolate errors and determine if faults
are pending.

Add cpufeature detection and a barrier in the context-switch code.
There is no need to use alternatives for this as CPUs that don't
support this feature will treat the instruction as a nop.

Platform level RAS support may require additional firmware support.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
[Rebased, added esb and config option, reworded commit message]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig               | 16 ++++++++++++++++
 arch/arm64/include/asm/barrier.h |  1 +
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/include/asm/sysreg.h  |  2 ++
 arch/arm64/kernel/cpufeature.c   | 13 +++++++++++++
 arch/arm64/kernel/process.c      |  3 +++
 6 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d3913cffa3ac..6e417e25672f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -960,6 +960,22 @@ config ARM64_UAO
 	  regular load/store instructions if the cpu does not implement the
 	  feature.
 
+config ARM64_RAS_EXTN
+	bool "Enable support for RAS CPU Extensions"
+	default y
+	help
+	  CPUs that support the Reliability, Availability and Serviceability
+	  (RAS) Extensions, part of ARMv8.2 are able to track faults and
+	  errors, classify them and report them to software.
+
+	  On CPUs with these extensions system software can use additional
+	  barriers to determine if faults are pending and read the
+	  classification from a new set of registers.
+
+	  Selecting this feature will allow the kernel to use these barriers
+	  and access the new registers if the system supports the extension.
+	  Platform RAS features may additionally depend on firmware support.
+
 endmenu
 
 config ARM64_MODULE_CMODEL_LARGE
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 0fe7e43b7fbc..8b0a0eb67625 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -30,6 +30,7 @@
 #define isb()		asm volatile("isb" : : : "memory")
 #define dmb(opt)	asm volatile("dmb " #opt : : : "memory")
 #define dsb(opt)	asm volatile("dsb " #opt : : : "memory")
+#define esb()		asm volatile("hint #16"  : : : "memory")
 
 #define mb()		dsb(sy)
 #define rmb()		dsb(ld)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 8d2272c6822c..f93bf77f1f74 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -39,7 +39,8 @@
 #define ARM64_WORKAROUND_QCOM_FALKOR_E1003	18
 #define ARM64_WORKAROUND_858921			19
 #define ARM64_WORKAROUND_CAVIUM_30115		20
+#define ARM64_HAS_RAS_EXTN			21
 
-#define ARM64_NCAPS				21
+#define ARM64_NCAPS				22
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 16e44fa9b3b6..58358acf7c9b 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -331,6 +331,7 @@
 #define ID_AA64ISAR1_JSCVT_SHIFT	12
 
 /* id_aa64pfr0 */
+#define ID_AA64PFR0_RAS_SHIFT		28
 #define ID_AA64PFR0_GIC_SHIFT		24
 #define ID_AA64PFR0_ASIMD_SHIFT		20
 #define ID_AA64PFR0_FP_SHIFT		16
@@ -339,6 +340,7 @@
 #define ID_AA64PFR0_EL1_SHIFT		4
 #define ID_AA64PFR0_EL0_SHIFT		0
 
+#define ID_AA64PFR0_RAS_V1		0x1
 #define ID_AA64PFR0_FP_NI		0xf
 #define ID_AA64PFR0_FP_SUPPORTED	0x0
 #define ID_AA64PFR0_ASIMD_NI		0xf
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9f9e0064c8c1..a807ab55ee10 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -124,6 +124,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_RAS_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_GIC_SHIFT, 4, 0),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI),
@@ -888,6 +889,18 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.min_field_value = 0,
 		.matches = has_no_fpsimd,
 	},
+#ifdef CONFIG_ARM64_RAS_EXTN
+	{
+		.desc = "RAS Extension Support",
+		.capability = ARM64_HAS_RAS_EXTN,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64PFR0_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64PFR0_RAS_SHIFT,
+		.min_field_value = ID_AA64PFR0_RAS_V1,
+	},
+#endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
 };
 
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 659ae8094ed5..2def5ce75867 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -363,6 +363,9 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
 	 */
 	dsb(ish);
 
+	/* Deliver any pending SError from prev */
+	esb();
+
 	/* the actual thread switch */
 	last = cpu_switch_to(prev, next);
 
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 09/16] arm64: cpufeature: Detect CPU RAS Extentions
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

From: Xie XiuQi <xiexiuqi@huawei.com>

ARM's v8.2 Extentions add support for Reliability, Availability and
Serviceability (RAS). On CPUs with these extensions system software
can use additional barriers to isolate errors and determine if faults
are pending.

Add cpufeature detection and a barrier in the context-switch code.
There is no need to use alternatives for this as CPUs that don't
support this feature will treat the instruction as a nop.

Platform level RAS support may require additional firmware support.

Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
[Rebased, added esb and config option, reworded commit message]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig               | 16 ++++++++++++++++
 arch/arm64/include/asm/barrier.h |  1 +
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/include/asm/sysreg.h  |  2 ++
 arch/arm64/kernel/cpufeature.c   | 13 +++++++++++++
 arch/arm64/kernel/process.c      |  3 +++
 6 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index d3913cffa3ac..6e417e25672f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -960,6 +960,22 @@ config ARM64_UAO
 	  regular load/store instructions if the cpu does not implement the
 	  feature.
 
+config ARM64_RAS_EXTN
+	bool "Enable support for RAS CPU Extensions"
+	default y
+	help
+	  CPUs that support the Reliability, Availability and Serviceability
+	  (RAS) Extensions, part of ARMv8.2 are able to track faults and
+	  errors, classify them and report them to software.
+
+	  On CPUs with these extensions system software can use additional
+	  barriers to determine if faults are pending and read the
+	  classification from a new set of registers.
+
+	  Selecting this feature will allow the kernel to use these barriers
+	  and access the new registers if the system supports the extension.
+	  Platform RAS features may additionally depend on firmware support.
+
 endmenu
 
 config ARM64_MODULE_CMODEL_LARGE
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 0fe7e43b7fbc..8b0a0eb67625 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -30,6 +30,7 @@
 #define isb()		asm volatile("isb" : : : "memory")
 #define dmb(opt)	asm volatile("dmb " #opt : : : "memory")
 #define dsb(opt)	asm volatile("dsb " #opt : : : "memory")
+#define esb()		asm volatile("hint #16"  : : : "memory")
 
 #define mb()		dsb(sy)
 #define rmb()		dsb(ld)
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 8d2272c6822c..f93bf77f1f74 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -39,7 +39,8 @@
 #define ARM64_WORKAROUND_QCOM_FALKOR_E1003	18
 #define ARM64_WORKAROUND_858921			19
 #define ARM64_WORKAROUND_CAVIUM_30115		20
+#define ARM64_HAS_RAS_EXTN			21
 
-#define ARM64_NCAPS				21
+#define ARM64_NCAPS				22
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 16e44fa9b3b6..58358acf7c9b 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -331,6 +331,7 @@
 #define ID_AA64ISAR1_JSCVT_SHIFT	12
 
 /* id_aa64pfr0 */
+#define ID_AA64PFR0_RAS_SHIFT		28
 #define ID_AA64PFR0_GIC_SHIFT		24
 #define ID_AA64PFR0_ASIMD_SHIFT		20
 #define ID_AA64PFR0_FP_SHIFT		16
@@ -339,6 +340,7 @@
 #define ID_AA64PFR0_EL1_SHIFT		4
 #define ID_AA64PFR0_EL0_SHIFT		0
 
+#define ID_AA64PFR0_RAS_V1		0x1
 #define ID_AA64PFR0_FP_NI		0xf
 #define ID_AA64PFR0_FP_SUPPORTED	0x0
 #define ID_AA64PFR0_ASIMD_NI		0xf
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9f9e0064c8c1..a807ab55ee10 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -124,6 +124,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_RAS_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_EXACT, ID_AA64PFR0_GIC_SHIFT, 4, 0),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI),
@@ -888,6 +889,18 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.min_field_value = 0,
 		.matches = has_no_fpsimd,
 	},
+#ifdef CONFIG_ARM64_RAS_EXTN
+	{
+		.desc = "RAS Extension Support",
+		.capability = ARM64_HAS_RAS_EXTN,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64PFR0_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64PFR0_RAS_SHIFT,
+		.min_field_value = ID_AA64PFR0_RAS_V1,
+	},
+#endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
 };
 
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 659ae8094ed5..2def5ce75867 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -363,6 +363,9 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
 	 */
 	dsb(ish);
 
+	/* Deliver any pending SError from prev */
+	esb();
+
 	/* the actual thread switch */
 	last = cpu_switch_to(prev, next);
 
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 10/16] arm64: kernel: Survive corrected RAS errors notified by SError
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

On v8.0, SError is an uncontainable fatal exception. The v8.2 RAS
extensions use SError to notify software about RAS errors, these can be
contained by the ESB instruction.

An ACPI system with firmware-first may use SError as its 'SEI'
notification. Future patches may add code to 'claim' this SError as
notification.

Other systems can distinguish these RAS errors from the SError ESR and
use the AET bits and additional data from RAS-Error registers to handle
the error.  Future patches may add this kernel-first handling.

In the meantime, on both kinds of system we can safely ignore corrected
errors.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/esr.h | 10 ++++++++++
 arch/arm64/kernel/traps.c    | 35 ++++++++++++++++++++++++++++++++---
 2 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 8cabd57b6348..77d5b1baf1a4 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -83,6 +83,15 @@
 /* ISS field definitions shared by different classes */
 #define ESR_ELx_WNR		(UL(1) << 6)
 
+/* Asynchronous Error Type */
+#define ESR_ELx_AET		(UL(0x7) << 10)
+
+#define ESR_ELx_AET_UC		(UL(0) << 10)	/* Uncontainable */
+#define ESR_ELx_AET_UEU		(UL(1) << 10)	/* Uncorrected Unrecoverable */
+#define ESR_ELx_AET_UEO		(UL(2) << 10)	/* Uncorrected Restartable */
+#define ESR_ELx_AET_UER		(UL(3) << 10)	/* Uncorrected Recoverable */
+#define ESR_ELx_AET_CE		(UL(6) << 10)	/* Corrected */
+
 /* Shared ISS field definitions for Data/Instruction aborts */
 #define ESR_ELx_FnV		(UL(1) << 10)
 #define ESR_ELx_EA		(UL(1) << 9)
@@ -92,6 +101,7 @@
 #define ESR_ELx_FSC		(0x3F)
 #define ESR_ELx_FSC_TYPE	(0x3C)
 #define ESR_ELx_FSC_EXTABT	(0x10)
+#define ESR_ELx_FSC_SERROR	(0x11)
 #define ESR_ELx_FSC_ACCESS	(0x08)
 #define ESR_ELx_FSC_FAULT	(0x04)
 #define ESR_ELx_FSC_PERM	(0x0C)
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 943a0e242dbc..e1eaccc66548 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -685,10 +685,8 @@ asmlinkage void bad_el0_sync(struct pt_regs *regs, int reason, unsigned int esr)
 	force_sig_info(info.si_signo, &info, current);
 }
 
-asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+static void do_serror_panic(struct pt_regs *regs, unsigned int esr)
 {
-	nmi_enter();
-
 	console_verbose();
 
 	pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n",
@@ -696,6 +694,37 @@ asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
 	__show_regs(regs);
 
 	nmi_panic(regs, "Asynchronous SError Interrupt");
+}
+
+static void _do_serror(struct pt_regs *regs, unsigned int esr)
+{
+	bool impdef_syndrome = esr & ESR_ELx_ISV;	/* aka IDS */
+	unsigned int aet = esr & ESR_ELx_AET;
+
+	if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN) || impdef_syndrome)
+		return do_serror_panic(regs, esr);
+
+	/*
+	 * AET is RES0 if 'the value returned in the DFSC field is not
+	 * [ESR_ELx_FSC_SERROR]'
+	 */
+	if ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR)
+		return do_serror_panic(regs, esr);
+
+	switch (aet) {
+	case ESR_ELx_AET_CE:	/* corrected error */
+	case ESR_ELx_AET_UEO:	/* restartable, not yet consumed */
+		break;
+	default:
+		return do_serror_panic(regs, esr);
+	}
+}
+
+asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+{
+	nmi_enter();
+
+	_do_serror(regs, esr);
 
 	nmi_exit();
 }
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 10/16] arm64: kernel: Survive corrected RAS errors notified by SError
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

On v8.0, SError is an uncontainable fatal exception. The v8.2 RAS
extensions use SError to notify software about RAS errors, these can be
contained by the ESB instruction.

An ACPI system with firmware-first may use SError as its 'SEI'
notification. Future patches may add code to 'claim' this SError as
notification.

Other systems can distinguish these RAS errors from the SError ESR and
use the AET bits and additional data from RAS-Error registers to handle
the error.  Future patches may add this kernel-first handling.

In the meantime, on both kinds of system we can safely ignore corrected
errors.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/esr.h | 10 ++++++++++
 arch/arm64/kernel/traps.c    | 35 ++++++++++++++++++++++++++++++++---
 2 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 8cabd57b6348..77d5b1baf1a4 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -83,6 +83,15 @@
 /* ISS field definitions shared by different classes */
 #define ESR_ELx_WNR		(UL(1) << 6)
 
+/* Asynchronous Error Type */
+#define ESR_ELx_AET		(UL(0x7) << 10)
+
+#define ESR_ELx_AET_UC		(UL(0) << 10)	/* Uncontainable */
+#define ESR_ELx_AET_UEU		(UL(1) << 10)	/* Uncorrected Unrecoverable */
+#define ESR_ELx_AET_UEO		(UL(2) << 10)	/* Uncorrected Restartable */
+#define ESR_ELx_AET_UER		(UL(3) << 10)	/* Uncorrected Recoverable */
+#define ESR_ELx_AET_CE		(UL(6) << 10)	/* Corrected */
+
 /* Shared ISS field definitions for Data/Instruction aborts */
 #define ESR_ELx_FnV		(UL(1) << 10)
 #define ESR_ELx_EA		(UL(1) << 9)
@@ -92,6 +101,7 @@
 #define ESR_ELx_FSC		(0x3F)
 #define ESR_ELx_FSC_TYPE	(0x3C)
 #define ESR_ELx_FSC_EXTABT	(0x10)
+#define ESR_ELx_FSC_SERROR	(0x11)
 #define ESR_ELx_FSC_ACCESS	(0x08)
 #define ESR_ELx_FSC_FAULT	(0x04)
 #define ESR_ELx_FSC_PERM	(0x0C)
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 943a0e242dbc..e1eaccc66548 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -685,10 +685,8 @@ asmlinkage void bad_el0_sync(struct pt_regs *regs, int reason, unsigned int esr)
 	force_sig_info(info.si_signo, &info, current);
 }
 
-asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+static void do_serror_panic(struct pt_regs *regs, unsigned int esr)
 {
-	nmi_enter();
-
 	console_verbose();
 
 	pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n",
@@ -696,6 +694,37 @@ asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
 	__show_regs(regs);
 
 	nmi_panic(regs, "Asynchronous SError Interrupt");
+}
+
+static void _do_serror(struct pt_regs *regs, unsigned int esr)
+{
+	bool impdef_syndrome = esr & ESR_ELx_ISV;	/* aka IDS */
+	unsigned int aet = esr & ESR_ELx_AET;
+
+	if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN) || impdef_syndrome)
+		return do_serror_panic(regs, esr);
+
+	/*
+	 * AET is RES0 if 'the value returned in the DFSC field is not
+	 * [ESR_ELx_FSC_SERROR]'
+	 */
+	if ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR)
+		return do_serror_panic(regs, esr);
+
+	switch (aet) {
+	case ESR_ELx_AET_CE:	/* corrected error */
+	case ESR_ELx_AET_UEO:	/* restartable, not yet consumed */
+		break;
+	default:
+		return do_serror_panic(regs, esr);
+	}
+}
+
+asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+{
+	nmi_enter();
+
+	_do_serror(regs, esr);
 
 	nmi_exit();
 }
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 11/16] arm64: kernel: Handle deferred SError on kernel entry
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

Before we can enable Implicit ESB on exception level change, we need to
handle deferred SErrors that may appear on exception entry.

Add code to kernel_entry to synchronize errors then read and clear
DISR_EL1. Call do_deferred_serror() if it had a non-zero value.
(The IESB feature will allow this explicit ESB to be removed)

These checks are needed in the SError vector too, as we may take a pending
SError that was signalled by a device, and on entry to EL1 synchronize
then defer a RAS SError that hadn't yet been made pending. We process the
'taken' SError first as it is more likely to be fatal.

Clear DISR_EL1 from the RAS cpufeature enable call. This means any value
we find in DISR_EL1 was triggered and deferred by our actions. We have
executed ESB prior to this point, but these occur with SError unmasked so
will not have been deferred.

Signed-off-by: James Morse <james.morse@arm.com>
---
Remove the 'x21  aborted SP' line from entry.S - its not true.

 arch/arm64/include/asm/assembler.h | 23 ++++++++++++
 arch/arm64/include/asm/esr.h       |  7 ++++
 arch/arm64/include/asm/exception.h | 14 ++++++++
 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/include/asm/sysreg.h    |  1 +
 arch/arm64/kernel/cpufeature.c     |  9 +++++
 arch/arm64/kernel/entry.S          | 73 ++++++++++++++++++++++++++++++++------
 arch/arm64/kernel/traps.c          |  5 +++
 8 files changed, 122 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index a013ab05210d..e2bb551f59f7 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -110,6 +110,13 @@
 	.endm
 
 /*
+ * RAS Error Synchronization barrier
+ */
+	.macro	esb
+	hint	#16
+	.endm
+
+/*
  * NOP sequence
  */
 	.macro	nops, num
@@ -455,6 +462,22 @@ alternative_endif
 	.endm
 
 /*
+ * Read and clear DISR if supported
+ */
+	.macro disr_read, reg
+	alternative_if ARM64_HAS_RAS_EXTN
+	mrs_s	\reg, SYS_DISR_EL1
+	cbz	\reg, 99f
+	msr_s	SYS_DISR_EL1, xzr
+99:
+	alternative_else
+	mov	\reg,	xzr
+	nop
+	nop
+	alternative_endif
+	.endm
+
+/*
  * Errata workaround prior to TTBR0_EL1 update
  *
  * 	val:	TTBR value with new BADDR, preserved
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 77d5b1baf1a4..41a0767e2600 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -124,6 +124,13 @@
 #define ESR_ELx_WFx_ISS_WFE	(UL(1) << 0)
 #define ESR_ELx_xVC_IMM_MASK	((1UL << 16) - 1)
 
+#define DISR_EL1_IDS		 (UL(1) << 24)
+/*
+ * DISR_EL1 and ESR_ELx share the bottom 13 bits, but the RES0 bits may mean
+ * different things in the future...
+ */
+#define DISR_EL1_ESR_MASK	(ESR_ELx_AET | ESR_ELx_EA | ESR_ELx_FSC)
+
 /* ESR value templates for specific events */
 
 /* BRK instruction trap from AArch64 state */
diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
index 0c2eec490abf..bc30429d8e91 100644
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -18,6 +18,8 @@
 #ifndef __ASM_EXCEPTION_H
 #define __ASM_EXCEPTION_H
 
+#include <asm/esr.h>
+
 #include <linux/interrupt.h>
 
 #define __exception	__attribute__((section(".exception.text")))
@@ -27,4 +29,16 @@
 #define __exception_irq_entry	__exception
 #endif
 
+static inline u32 disr_to_esr(u64 disr)
+{
+	unsigned int esr = ESR_ELx_EC_SERROR << ESR_ELx_EC_SHIFT;
+
+	if ((disr & DISR_EL1_IDS) == 0)
+		esr |= (disr & DISR_EL1_ESR_MASK);
+	else
+		esr |= (disr & ESR_ELx_ISS_MASK);
+
+	return esr;
+}
+
 #endif	/* __ASM_EXCEPTION_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 64c9e78f9882..82e8ff01153d 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -193,5 +193,6 @@ static inline void spin_lock_prefetch(const void *ptr)
 
 int cpu_enable_pan(void *__unused);
 int cpu_enable_cache_maint_trap(void *__unused);
+int cpu_clear_disr(void *__unused);
 
 #endif /* __ASM_PROCESSOR_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 58358acf7c9b..18cabd92af22 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -179,6 +179,7 @@
 #define SYS_AMAIR_EL1			sys_reg(3, 0, 10, 3, 0)
 
 #define SYS_VBAR_EL1			sys_reg(3, 0, 12, 0, 0)
+#define SYS_DISR_EL1			sys_reg(3, 0, 12, 1,  1)
 
 #define SYS_ICC_IAR0_EL1		sys_reg(3, 0, 12, 8, 0)
 #define SYS_ICC_EOIR0_EL1		sys_reg(3, 0, 12, 8, 1)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index a807ab55ee10..6dbefe401dc4 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -899,6 +899,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.sign = FTR_UNSIGNED,
 		.field_pos = ID_AA64PFR0_RAS_SHIFT,
 		.min_field_value = ID_AA64PFR0_RAS_V1,
+		.enable = cpu_clear_disr,
 	},
 #endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
@@ -1308,3 +1309,11 @@ static int __init enable_mrs_emulation(void)
 }
 
 late_initcall(enable_mrs_emulation);
+
+int cpu_clear_disr(void *__unused)
+{
+	/* Firmware may have left a deferred SError in this register. */
+	write_sysreg_s(0, SYS_DISR_EL1);
+
+	return 0;
+}
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 9e63f69e1366..8cdfca4060e3 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -34,6 +34,18 @@
 #include <asm/asm-uaccess.h>
 #include <asm/unistd.h>
 
+
+	/*
+	 * Restore syscall arguments from the values already saved on stack
+	 * during kernel_entry.
+	 */
+	.macro restore_syscall_args
+	ldp	x0, x1, [sp]
+	ldp	x2, x3, [sp, #S_X2]
+	ldp	x4, x5, [sp, #S_X4]
+	ldp	x6, x7, [sp, #S_X6]
+	.endm
+
 /*
  * Context tracking subsystem.  Used to instrument transitions
  * between user and kernel mode.
@@ -42,14 +54,8 @@
 #ifdef CONFIG_CONTEXT_TRACKING
 	bl	context_tracking_user_exit
 	.if \syscall == 1
-	/*
-	 * Save/restore needed during syscalls.  Restore syscall arguments from
-	 * the values already saved on stack during kernel_entry.
-	 */
-	ldp	x0, x1, [sp]
-	ldp	x2, x3, [sp, #S_X2]
-	ldp	x4, x5, [sp, #S_X4]
-	ldp	x6, x7, [sp, #S_X6]
+	/* Save/restore needed during syscalls. */
+	restore_syscall_args
 	.endif
 #endif
 	.endm
@@ -153,10 +159,13 @@ alternative_else_nop_endif
 	msr	sp_el0, tsk
 	.endif
 
+	esb
+	disr_read reg=x15
+
 	/*
 	 * Registers that may be useful after this macro is invoked:
 	 *
-	 * x21 - aborted SP
+	 * x15 - Deferred Interrupt Status value
 	 * x22 - aborted PC
 	 * x23 - aborted PSTATE
 	*/
@@ -312,6 +321,31 @@ tsk	.req	x28		// current thread_info
 	irq_stack_exit
 	.endm
 
+/* Handle any non-zero DISR value if supported.
+ *
+ * @reg     the register holding DISR
+ * @syscall whether the syscall args should be restored if we call
+ *          do_deferred_serror (default: no)
+ */
+	.macro disr_check	reg, syscall = 0
+#ifdef CONFIG_ARM64_RAS_EXTN
+alternative_if_not ARM64_HAS_RAS_EXTN
+	b	9998f
+alternative_else_nop_endif
+	cbz	\reg, 9998f
+
+	mov	x1, \reg
+	mov	x0, sp
+	bl	do_deferred_serror
+
+	.if \syscall == 1
+	restore_syscall_args
+	.endif
+9998:
+#endif /* CONFIG_ARM64_RAS_EXTN */
+	.endm
+
+
 	.text
 
 /*
@@ -404,8 +438,12 @@ ENDPROC(el1_error_invalid)
 	.align	6
 el1_sync:
 	kernel_entry 1
-	mrs	x0, far_el1
-	mrs	x1, esr_el1			// read the syndrome register
+	mrs	x26, far_el1
+	mrs	x25, esr_el1			// read the syndrome register
+	disr_check	reg=x15
+	mov	x0, x26
+	mov	x1, x25
+
 	lsr	x24, x1, #ESR_ELx_EC_SHIFT	// exception class
 	cmp	x24, #ESR_ELx_EC_BREAKPT_CUR	// debug exception in EL1
 	b.ge	el1_dbg
@@ -461,6 +499,7 @@ el1_dbg:
 	tbz	x24, #0, el1_inv		// EL1 only
 	mov	x2, sp				// struct pt_regs
 	bl	do_debug_exception
+
 	kernel_exit 1
 el1_inv:
 	// TODO: add support for undefined instructions in kernel mode
@@ -473,6 +512,7 @@ ENDPROC(el1_sync)
 	.align	6
 el1_irq:
 	kernel_entry 1
+	disr_check	reg=x15
 	enable_da_f
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
@@ -511,6 +551,8 @@ el0_sync:
 	kernel_entry 0
 	mrs	x25, esr_el1			// read the syndrome register
 	mrs	x26, far_el1
+	disr_check	reg=x15, syscall=1
+
 	lsr	x24, x25, #ESR_ELx_EC_SHIFT	// exception class
 	cmp	x24, #ESR_ELx_EC_BREAKPT_LOW	// debug exception in EL0
 	b.ge	el0_dbg
@@ -544,6 +586,8 @@ el0_sync_compat:
 	kernel_entry 0, 32
 	mrs	x25, esr_el1			// read the syndrome register
 	mrs	x26, far_el1
+	disr_check	reg=x15, syscall=1
+
 	lsr	x24, x25, #ESR_ELx_EC_SHIFT	// exception class
 	cmp	x24, #ESR_ELx_EC_BREAKPT_LOW	// debug exception in EL0
 	b.ge	el0_dbg
@@ -677,6 +721,7 @@ ENDPROC(el0_sync)
 el0_irq:
 	kernel_entry 0
 el0_irq_naked:
+	disr_check	reg=x15
 	enable_da_f
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
@@ -693,18 +738,24 @@ ENDPROC(el0_irq)
 
 el1_serror:
 	kernel_entry 1
+	mov	x20, x15
 	mrs	x1, esr_el1
 	mov	x0, sp
 	bl	do_serror
+
+	disr_check	reg=x20
 	kernel_exit 1
 ENDPROC(el1_serror)
 
 el0_serror:
 	kernel_entry 0
 el0_serror_naked:
+	mov	x20, x15
 	mrs	x1, esr_el1
 	mov	x0, sp
 	bl	do_serror
+
+	disr_check	reg=x20
 	enable_daif
 	ct_user_exit
 	b	ret_to_user
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index e1eaccc66548..27ebcaa2f0b6 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -729,6 +729,11 @@ asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
 	nmi_exit();
 }
 
+asmlinkage void do_deferred_serror(struct pt_regs *regs, u64 disr)
+{
+	return do_serror(regs, disr_to_esr(disr));
+}
+
 void __pte_error(const char *file, int line, unsigned long val)
 {
 	pr_err("%s:%d: bad pte %016lx.\n", file, line, val);
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 11/16] arm64: kernel: Handle deferred SError on kernel entry
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

Before we can enable Implicit ESB on exception level change, we need to
handle deferred SErrors that may appear on exception entry.

Add code to kernel_entry to synchronize errors then read and clear
DISR_EL1. Call do_deferred_serror() if it had a non-zero value.
(The IESB feature will allow this explicit ESB to be removed)

These checks are needed in the SError vector too, as we may take a pending
SError that was signalled by a device, and on entry to EL1 synchronize
then defer a RAS SError that hadn't yet been made pending. We process the
'taken' SError first as it is more likely to be fatal.

Clear DISR_EL1 from the RAS cpufeature enable call. This means any value
we find in DISR_EL1 was triggered and deferred by our actions. We have
executed ESB prior to this point, but these occur with SError unmasked so
will not have been deferred.

Signed-off-by: James Morse <james.morse@arm.com>
---
Remove the 'x21  aborted SP' line from entry.S - its not true.

 arch/arm64/include/asm/assembler.h | 23 ++++++++++++
 arch/arm64/include/asm/esr.h       |  7 ++++
 arch/arm64/include/asm/exception.h | 14 ++++++++
 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/include/asm/sysreg.h    |  1 +
 arch/arm64/kernel/cpufeature.c     |  9 +++++
 arch/arm64/kernel/entry.S          | 73 ++++++++++++++++++++++++++++++++------
 arch/arm64/kernel/traps.c          |  5 +++
 8 files changed, 122 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index a013ab05210d..e2bb551f59f7 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -110,6 +110,13 @@
 	.endm
 
 /*
+ * RAS Error Synchronization barrier
+ */
+	.macro	esb
+	hint	#16
+	.endm
+
+/*
  * NOP sequence
  */
 	.macro	nops, num
@@ -455,6 +462,22 @@ alternative_endif
 	.endm
 
 /*
+ * Read and clear DISR if supported
+ */
+	.macro disr_read, reg
+	alternative_if ARM64_HAS_RAS_EXTN
+	mrs_s	\reg, SYS_DISR_EL1
+	cbz	\reg, 99f
+	msr_s	SYS_DISR_EL1, xzr
+99:
+	alternative_else
+	mov	\reg,	xzr
+	nop
+	nop
+	alternative_endif
+	.endm
+
+/*
  * Errata workaround prior to TTBR0_EL1 update
  *
  * 	val:	TTBR value with new BADDR, preserved
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 77d5b1baf1a4..41a0767e2600 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -124,6 +124,13 @@
 #define ESR_ELx_WFx_ISS_WFE	(UL(1) << 0)
 #define ESR_ELx_xVC_IMM_MASK	((1UL << 16) - 1)
 
+#define DISR_EL1_IDS		 (UL(1) << 24)
+/*
+ * DISR_EL1 and ESR_ELx share the bottom 13 bits, but the RES0 bits may mean
+ * different things in the future...
+ */
+#define DISR_EL1_ESR_MASK	(ESR_ELx_AET | ESR_ELx_EA | ESR_ELx_FSC)
+
 /* ESR value templates for specific events */
 
 /* BRK instruction trap from AArch64 state */
diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
index 0c2eec490abf..bc30429d8e91 100644
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -18,6 +18,8 @@
 #ifndef __ASM_EXCEPTION_H
 #define __ASM_EXCEPTION_H
 
+#include <asm/esr.h>
+
 #include <linux/interrupt.h>
 
 #define __exception	__attribute__((section(".exception.text")))
@@ -27,4 +29,16 @@
 #define __exception_irq_entry	__exception
 #endif
 
+static inline u32 disr_to_esr(u64 disr)
+{
+	unsigned int esr = ESR_ELx_EC_SERROR << ESR_ELx_EC_SHIFT;
+
+	if ((disr & DISR_EL1_IDS) == 0)
+		esr |= (disr & DISR_EL1_ESR_MASK);
+	else
+		esr |= (disr & ESR_ELx_ISS_MASK);
+
+	return esr;
+}
+
 #endif	/* __ASM_EXCEPTION_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 64c9e78f9882..82e8ff01153d 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -193,5 +193,6 @@ static inline void spin_lock_prefetch(const void *ptr)
 
 int cpu_enable_pan(void *__unused);
 int cpu_enable_cache_maint_trap(void *__unused);
+int cpu_clear_disr(void *__unused);
 
 #endif /* __ASM_PROCESSOR_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 58358acf7c9b..18cabd92af22 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -179,6 +179,7 @@
 #define SYS_AMAIR_EL1			sys_reg(3, 0, 10, 3, 0)
 
 #define SYS_VBAR_EL1			sys_reg(3, 0, 12, 0, 0)
+#define SYS_DISR_EL1			sys_reg(3, 0, 12, 1,  1)
 
 #define SYS_ICC_IAR0_EL1		sys_reg(3, 0, 12, 8, 0)
 #define SYS_ICC_EOIR0_EL1		sys_reg(3, 0, 12, 8, 1)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index a807ab55ee10..6dbefe401dc4 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -899,6 +899,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.sign = FTR_UNSIGNED,
 		.field_pos = ID_AA64PFR0_RAS_SHIFT,
 		.min_field_value = ID_AA64PFR0_RAS_V1,
+		.enable = cpu_clear_disr,
 	},
 #endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
@@ -1308,3 +1309,11 @@ static int __init enable_mrs_emulation(void)
 }
 
 late_initcall(enable_mrs_emulation);
+
+int cpu_clear_disr(void *__unused)
+{
+	/* Firmware may have left a deferred SError in this register. */
+	write_sysreg_s(0, SYS_DISR_EL1);
+
+	return 0;
+}
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 9e63f69e1366..8cdfca4060e3 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -34,6 +34,18 @@
 #include <asm/asm-uaccess.h>
 #include <asm/unistd.h>
 
+
+	/*
+	 * Restore syscall arguments from the values already saved on stack
+	 * during kernel_entry.
+	 */
+	.macro restore_syscall_args
+	ldp	x0, x1, [sp]
+	ldp	x2, x3, [sp, #S_X2]
+	ldp	x4, x5, [sp, #S_X4]
+	ldp	x6, x7, [sp, #S_X6]
+	.endm
+
 /*
  * Context tracking subsystem.  Used to instrument transitions
  * between user and kernel mode.
@@ -42,14 +54,8 @@
 #ifdef CONFIG_CONTEXT_TRACKING
 	bl	context_tracking_user_exit
 	.if \syscall == 1
-	/*
-	 * Save/restore needed during syscalls.  Restore syscall arguments from
-	 * the values already saved on stack during kernel_entry.
-	 */
-	ldp	x0, x1, [sp]
-	ldp	x2, x3, [sp, #S_X2]
-	ldp	x4, x5, [sp, #S_X4]
-	ldp	x6, x7, [sp, #S_X6]
+	/* Save/restore needed during syscalls. */
+	restore_syscall_args
 	.endif
 #endif
 	.endm
@@ -153,10 +159,13 @@ alternative_else_nop_endif
 	msr	sp_el0, tsk
 	.endif
 
+	esb
+	disr_read reg=x15
+
 	/*
 	 * Registers that may be useful after this macro is invoked:
 	 *
-	 * x21 - aborted SP
+	 * x15 - Deferred Interrupt Status value
 	 * x22 - aborted PC
 	 * x23 - aborted PSTATE
 	*/
@@ -312,6 +321,31 @@ tsk	.req	x28		// current thread_info
 	irq_stack_exit
 	.endm
 
+/* Handle any non-zero DISR value if supported.
+ *
+ * @reg     the register holding DISR
+ * @syscall whether the syscall args should be restored if we call
+ *          do_deferred_serror (default: no)
+ */
+	.macro disr_check	reg, syscall = 0
+#ifdef CONFIG_ARM64_RAS_EXTN
+alternative_if_not ARM64_HAS_RAS_EXTN
+	b	9998f
+alternative_else_nop_endif
+	cbz	\reg, 9998f
+
+	mov	x1, \reg
+	mov	x0, sp
+	bl	do_deferred_serror
+
+	.if \syscall == 1
+	restore_syscall_args
+	.endif
+9998:
+#endif /* CONFIG_ARM64_RAS_EXTN */
+	.endm
+
+
 	.text
 
 /*
@@ -404,8 +438,12 @@ ENDPROC(el1_error_invalid)
 	.align	6
 el1_sync:
 	kernel_entry 1
-	mrs	x0, far_el1
-	mrs	x1, esr_el1			// read the syndrome register
+	mrs	x26, far_el1
+	mrs	x25, esr_el1			// read the syndrome register
+	disr_check	reg=x15
+	mov	x0, x26
+	mov	x1, x25
+
 	lsr	x24, x1, #ESR_ELx_EC_SHIFT	// exception class
 	cmp	x24, #ESR_ELx_EC_BREAKPT_CUR	// debug exception in EL1
 	b.ge	el1_dbg
@@ -461,6 +499,7 @@ el1_dbg:
 	tbz	x24, #0, el1_inv		// EL1 only
 	mov	x2, sp				// struct pt_regs
 	bl	do_debug_exception
+
 	kernel_exit 1
 el1_inv:
 	// TODO: add support for undefined instructions in kernel mode
@@ -473,6 +512,7 @@ ENDPROC(el1_sync)
 	.align	6
 el1_irq:
 	kernel_entry 1
+	disr_check	reg=x15
 	enable_da_f
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
@@ -511,6 +551,8 @@ el0_sync:
 	kernel_entry 0
 	mrs	x25, esr_el1			// read the syndrome register
 	mrs	x26, far_el1
+	disr_check	reg=x15, syscall=1
+
 	lsr	x24, x25, #ESR_ELx_EC_SHIFT	// exception class
 	cmp	x24, #ESR_ELx_EC_BREAKPT_LOW	// debug exception in EL0
 	b.ge	el0_dbg
@@ -544,6 +586,8 @@ el0_sync_compat:
 	kernel_entry 0, 32
 	mrs	x25, esr_el1			// read the syndrome register
 	mrs	x26, far_el1
+	disr_check	reg=x15, syscall=1
+
 	lsr	x24, x25, #ESR_ELx_EC_SHIFT	// exception class
 	cmp	x24, #ESR_ELx_EC_BREAKPT_LOW	// debug exception in EL0
 	b.ge	el0_dbg
@@ -677,6 +721,7 @@ ENDPROC(el0_sync)
 el0_irq:
 	kernel_entry 0
 el0_irq_naked:
+	disr_check	reg=x15
 	enable_da_f
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
@@ -693,18 +738,24 @@ ENDPROC(el0_irq)
 
 el1_serror:
 	kernel_entry 1
+	mov	x20, x15
 	mrs	x1, esr_el1
 	mov	x0, sp
 	bl	do_serror
+
+	disr_check	reg=x20
 	kernel_exit 1
 ENDPROC(el1_serror)
 
 el0_serror:
 	kernel_entry 0
 el0_serror_naked:
+	mov	x20, x15
 	mrs	x1, esr_el1
 	mov	x0, sp
 	bl	do_serror
+
+	disr_check	reg=x20
 	enable_daif
 	ct_user_exit
 	b	ret_to_user
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index e1eaccc66548..27ebcaa2f0b6 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -729,6 +729,11 @@ asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
 	nmi_exit();
 }
 
+asmlinkage void do_deferred_serror(struct pt_regs *regs, u64 disr)
+{
+	return do_serror(regs, disr_to_esr(disr));
+}
+
 void __pte_error(const char *file, int line, unsigned long val)
 {
 	pr_err("%s:%d: bad pte %016lx.\n", file, line, val);
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 12/16] arm64: entry.S: Make eret restartable
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

To gain any benefit from IESB on exception return we must unmask SError
over ERET instructions so that the SError is taken to EL1, instead of
deferred. SErrors deferred like this would only be processed once we take
another exception, at which point they may be overwritten by a new (less
severe) deferred SError.

The 'IESB' bit in the ESR isn't enough for us to fixup this error, as
we may take a pending SError the moment we unmask it, instead of being
synchronized by IESB when we ERET.

Instead we move exception return out of the kernel_exit macro so that its
PC range is well-known, and stash the SPSR and ELR which would be lost if
we take an SError from this code.

_do_serror() is extended to match the interrupted PC against the well known
do_kernel_exit range and restore the stashed values.

Now if we take a survivable SError from EL1 to EL1, we must check if
kernel_exit had restored the EL0 state, if so we must call 'kernel_enter 0'
from el1_serror. _do_serror() restores the clobbered SPSR value, and we
then return to EL0 from el1_serror. This keeps the enter/exit calls
balanced.

None of this code is specific to IESB, enable it for all platforms. On
systems without the IESB feature we may take a pending SError earlier.

Signed-off-by: James Morse <james.morse@arm.com>
---
Known issue: If _do_serror() takes a synchronous exception the per-cpu SPSR
and ELR will be overwritten. A WARN_ON() firing is the most likely way of
doing this. Fixing it requires the asm to do the restore, which makes it
three times as complicated. This shouldn't be a problem for _do_serror()
as it is today.

 arch/arm64/include/asm/exception.h | 20 +++++++++++++++
 arch/arm64/kernel/entry.S          | 51 +++++++++++++++++++++++++++++++++++++-
 arch/arm64/kernel/traps.c          | 23 ++++++++++++++---
 3 files changed, 90 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
index bc30429d8e91..a0ef187127ea 100644
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -18,7 +18,10 @@
 #ifndef __ASM_EXCEPTION_H
 #define __ASM_EXCEPTION_H
 
+#ifndef __ASSEMBLY__
+
 #include <asm/esr.h>
+#include <asm/ptrace.h>
 
 #include <linux/interrupt.h>
 
@@ -41,4 +44,21 @@ static inline u32 disr_to_esr(u64 disr)
 	return esr;
 }
 
+extern char __do_kernel_exit_start;
+extern char __do_kernel_exit_end;
+
+static inline bool __is_kernel_exit(unsigned long pc)
+{
+	return ((unsigned long)&__do_kernel_exit_start <= pc &&
+		pc < (unsigned long)&__do_kernel_exit_end);
+}
+#else
+/* result returned in flags, 'lo' true, 'hs' false */
+.macro	is_kernel_exit, reg, tmp
+	adr	\tmp, __do_kernel_exit_start
+	cmp	\reg, \tmp
+	adr	\tmp, __do_kernel_exit_end
+	ccmp	\reg, \tmp, #2, hs
+.endm
+#endif /* __ASSEMBLY__*/
 #endif	/* __ASM_EXCEPTION_H */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 8cdfca4060e3..173b86fac066 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -26,6 +26,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/cpufeature.h>
 #include <asm/errno.h>
+#include <asm/exception.h>
 #include <asm/esr.h>
 #include <asm/irq.h>
 #include <asm/memory.h>
@@ -239,6 +240,10 @@ alternative_else_nop_endif
 #endif
 	.endif
 
+	/* Stash elr and spsr so we can restart this eret */
+	adr_this_cpu x15, __exit_exception_regs, tmp=x16
+	stp	x21, x22, [x15]
+
 	msr	elr_el1, x21			// set up the return data
 	msr	spsr_el1, x22
 	ldp	x0, x1, [sp, #16 * 0]
@@ -258,7 +263,7 @@ alternative_else_nop_endif
 	ldp	x28, x29, [sp, #16 * 14]
 	ldr	lr, [sp, #S_LR]
 	add	sp, sp, #S_FRAME_SIZE		// restore sp
-	eret					// return to kernel
+	b	do_kernel_exit
 	.endm
 
 	.macro	irq_stack_entry
@@ -432,6 +437,17 @@ el1_error_invalid:
 	inv_entry 1, BAD_ERROR
 ENDPROC(el1_error_invalid)
 
+.global __do_kernel_exit_start
+.global __do_kernel_exit_end
+ENTRY(do_kernel_exit)
+__do_kernel_exit_start:
+	enable_serror
+	esb
+
+	eret
+__do_kernel_exit_end:
+ENDPROC(do_kernel_exit)
+
 /*
  * EL1 mode handlers.
  */
@@ -737,13 +753,46 @@ el0_irq_naked:
 ENDPROC(el0_irq)
 
 el1_serror:
+	/*
+	 * If this SError was taken due to an SError while returning from EL1
+	 * to EL0, then sp_el0 is a user space address, even though we took the
+	 * exception from EL1.
+	 * Did we interrupt __do_kernel_exit()?
+	 */
+	stp	x0, x1, [sp, #-16]!
+	mrs	x0, elr_el1
+	is_kernel_exit x0, x1
+	b.hs	1f
+
+	/*
+	 * Retrieve the per-cpu stashed SPSR to check if the interrupted
+	 * kernel_exit was heading for EL0.
+	 */
+	adr_this_cpu x0, __exit_exception_regs, tmp=x1
+	ldr	x1, [x0, #8]
+	and	x1, x1, #PSR_MODE_MASK
+	cmp	x1, #PSR_MODE_EL0t
+	b.ne	1f
+
+	ldp	x0, x1, [sp], #16
+	kernel_entry 0
+	mov	x24, #0		// do EL0 exit
+	b	2f
+
+1:	ldp	x0, x1, [sp], #16
 	kernel_entry 1
+	mov	x24, #1		// do EL1 exit
+2:
 	mov	x20, x15
 	mrs	x1, esr_el1
 	mov	x0, sp
 	bl	do_serror
 
 	disr_check	reg=x20
+
+	cbnz	x24, 9f
+	kernel_exit 0		// do_serror() restored the clobbered ELR, SPSR
+9:
 	kernel_exit 1
 ENDPROC(el1_serror)
 
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 27ebcaa2f0b6..18f53e3afd06 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -29,6 +29,7 @@
 #include <linux/kexec.h>
 #include <linux/delay.h>
 #include <linux/init.h>
+#include <linux/percpu.h>
 #include <linux/sched/signal.h>
 #include <linux/sched/debug.h>
 #include <linux/sched/task_stack.h>
@@ -40,6 +41,7 @@
 #include <asm/debug-monitors.h>
 #include <asm/esr.h>
 #include <asm/insn.h>
+#include <asm/kprobes.h>
 #include <asm/traps.h>
 #include <asm/stack_pointer.h>
 #include <asm/stacktrace.h>
@@ -56,6 +58,9 @@ static const char *handler[]= {
 
 int show_unhandled_signals = 1;
 
+/* Stashed ELR/SPSR pair for restoring after taking an SError during eret */
+DEFINE_PER_CPU(u64 [2], __exit_exception_regs);
+
 /*
  * Dump out the contents of some kernel memory nicely...
  */
@@ -696,7 +701,7 @@ static void do_serror_panic(struct pt_regs *regs, unsigned int esr)
 	nmi_panic(regs, "Asynchronous SError Interrupt");
 }
 
-static void _do_serror(struct pt_regs *regs, unsigned int esr)
+static void __kprobes _do_serror(struct pt_regs *regs, unsigned int esr)
 {
 	bool impdef_syndrome = esr & ESR_ELx_ISV;	/* aka IDS */
 	unsigned int aet = esr & ESR_ELx_AET;
@@ -718,9 +723,21 @@ static void _do_serror(struct pt_regs *regs, unsigned int esr)
 	default:
 		return do_serror_panic(regs, esr);
 	}
+
+	/*
+	 * If we took this SError during kernel_exit restore the ELR and SPSR.
+	 * We can only do this if the interrupted PC points into do_kernel_exit.
+	 * We can't return into do_kernel_exit code and restore the ELR and
+	 * SPSR, so instead we skip the rest of do_kernel_exit and unmask SError
+	 * and eret with the stashed values on our own return path.
+	 */
+	if (__is_kernel_exit(regs->pc)) {
+		regs->pc = this_cpu_read(__exit_exception_regs[0]);
+		regs->pstate = this_cpu_read(__exit_exception_regs[1]);
+	}
 }
 
-asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+asmlinkage void __kprobes do_serror(struct pt_regs *regs, unsigned int esr)
 {
 	nmi_enter();
 
@@ -729,7 +746,7 @@ asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
 	nmi_exit();
 }
 
-asmlinkage void do_deferred_serror(struct pt_regs *regs, u64 disr)
+asmlinkage void __kprobes do_deferred_serror(struct pt_regs *regs, u64 disr)
 {
 	return do_serror(regs, disr_to_esr(disr));
 }
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 12/16] arm64: entry.S: Make eret restartable
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

To gain any benefit from IESB on exception return we must unmask SError
over ERET instructions so that the SError is taken to EL1, instead of
deferred. SErrors deferred like this would only be processed once we take
another exception, at which point they may be overwritten by a new (less
severe) deferred SError.

The 'IESB' bit in the ESR isn't enough for us to fixup this error, as
we may take a pending SError the moment we unmask it, instead of being
synchronized by IESB when we ERET.

Instead we move exception return out of the kernel_exit macro so that its
PC range is well-known, and stash the SPSR and ELR which would be lost if
we take an SError from this code.

_do_serror() is extended to match the interrupted PC against the well known
do_kernel_exit range and restore the stashed values.

Now if we take a survivable SError from EL1 to EL1, we must check if
kernel_exit had restored the EL0 state, if so we must call 'kernel_enter 0'
from el1_serror. _do_serror() restores the clobbered SPSR value, and we
then return to EL0 from el1_serror. This keeps the enter/exit calls
balanced.

None of this code is specific to IESB, enable it for all platforms. On
systems without the IESB feature we may take a pending SError earlier.

Signed-off-by: James Morse <james.morse@arm.com>
---
Known issue: If _do_serror() takes a synchronous exception the per-cpu SPSR
and ELR will be overwritten. A WARN_ON() firing is the most likely way of
doing this. Fixing it requires the asm to do the restore, which makes it
three times as complicated. This shouldn't be a problem for _do_serror()
as it is today.

 arch/arm64/include/asm/exception.h | 20 +++++++++++++++
 arch/arm64/kernel/entry.S          | 51 +++++++++++++++++++++++++++++++++++++-
 arch/arm64/kernel/traps.c          | 23 ++++++++++++++---
 3 files changed, 90 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
index bc30429d8e91..a0ef187127ea 100644
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -18,7 +18,10 @@
 #ifndef __ASM_EXCEPTION_H
 #define __ASM_EXCEPTION_H
 
+#ifndef __ASSEMBLY__
+
 #include <asm/esr.h>
+#include <asm/ptrace.h>
 
 #include <linux/interrupt.h>
 
@@ -41,4 +44,21 @@ static inline u32 disr_to_esr(u64 disr)
 	return esr;
 }
 
+extern char __do_kernel_exit_start;
+extern char __do_kernel_exit_end;
+
+static inline bool __is_kernel_exit(unsigned long pc)
+{
+	return ((unsigned long)&__do_kernel_exit_start <= pc &&
+		pc < (unsigned long)&__do_kernel_exit_end);
+}
+#else
+/* result returned in flags, 'lo' true, 'hs' false */
+.macro	is_kernel_exit, reg, tmp
+	adr	\tmp, __do_kernel_exit_start
+	cmp	\reg, \tmp
+	adr	\tmp, __do_kernel_exit_end
+	ccmp	\reg, \tmp, #2, hs
+.endm
+#endif /* __ASSEMBLY__*/
 #endif	/* __ASM_EXCEPTION_H */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 8cdfca4060e3..173b86fac066 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -26,6 +26,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/cpufeature.h>
 #include <asm/errno.h>
+#include <asm/exception.h>
 #include <asm/esr.h>
 #include <asm/irq.h>
 #include <asm/memory.h>
@@ -239,6 +240,10 @@ alternative_else_nop_endif
 #endif
 	.endif
 
+	/* Stash elr and spsr so we can restart this eret */
+	adr_this_cpu x15, __exit_exception_regs, tmp=x16
+	stp	x21, x22, [x15]
+
 	msr	elr_el1, x21			// set up the return data
 	msr	spsr_el1, x22
 	ldp	x0, x1, [sp, #16 * 0]
@@ -258,7 +263,7 @@ alternative_else_nop_endif
 	ldp	x28, x29, [sp, #16 * 14]
 	ldr	lr, [sp, #S_LR]
 	add	sp, sp, #S_FRAME_SIZE		// restore sp
-	eret					// return to kernel
+	b	do_kernel_exit
 	.endm
 
 	.macro	irq_stack_entry
@@ -432,6 +437,17 @@ el1_error_invalid:
 	inv_entry 1, BAD_ERROR
 ENDPROC(el1_error_invalid)
 
+.global __do_kernel_exit_start
+.global __do_kernel_exit_end
+ENTRY(do_kernel_exit)
+__do_kernel_exit_start:
+	enable_serror
+	esb
+
+	eret
+__do_kernel_exit_end:
+ENDPROC(do_kernel_exit)
+
 /*
  * EL1 mode handlers.
  */
@@ -737,13 +753,46 @@ el0_irq_naked:
 ENDPROC(el0_irq)
 
 el1_serror:
+	/*
+	 * If this SError was taken due to an SError while returning from EL1
+	 * to EL0, then sp_el0 is a user space address, even though we took the
+	 * exception from EL1.
+	 * Did we interrupt __do_kernel_exit()?
+	 */
+	stp	x0, x1, [sp, #-16]!
+	mrs	x0, elr_el1
+	is_kernel_exit x0, x1
+	b.hs	1f
+
+	/*
+	 * Retrieve the per-cpu stashed SPSR to check if the interrupted
+	 * kernel_exit was heading for EL0.
+	 */
+	adr_this_cpu x0, __exit_exception_regs, tmp=x1
+	ldr	x1, [x0, #8]
+	and	x1, x1, #PSR_MODE_MASK
+	cmp	x1, #PSR_MODE_EL0t
+	b.ne	1f
+
+	ldp	x0, x1, [sp], #16
+	kernel_entry 0
+	mov	x24, #0		// do EL0 exit
+	b	2f
+
+1:	ldp	x0, x1, [sp], #16
 	kernel_entry 1
+	mov	x24, #1		// do EL1 exit
+2:
 	mov	x20, x15
 	mrs	x1, esr_el1
 	mov	x0, sp
 	bl	do_serror
 
 	disr_check	reg=x20
+
+	cbnz	x24, 9f
+	kernel_exit 0		// do_serror() restored the clobbered ELR, SPSR
+9:
 	kernel_exit 1
 ENDPROC(el1_serror)
 
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 27ebcaa2f0b6..18f53e3afd06 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -29,6 +29,7 @@
 #include <linux/kexec.h>
 #include <linux/delay.h>
 #include <linux/init.h>
+#include <linux/percpu.h>
 #include <linux/sched/signal.h>
 #include <linux/sched/debug.h>
 #include <linux/sched/task_stack.h>
@@ -40,6 +41,7 @@
 #include <asm/debug-monitors.h>
 #include <asm/esr.h>
 #include <asm/insn.h>
+#include <asm/kprobes.h>
 #include <asm/traps.h>
 #include <asm/stack_pointer.h>
 #include <asm/stacktrace.h>
@@ -56,6 +58,9 @@ static const char *handler[]= {
 
 int show_unhandled_signals = 1;
 
+/* Stashed ELR/SPSR pair for restoring after taking an SError during eret */
+DEFINE_PER_CPU(u64 [2], __exit_exception_regs);
+
 /*
  * Dump out the contents of some kernel memory nicely...
  */
@@ -696,7 +701,7 @@ static void do_serror_panic(struct pt_regs *regs, unsigned int esr)
 	nmi_panic(regs, "Asynchronous SError Interrupt");
 }
 
-static void _do_serror(struct pt_regs *regs, unsigned int esr)
+static void __kprobes _do_serror(struct pt_regs *regs, unsigned int esr)
 {
 	bool impdef_syndrome = esr & ESR_ELx_ISV;	/* aka IDS */
 	unsigned int aet = esr & ESR_ELx_AET;
@@ -718,9 +723,21 @@ static void _do_serror(struct pt_regs *regs, unsigned int esr)
 	default:
 		return do_serror_panic(regs, esr);
 	}
+
+	/*
+	 * If we took this SError during kernel_exit restore the ELR and SPSR.
+	 * We can only do this if the interrupted PC points into do_kernel_exit.
+	 * We can't return into do_kernel_exit code and restore the ELR and
+	 * SPSR, so instead we skip the rest of do_kernel_exit and unmask SError
+	 * and eret with the stashed values on our own return path.
+	 */
+	if (__is_kernel_exit(regs->pc)) {
+		regs->pc = this_cpu_read(__exit_exception_regs[0]);
+		regs->pstate = this_cpu_read(__exit_exception_regs[1]);
+	}
 }
 
-asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+asmlinkage void __kprobes do_serror(struct pt_regs *regs, unsigned int esr)
 {
 	nmi_enter();
 
@@ -729,7 +746,7 @@ asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
 	nmi_exit();
 }
 
-asmlinkage void do_deferred_serror(struct pt_regs *regs, u64 disr)
+asmlinkage void __kprobes do_deferred_serror(struct pt_regs *regs, u64 disr)
 {
 	return do_serror(regs, disr_to_esr(disr));
 }
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 13/16] arm64: cpufeature: Enable Implicit ESB on entry/return-from EL1
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

ARM v8.2 adds a feature to add implicit error synchronization barriers
whenever the CPU enters or returns from EL1. Add code to detect this
feature and enable the SCTLR_EL1.IESB bit.

The explicit ESBs on entry/return-from EL1 are replaced with nops
by this feature.

Platform level RAS support may require additional firmware support.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig                 | 15 +++++++++++++++
 arch/arm64/include/asm/cpucaps.h   |  3 ++-
 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/include/asm/sysreg.h    |  1 +
 arch/arm64/kernel/cpufeature.c     | 21 +++++++++++++++++++++
 arch/arm64/kernel/entry.S          |  4 ++++
 6 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6e417e25672f..f6367cc2e180 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -976,6 +976,21 @@ config ARM64_RAS_EXTN
 	  and access the new registers if the system supports the extension.
 	  Platform RAS features may additionally depend on firmware support.
 
+config ARM64_IESB
+	bool "Enable Implicit Error Synchronization Barrier (IESB)"
+	default y
+	depends on ARM64_RAS_EXTN
+	help
+	  ARM v8.2 adds a feature to add implicit error synchronization
+	  barriers whenever the CPU enters or exits a particular exception
+	  level.
+
+	  On CPUs with this feature and the 'RAS Extensions' feature, we can
+	  use this to contain detected (but not yet reported) errors to the
+	  relevant exception level.
+
+	  The feature is detected at runtime, selecting this option will
+	  enable these implicit barriers if the CPU supports the feature.
 endmenu
 
 config ARM64_MODULE_CMODEL_LARGE
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index f93bf77f1f74..716545c96714 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -40,7 +40,8 @@
 #define ARM64_WORKAROUND_858921			19
 #define ARM64_WORKAROUND_CAVIUM_30115		20
 #define ARM64_HAS_RAS_EXTN			21
+#define ARM64_HAS_IESB				22
 
-#define ARM64_NCAPS				22
+#define ARM64_NCAPS				23
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 82e8ff01153d..fe353f5a4a7b 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -194,5 +194,6 @@ static inline void spin_lock_prefetch(const void *ptr)
 int cpu_enable_pan(void *__unused);
 int cpu_enable_cache_maint_trap(void *__unused);
 int cpu_clear_disr(void *__unused);
+int cpu_enable_iesb(void *__unused);
 
 #endif /* __ASM_PROCESSOR_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 18cabd92af22..e817e802f0b9 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -314,6 +314,7 @@
 /* SCTLR_EL1 specific flags. */
 #define SCTLR_EL1_UCI		(1 << 26)
 #define SCTLR_EL1_SPAN		(1 << 23)
+#define SCTLR_EL1_IESB		(1 << 21)
 #define SCTLR_EL1_UCT		(1 << 15)
 #define SCTLR_EL1_SED		(1 << 8)
 #define SCTLR_EL1_CP15BEN	(1 << 5)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 6dbefe401dc4..12cb1b7ef46b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -901,6 +901,19 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.min_field_value = ID_AA64PFR0_RAS_V1,
 		.enable = cpu_clear_disr,
 	},
+#ifdef CONFIG_ARM64_IESB
+	{
+		.desc = "Implicit Error Synchronization Barrier",
+		.capability = ARM64_HAS_IESB,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64MMFR2_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64MMFR2_IESB_SHIFT,
+		.min_field_value = 1,
+		.enable = cpu_enable_iesb,
+	},
+#endif /* CONFIG_ARM64_IESB */
 #endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
 };
@@ -1317,3 +1330,11 @@ int cpu_clear_disr(void *__unused)
 
 	return 0;
 }
+
+int cpu_enable_iesb(void *__unused)
+{
+	if (cpus_have_cap(ARM64_HAS_RAS_EXTN))
+		config_sctlr_el1(0, SCTLR_EL1_IESB);
+
+	return 0;
+}
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 173b86fac066..0e9b056385c2 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -160,7 +160,9 @@ alternative_else_nop_endif
 	msr	sp_el0, tsk
 	.endif
 
+	alternative_if_not ARM64_HAS_IESB
 	esb
+	alternative_else_nop_endif
 	disr_read reg=x15
 
 	/*
@@ -442,7 +444,9 @@ ENDPROC(el1_error_invalid)
 ENTRY(do_kernel_exit)
 __do_kernel_exit_start:
 	enable_serror
+	alternative_if_not ARM64_HAS_IESB
 	esb
+	alternative_else_nop_endif
 
 	eret
 __do_kernel_exit_end:
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 13/16] arm64: cpufeature: Enable Implicit ESB on entry/return-from EL1
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

ARM v8.2 adds a feature to add implicit error synchronization barriers
whenever the CPU enters or returns from EL1. Add code to detect this
feature and enable the SCTLR_EL1.IESB bit.

The explicit ESBs on entry/return-from EL1 are replaced with nops
by this feature.

Platform level RAS support may require additional firmware support.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig                 | 15 +++++++++++++++
 arch/arm64/include/asm/cpucaps.h   |  3 ++-
 arch/arm64/include/asm/processor.h |  1 +
 arch/arm64/include/asm/sysreg.h    |  1 +
 arch/arm64/kernel/cpufeature.c     | 21 +++++++++++++++++++++
 arch/arm64/kernel/entry.S          |  4 ++++
 6 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 6e417e25672f..f6367cc2e180 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -976,6 +976,21 @@ config ARM64_RAS_EXTN
 	  and access the new registers if the system supports the extension.
 	  Platform RAS features may additionally depend on firmware support.
 
+config ARM64_IESB
+	bool "Enable Implicit Error Synchronization Barrier (IESB)"
+	default y
+	depends on ARM64_RAS_EXTN
+	help
+	  ARM v8.2 adds a feature to add implicit error synchronization
+	  barriers whenever the CPU enters or exits a particular exception
+	  level.
+
+	  On CPUs with this feature and the 'RAS Extensions' feature, we can
+	  use this to contain detected (but not yet reported) errors to the
+	  relevant exception level.
+
+	  The feature is detected at runtime, selecting this option will
+	  enable these implicit barriers if the CPU supports the feature.
 endmenu
 
 config ARM64_MODULE_CMODEL_LARGE
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index f93bf77f1f74..716545c96714 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -40,7 +40,8 @@
 #define ARM64_WORKAROUND_858921			19
 #define ARM64_WORKAROUND_CAVIUM_30115		20
 #define ARM64_HAS_RAS_EXTN			21
+#define ARM64_HAS_IESB				22
 
-#define ARM64_NCAPS				22
+#define ARM64_NCAPS				23
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 82e8ff01153d..fe353f5a4a7b 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -194,5 +194,6 @@ static inline void spin_lock_prefetch(const void *ptr)
 int cpu_enable_pan(void *__unused);
 int cpu_enable_cache_maint_trap(void *__unused);
 int cpu_clear_disr(void *__unused);
+int cpu_enable_iesb(void *__unused);
 
 #endif /* __ASM_PROCESSOR_H */
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 18cabd92af22..e817e802f0b9 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -314,6 +314,7 @@
 /* SCTLR_EL1 specific flags. */
 #define SCTLR_EL1_UCI		(1 << 26)
 #define SCTLR_EL1_SPAN		(1 << 23)
+#define SCTLR_EL1_IESB		(1 << 21)
 #define SCTLR_EL1_UCT		(1 << 15)
 #define SCTLR_EL1_SED		(1 << 8)
 #define SCTLR_EL1_CP15BEN	(1 << 5)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 6dbefe401dc4..12cb1b7ef46b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -901,6 +901,19 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.min_field_value = ID_AA64PFR0_RAS_V1,
 		.enable = cpu_clear_disr,
 	},
+#ifdef CONFIG_ARM64_IESB
+	{
+		.desc = "Implicit Error Synchronization Barrier",
+		.capability = ARM64_HAS_IESB,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64MMFR2_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64MMFR2_IESB_SHIFT,
+		.min_field_value = 1,
+		.enable = cpu_enable_iesb,
+	},
+#endif /* CONFIG_ARM64_IESB */
 #endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
 };
@@ -1317,3 +1330,11 @@ int cpu_clear_disr(void *__unused)
 
 	return 0;
 }
+
+int cpu_enable_iesb(void *__unused)
+{
+	if (cpus_have_cap(ARM64_HAS_RAS_EXTN))
+		config_sctlr_el1(0, SCTLR_EL1_IESB);
+
+	return 0;
+}
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 173b86fac066..0e9b056385c2 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -160,7 +160,9 @@ alternative_else_nop_endif
 	msr	sp_el0, tsk
 	.endif
 
+	alternative_if_not ARM64_HAS_IESB
 	esb
+	alternative_else_nop_endif
 	disr_read reg=x15
 
 	/*
@@ -442,7 +444,9 @@ ENDPROC(el1_error_invalid)
 ENTRY(do_kernel_exit)
 __do_kernel_exit_start:
 	enable_serror
+	alternative_if_not ARM64_HAS_IESB
 	esb
+	alternative_else_nop_endif
 
 	eret
 __do_kernel_exit_end:
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 14/16] KVM: arm64: Take pending SErrors on entry to the guest
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

SErrors due to RAS are either taken as an SError, or deferred because
of an Error Synchronization Barrier (ESB). Systems that support the RAS
extensions are very likely to have firmware-first handling of these
errors, taking all SErrors to EL3.

Add {I,}ESB support to KVM and be prepared to handle any resulting SError
if we are notified directly. (i.e. no firmware-first handling). Do this
for the cases where we can take the SError instead of deferring it.

With VHE KVM is covered by the host's setting of SCTLR_EL1.IESB: unmask
SError when entering a guest. This will hyp-panic if there was an SError
pending during world switch (and we don't have firmware first). Make
sure this only happens when its KVM's 'fault' by adding an ESB to
__kvm_call_hyp().

On systems without the RAS extensions a pending SError triggered by KVM's
world switch will no longer be blamed on the guest, causing a panic
instead.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h |  1 +
 arch/arm64/kvm/hyp.S               |  1 +
 arch/arm64/kvm/hyp/entry.S         | 18 ++++++++++++++++++
 3 files changed, 20 insertions(+)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index e2bb551f59f7..e440fba6d0fe 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -29,6 +29,7 @@
 #include <asm/page.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
+#include <asm/sysreg.h>
 #include <asm/thread_info.h>
 
 	.macro save_and_disable_daif, flags
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 952f6cb9cf72..e96a5f6afecd 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -40,6 +40,7 @@
  * arch/arm64/kernel/hyp_stub.S.
  */
 ENTRY(__kvm_call_hyp)
+	esb
 alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
 	hvc	#0
 	ret
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 12ee62d6d410..cec18df5a324 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -49,6 +49,21 @@
 	ldp	x29, lr,  [\ctxt, #CPU_XREG_OFFSET(29)]
 .endm
 
+/* We have an implicit esb if we have VHE and IESB. */
+.macro kvm_explicit_esb
+	alternative_if_not ARM64_HAS_RAS_EXTN
+	b	998f
+	alternative_else_nop_endif
+	alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
+	esb
+	b	998f
+	alternative_else_nop_endif
+	alternative_if_not ARM64_HAS_IESB
+	esb
+	alternative_else_nop_endif
+998:
+.endm
+
 /*
  * u64 __guest_enter(struct kvm_vcpu *vcpu,
  *		     struct kvm_cpu_context *host_ctxt);
@@ -85,6 +100,9 @@ ENTRY(__guest_enter)
 	ldr	x18,      [x18, #CPU_XREG_OFFSET(18)]
 
 	// Do not touch any register after this!
+
+	enable_serror		// Don't defer an IESB SError
+	kvm_explicit_esb
 	eret
 ENDPROC(__guest_enter)
 
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 14/16] KVM: arm64: Take pending SErrors on entry to the guest
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

SErrors due to RAS are either taken as an SError, or deferred because
of an Error Synchronization Barrier (ESB). Systems that support the RAS
extensions are very likely to have firmware-first handling of these
errors, taking all SErrors to EL3.

Add {I,}ESB support to KVM and be prepared to handle any resulting SError
if we are notified directly. (i.e. no firmware-first handling). Do this
for the cases where we can take the SError instead of deferring it.

With VHE KVM is covered by the host's setting of SCTLR_EL1.IESB: unmask
SError when entering a guest. This will hyp-panic if there was an SError
pending during world switch (and we don't have firmware first). Make
sure this only happens when its KVM's 'fault' by adding an ESB to
__kvm_call_hyp().

On systems without the RAS extensions a pending SError triggered by KVM's
world switch will no longer be blamed on the guest, causing a panic
instead.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h |  1 +
 arch/arm64/kvm/hyp.S               |  1 +
 arch/arm64/kvm/hyp/entry.S         | 18 ++++++++++++++++++
 3 files changed, 20 insertions(+)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index e2bb551f59f7..e440fba6d0fe 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -29,6 +29,7 @@
 #include <asm/page.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
+#include <asm/sysreg.h>
 #include <asm/thread_info.h>
 
 	.macro save_and_disable_daif, flags
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 952f6cb9cf72..e96a5f6afecd 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -40,6 +40,7 @@
  * arch/arm64/kernel/hyp_stub.S.
  */
 ENTRY(__kvm_call_hyp)
+	esb
 alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
 	hvc	#0
 	ret
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 12ee62d6d410..cec18df5a324 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -49,6 +49,21 @@
 	ldp	x29, lr,  [\ctxt, #CPU_XREG_OFFSET(29)]
 .endm
 
+/* We have an implicit esb if we have VHE and IESB. */
+.macro kvm_explicit_esb
+	alternative_if_not ARM64_HAS_RAS_EXTN
+	b	998f
+	alternative_else_nop_endif
+	alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
+	esb
+	b	998f
+	alternative_else_nop_endif
+	alternative_if_not ARM64_HAS_IESB
+	esb
+	alternative_else_nop_endif
+998:
+.endm
+
 /*
  * u64 __guest_enter(struct kvm_vcpu *vcpu,
  *		     struct kvm_cpu_context *host_ctxt);
@@ -85,6 +100,9 @@ ENTRY(__guest_enter)
 	ldr	x18,      [x18, #CPU_XREG_OFFSET(18)]
 
 	// Do not touch any register after this!
+
+	enable_serror		// Don't defer an IESB SError
+	kvm_explicit_esb
 	eret
 ENDPROC(__guest_enter)
 
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 15/16] KVM: arm64: Save ESR_EL2 on guest SError
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

When we exit a guest due to an SError the vcpu fault info isn't updated
with the ESR. Today this is only done for traps.

The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
fault_info with the ESR on SError so that handle_exit() can determine
if this was a RAS SError and decode its severity.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kvm/hyp/switch.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 945e79c641c4..c6f17c7675ad 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -226,13 +226,20 @@ static bool __hyp_text __translate_far_to_hpfar(u64 far, u64 *hpfar)
 	return true;
 }
 
+static void __hyp_text __populate_fault_info_esr(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
+}
+
 static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
 {
-	u64 esr = read_sysreg_el2(esr);
-	u8 ec = ESR_ELx_EC(esr);
+	u8 ec;
+	u64 esr;
 	u64 hpfar, far;
 
-	vcpu->arch.fault.esr_el2 = esr;
+	__populate_fault_info_esr(vcpu);
+	esr = vcpu->arch.fault.esr_el2;
+	ec = ESR_ELx_EC(esr);
 
 	if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
 		return true;
@@ -321,6 +328,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 	 */
 	if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
 		goto again;
+	else if (ARM_EXCEPTION_CODE(exit_code) == ARM_EXCEPTION_EL1_SERROR)
+		__populate_fault_info_esr(vcpu);
 
 	if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
 	    exit_code == ARM_EXCEPTION_TRAP) {
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 15/16] KVM: arm64: Save ESR_EL2 on guest SError
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

When we exit a guest due to an SError the vcpu fault info isn't updated
with the ESR. Today this is only done for traps.

The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
fault_info with the ESR on SError so that handle_exit() can determine
if this was a RAS SError and decode its severity.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kvm/hyp/switch.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index 945e79c641c4..c6f17c7675ad 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -226,13 +226,20 @@ static bool __hyp_text __translate_far_to_hpfar(u64 far, u64 *hpfar)
 	return true;
 }
 
+static void __hyp_text __populate_fault_info_esr(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
+}
+
 static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
 {
-	u64 esr = read_sysreg_el2(esr);
-	u8 ec = ESR_ELx_EC(esr);
+	u8 ec;
+	u64 esr;
 	u64 hpfar, far;
 
-	vcpu->arch.fault.esr_el2 = esr;
+	__populate_fault_info_esr(vcpu);
+	esr = vcpu->arch.fault.esr_el2;
+	ec = ESR_ELx_EC(esr);
 
 	if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
 		return true;
@@ -321,6 +328,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 	 */
 	if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
 		goto again;
+	else if (ARM_EXCEPTION_CODE(exit_code) == ARM_EXCEPTION_EL1_SERROR)
+		__populate_fault_info_esr(vcpu);
 
 	if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
 	    exit_code == ARM_EXCEPTION_TRAP) {
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 16/16] KVM: arm64: Handle deferred SErrors consumed on guest exit
  2017-07-28 14:10 ` James Morse
@ 2017-07-28 14:10   ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng, kvmarm

On systems with VHE, the RAS extensions and IESB support, KVM gets an
implicit ESB whenever it enters/exits a guest, because the host sets
SCTLR_EL1.IESB.

To prevent errors being lost, add code to __guest_exit() to read DISR_EL1,
and save it in the kvm_vcpu_fault_info. Add code to handle_exit() to
process this deferred SError. This data is in addition to the reason the
guest exitted.

Future patches may add a firmware-first callout from
kvm_handle_deferred_serror() to decode CPER records populated by firmware,
or call some arm64 arch code to process the RAS 'ERR' registers for
kernel-first handling. Without either of these, we just make a judgement
on the severity: corrected and restartable errors are ignored, all others
result it an SError being given to the guest.

On systems with EL2 but where VHE has been disabled in the build config,
add an explicit ESB in the __guest_exit() path. This lets us skip the
SError VAXorcism on all systems with the RAS extensions.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/kvm_emulate.h |  5 +++
 arch/arm64/include/asm/kvm_host.h    |  1 +
 arch/arm64/kernel/asm-offsets.c      |  1 +
 arch/arm64/kvm/handle_exit.c         | 84 +++++++++++++++++++++++++++++-------
 arch/arm64/kvm/hyp/entry.S           |  9 ++++
 5 files changed, 85 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index fe39e6841326..9bec1e572bee 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -168,6 +168,11 @@ static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
 	return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
 }
 
+static inline u64 kvm_vcpu_get_disr(const struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.fault.disr_el1;
+}
+
 static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index d68630007b14..f65cc6d497e6 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -88,6 +88,7 @@ struct kvm_vcpu_fault_info {
 	u32 esr_el2;		/* Hyp Syndrom Register */
 	u64 far_el2;		/* Hyp Fault Address Register */
 	u64 hpfar_el2;		/* Hyp IPA Fault Address Register */
+	u64 disr_el1;		/* Deferred [SError] Status Register */
 };
 
 /*
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index b3bb7ef97bc8..331cccbd38cf 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -129,6 +129,7 @@ int main(void)
   BLANK();
 #ifdef CONFIG_KVM_ARM_HOST
   DEFINE(VCPU_CONTEXT,		offsetof(struct kvm_vcpu, arch.ctxt));
+  DEFINE(VCPU_FAULT_DISR,	offsetof(struct kvm_vcpu, arch.fault.disr_el1));
   DEFINE(CPU_GP_REGS,		offsetof(struct kvm_cpu_context, gp_regs));
   DEFINE(CPU_USER_PT_REGS,	offsetof(struct kvm_regs, regs));
   DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 17d8a1677a0b..b8e2477853eb 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -23,6 +23,7 @@
 #include <linux/kvm_host.h>
 
 #include <asm/esr.h>
+#include <asm/exception.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_coproc.h>
 #include <asm/kvm_emulate.h>
@@ -67,6 +68,67 @@ static int handle_no_fpsimd(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	return 1;
 }
 
+static void kvm_inject_serror(struct kvm_vcpu *vcpu)
+{
+	u8 hsr_ec = ESR_ELx_EC(kvm_vcpu_get_hsr(vcpu));
+
+	/*
+	 * HVC/SMC already have an adjusted PC, which we need
+	 * to correct in order to return to after having
+	 * injected the SError.
+	 */
+	if (hsr_ec == ESR_ELx_EC_HVC32 || hsr_ec == ESR_ELx_EC_HVC64 ||
+	    hsr_ec == ESR_ELx_EC_SMC32 || hsr_ec == ESR_ELx_EC_SMC64) {
+		u32 adj =  kvm_vcpu_trap_il_is32bit(vcpu) ? 4 : 2;
+		*vcpu_pc(vcpu) -= adj;
+	}
+
+	kvm_inject_vabt(vcpu);
+}
+
+/**
+ * kvm_handle_guest_serror
+ *
+ * @vcpu:	the vcpu pointer
+ * @esr:	the esr_el2 value read at guest exit for an SError, or
+ * 		disr_el1 for a deferred SError.
+ *
+ * Either the guest took an SError, or we found one pending while handling
+ * __guest_exit() at EL2. With the v8.2 RAS extensions SErrors are either
+ * 'impementation defined' or categorised as a RAS exception.
+ * Without any further information we ignore SErrors categorised as
+ * 'corrected' or 'restartable' by RAS, and hand the guest an SError in
+ * all other cases.
+ */
+static int kvm_handle_guest_serror(struct kvm_vcpu *vcpu, u32 esr)
+{
+	bool impdef_syndrome = esr & ESR_ELx_ISV;	/* aka IDS */
+	unsigned int aet = esr & ESR_ELx_AET;
+
+	if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN) || impdef_syndrome) {
+		kvm_inject_serror(vcpu);
+		return 1;
+	}
+
+	/*
+	 * AET is RES0 if 'the value returned in the DFSC field is not
+	 * [ESR_ELx_FSC_SERROR]'
+	 */
+	if ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR) {
+		kvm_inject_serror(vcpu);
+		return 1;
+	}
+
+	switch (aet) {
+	case ESR_ELx_AET_CE:	/* corrected error */
+	case ESR_ELx_AET_UEO:	/* restartable error, not yet consumed */
+		return 0;	/* continue processing the guest exit */
+	default:
+		kvm_inject_serror(vcpu);
+		return 1;
+	}
+}
+
 /**
  * kvm_handle_wfx - handle a wait-for-interrupts or wait-for-event
  *		    instruction executed by a guest
@@ -187,21 +249,13 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 {
 	exit_handle_fn exit_handler;
 
-	if (ARM_SERROR_PENDING(exception_index)) {
-		u8 hsr_ec = ESR_ELx_EC(kvm_vcpu_get_hsr(vcpu));
-
-		/*
-		 * HVC/SMC already have an adjusted PC, which we need
-		 * to correct in order to return to after having
-		 * injected the SError.
-		 */
-		if (hsr_ec == ESR_ELx_EC_HVC32 || hsr_ec == ESR_ELx_EC_HVC64 ||
-		    hsr_ec == ESR_ELx_EC_SMC32 || hsr_ec == ESR_ELx_EC_SMC64) {
-			u32 adj =  kvm_vcpu_trap_il_is32bit(vcpu) ? 4 : 2;
-			*vcpu_pc(vcpu) -= adj;
-		}
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+		u64 disr = kvm_vcpu_get_disr(vcpu);
 
-		kvm_inject_vabt(vcpu);
+		if (disr)
+			kvm_handle_guest_serror(vcpu, disr_to_esr(disr));
+	} else if (ARM_SERROR_PENDING(exception_index)) {
+		kvm_inject_serror(vcpu);
 		return 1;
 	}
 
@@ -211,7 +265,7 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	case ARM_EXCEPTION_IRQ:
 		return 1;
 	case ARM_EXCEPTION_EL1_SERROR:
-		kvm_inject_vabt(vcpu);
+		kvm_handle_guest_serror(vcpu, kvm_vcpu_get_hsr(vcpu));
 		return 1;
 	case ARM_EXCEPTION_TRAP:
 		/*
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index cec18df5a324..f1baaffa6922 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -142,6 +142,15 @@ ENTRY(__guest_exit)
 	// Now restore the host regs
 	restore_callee_saved_regs x2
 
+	kvm_explicit_esb
+	disr_read	reg=x2
+alternative_if ARM64_HAS_RAS_EXTN
+	str	x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
+	cbz	x2, 1f
+	orr	x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
+1:	ret
+alternative_else_nop_endif
+
 	// If we have a pending asynchronous abort, now is the
 	// time to find out. From your VAXorcist book, page 666:
 	// "Threaten me not, oh Evil one!  For I speak with
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 16/16] KVM: arm64: Handle deferred SErrors consumed on guest exit
@ 2017-07-28 14:10   ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-07-28 14:10 UTC (permalink / raw)
  To: linux-arm-kernel

On systems with VHE, the RAS extensions and IESB support, KVM gets an
implicit ESB whenever it enters/exits a guest, because the host sets
SCTLR_EL1.IESB.

To prevent errors being lost, add code to __guest_exit() to read DISR_EL1,
and save it in the kvm_vcpu_fault_info. Add code to handle_exit() to
process this deferred SError. This data is in addition to the reason the
guest exitted.

Future patches may add a firmware-first callout from
kvm_handle_deferred_serror() to decode CPER records populated by firmware,
or call some arm64 arch code to process the RAS 'ERR' registers for
kernel-first handling. Without either of these, we just make a judgement
on the severity: corrected and restartable errors are ignored, all others
result it an SError being given to the guest.

On systems with EL2 but where VHE has been disabled in the build config,
add an explicit ESB in the __guest_exit() path. This lets us skip the
SError VAXorcism on all systems with the RAS extensions.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/kvm_emulate.h |  5 +++
 arch/arm64/include/asm/kvm_host.h    |  1 +
 arch/arm64/kernel/asm-offsets.c      |  1 +
 arch/arm64/kvm/handle_exit.c         | 84 +++++++++++++++++++++++++++++-------
 arch/arm64/kvm/hyp/entry.S           |  9 ++++
 5 files changed, 85 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index fe39e6841326..9bec1e572bee 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -168,6 +168,11 @@ static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
 	return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
 }
 
+static inline u64 kvm_vcpu_get_disr(const struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.fault.disr_el1;
+}
+
 static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index d68630007b14..f65cc6d497e6 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -88,6 +88,7 @@ struct kvm_vcpu_fault_info {
 	u32 esr_el2;		/* Hyp Syndrom Register */
 	u64 far_el2;		/* Hyp Fault Address Register */
 	u64 hpfar_el2;		/* Hyp IPA Fault Address Register */
+	u64 disr_el1;		/* Deferred [SError] Status Register */
 };
 
 /*
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index b3bb7ef97bc8..331cccbd38cf 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -129,6 +129,7 @@ int main(void)
   BLANK();
 #ifdef CONFIG_KVM_ARM_HOST
   DEFINE(VCPU_CONTEXT,		offsetof(struct kvm_vcpu, arch.ctxt));
+  DEFINE(VCPU_FAULT_DISR,	offsetof(struct kvm_vcpu, arch.fault.disr_el1));
   DEFINE(CPU_GP_REGS,		offsetof(struct kvm_cpu_context, gp_regs));
   DEFINE(CPU_USER_PT_REGS,	offsetof(struct kvm_regs, regs));
   DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 17d8a1677a0b..b8e2477853eb 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -23,6 +23,7 @@
 #include <linux/kvm_host.h>
 
 #include <asm/esr.h>
+#include <asm/exception.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_coproc.h>
 #include <asm/kvm_emulate.h>
@@ -67,6 +68,67 @@ static int handle_no_fpsimd(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	return 1;
 }
 
+static void kvm_inject_serror(struct kvm_vcpu *vcpu)
+{
+	u8 hsr_ec = ESR_ELx_EC(kvm_vcpu_get_hsr(vcpu));
+
+	/*
+	 * HVC/SMC already have an adjusted PC, which we need
+	 * to correct in order to return to after having
+	 * injected the SError.
+	 */
+	if (hsr_ec == ESR_ELx_EC_HVC32 || hsr_ec == ESR_ELx_EC_HVC64 ||
+	    hsr_ec == ESR_ELx_EC_SMC32 || hsr_ec == ESR_ELx_EC_SMC64) {
+		u32 adj =  kvm_vcpu_trap_il_is32bit(vcpu) ? 4 : 2;
+		*vcpu_pc(vcpu) -= adj;
+	}
+
+	kvm_inject_vabt(vcpu);
+}
+
+/**
+ * kvm_handle_guest_serror
+ *
+ * @vcpu:	the vcpu pointer
+ * @esr:	the esr_el2 value read at guest exit for an SError, or
+ * 		disr_el1 for a deferred SError.
+ *
+ * Either the guest took an SError, or we found one pending while handling
+ * __guest_exit() at EL2. With the v8.2 RAS extensions SErrors are either
+ * 'impementation defined' or categorised as a RAS exception.
+ * Without any further information we ignore SErrors categorised as
+ * 'corrected' or 'restartable' by RAS, and hand the guest an SError in
+ * all other cases.
+ */
+static int kvm_handle_guest_serror(struct kvm_vcpu *vcpu, u32 esr)
+{
+	bool impdef_syndrome = esr & ESR_ELx_ISV;	/* aka IDS */
+	unsigned int aet = esr & ESR_ELx_AET;
+
+	if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN) || impdef_syndrome) {
+		kvm_inject_serror(vcpu);
+		return 1;
+	}
+
+	/*
+	 * AET is RES0 if 'the value returned in the DFSC field is not
+	 * [ESR_ELx_FSC_SERROR]'
+	 */
+	if ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR) {
+		kvm_inject_serror(vcpu);
+		return 1;
+	}
+
+	switch (aet) {
+	case ESR_ELx_AET_CE:	/* corrected error */
+	case ESR_ELx_AET_UEO:	/* restartable error, not yet consumed */
+		return 0;	/* continue processing the guest exit */
+	default:
+		kvm_inject_serror(vcpu);
+		return 1;
+	}
+}
+
 /**
  * kvm_handle_wfx - handle a wait-for-interrupts or wait-for-event
  *		    instruction executed by a guest
@@ -187,21 +249,13 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 {
 	exit_handle_fn exit_handler;
 
-	if (ARM_SERROR_PENDING(exception_index)) {
-		u8 hsr_ec = ESR_ELx_EC(kvm_vcpu_get_hsr(vcpu));
-
-		/*
-		 * HVC/SMC already have an adjusted PC, which we need
-		 * to correct in order to return to after having
-		 * injected the SError.
-		 */
-		if (hsr_ec == ESR_ELx_EC_HVC32 || hsr_ec == ESR_ELx_EC_HVC64 ||
-		    hsr_ec == ESR_ELx_EC_SMC32 || hsr_ec == ESR_ELx_EC_SMC64) {
-			u32 adj =  kvm_vcpu_trap_il_is32bit(vcpu) ? 4 : 2;
-			*vcpu_pc(vcpu) -= adj;
-		}
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+		u64 disr = kvm_vcpu_get_disr(vcpu);
 
-		kvm_inject_vabt(vcpu);
+		if (disr)
+			kvm_handle_guest_serror(vcpu, disr_to_esr(disr));
+	} else if (ARM_SERROR_PENDING(exception_index)) {
+		kvm_inject_serror(vcpu);
 		return 1;
 	}
 
@@ -211,7 +265,7 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	case ARM_EXCEPTION_IRQ:
 		return 1;
 	case ARM_EXCEPTION_EL1_SERROR:
-		kvm_inject_vabt(vcpu);
+		kvm_handle_guest_serror(vcpu, kvm_vcpu_get_hsr(vcpu));
 		return 1;
 	case ARM_EXCEPTION_TRAP:
 		/*
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index cec18df5a324..f1baaffa6922 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -142,6 +142,15 @@ ENTRY(__guest_exit)
 	// Now restore the host regs
 	restore_callee_saved_regs x2
 
+	kvm_explicit_esb
+	disr_read	reg=x2
+alternative_if ARM64_HAS_RAS_EXTN
+	str	x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
+	cbz	x2, 1f
+	orr	x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
+1:	ret
+alternative_else_nop_endif
+
 	// If we have a pending asynchronous abort, now is the
 	// time to find out. From your VAXorcist book, page 666:
 	// "Threaten me not, oh Evil one!  For I speak with
-- 
2.13.2

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 14/16] KVM: arm64: Take pending SErrors on entry to the guest
  2017-07-28 14:10   ` James Morse
@ 2017-08-01 12:53     ` Christoffer Dall
  -1 siblings, 0 replies; 56+ messages in thread
From: Christoffer Dall @ 2017-08-01 12:53 UTC (permalink / raw)
  To: James Morse
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng,
	linux-arm-kernel, kvmarm

Hi James,

On Fri, Jul 28, 2017 at 03:10:17PM +0100, James Morse wrote:
> SErrors due to RAS are either taken as an SError, or deferred because
> of an Error Synchronization Barrier (ESB). Systems that support the RAS
> extensions are very likely to have firmware-first handling of these
> errors, taking all SErrors to EL3.
> 
> Add {I,}ESB support to KVM and be prepared to handle any resulting SError
> if we are notified directly. (i.e. no firmware-first handling). Do this
> for the cases where we can take the SError instead of deferring it.

Sorry, I forgot how this works again.  If we do have firmware-first, are
we sure that the firmware doesn't just emulate the SError to EL2,
resulting in it looking the same whether or not we have firmware-first?
(perhaps the firmware-first thing clearly defines what should happen
when the firmware detects an SError?)

> 
> With VHE KVM is covered by the host's setting of SCTLR_EL1.IESB: unmask
> SError when entering a guest. This will hyp-panic if there was an SError
> pending during world switch (and we don't have firmware first). Make
> sure this only happens when its KVM's 'fault' by adding an ESB to

nit: it's

> __kvm_call_hyp().
> 
> On systems without the RAS extensions a pending SError triggered by KVM's
> world switch will no longer be blamed on the guest, causing a panic
> instead.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  arch/arm64/include/asm/assembler.h |  1 +
>  arch/arm64/kvm/hyp.S               |  1 +
>  arch/arm64/kvm/hyp/entry.S         | 18 ++++++++++++++++++
>  3 files changed, 20 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index e2bb551f59f7..e440fba6d0fe 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -29,6 +29,7 @@
>  #include <asm/page.h>
>  #include <asm/pgtable-hwdef.h>
>  #include <asm/ptrace.h>
> +#include <asm/sysreg.h>
>  #include <asm/thread_info.h>
>  
>  	.macro save_and_disable_daif, flags
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index 952f6cb9cf72..e96a5f6afecd 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -40,6 +40,7 @@
>   * arch/arm64/kernel/hyp_stub.S.
>   */
>  ENTRY(__kvm_call_hyp)
> +	esb

are esb instructions ignored on earlier versions than ARMv8.2?  (I
couldn't easily determine this from the spec)

So this is not necessarily the world switch path we're entering (as
hinted in the commit message), but we could also be doing TLB
maintenance.  Is this still going to work in that case?

>  alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
>  	hvc	#0
>  	ret
> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> index 12ee62d6d410..cec18df5a324 100644
> --- a/arch/arm64/kvm/hyp/entry.S
> +++ b/arch/arm64/kvm/hyp/entry.S
> @@ -49,6 +49,21 @@
>  	ldp	x29, lr,  [\ctxt, #CPU_XREG_OFFSET(29)]
>  .endm
>  
> +/* We have an implicit esb if we have VHE and IESB. */
> +.macro kvm_explicit_esb
> +	alternative_if_not ARM64_HAS_RAS_EXTN
> +	b	998f
> +	alternative_else_nop_endif
> +	alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
> +	esb
> +	b	998f
> +	alternative_else_nop_endif
> +	alternative_if_not ARM64_HAS_IESB
> +	esb
> +	alternative_else_nop_endif
> +998:
> +.endm
> +
>  /*
>   * u64 __guest_enter(struct kvm_vcpu *vcpu,
>   *		     struct kvm_cpu_context *host_ctxt);
> @@ -85,6 +100,9 @@ ENTRY(__guest_enter)
>  	ldr	x18,      [x18, #CPU_XREG_OFFSET(18)]
>  
>  	// Do not touch any register after this!
> +
> +	enable_serror		// Don't defer an IESB SError
> +	kvm_explicit_esb
>  	eret
>  ENDPROC(__guest_enter)
>  
> -- 
> 2.13.2
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 14/16] KVM: arm64: Take pending SErrors on entry to the guest
@ 2017-08-01 12:53     ` Christoffer Dall
  0 siblings, 0 replies; 56+ messages in thread
From: Christoffer Dall @ 2017-08-01 12:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

On Fri, Jul 28, 2017 at 03:10:17PM +0100, James Morse wrote:
> SErrors due to RAS are either taken as an SError, or deferred because
> of an Error Synchronization Barrier (ESB). Systems that support the RAS
> extensions are very likely to have firmware-first handling of these
> errors, taking all SErrors to EL3.
> 
> Add {I,}ESB support to KVM and be prepared to handle any resulting SError
> if we are notified directly. (i.e. no firmware-first handling). Do this
> for the cases where we can take the SError instead of deferring it.

Sorry, I forgot how this works again.  If we do have firmware-first, are
we sure that the firmware doesn't just emulate the SError to EL2,
resulting in it looking the same whether or not we have firmware-first?
(perhaps the firmware-first thing clearly defines what should happen
when the firmware detects an SError?)

> 
> With VHE KVM is covered by the host's setting of SCTLR_EL1.IESB: unmask
> SError when entering a guest. This will hyp-panic if there was an SError
> pending during world switch (and we don't have firmware first). Make
> sure this only happens when its KVM's 'fault' by adding an ESB to

nit: it's

> __kvm_call_hyp().
> 
> On systems without the RAS extensions a pending SError triggered by KVM's
> world switch will no longer be blamed on the guest, causing a panic
> instead.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  arch/arm64/include/asm/assembler.h |  1 +
>  arch/arm64/kvm/hyp.S               |  1 +
>  arch/arm64/kvm/hyp/entry.S         | 18 ++++++++++++++++++
>  3 files changed, 20 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index e2bb551f59f7..e440fba6d0fe 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -29,6 +29,7 @@
>  #include <asm/page.h>
>  #include <asm/pgtable-hwdef.h>
>  #include <asm/ptrace.h>
> +#include <asm/sysreg.h>
>  #include <asm/thread_info.h>
>  
>  	.macro save_and_disable_daif, flags
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index 952f6cb9cf72..e96a5f6afecd 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -40,6 +40,7 @@
>   * arch/arm64/kernel/hyp_stub.S.
>   */
>  ENTRY(__kvm_call_hyp)
> +	esb

are esb instructions ignored on earlier versions than ARMv8.2?  (I
couldn't easily determine this from the spec)

So this is not necessarily the world switch path we're entering (as
hinted in the commit message), but we could also be doing TLB
maintenance.  Is this still going to work in that case?

>  alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
>  	hvc	#0
>  	ret
> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> index 12ee62d6d410..cec18df5a324 100644
> --- a/arch/arm64/kvm/hyp/entry.S
> +++ b/arch/arm64/kvm/hyp/entry.S
> @@ -49,6 +49,21 @@
>  	ldp	x29, lr,  [\ctxt, #CPU_XREG_OFFSET(29)]
>  .endm
>  
> +/* We have an implicit esb if we have VHE and IESB. */
> +.macro kvm_explicit_esb
> +	alternative_if_not ARM64_HAS_RAS_EXTN
> +	b	998f
> +	alternative_else_nop_endif
> +	alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
> +	esb
> +	b	998f
> +	alternative_else_nop_endif
> +	alternative_if_not ARM64_HAS_IESB
> +	esb
> +	alternative_else_nop_endif
> +998:
> +.endm
> +
>  /*
>   * u64 __guest_enter(struct kvm_vcpu *vcpu,
>   *		     struct kvm_cpu_context *host_ctxt);
> @@ -85,6 +100,9 @@ ENTRY(__guest_enter)
>  	ldr	x18,      [x18, #CPU_XREG_OFFSET(18)]
>  
>  	// Do not touch any register after this!
> +
> +	enable_serror		// Don't defer an IESB SError
> +	kvm_explicit_esb
>  	eret
>  ENDPROC(__guest_enter)
>  
> -- 
> 2.13.2
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 16/16] KVM: arm64: Handle deferred SErrors consumed on guest exit
  2017-07-28 14:10   ` James Morse
@ 2017-08-01 13:18     ` Christoffer Dall
  -1 siblings, 0 replies; 56+ messages in thread
From: Christoffer Dall @ 2017-08-01 13:18 UTC (permalink / raw)
  To: James Morse
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng,
	linux-arm-kernel, kvmarm

On Fri, Jul 28, 2017 at 03:10:19PM +0100, James Morse wrote:
> On systems with VHE, the RAS extensions and IESB support, KVM gets an
> implicit ESB whenever it enters/exits a guest, because the host sets
> SCTLR_EL1.IESB.
> 
> To prevent errors being lost, add code to __guest_exit() to read DISR_EL1,
> and save it in the kvm_vcpu_fault_info. Add code to handle_exit() to
> process this deferred SError. This data is in addition to the reason the
> guest exitted.

Two questions:

First, am I reading the spec incorrectly when it says "The implicit form
of Error Synchronization Barrier: [...] Has no effect on DISR_EL1 or
VDISR_EL2" and I understand this as we wouldn't actually read anything
from DISR_EL1 if we rely on the IESB?

Second, what if we have several SErrors, and one happens upon entering
the guest and another one happens when returning from the guest - do we
end up overwriting the DISR_EL1 by only looking at it during exit and
potentially miss errors?

> 
> Future patches may add a firmware-first callout from
> kvm_handle_deferred_serror() to decode CPER records populated by firmware,
> or call some arm64 arch code to process the RAS 'ERR' registers for
> kernel-first handling. Without either of these, we just make a judgement
> on the severity: corrected and restartable errors are ignored, all others
> result it an SError being given to the guest.

*in an* ?

Why do we give the remaining types of SErrors to the guest?  What would
the kernel normally do for any other workload than running a VM when
discovering this type of error?

> 
> On systems with EL2 but where VHE has been disabled in the build config,
> add an explicit ESB in the __guest_exit() path. This lets us skip the
> SError VAXorcism on all systems with the RAS extensions.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  arch/arm64/include/asm/kvm_emulate.h |  5 +++
>  arch/arm64/include/asm/kvm_host.h    |  1 +
>  arch/arm64/kernel/asm-offsets.c      |  1 +
>  arch/arm64/kvm/handle_exit.c         | 84 +++++++++++++++++++++++++++++-------
>  arch/arm64/kvm/hyp/entry.S           |  9 ++++
>  5 files changed, 85 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index fe39e6841326..9bec1e572bee 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -168,6 +168,11 @@ static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
>  	return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
>  }
>  
> +static inline u64 kvm_vcpu_get_disr(const struct kvm_vcpu *vcpu)
> +{
> +	return vcpu->arch.fault.disr_el1;
> +}
> +
>  static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu)
>  {
>  	return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index d68630007b14..f65cc6d497e6 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -88,6 +88,7 @@ struct kvm_vcpu_fault_info {
>  	u32 esr_el2;		/* Hyp Syndrom Register */
>  	u64 far_el2;		/* Hyp Fault Address Register */
>  	u64 hpfar_el2;		/* Hyp IPA Fault Address Register */
> +	u64 disr_el1;		/* Deferred [SError] Status Register */
>  };
>  
>  /*
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index b3bb7ef97bc8..331cccbd38cf 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -129,6 +129,7 @@ int main(void)
>    BLANK();
>  #ifdef CONFIG_KVM_ARM_HOST
>    DEFINE(VCPU_CONTEXT,		offsetof(struct kvm_vcpu, arch.ctxt));
> +  DEFINE(VCPU_FAULT_DISR,	offsetof(struct kvm_vcpu, arch.fault.disr_el1));
>    DEFINE(CPU_GP_REGS,		offsetof(struct kvm_cpu_context, gp_regs));
>    DEFINE(CPU_USER_PT_REGS,	offsetof(struct kvm_regs, regs));
>    DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 17d8a1677a0b..b8e2477853eb 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -23,6 +23,7 @@
>  #include <linux/kvm_host.h>
>  
>  #include <asm/esr.h>
> +#include <asm/exception.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_coproc.h>
>  #include <asm/kvm_emulate.h>
> @@ -67,6 +68,67 @@ static int handle_no_fpsimd(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  	return 1;
>  }
>  
> +static void kvm_inject_serror(struct kvm_vcpu *vcpu)
> +{
> +	u8 hsr_ec = ESR_ELx_EC(kvm_vcpu_get_hsr(vcpu));
> +
> +	/*
> +	 * HVC/SMC already have an adjusted PC, which we need
> +	 * to correct in order to return to after having
> +	 * injected the SError.
> +	 */
> +	if (hsr_ec == ESR_ELx_EC_HVC32 || hsr_ec == ESR_ELx_EC_HVC64 ||
> +	    hsr_ec == ESR_ELx_EC_SMC32 || hsr_ec == ESR_ELx_EC_SMC64) {
> +		u32 adj =  kvm_vcpu_trap_il_is32bit(vcpu) ? 4 : 2;
> +		*vcpu_pc(vcpu) -= adj;
> +	}
> +
> +	kvm_inject_vabt(vcpu);
> +}
> +
> +/**
> + * kvm_handle_guest_serror
> + *
> + * @vcpu:	the vcpu pointer
> + * @esr:	the esr_el2 value read at guest exit for an SError, or
> + * 		disr_el1 for a deferred SError.
> + *
> + * Either the guest took an SError, or we found one pending while handling
> + * __guest_exit() at EL2. With the v8.2 RAS extensions SErrors are either
> + * 'impementation defined' or categorised as a RAS exception.
> + * Without any further information we ignore SErrors categorised as
> + * 'corrected' or 'restartable' by RAS, and hand the guest an SError in
> + * all other cases.
> + */
> +static int kvm_handle_guest_serror(struct kvm_vcpu *vcpu, u32 esr)
> +{
> +	bool impdef_syndrome = esr & ESR_ELx_ISV;	/* aka IDS */
> +	unsigned int aet = esr & ESR_ELx_AET;
> +
> +	if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN) || impdef_syndrome) {
> +		kvm_inject_serror(vcpu);
> +		return 1;
> +	}
> +
> +	/*
> +	 * AET is RES0 if 'the value returned in the DFSC field is not
> +	 * [ESR_ELx_FSC_SERROR]'
> +	 */
> +	if ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR) {
> +		kvm_inject_serror(vcpu);
> +		return 1;
> +	}
> +
> +	switch (aet) {
> +	case ESR_ELx_AET_CE:	/* corrected error */
> +	case ESR_ELx_AET_UEO:	/* restartable error, not yet consumed */
> +		return 0;	/* continue processing the guest exit */
> +	default:
> +		kvm_inject_serror(vcpu);
> +		return 1;
> +	}
> +}
> +
>  /**
>   * kvm_handle_wfx - handle a wait-for-interrupts or wait-for-event
>   *		    instruction executed by a guest
> @@ -187,21 +249,13 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  {
>  	exit_handle_fn exit_handler;
>  
> -	if (ARM_SERROR_PENDING(exception_index)) {
> -		u8 hsr_ec = ESR_ELx_EC(kvm_vcpu_get_hsr(vcpu));
> -
> -		/*
> -		 * HVC/SMC already have an adjusted PC, which we need
> -		 * to correct in order to return to after having
> -		 * injected the SError.
> -		 */
> -		if (hsr_ec == ESR_ELx_EC_HVC32 || hsr_ec == ESR_ELx_EC_HVC64 ||
> -		    hsr_ec == ESR_ELx_EC_SMC32 || hsr_ec == ESR_ELx_EC_SMC64) {
> -			u32 adj =  kvm_vcpu_trap_il_is32bit(vcpu) ? 4 : 2;
> -			*vcpu_pc(vcpu) -= adj;
> -		}
> +	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
> +		u64 disr = kvm_vcpu_get_disr(vcpu);
>  
> -		kvm_inject_vabt(vcpu);
> +		if (disr)
> +			kvm_handle_guest_serror(vcpu, disr_to_esr(disr));
> +	} else if (ARM_SERROR_PENDING(exception_index)) {
> +		kvm_inject_serror(vcpu);
>  		return 1;
>  	}
>  
> @@ -211,7 +265,7 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  	case ARM_EXCEPTION_IRQ:
>  		return 1;
>  	case ARM_EXCEPTION_EL1_SERROR:
> -		kvm_inject_vabt(vcpu);
> +		kvm_handle_guest_serror(vcpu, kvm_vcpu_get_hsr(vcpu));
>  		return 1;
>  	case ARM_EXCEPTION_TRAP:
>  		/*
> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> index cec18df5a324..f1baaffa6922 100644
> --- a/arch/arm64/kvm/hyp/entry.S
> +++ b/arch/arm64/kvm/hyp/entry.S
> @@ -142,6 +142,15 @@ ENTRY(__guest_exit)
>  	// Now restore the host regs
>  	restore_callee_saved_regs x2
>  
> +	kvm_explicit_esb
> +	disr_read	reg=x2
> +alternative_if ARM64_HAS_RAS_EXTN
> +	str	x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
> +	cbz	x2, 1f
> +	orr	x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
> +1:	ret
> +alternative_else_nop_endif
> +
>  	// If we have a pending asynchronous abort, now is the
>  	// time to find out. From your VAXorcist book, page 666:
>  	// "Threaten me not, oh Evil one!  For I speak with
> -- 
> 2.13.2
> 

Otherwise this patch looks good to me.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 16/16] KVM: arm64: Handle deferred SErrors consumed on guest exit
@ 2017-08-01 13:18     ` Christoffer Dall
  0 siblings, 0 replies; 56+ messages in thread
From: Christoffer Dall @ 2017-08-01 13:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 03:10:19PM +0100, James Morse wrote:
> On systems with VHE, the RAS extensions and IESB support, KVM gets an
> implicit ESB whenever it enters/exits a guest, because the host sets
> SCTLR_EL1.IESB.
> 
> To prevent errors being lost, add code to __guest_exit() to read DISR_EL1,
> and save it in the kvm_vcpu_fault_info. Add code to handle_exit() to
> process this deferred SError. This data is in addition to the reason the
> guest exitted.

Two questions:

First, am I reading the spec incorrectly when it says "The implicit form
of Error Synchronization Barrier: [...] Has no effect on DISR_EL1 or
VDISR_EL2" and I understand this as we wouldn't actually read anything
from DISR_EL1 if we rely on the IESB?

Second, what if we have several SErrors, and one happens upon entering
the guest and another one happens when returning from the guest - do we
end up overwriting the DISR_EL1 by only looking at it during exit and
potentially miss errors?

> 
> Future patches may add a firmware-first callout from
> kvm_handle_deferred_serror() to decode CPER records populated by firmware,
> or call some arm64 arch code to process the RAS 'ERR' registers for
> kernel-first handling. Without either of these, we just make a judgement
> on the severity: corrected and restartable errors are ignored, all others
> result it an SError being given to the guest.

*in an* ?

Why do we give the remaining types of SErrors to the guest?  What would
the kernel normally do for any other workload than running a VM when
discovering this type of error?

> 
> On systems with EL2 but where VHE has been disabled in the build config,
> add an explicit ESB in the __guest_exit() path. This lets us skip the
> SError VAXorcism on all systems with the RAS extensions.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  arch/arm64/include/asm/kvm_emulate.h |  5 +++
>  arch/arm64/include/asm/kvm_host.h    |  1 +
>  arch/arm64/kernel/asm-offsets.c      |  1 +
>  arch/arm64/kvm/handle_exit.c         | 84 +++++++++++++++++++++++++++++-------
>  arch/arm64/kvm/hyp/entry.S           |  9 ++++
>  5 files changed, 85 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> index fe39e6841326..9bec1e572bee 100644
> --- a/arch/arm64/include/asm/kvm_emulate.h
> +++ b/arch/arm64/include/asm/kvm_emulate.h
> @@ -168,6 +168,11 @@ static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
>  	return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
>  }
>  
> +static inline u64 kvm_vcpu_get_disr(const struct kvm_vcpu *vcpu)
> +{
> +	return vcpu->arch.fault.disr_el1;
> +}
> +
>  static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu)
>  {
>  	return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index d68630007b14..f65cc6d497e6 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -88,6 +88,7 @@ struct kvm_vcpu_fault_info {
>  	u32 esr_el2;		/* Hyp Syndrom Register */
>  	u64 far_el2;		/* Hyp Fault Address Register */
>  	u64 hpfar_el2;		/* Hyp IPA Fault Address Register */
> +	u64 disr_el1;		/* Deferred [SError] Status Register */
>  };
>  
>  /*
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index b3bb7ef97bc8..331cccbd38cf 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -129,6 +129,7 @@ int main(void)
>    BLANK();
>  #ifdef CONFIG_KVM_ARM_HOST
>    DEFINE(VCPU_CONTEXT,		offsetof(struct kvm_vcpu, arch.ctxt));
> +  DEFINE(VCPU_FAULT_DISR,	offsetof(struct kvm_vcpu, arch.fault.disr_el1));
>    DEFINE(CPU_GP_REGS,		offsetof(struct kvm_cpu_context, gp_regs));
>    DEFINE(CPU_USER_PT_REGS,	offsetof(struct kvm_regs, regs));
>    DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 17d8a1677a0b..b8e2477853eb 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -23,6 +23,7 @@
>  #include <linux/kvm_host.h>
>  
>  #include <asm/esr.h>
> +#include <asm/exception.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_coproc.h>
>  #include <asm/kvm_emulate.h>
> @@ -67,6 +68,67 @@ static int handle_no_fpsimd(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  	return 1;
>  }
>  
> +static void kvm_inject_serror(struct kvm_vcpu *vcpu)
> +{
> +	u8 hsr_ec = ESR_ELx_EC(kvm_vcpu_get_hsr(vcpu));
> +
> +	/*
> +	 * HVC/SMC already have an adjusted PC, which we need
> +	 * to correct in order to return to after having
> +	 * injected the SError.
> +	 */
> +	if (hsr_ec == ESR_ELx_EC_HVC32 || hsr_ec == ESR_ELx_EC_HVC64 ||
> +	    hsr_ec == ESR_ELx_EC_SMC32 || hsr_ec == ESR_ELx_EC_SMC64) {
> +		u32 adj =  kvm_vcpu_trap_il_is32bit(vcpu) ? 4 : 2;
> +		*vcpu_pc(vcpu) -= adj;
> +	}
> +
> +	kvm_inject_vabt(vcpu);
> +}
> +
> +/**
> + * kvm_handle_guest_serror
> + *
> + * @vcpu:	the vcpu pointer
> + * @esr:	the esr_el2 value read at guest exit for an SError, or
> + * 		disr_el1 for a deferred SError.
> + *
> + * Either the guest took an SError, or we found one pending while handling
> + * __guest_exit() at EL2. With the v8.2 RAS extensions SErrors are either
> + * 'impementation defined' or categorised as a RAS exception.
> + * Without any further information we ignore SErrors categorised as
> + * 'corrected' or 'restartable' by RAS, and hand the guest an SError in
> + * all other cases.
> + */
> +static int kvm_handle_guest_serror(struct kvm_vcpu *vcpu, u32 esr)
> +{
> +	bool impdef_syndrome = esr & ESR_ELx_ISV;	/* aka IDS */
> +	unsigned int aet = esr & ESR_ELx_AET;
> +
> +	if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN) || impdef_syndrome) {
> +		kvm_inject_serror(vcpu);
> +		return 1;
> +	}
> +
> +	/*
> +	 * AET is RES0 if 'the value returned in the DFSC field is not
> +	 * [ESR_ELx_FSC_SERROR]'
> +	 */
> +	if ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR) {
> +		kvm_inject_serror(vcpu);
> +		return 1;
> +	}
> +
> +	switch (aet) {
> +	case ESR_ELx_AET_CE:	/* corrected error */
> +	case ESR_ELx_AET_UEO:	/* restartable error, not yet consumed */
> +		return 0;	/* continue processing the guest exit */
> +	default:
> +		kvm_inject_serror(vcpu);
> +		return 1;
> +	}
> +}
> +
>  /**
>   * kvm_handle_wfx - handle a wait-for-interrupts or wait-for-event
>   *		    instruction executed by a guest
> @@ -187,21 +249,13 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  {
>  	exit_handle_fn exit_handler;
>  
> -	if (ARM_SERROR_PENDING(exception_index)) {
> -		u8 hsr_ec = ESR_ELx_EC(kvm_vcpu_get_hsr(vcpu));
> -
> -		/*
> -		 * HVC/SMC already have an adjusted PC, which we need
> -		 * to correct in order to return to after having
> -		 * injected the SError.
> -		 */
> -		if (hsr_ec == ESR_ELx_EC_HVC32 || hsr_ec == ESR_ELx_EC_HVC64 ||
> -		    hsr_ec == ESR_ELx_EC_SMC32 || hsr_ec == ESR_ELx_EC_SMC64) {
> -			u32 adj =  kvm_vcpu_trap_il_is32bit(vcpu) ? 4 : 2;
> -			*vcpu_pc(vcpu) -= adj;
> -		}
> +	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
> +		u64 disr = kvm_vcpu_get_disr(vcpu);
>  
> -		kvm_inject_vabt(vcpu);
> +		if (disr)
> +			kvm_handle_guest_serror(vcpu, disr_to_esr(disr));
> +	} else if (ARM_SERROR_PENDING(exception_index)) {
> +		kvm_inject_serror(vcpu);
>  		return 1;
>  	}
>  
> @@ -211,7 +265,7 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
>  	case ARM_EXCEPTION_IRQ:
>  		return 1;
>  	case ARM_EXCEPTION_EL1_SERROR:
> -		kvm_inject_vabt(vcpu);
> +		kvm_handle_guest_serror(vcpu, kvm_vcpu_get_hsr(vcpu));
>  		return 1;
>  	case ARM_EXCEPTION_TRAP:
>  		/*
> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> index cec18df5a324..f1baaffa6922 100644
> --- a/arch/arm64/kvm/hyp/entry.S
> +++ b/arch/arm64/kvm/hyp/entry.S
> @@ -142,6 +142,15 @@ ENTRY(__guest_exit)
>  	// Now restore the host regs
>  	restore_callee_saved_regs x2
>  
> +	kvm_explicit_esb
> +	disr_read	reg=x2
> +alternative_if ARM64_HAS_RAS_EXTN
> +	str	x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
> +	cbz	x2, 1f
> +	orr	x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
> +1:	ret
> +alternative_else_nop_endif
> +
>  	// If we have a pending asynchronous abort, now is the
>  	// time to find out. From your VAXorcist book, page 666:
>  	// "Threaten me not, oh Evil one!  For I speak with
> -- 
> 2.13.2
> 

Otherwise this patch looks good to me.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 15/16] KVM: arm64: Save ESR_EL2 on guest SError
  2017-07-28 14:10   ` James Morse
@ 2017-08-01 13:25     ` Christoffer Dall
  -1 siblings, 0 replies; 56+ messages in thread
From: Christoffer Dall @ 2017-08-01 13:25 UTC (permalink / raw)
  To: James Morse
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng,
	linux-arm-kernel, kvmarm

On Fri, Jul 28, 2017 at 03:10:18PM +0100, James Morse wrote:
> When we exit a guest due to an SError the vcpu fault info isn't updated
> with the ESR. Today this is only done for traps.
>
> The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
> fault_info with the ESR on SError so that handle_exit() can determine
> if this was a RAS SError and decode its severity.
>
> Signed-off-by: James Morse <james.morse@arm.com>

Reviewed-by: Christoffer Dall <cdall@linaro.org>

> ---
>  arch/arm64/kvm/hyp/switch.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 945e79c641c4..c6f17c7675ad 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -226,13 +226,20 @@ static bool __hyp_text __translate_far_to_hpfar(u64 far, u64 *hpfar)
>   return true;
>  }
>
> +static void __hyp_text __populate_fault_info_esr(struct kvm_vcpu *vcpu)
> +{
> + vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
> +}
> +
>  static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
>  {
> - u64 esr = read_sysreg_el2(esr);
> - u8 ec = ESR_ELx_EC(esr);
> + u8 ec;
> + u64 esr;
>   u64 hpfar, far;
>
> - vcpu->arch.fault.esr_el2 = esr;
> + __populate_fault_info_esr(vcpu);
> + esr = vcpu->arch.fault.esr_el2;
> + ec = ESR_ELx_EC(esr);
>
>   if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
>   return true;
> @@ -321,6 +328,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>   */
>   if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
>   goto again;
> + else if (ARM_EXCEPTION_CODE(exit_code) == ARM_EXCEPTION_EL1_SERROR)
> + __populate_fault_info_esr(vcpu);
>
>   if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
>      exit_code == ARM_EXCEPTION_TRAP) {
> --
> 2.13.2
>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 15/16] KVM: arm64: Save ESR_EL2 on guest SError
@ 2017-08-01 13:25     ` Christoffer Dall
  0 siblings, 0 replies; 56+ messages in thread
From: Christoffer Dall @ 2017-08-01 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jul 28, 2017 at 03:10:18PM +0100, James Morse wrote:
> When we exit a guest due to an SError the vcpu fault info isn't updated
> with the ESR. Today this is only done for traps.
>
> The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
> fault_info with the ESR on SError so that handle_exit() can determine
> if this was a RAS SError and decode its severity.
>
> Signed-off-by: James Morse <james.morse@arm.com>

Reviewed-by: Christoffer Dall <cdall@linaro.org>

> ---
>  arch/arm64/kvm/hyp/switch.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 945e79c641c4..c6f17c7675ad 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -226,13 +226,20 @@ static bool __hyp_text __translate_far_to_hpfar(u64 far, u64 *hpfar)
>   return true;
>  }
>
> +static void __hyp_text __populate_fault_info_esr(struct kvm_vcpu *vcpu)
> +{
> + vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
> +}
> +
>  static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
>  {
> - u64 esr = read_sysreg_el2(esr);
> - u8 ec = ESR_ELx_EC(esr);
> + u8 ec;
> + u64 esr;
>   u64 hpfar, far;
>
> - vcpu->arch.fault.esr_el2 = esr;
> + __populate_fault_info_esr(vcpu);
> + esr = vcpu->arch.fault.esr_el2;
> + ec = ESR_ELx_EC(esr);
>
>   if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
>   return true;
> @@ -321,6 +328,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>   */
>   if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
>   goto again;
> + else if (ARM_EXCEPTION_CODE(exit_code) == ARM_EXCEPTION_EL1_SERROR)
> + __populate_fault_info_esr(vcpu);
>
>   if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
>      exit_code == ARM_EXCEPTION_TRAP) {
> --
> 2.13.2
>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 16/16] KVM: arm64: Handle deferred SErrors consumed on guest exit
  2017-08-01 13:18     ` Christoffer Dall
@ 2017-08-03 17:03       ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-08-03 17:03 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng,
	linux-arm-kernel, kvmarm

Hi Christoffer,

On 01/08/17 14:18, Christoffer Dall wrote:
> On Fri, Jul 28, 2017 at 03:10:19PM +0100, James Morse wrote:
>> On systems with VHE, the RAS extensions and IESB support, KVM gets an
>> implicit ESB whenever it enters/exits a guest, because the host sets
>> SCTLR_EL1.IESB.
>>
>> To prevent errors being lost, add code to __guest_exit() to read DISR_EL1,
>> and save it in the kvm_vcpu_fault_info. Add code to handle_exit() to
>> process this deferred SError. This data is in addition to the reason the
>> guest exitted.
> 
> Two questions:
> 
> First, am I reading the spec incorrectly when it says "The implicit form
> of Error Synchronization Barrier: [...] Has no effect on DISR_EL1 or
> VDISR_EL2" and I understand this as we wouldn't actually read anything
> from DISR_EL1 if we rely on the IESB?

(This is from section 2.4.5 Extension for barrier at exception entry and exit of
DDI 0587A.)

Well spotted ... that's embarrassing!

The DISR write is in the pseudocode's ESBOperation() which is not the same as
ErrorSynchronizationBarrier(). Running an 'ESB' does both, but an IESB only does
ErrorSynchronizationBarrier().

I think this distinction is because the CPU may know about RAS errors it hasn't
yet made pending SErrors. (they must have to have a severity for the ESR by this
point).

So IESB makes hidden RAS errors pending SErrors, it doesn't do what ESB does.

Yes, this means the DISR_EL1 check on kernel-entry and guest exit is useless.
Given this the host kernel entry/exit can be simplified, probably getting rid of
the SError over eret horror. I will need to re-think the KVM changes, (we may
just need the ESR from the existing vaxorcism code).


> Second, what if we have several SErrors, and one happens upon entering
> the guest and another one happens when returning from the guest - do we
> end up overwriting the DISR_EL1 by only looking at it during exit and
> potentially miss errors?

There can only be one pending SError at a time, but if we have PSTATE.A set, a
pending SError and a hidden RAS error, then ESB must have to pick one to defer,
and IESB must have to discard one. I suspect the answer is 'implementation
defined', but I will ask!


>> Future patches may add a firmware-first callout from
>> kvm_handle_deferred_serror() to decode CPER records populated by firmware,
>> or call some arm64 arch code to process the RAS 'ERR' registers for
>> kernel-first handling. Without either of these, we just make a judgement
>> on the severity: corrected and restartable errors are ignored, all others
>> result it an SError being given to the guest.
> 
> *in an* ?


> Why do we give the remaining types of SErrors to the guest?

Just because that is what KVM does today.

> What would the kernel normally do for any other workload than running a VM when
> discovering this type of error?

I'm trying to make that clearer! Today we 'kill the running task', if its the
kernel, we would panic(). But because the CPU masks SError on exception entry,
and we never touch PSTATE.A, its always masked in the kernel, so we take the
SError and kill the next user space task that gets run.

We should panic() like we do in the early boot code if an SError was pending
from firmware.


Should the host panic because of an SError taken during a guest?, not
necessarily. All the system registers are save/restored by world-switch, and the
host doesn't depend on anything in guest memory. The host should be immune to
any corruption that occurs while a guest was running.
Gengdongjiu's example of device pass-through is the exception to this reasoning,
I think we need a way for the host to contain/reset pass-through devices that
trigger an SError.



Thanks!

James

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 16/16] KVM: arm64: Handle deferred SErrors consumed on guest exit
@ 2017-08-03 17:03       ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-08-03 17:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,

On 01/08/17 14:18, Christoffer Dall wrote:
> On Fri, Jul 28, 2017 at 03:10:19PM +0100, James Morse wrote:
>> On systems with VHE, the RAS extensions and IESB support, KVM gets an
>> implicit ESB whenever it enters/exits a guest, because the host sets
>> SCTLR_EL1.IESB.
>>
>> To prevent errors being lost, add code to __guest_exit() to read DISR_EL1,
>> and save it in the kvm_vcpu_fault_info. Add code to handle_exit() to
>> process this deferred SError. This data is in addition to the reason the
>> guest exitted.
> 
> Two questions:
> 
> First, am I reading the spec incorrectly when it says "The implicit form
> of Error Synchronization Barrier: [...] Has no effect on DISR_EL1 or
> VDISR_EL2" and I understand this as we wouldn't actually read anything
> from DISR_EL1 if we rely on the IESB?

(This is from section 2.4.5 Extension for barrier at exception entry and exit of
DDI 0587A.)

Well spotted ... that's embarrassing!

The DISR write is in the pseudocode's ESBOperation() which is not the same as
ErrorSynchronizationBarrier(). Running an 'ESB' does both, but an IESB only does
ErrorSynchronizationBarrier().

I think this distinction is because the CPU may know about RAS errors it hasn't
yet made pending SErrors. (they must have to have a severity for the ESR by this
point).

So IESB makes hidden RAS errors pending SErrors, it doesn't do what ESB does.

Yes, this means the DISR_EL1 check on kernel-entry and guest exit is useless.
Given this the host kernel entry/exit can be simplified, probably getting rid of
the SError over eret horror. I will need to re-think the KVM changes, (we may
just need the ESR from the existing vaxorcism code).


> Second, what if we have several SErrors, and one happens upon entering
> the guest and another one happens when returning from the guest - do we
> end up overwriting the DISR_EL1 by only looking at it during exit and
> potentially miss errors?

There can only be one pending SError at a time, but if we have PSTATE.A set, a
pending SError and a hidden RAS error, then ESB must have to pick one to defer,
and IESB must have to discard one. I suspect the answer is 'implementation
defined', but I will ask!


>> Future patches may add a firmware-first callout from
>> kvm_handle_deferred_serror() to decode CPER records populated by firmware,
>> or call some arm64 arch code to process the RAS 'ERR' registers for
>> kernel-first handling. Without either of these, we just make a judgement
>> on the severity: corrected and restartable errors are ignored, all others
>> result it an SError being given to the guest.
> 
> *in an* ?


> Why do we give the remaining types of SErrors to the guest?

Just because that is what KVM does today.

> What would the kernel normally do for any other workload than running a VM when
> discovering this type of error?

I'm trying to make that clearer! Today we 'kill the running task', if its the
kernel, we would panic(). But because the CPU masks SError on exception entry,
and we never touch PSTATE.A, its always masked in the kernel, so we take the
SError and kill the next user space task that gets run.

We should panic() like we do in the early boot code if an SError was pending
from firmware.


Should the host panic because of an SError taken during a guest?, not
necessarily. All the system registers are save/restored by world-switch, and the
host doesn't depend on anything in guest memory. The host should be immune to
any corruption that occurs while a guest was running.
Gengdongjiu's example of device pass-through is the exception to this reasoning,
I think we need a way for the host to contain/reset pass-through devices that
trigger an SError.



Thanks!

James

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 11/16] arm64: kernel: Handle deferred SError on kernel entry
  2017-07-28 14:10   ` James Morse
@ 2017-08-03 17:03     ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-08-03 17:03 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Christoffer Dall, Marc Zyngier, Catalin Marinas, Will Deacon,
	kvmarm, Wang Xiongfeng

Hello!

On 28/07/17 15:10, James Morse wrote:
> Before we can enable Implicit ESB on exception level change, we need to
> handle deferred SErrors that may appear on exception entry.

Christoffer has pointed out on patch 16 that I've miss-understood IESB's behaviour:
> The implicit form of Error Synchronization Barrier: [...] Has no effect on
> DISR_EL1

Turns out the ARM-ARM psuedocode means subtly different things by 'ESB' and
'ErrorSynchronizationBarrier'.

Patches 11->16 will need rethinking, but it looks like they can be simplified.


Thanks,

James

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 11/16] arm64: kernel: Handle deferred SError on kernel entry
@ 2017-08-03 17:03     ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-08-03 17:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hello!

On 28/07/17 15:10, James Morse wrote:
> Before we can enable Implicit ESB on exception level change, we need to
> handle deferred SErrors that may appear on exception entry.

Christoffer has pointed out on patch 16 that I've miss-understood IESB's behaviour:
> The implicit form of Error Synchronization Barrier: [...] Has no effect on
> DISR_EL1

Turns out the ARM-ARM psuedocode means subtly different things by 'ESB' and
'ErrorSynchronizationBarrier'.

Patches 11->16 will need rethinking, but it looks like they can be simplified.


Thanks,

James

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 16/16] KVM: arm64: Handle deferred SErrors consumed on guest exit
  2017-08-03 17:03       ` James Morse
@ 2017-08-04 13:12         ` Christoffer Dall
  -1 siblings, 0 replies; 56+ messages in thread
From: Christoffer Dall @ 2017-08-04 13:12 UTC (permalink / raw)
  To: James Morse
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, Wang Xiongfeng,
	linux-arm-kernel, kvmarm

On Thu, Aug 03, 2017 at 06:03:33PM +0100, James Morse wrote:
> Hi Christoffer,
> 
> On 01/08/17 14:18, Christoffer Dall wrote:
> > On Fri, Jul 28, 2017 at 03:10:19PM +0100, James Morse wrote:
> >> On systems with VHE, the RAS extensions and IESB support, KVM gets an
> >> implicit ESB whenever it enters/exits a guest, because the host sets
> >> SCTLR_EL1.IESB.
> >>
> >> To prevent errors being lost, add code to __guest_exit() to read DISR_EL1,
> >> and save it in the kvm_vcpu_fault_info. Add code to handle_exit() to
> >> process this deferred SError. This data is in addition to the reason the
> >> guest exitted.
> > 
> > Two questions:
> > 
> > First, am I reading the spec incorrectly when it says "The implicit form
> > of Error Synchronization Barrier: [...] Has no effect on DISR_EL1 or
> > VDISR_EL2" and I understand this as we wouldn't actually read anything
> > from DISR_EL1 if we rely on the IESB?
> 
> (This is from section 2.4.5 Extension for barrier at exception entry and exit of
> DDI 0587A.)
> 
> Well spotted ... that's embarrassing!

Not at all, that spec is a little dense.

> 
> The DISR write is in the pseudocode's ESBOperation() which is not the same as
> ErrorSynchronizationBarrier(). Running an 'ESB' does both, but an IESB only does
> ErrorSynchronizationBarrier().
> 
> I think this distinction is because the CPU may know about RAS errors it hasn't
> yet made pending SErrors. (they must have to have a severity for the ESR by this
> point).
> 
> So IESB makes hidden RAS errors pending SErrors, it doesn't do what ESB does.
> 
> Yes, this means the DISR_EL1 check on kernel-entry and guest exit is useless.
> Given this the host kernel entry/exit can be simplified, probably getting rid of
> the SError over eret horror. I will need to re-think the KVM changes, (we may
> just need the ESR from the existing vaxorcism code).
> 
> 
> > Second, what if we have several SErrors, and one happens upon entering
> > the guest and another one happens when returning from the guest - do we
> > end up overwriting the DISR_EL1 by only looking at it during exit and
> > potentially miss errors?
> 
> There can only be one pending SError at a time, but if we have PSTATE.A set, a
> pending SError and a hidden RAS error, then ESB must have to pick one to defer,
> and IESB must have to discard one. I suspect the answer is 'implementation
> defined', but I will ask!
> 

As long as we're doing what we can, and we're not missing something that
the architecture gives us a way to retrieve, then that's probably the
best we can do.

> 
> >> Future patches may add a firmware-first callout from
> >> kvm_handle_deferred_serror() to decode CPER records populated by firmware,
> >> or call some arm64 arch code to process the RAS 'ERR' registers for
> >> kernel-first handling. Without either of these, we just make a judgement
> >> on the severity: corrected and restartable errors are ignored, all others
> >> result it an SError being given to the guest.
> > 
> > *in an* ?
> 
> 
> > Why do we give the remaining types of SErrors to the guest?
> 
> Just because that is what KVM does today.
> 
> > What would the kernel normally do for any other workload than running a VM when
> > discovering this type of error?
> 
> I'm trying to make that clearer! Today we 'kill the running task', if its the
> kernel, we would panic(). But because the CPU masks SError on exception entry,
> and we never touch PSTATE.A, its always masked in the kernel, so we take the
> SError and kill the next user space task that gets run.
> 
> We should panic() like we do in the early boot code if an SError was pending
> from firmware.
> 
> 
> Should the host panic because of an SError taken during a guest?, not
> necessarily. All the system registers are save/restored by world-switch, and the
> host doesn't depend on anything in guest memory. The host should be immune to
> any corruption that occurs while a guest was running.
> Gengdongjiu's example of device pass-through is the exception to this reasoning,
> I think we need a way for the host to contain/reset pass-through devices that
> trigger an SError.
> 

I'm not an expert on what can generate the SError.  If it's because the
guest misprogrammed a system register, then it makes sense to just tell
the guest.

However, if this could be related to corrupted memory, or a CPU fault,
or really any resource that the guest is using which can be used by the
host later on (memory, CPU, GIC, passthrough devices, ...) then it feels
a little dangerous to just signal the guest and carry on.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 16/16] KVM: arm64: Handle deferred SErrors consumed on guest exit
@ 2017-08-04 13:12         ` Christoffer Dall
  0 siblings, 0 replies; 56+ messages in thread
From: Christoffer Dall @ 2017-08-04 13:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 03, 2017 at 06:03:33PM +0100, James Morse wrote:
> Hi Christoffer,
> 
> On 01/08/17 14:18, Christoffer Dall wrote:
> > On Fri, Jul 28, 2017 at 03:10:19PM +0100, James Morse wrote:
> >> On systems with VHE, the RAS extensions and IESB support, KVM gets an
> >> implicit ESB whenever it enters/exits a guest, because the host sets
> >> SCTLR_EL1.IESB.
> >>
> >> To prevent errors being lost, add code to __guest_exit() to read DISR_EL1,
> >> and save it in the kvm_vcpu_fault_info. Add code to handle_exit() to
> >> process this deferred SError. This data is in addition to the reason the
> >> guest exitted.
> > 
> > Two questions:
> > 
> > First, am I reading the spec incorrectly when it says "The implicit form
> > of Error Synchronization Barrier: [...] Has no effect on DISR_EL1 or
> > VDISR_EL2" and I understand this as we wouldn't actually read anything
> > from DISR_EL1 if we rely on the IESB?
> 
> (This is from section 2.4.5 Extension for barrier at exception entry and exit of
> DDI 0587A.)
> 
> Well spotted ... that's embarrassing!

Not at all, that spec is a little dense.

> 
> The DISR write is in the pseudocode's ESBOperation() which is not the same as
> ErrorSynchronizationBarrier(). Running an 'ESB' does both, but an IESB only does
> ErrorSynchronizationBarrier().
> 
> I think this distinction is because the CPU may know about RAS errors it hasn't
> yet made pending SErrors. (they must have to have a severity for the ESR by this
> point).
> 
> So IESB makes hidden RAS errors pending SErrors, it doesn't do what ESB does.
> 
> Yes, this means the DISR_EL1 check on kernel-entry and guest exit is useless.
> Given this the host kernel entry/exit can be simplified, probably getting rid of
> the SError over eret horror. I will need to re-think the KVM changes, (we may
> just need the ESR from the existing vaxorcism code).
> 
> 
> > Second, what if we have several SErrors, and one happens upon entering
> > the guest and another one happens when returning from the guest - do we
> > end up overwriting the DISR_EL1 by only looking at it during exit and
> > potentially miss errors?
> 
> There can only be one pending SError at a time, but if we have PSTATE.A set, a
> pending SError and a hidden RAS error, then ESB must have to pick one to defer,
> and IESB must have to discard one. I suspect the answer is 'implementation
> defined', but I will ask!
> 

As long as we're doing what we can, and we're not missing something that
the architecture gives us a way to retrieve, then that's probably the
best we can do.

> 
> >> Future patches may add a firmware-first callout from
> >> kvm_handle_deferred_serror() to decode CPER records populated by firmware,
> >> or call some arm64 arch code to process the RAS 'ERR' registers for
> >> kernel-first handling. Without either of these, we just make a judgement
> >> on the severity: corrected and restartable errors are ignored, all others
> >> result it an SError being given to the guest.
> > 
> > *in an* ?
> 
> 
> > Why do we give the remaining types of SErrors to the guest?
> 
> Just because that is what KVM does today.
> 
> > What would the kernel normally do for any other workload than running a VM when
> > discovering this type of error?
> 
> I'm trying to make that clearer! Today we 'kill the running task', if its the
> kernel, we would panic(). But because the CPU masks SError on exception entry,
> and we never touch PSTATE.A, its always masked in the kernel, so we take the
> SError and kill the next user space task that gets run.
> 
> We should panic() like we do in the early boot code if an SError was pending
> from firmware.
> 
> 
> Should the host panic because of an SError taken during a guest?, not
> necessarily. All the system registers are save/restored by world-switch, and the
> host doesn't depend on anything in guest memory. The host should be immune to
> any corruption that occurs while a guest was running.
> Gengdongjiu's example of device pass-through is the exception to this reasoning,
> I think we need a way for the host to contain/reset pass-through devices that
> trigger an SError.
> 

I'm not an expert on what can generate the SError.  If it's because the
guest misprogrammed a system register, then it makes sense to just tell
the guest.

However, if this could be related to corrupted memory, or a CPU fault,
or really any resource that the guest is using which can be used by the
host later on (memory, CPU, GIC, passthrough devices, ...) then it feels
a little dangerous to just signal the guest and carry on.

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/16] arm64: entry.S: convert elX_sync
  2017-07-28 14:10   ` James Morse
@ 2017-08-09 17:25     ` Catalin Marinas
  -1 siblings, 0 replies; 56+ messages in thread
From: Catalin Marinas @ 2017-08-09 17:25 UTC (permalink / raw)
  To: James Morse
  Cc: Marc Zyngier, Will Deacon, kvmarm, Wang Xiongfeng, linux-arm-kernel

Hi James,

On Fri, Jul 28, 2017 at 03:10:09PM +0100, James Morse wrote:
> @@ -520,9 +514,16 @@ el1_preempt:
>  el0_sync:
>  	kernel_entry 0
>  	mrs	x25, esr_el1			// read the syndrome register
> +	mrs	x26, far_el1

Just checking, since we are going to access far_el1 even when we get a
syscall, have you noticed any overhead?

-- 
Catalin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 06/16] arm64: entry.S: convert elX_sync
@ 2017-08-09 17:25     ` Catalin Marinas
  0 siblings, 0 replies; 56+ messages in thread
From: Catalin Marinas @ 2017-08-09 17:25 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

On Fri, Jul 28, 2017 at 03:10:09PM +0100, James Morse wrote:
> @@ -520,9 +514,16 @@ el1_preempt:
>  el0_sync:
>  	kernel_entry 0
>  	mrs	x25, esr_el1			// read the syndrome register
> +	mrs	x26, far_el1

Just checking, since we are going to access far_el1 even when we get a
syscall, have you noticed any overhead?

-- 
Catalin

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/16] arm64: entry.S: convert elX_sync
  2017-08-09 17:25     ` Catalin Marinas
@ 2017-08-10 16:57       ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-08-10 16:57 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Marc Zyngier, Will Deacon, kvmarm, Wang Xiongfeng, linux-arm-kernel

Hi Catalin,

On 09/08/17 18:25, Catalin Marinas wrote:
> On Fri, Jul 28, 2017 at 03:10:09PM +0100, James Morse wrote:
>> @@ -520,9 +514,16 @@ el1_preempt:
>>  el0_sync:
>>  	kernel_entry 0
>>  	mrs	x25, esr_el1			// read the syndrome register
>> +	mrs	x26, far_el1
> 
> Just checking, since we are going to access far_el1 even when we get a
> syscall, have you noticed any overhead?

Good point, I haven't checked because I've been doing all this with the software
model.

I will set this running on Seattle overnight, results in v3's cover letter.


Thanks!

James

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 06/16] arm64: entry.S: convert elX_sync
@ 2017-08-10 16:57       ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-08-10 16:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On 09/08/17 18:25, Catalin Marinas wrote:
> On Fri, Jul 28, 2017 at 03:10:09PM +0100, James Morse wrote:
>> @@ -520,9 +514,16 @@ el1_preempt:
>>  el0_sync:
>>  	kernel_entry 0
>>  	mrs	x25, esr_el1			// read the syndrome register
>> +	mrs	x26, far_el1
> 
> Just checking, since we are going to access far_el1 even when we get a
> syscall, have you noticed any overhead?

Good point, I haven't checked because I've been doing all this with the software
model.

I will set this running on Seattle overnight, results in v3's cover letter.


Thanks!

James

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 06/16] arm64: entry.S: convert elX_sync
  2017-08-10 16:57       ` James Morse
@ 2017-08-11 17:24         ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-08-11 17:24 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Marc Zyngier, Will Deacon, kvmarm, linux-arm-kernel, Wang Xiongfeng

Hi Catalin,

On 10/08/17 17:57, James Morse wrote:
> On 09/08/17 18:25, Catalin Marinas wrote:
>> On Fri, Jul 28, 2017 at 03:10:09PM +0100, James Morse wrote:
>>> @@ -520,9 +514,16 @@ el1_preempt:
>>>  el0_sync:
>>>  	kernel_entry 0
>>>  	mrs	x25, esr_el1			// read the syndrome register
>>> +	mrs	x26, far_el1
>>
>> Just checking, since we are going to access far_el1 even when we get a
>> syscall, have you noticed any overhead?

(I can get rid of the extra far_el1 reads by doing a better job of this patch.)


> Good point, I haven't checked because I've been doing all this with the software
> model.
> 
> I will set this running on Seattle overnight, results in v3's cover letter.

So the series does make microbenchmarks like calling getpid() in a loop slower,
but its not the far_el1 read causing this, its the unconditional masking of
exceptions in kernel_exit. This doesn't show up once I start doing real work
(like fork or exec).

I may be able to get rid of this but keep SError unmasked in the kernel and
masked over eret by merging EL0-returns disable_daif with its existing
irq-masked ret-to-user loop.



Thanks,

James

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 06/16] arm64: entry.S: convert elX_sync
@ 2017-08-11 17:24         ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-08-11 17:24 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On 10/08/17 17:57, James Morse wrote:
> On 09/08/17 18:25, Catalin Marinas wrote:
>> On Fri, Jul 28, 2017 at 03:10:09PM +0100, James Morse wrote:
>>> @@ -520,9 +514,16 @@ el1_preempt:
>>>  el0_sync:
>>>  	kernel_entry 0
>>>  	mrs	x25, esr_el1			// read the syndrome register
>>> +	mrs	x26, far_el1
>>
>> Just checking, since we are going to access far_el1 even when we get a
>> syscall, have you noticed any overhead?

(I can get rid of the extra far_el1 reads by doing a better job of this patch.)


> Good point, I haven't checked because I've been doing all this with the software
> model.
> 
> I will set this running on Seattle overnight, results in v3's cover letter.

So the series does make microbenchmarks like calling getpid() in a loop slower,
but its not the far_el1 read causing this, its the unconditional masking of
exceptions in kernel_exit. This doesn't show up once I start doing real work
(like fork or exec).

I may be able to get rid of this but keep SError unmasked in the kernel and
masked over eret by merging EL0-returns disable_daif with its existing
irq-masked ret-to-user loop.



Thanks,

James

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 10/16] arm64: kernel: Survive corrected RAS errors notified by SError
  2017-07-28 14:10   ` James Morse
@ 2017-09-13 20:52     ` Baicar, Tyler
  -1 siblings, 0 replies; 56+ messages in thread
From: Baicar, Tyler @ 2017-09-13 20:52 UTC (permalink / raw)
  To: James Morse, linux-arm-kernel
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, kvmarm, Wang Xiongfeng

On 7/28/2017 8:10 AM, James Morse wrote:
> On v8.0, SError is an uncontainable fatal exception. The v8.2 RAS
> extensions use SError to notify software about RAS errors, these can be
> contained by the ESB instruction.
>
> An ACPI system with firmware-first may use SError as its 'SEI'
> notification. Future patches may add code to 'claim' this SError as
> notification.
>
> Other systems can distinguish these RAS errors from the SError ESR and
> use the AET bits and additional data from RAS-Error registers to handle
> the error.  Future patches may add this kernel-first handling.
>
> In the meantime, on both kinds of system we can safely ignore corrected
> errors.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>   arch/arm64/include/asm/esr.h | 10 ++++++++++
>   arch/arm64/kernel/traps.c    | 35 ++++++++++++++++++++++++++++++++---
>   2 files changed, 42 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
> index 8cabd57b6348..77d5b1baf1a4 100644
> --- a/arch/arm64/include/asm/esr.h
> +++ b/arch/arm64/include/asm/esr.h
> @@ -83,6 +83,15 @@
>   /* ISS field definitions shared by different classes */
>   #define ESR_ELx_WNR		(UL(1) << 6)
>   
> +/* Asynchronous Error Type */
> +#define ESR_ELx_AET		(UL(0x7) << 10)
> +
> +#define ESR_ELx_AET_UC		(UL(0) << 10)	/* Uncontainable */
> +#define ESR_ELx_AET_UEU		(UL(1) << 10)	/* Uncorrected Unrecoverable */
> +#define ESR_ELx_AET_UEO		(UL(2) << 10)	/* Uncorrected Restartable */
> +#define ESR_ELx_AET_UER		(UL(3) << 10)	/* Uncorrected Recoverable */
> +#define ESR_ELx_AET_CE		(UL(6) << 10)	/* Corrected */
> +
>   /* Shared ISS field definitions for Data/Instruction aborts */
>   #define ESR_ELx_FnV		(UL(1) << 10)
>   #define ESR_ELx_EA		(UL(1) << 9)
> @@ -92,6 +101,7 @@
>   #define ESR_ELx_FSC		(0x3F)
>   #define ESR_ELx_FSC_TYPE	(0x3C)
>   #define ESR_ELx_FSC_EXTABT	(0x10)
> +#define ESR_ELx_FSC_SERROR	(0x11)
>   #define ESR_ELx_FSC_ACCESS	(0x08)
>   #define ESR_ELx_FSC_FAULT	(0x04)
>   #define ESR_ELx_FSC_PERM	(0x0C)
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 943a0e242dbc..e1eaccc66548 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -685,10 +685,8 @@ asmlinkage void bad_el0_sync(struct pt_regs *regs, int reason, unsigned int esr)
>   	force_sig_info(info.si_signo, &info, current);
>   }
>   
> -asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
> +static void do_serror_panic(struct pt_regs *regs, unsigned int esr)
>   {
> -	nmi_enter();
> -
>   	console_verbose();
>   
>   	pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n",
> @@ -696,6 +694,37 @@ asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
>   	__show_regs(regs);
>   
>   	nmi_panic(regs, "Asynchronous SError Interrupt");
> +}
> +
> +static void _do_serror(struct pt_regs *regs, unsigned int esr)
> +{
> +	bool impdef_syndrome = esr & ESR_ELx_ISV;	/* aka IDS */
> +	unsigned int aet = esr & ESR_ELx_AET;
> +
> +	if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN) || impdef_syndrome)
> +		return do_serror_panic(regs, esr);
> +
> +	/*
> +	 * AET is RES0 if 'the value returned in the DFSC field is not
> +	 * [ESR_ELx_FSC_SERROR]'
> +	 */
> +	if ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR)
> +		return do_serror_panic(regs, esr);
> +
> +	switch (aet) {
Hello James,

Here you just have corrected and restartable errors being ignored and 
all other errors panic. For corrected and restartable errors, we should 
at least be logging that an error happened and provide the syndrome info 
(address, context, etc.). We also should be triggering a trace event to 
notify the user space that an error happened so that tools like RAS 
Daemon can report the error. This will involve a new trace event since 
the current ones are based of the CPER structures from the 
firmware-first case.

Recoverable UEs should not need to trigger the panic, we should be able 
to do the recovery similar to the memory fault handling in 
mm/memory-failure.c code. The recoverable UEs should also trigger a 
trace event to user space since they won't cause a panic as well.

Thanks,
Tyler
> +	case ESR_ELx_AET_CE:	/* corrected error */
> +	case ESR_ELx_AET_UEO:	/* restartable, not yet consumed */
> +		break;
> +	default:
> +		return do_serror_panic(regs, esr);
> +	}
> +}
> +
> +asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
> +{
> +	nmi_enter();
> +
> +	_do_serror(regs, esr);
>   
>   	nmi_exit();
>   }

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 10/16] arm64: kernel: Survive corrected RAS errors notified by SError
@ 2017-09-13 20:52     ` Baicar, Tyler
  0 siblings, 0 replies; 56+ messages in thread
From: Baicar, Tyler @ 2017-09-13 20:52 UTC (permalink / raw)
  To: linux-arm-kernel

On 7/28/2017 8:10 AM, James Morse wrote:
> On v8.0, SError is an uncontainable fatal exception. The v8.2 RAS
> extensions use SError to notify software about RAS errors, these can be
> contained by the ESB instruction.
>
> An ACPI system with firmware-first may use SError as its 'SEI'
> notification. Future patches may add code to 'claim' this SError as
> notification.
>
> Other systems can distinguish these RAS errors from the SError ESR and
> use the AET bits and additional data from RAS-Error registers to handle
> the error.  Future patches may add this kernel-first handling.
>
> In the meantime, on both kinds of system we can safely ignore corrected
> errors.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>   arch/arm64/include/asm/esr.h | 10 ++++++++++
>   arch/arm64/kernel/traps.c    | 35 ++++++++++++++++++++++++++++++++---
>   2 files changed, 42 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
> index 8cabd57b6348..77d5b1baf1a4 100644
> --- a/arch/arm64/include/asm/esr.h
> +++ b/arch/arm64/include/asm/esr.h
> @@ -83,6 +83,15 @@
>   /* ISS field definitions shared by different classes */
>   #define ESR_ELx_WNR		(UL(1) << 6)
>   
> +/* Asynchronous Error Type */
> +#define ESR_ELx_AET		(UL(0x7) << 10)
> +
> +#define ESR_ELx_AET_UC		(UL(0) << 10)	/* Uncontainable */
> +#define ESR_ELx_AET_UEU		(UL(1) << 10)	/* Uncorrected Unrecoverable */
> +#define ESR_ELx_AET_UEO		(UL(2) << 10)	/* Uncorrected Restartable */
> +#define ESR_ELx_AET_UER		(UL(3) << 10)	/* Uncorrected Recoverable */
> +#define ESR_ELx_AET_CE		(UL(6) << 10)	/* Corrected */
> +
>   /* Shared ISS field definitions for Data/Instruction aborts */
>   #define ESR_ELx_FnV		(UL(1) << 10)
>   #define ESR_ELx_EA		(UL(1) << 9)
> @@ -92,6 +101,7 @@
>   #define ESR_ELx_FSC		(0x3F)
>   #define ESR_ELx_FSC_TYPE	(0x3C)
>   #define ESR_ELx_FSC_EXTABT	(0x10)
> +#define ESR_ELx_FSC_SERROR	(0x11)
>   #define ESR_ELx_FSC_ACCESS	(0x08)
>   #define ESR_ELx_FSC_FAULT	(0x04)
>   #define ESR_ELx_FSC_PERM	(0x0C)
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 943a0e242dbc..e1eaccc66548 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -685,10 +685,8 @@ asmlinkage void bad_el0_sync(struct pt_regs *regs, int reason, unsigned int esr)
>   	force_sig_info(info.si_signo, &info, current);
>   }
>   
> -asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
> +static void do_serror_panic(struct pt_regs *regs, unsigned int esr)
>   {
> -	nmi_enter();
> -
>   	console_verbose();
>   
>   	pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n",
> @@ -696,6 +694,37 @@ asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
>   	__show_regs(regs);
>   
>   	nmi_panic(regs, "Asynchronous SError Interrupt");
> +}
> +
> +static void _do_serror(struct pt_regs *regs, unsigned int esr)
> +{
> +	bool impdef_syndrome = esr & ESR_ELx_ISV;	/* aka IDS */
> +	unsigned int aet = esr & ESR_ELx_AET;
> +
> +	if (!cpus_have_const_cap(ARM64_HAS_RAS_EXTN) || impdef_syndrome)
> +		return do_serror_panic(regs, esr);
> +
> +	/*
> +	 * AET is RES0 if 'the value returned in the DFSC field is not
> +	 * [ESR_ELx_FSC_SERROR]'
> +	 */
> +	if ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR)
> +		return do_serror_panic(regs, esr);
> +
> +	switch (aet) {
Hello James,

Here you just have corrected and restartable errors being ignored and 
all other errors panic. For corrected and restartable errors, we should 
at least be logging that an error happened and provide the syndrome info 
(address, context, etc.). We also should be triggering a trace event to 
notify the user space that an error happened so that tools like RAS 
Daemon can report the error. This will involve a new trace event since 
the current ones are based of the CPER structures from the 
firmware-first case.

Recoverable UEs should not need to trigger the panic, we should be able 
to do the recovery similar to the memory fault handling in 
mm/memory-failure.c code. The recoverable UEs should also trigger a 
trace event to user space since they won't cause a panic as well.

Thanks,
Tyler
> +	case ESR_ELx_AET_CE:	/* corrected error */
> +	case ESR_ELx_AET_UEO:	/* restartable, not yet consumed */
> +		break;
> +	default:
> +		return do_serror_panic(regs, esr);
> +	}
> +}
> +
> +asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
> +{
> +	nmi_enter();
> +
> +	_do_serror(regs, esr);
>   
>   	nmi_exit();
>   }

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 10/16] arm64: kernel: Survive corrected RAS errors notified by SError
  2017-09-13 20:52     ` Baicar, Tyler
@ 2017-09-14 12:58       ` James Morse
  -1 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-09-14 12:58 UTC (permalink / raw)
  To: Baicar, Tyler
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, kvmarm,
	Wang Xiongfeng, linux-arm-kernel

Hi Tyler,

On 13/09/17 21:52, Baicar, Tyler wrote:
> On 7/28/2017 8:10 AM, James Morse wrote:
>> On v8.0, SError is an uncontainable fatal exception. The v8.2 RAS
>> extensions use SError to notify software about RAS errors, these can be
>> contained by the ESB instruction.
>>
>> An ACPI system with firmware-first may use SError as its 'SEI'
>> notification. Future patches may add code to 'claim' this SError as
>> notification.
>>
>> Other systems can distinguish these RAS errors from the SError ESR and
>> use the AET bits and additional data from RAS-Error registers to handle
>> the error.  Future patches may add this kernel-first handling.
>>
>> In the meantime, on both kinds of system we can safely ignore corrected
>> errors.

> Here you just have corrected and restartable errors being ignored and all other
> errors panic. For corrected and restartable errors, we should at least be
> logging that an error happened and provide the syndrome info (address, context,
> etc.). 

Yes, that would be great, but its all wrapped up in 'kernel first handling' for
RAS... which we don't yet have.

This series is 'fixing' the kernel's SError mask behaviour so that the SEI
firmware-first mechanism can (almost) always deliver its notifications, and has
somewhere to hook the APEI code into, (like you did for do_sea()).

Of course not all systems will have this firmware, so if we took a v8.2 RAS
SError on bare-metal we need to do something. This selective-ignoring is an
interim fudge to avoid bringing the machine down for something that isn't (yet?)
a problem.


>From the commit message:
> Future patches may add this kernel-first handling.
> In the meantime, on both kinds of system we can safely ignore corrected
> errors.


> We also should be triggering a trace event to notify the user space that
> an error happened so that tools like RAS Daemon can report the error. This will
> involve a new trace event since the current ones are based of the CPER
> structures from the firmware-first case.

Hmm, so RAS Daemon is going to end up knowing whether an error was handled
kernel-first or firmware-first, that is unfortunate for RAS-Daemon (more code)
and means we have duplicate trace points.


> Recoverable UEs should not need to trigger the panic, we should be able to do
> the recovery similar to the memory fault handling in mm/memory-failure.c code.
> The recoverable UEs should also trigger a trace event to user space since they
> won't cause a panic as well.

I agree, but only once we have code to dig in v8.2's RAS ERR registers to pick
out the class of error and affected component or address. Until then we can't
know the component or address, so can't handle the error.

This is still an improvement over a non-v8.2-RAS aware kernel, as that would
panic() for corrected errors too, (depending on when they arrived ... the SError
masking is somewhat broken).


Thanks,

James

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 10/16] arm64: kernel: Survive corrected RAS errors notified by SError
@ 2017-09-14 12:58       ` James Morse
  0 siblings, 0 replies; 56+ messages in thread
From: James Morse @ 2017-09-14 12:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Tyler,

On 13/09/17 21:52, Baicar, Tyler wrote:
> On 7/28/2017 8:10 AM, James Morse wrote:
>> On v8.0, SError is an uncontainable fatal exception. The v8.2 RAS
>> extensions use SError to notify software about RAS errors, these can be
>> contained by the ESB instruction.
>>
>> An ACPI system with firmware-first may use SError as its 'SEI'
>> notification. Future patches may add code to 'claim' this SError as
>> notification.
>>
>> Other systems can distinguish these RAS errors from the SError ESR and
>> use the AET bits and additional data from RAS-Error registers to handle
>> the error.  Future patches may add this kernel-first handling.
>>
>> In the meantime, on both kinds of system we can safely ignore corrected
>> errors.

> Here you just have corrected and restartable errors being ignored and all other
> errors panic. For corrected and restartable errors, we should at least be
> logging that an error happened and provide the syndrome info (address, context,
> etc.). 

Yes, that would be great, but its all wrapped up in 'kernel first handling' for
RAS... which we don't yet have.

This series is 'fixing' the kernel's SError mask behaviour so that the SEI
firmware-first mechanism can (almost) always deliver its notifications, and has
somewhere to hook the APEI code into, (like you did for do_sea()).

Of course not all systems will have this firmware, so if we took a v8.2 RAS
SError on bare-metal we need to do something. This selective-ignoring is an
interim fudge to avoid bringing the machine down for something that isn't (yet?)
a problem.


>From the commit message:
> Future patches may add this kernel-first handling.
> In the meantime, on both kinds of system we can safely ignore corrected
> errors.


> We also should be triggering a trace event to notify the user space that
> an error happened so that tools like RAS Daemon can report the error. This will
> involve a new trace event since the current ones are based of the CPER
> structures from the firmware-first case.

Hmm, so RAS Daemon is going to end up knowing whether an error was handled
kernel-first or firmware-first, that is unfortunate for RAS-Daemon (more code)
and means we have duplicate trace points.


> Recoverable UEs should not need to trigger the panic, we should be able to do
> the recovery similar to the memory fault handling in mm/memory-failure.c code.
> The recoverable UEs should also trigger a trace event to user space since they
> won't cause a panic as well.

I agree, but only once we have code to dig in v8.2's RAS ERR registers to pick
out the class of error and affected component or address. Until then we can't
know the component or address, so can't handle the error.

This is still an improvement over a non-v8.2-RAS aware kernel, as that would
panic() for corrected errors too, (depending on when they arrived ... the SError
masking is somewhat broken).


Thanks,

James

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2017-09-14 12:58 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-28 14:10 [PATCH v2 00/16] SError rework + v8.2 RAS and IESB cpufeature support James Morse
2017-07-28 14:10 ` James Morse
2017-07-28 14:10 ` [PATCH v2 01/16] arm64: explicitly mask all exceptions James Morse
2017-07-28 14:10   ` James Morse
2017-07-28 14:10 ` [PATCH v2 02/16] arm64: introduce an order for exceptions James Morse
2017-07-28 14:10   ` James Morse
2017-07-28 14:10 ` [PATCH v2 03/16] arm64: unmask all exceptions from C code on CPU startup James Morse
2017-07-28 14:10   ` James Morse
2017-07-28 14:10 ` [PATCH v2 04/16] arm64: entry.S: mask all exceptions during kernel_exit James Morse
2017-07-28 14:10   ` James Morse
2017-07-28 14:10 ` [PATCH v2 05/16] arm64: entry.S: move enable_step_tsk into kernel_exit James Morse
2017-07-28 14:10   ` James Morse
2017-07-28 14:10 ` [PATCH v2 06/16] arm64: entry.S: convert elX_sync James Morse
2017-07-28 14:10   ` James Morse
2017-08-09 17:25   ` Catalin Marinas
2017-08-09 17:25     ` Catalin Marinas
2017-08-10 16:57     ` James Morse
2017-08-10 16:57       ` James Morse
2017-08-11 17:24       ` James Morse
2017-08-11 17:24         ` James Morse
2017-07-28 14:10 ` [PATCH v2 07/16] arm64: entry.S: convert elX_irq James Morse
2017-07-28 14:10   ` James Morse
2017-07-28 14:10 ` [PATCH v2 08/16] arm64: entry.S: move SError handling into a C function for future expansion James Morse
2017-07-28 14:10   ` James Morse
2017-07-28 14:10 ` [PATCH v2 09/16] arm64: cpufeature: Detect CPU RAS Extentions James Morse
2017-07-28 14:10   ` James Morse
2017-07-28 14:10 ` [PATCH v2 10/16] arm64: kernel: Survive corrected RAS errors notified by SError James Morse
2017-07-28 14:10   ` James Morse
2017-09-13 20:52   ` Baicar, Tyler
2017-09-13 20:52     ` Baicar, Tyler
2017-09-14 12:58     ` James Morse
2017-09-14 12:58       ` James Morse
2017-07-28 14:10 ` [PATCH v2 11/16] arm64: kernel: Handle deferred SError on kernel entry James Morse
2017-07-28 14:10   ` James Morse
2017-08-03 17:03   ` James Morse
2017-08-03 17:03     ` James Morse
2017-07-28 14:10 ` [PATCH v2 12/16] arm64: entry.S: Make eret restartable James Morse
2017-07-28 14:10   ` James Morse
2017-07-28 14:10 ` [PATCH v2 13/16] arm64: cpufeature: Enable Implicit ESB on entry/return-from EL1 James Morse
2017-07-28 14:10   ` James Morse
2017-07-28 14:10 ` [PATCH v2 14/16] KVM: arm64: Take pending SErrors on entry to the guest James Morse
2017-07-28 14:10   ` James Morse
2017-08-01 12:53   ` Christoffer Dall
2017-08-01 12:53     ` Christoffer Dall
2017-07-28 14:10 ` [PATCH v2 15/16] KVM: arm64: Save ESR_EL2 on guest SError James Morse
2017-07-28 14:10   ` James Morse
2017-08-01 13:25   ` Christoffer Dall
2017-08-01 13:25     ` Christoffer Dall
2017-07-28 14:10 ` [PATCH v2 16/16] KVM: arm64: Handle deferred SErrors consumed on guest exit James Morse
2017-07-28 14:10   ` James Morse
2017-08-01 13:18   ` Christoffer Dall
2017-08-01 13:18     ` Christoffer Dall
2017-08-03 17:03     ` James Morse
2017-08-03 17:03       ` James Morse
2017-08-04 13:12       ` Christoffer Dall
2017-08-04 13:12         ` Christoffer Dall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.